Published on the 01/05/2017 | Written by Donovan Jackson
Hadoop inventor pegs ‘big data’ back to ‘modern data infrastructure’…
Is it really big data, or is it just ‘data’, is a question which Doug Cutting is ideally positioned to answer. After all, he co-designed Hadoop, the distributed file system synonymous with ‘big data’ which is now owned by the Apache Software Foundation, and famously named it after his son’s toy elephant.
“Just calling it ‘data’ is very much the trend. But one of the salient features when this was a new breed of technology with distributed systems was that they could scale much further; the applications for analysis which would deliver the most dramatic benefits were always those which could not scale [until the big data concept came along], so there was a strong focus on size there,” Cutting explained.
There is slightly more to it, too, which Cutting was happy to admit with some levity. “We also wanted to distinguish it, of course. But now it’s fair to say that what we are seeing now is just modern data infrastructure.”
These days, Cutting is chief information architect at Cloudera, unsurprisingly a provider of Hadoop software, support and services. Cloudera has offices in Australia and New Zealand, and Cutting was over this way to meet with customers and do what he does best, which is to advocate for the use of data to solve business problems.
In any event, sometimes simple questions deliver the best insights. Asked what the big deal is with big data, Cutting said the headlines, excitement and interest isn’t hype. “Hype is in excess of reality; the excitement around big data is appropriate,” he declared; well, he would say that, wouldn’t he?
But, continued Cutting, “What’s behind it is a transition towards things being more digital throughout society, the economy and government. We are seeing that more and more people can afford to and are finding value in using digital devices; these things leave tracks which can be collected and which becomes the eyes and ears on what customers are doing.”
The latter bit may sound a little creepy, until you bear in mind that those ‘tracks’, and the analysis thereof, are precisely what gives companies the ability to tailor products and services in ‘mass personalisation’ efforts. It’s what lets Google be helpful with its traffic and other suggestions, or what helps Amazon sell you something else you probably hadn’t realised you wanted (until laying eyes on it). It’s what banks and telcos want to be able to do to provide you with the services you need, but never realised you wanted. Governments, too, want in on the action, in the way that governments everywhere like to creep further and further into the lives of citizens.
“If you are going to remain competitive, improve and grow,” – service providers, that is, “then you need to know what your customers are doing, keep track, learn to analyse and harness information. That’s the trend towards a digitised economy and society,” said Cutting.
With his mention of things going digital, we wanted to know his view on digital transformation; after all, among the jaded members of tech society, there is a perception that anything from an infrastructure project to an ERP implementation can be characterised as ‘digital transformation’. It has practically become a synonym for ‘IT project’.
“Well, the nature of things is changing,” Cutting countered. “For decades, the IT organisation was running certain aspects of the business, like the back end, and that’s been digital for a long time now. However, ‘digital’ now pervades every aspect of the business. It’s no longer a narrow thing [run by techies] but a general capability. And instead of experts to run business systems, we’re moving to where things are more self-serve, where people in sales or marketing for example are using systems directly in coordination with other parts of the organisation.”
Digital organisations, he said, don’t have the strongly centralised IT of the past; IT is not so much ‘running the show’, but coordinating guidelines.
The use cases for big data – or just data, as we can now return to calling it – are diverse, he said, but what stands out is the results which are possible when professionals are given tools that they can use to solve problems they understand. That includes groups like nurses or cancer researchers; he said among the best use cases he has seen was nurses in a paediatric ward who spontaneously started using data gathered from the treatment regimes of premature babies to determine which interventions caused most stress and which were more likely to deliver positive outcomes.
Which is why, Cutting said, that despite the concerns of ‘industry’ around skills shortages, he does not see this as a major problem. “It is not something we worry about. There are people out there who know what the problems are within their industries and now we have the tools to help them attack those problems.”
The trick, he said, is to create useable tools which rapidly demonstrate their ability to add value. “Once people have those sorts of things, they are hungry for more. Some training is obviously important, but generally, people are good at figuring out how things like data analytics can help them get their jobs done better.”