Published on the 20/09/2017 | Written by Martin Norgrove
Cloud computing is hardly new, but with data warehousing structural challenges and maturity limited widespread adoption. Until now, writes NOW Consulting’s Martin Norgrove...
A question mark has also surrounded security: for enterprises which are using data warehouses, the integrity of information is paramount. But with these issues addressed, the time is ripe for your organisation to, at the very least, test the waters with cloud data warehousing. Why? Because the benefits and advantages are so clear that if you’re not doing it, you are missing out.
Maturity is often just a question of how long something has been in the market. That ‘time in the saddle’ gives vendors the room to experiment, identify and iron out issues, and shape up value propositions. It also gives early adopters the opportunity to get to grips with a new way of doing things.
With data warehousing in the cloud, the major steps to maturity have come in the past 12 to 18 months. The big vendors in this space, including Microsoft Azure with SQL Data Warehouse and Amazon’s AWS Redshift, are highly viable solutions.
More than that, they are now ‘battle proven’; particularly in the last six months, we’ve seen several serious enterprise workloads successfully move into the cloud.
In simple terms, that means when you go ahead and try a data warehouse in the cloud, you will not be a ‘test case’ as there are thousands of other organisations which have done it before and which are seeing the advantages. The big question of ‘security’ is resolved – not just for data warehousing, but for cloud computing generally. The companies which offer cloud solutions, after all, invest an exponentially larger sum into security than any of their customers can. Why? Because their very business depends upon it. If Xero and other leading organisations have done it, so can you. “It is not an overstatement to say the cloud has been crucial to Xero’s success,” Andrew Jessett, Xero’s GM of IT has said. “We could not have succeeded without it.”
In other words, trust in the platforms is established, demonstrated and proven.
Let’s look at the advantages of a cloud data warehouse. Most of the pluses are somewhat generic to any cloud service, but it bears repeating in the context of the data warehouse: cost, efficiency, flexibility, speed (or time-to-value) and simplification.
Starting with cost and efficiency, enterprises with a certain set of problems recognise that cost and efficiency are major challenges on the path to value. When the data warehouse is in the cloud, it can be ‘stood up’ in a matter of minutes with a few mouse clicks. There is no need to procure, install and then maintain enterprise hardware (an exercise which typically requires approval of capital budget, delivery times and installation, all of which can take months). The software, be it Azure SQL, Redshift or anything else, is packaged with the infrastructure, all in one go. The saving in hassle, time and money is immediate and extreme.
Then there is the question of software license costs. In the cloud, that drops by four to five times when moving from, say, a traditional data warehouse to Redshift. It isn’t just the immediate cost benefit either: traditional data warehouses can be expensive and slow to upgrade and update. It can even be difficult to unlock new features.
What of performance, or scale? Traditionally, you’d need to kick off that tin procurement exercise if you needed more of either. With the cloud, more capacity and performance is on tap – it scales linearly. We’ve seen a process which takes 80 minutes, drop down to 10 minutes. For businesses processing data multiple times per day, the advantage is obvious.
Costs aren’t just a factor of establishing the infrastructure; the skills necessary to get to work on populating the data warehouse must be factored in, too. As far as Microsoft is concerned, the skills are entirely translatable: SQL is SQL. There is a common developer experience, and when a tool like WhereScape RED is introduced into the mix, data translation and automation makes for a highly efficient (and fast) data warehouse setup (and for all the WhereScape RED fans out there, the experience in RED is almost identical).
It’s all about the analysis
By now, the time-to-value advantage should be making itself obvious. The point of a data warehouse isn’t to have a data warehouse but to have it as a foundation on which to conduct analysis and get outputs of actionable information and insights. When the data warehouse is stood up in the cloud rapidly and at a dramatically reduced overhead, that means the good stuff starts happening far faster. Business analysts can do their thing with the cloud data warehouse in days, rather than standing by and hoping something comes out of it in weeks, months or perhaps, as is sometimes the case, never.
That’s particularly relevant right now, with all the media attention on machine learning, artificial intelligence and other advanced analysis techniques. These may be in the ‘marketing exercise’ phase of maturity, but just like cloud data warehouses took a little time (not a lot, mind) to mature, so too will these concepts shortly be ready for prime time. Putting them to work and getting valuable outcomes is enormously more viable when your data warehouse is in the cloud; already, some algorithms can simply be plugged into your data warehouse. Expect a lot – like really, a LOT – more as these concepts shoot up the maturity curve. With the basics of a data warehouse in the cloud, you are equipped to do the fancy stuff fast, inexpensively and – importantly – experimentally. With such a low cost, you can go right ahead.
Another of the popular words in the press these days is innovation. There’s no better way to take the excitement out of a potentially new way of doing things than telling the eager data scientist he’ll need to wait a few months before the means to test his latest hypothesis is established. The cloud delivers immediacy. Had a great idea which needs data analysis to test it? Stand up a data lake right away. Test, experiment, verify, get into production faster, at lower cost and all the while keeping smart staff members enthused. If it works, fantastic. If it doesn’t, no worry – failing fast allows for iterative improvement without breaking the bank.
That helps foster a culture of innovation, where people in your teams are equipped to try new things. If it doesn’t work, close it down – and you aren’t stuck with a bunch of servers and more shelfware gathering dust.
What’s not to like
With low barriers to entry, proven use cases and the ability to do much more with data, much faster, the case for data warehousing in the cloud is a strong one. The bottom line is a simple one: if you aren’t at least testing data warehousing in the cloud, you are losing out.
Martin Norgrove is CTO at NOW Consulting, a data and analytics services company. With a BSc in Chemistry and Physics from The University of Auckland, Martin initially started working life as a lab technician at Carter Holt Harvey before discovering his passion for all things data. He’s worked for numerous high-profile brands including Spark, Z Energy, ASB, Auckland Council and Lotto, and thinks the future is very bright.