Partner Article

Is the backup industry killing itself?

It’s been fascinating to watch the flood of service providers who jumped into the online backup market over the past five or six years. Most people entered the market when it was considered acceptable to charge £5 per Gb per month and sell rigid three or five year contracts; those days are long gone. Now we’re seeing a dog fight where the price has dropped to less than £1 per Gb, and as a consequence the quality and integrity of the services have dropped as the margins have shrunk. But their original cost models have remained largely the same and many will go by the wayside unless they can significantly enhance the perceived value of their services and build a sustainable margin. Those who are nearing the end of their earlier lock-in contracts where the margins were ‘generous’ are going to be the first casualties. They will find it near impossible to survive on a price model of less than £1 per GB.

It will be very challenging for them to make inroads to the upper end of the DR market where the more established DR providers are so entrenched. Without the experience and expensive infrastructure to provide a more robust DR service model, there is limited scope to find a more profitable area of the market in which to survive.

For those people who are sufficiently courageous to completely re-engineer their business models, there are opportunities to provide valuable services in a market that has witnessed an explosive growth in data volumes over the past 15 years. We now have a novel name for the problem - Big Data!

If we can believe the industry analysts, we will see a doubling of the world’s stored digital data by the end of 2014. To put this into context, the analysts predict an increase in annual data generation of 4300% by 2020.

First reactions are – ‘Great opportunities for selling storage platforms’. But the corporate and Government markets are already wise to the fact that the challenges of exploding data cannot be resolved by just throwing more disk capacity at the problem.

The data tsunami is now a well recognised reality and unfortunately we’re all acting like the hapless King Lear, standing on the shore in a futile effort to stem the tide. Until we wake up and intelligently recognise the problem of storing so much redundant ‘Aged’ data on ‘uber’ expensive SANs, IT professional will continue to throw endless disk capacity at the problem; throwing away more and more money just to maintain the status quo. But too few are looking at the efficiency of their data storage and how much of the information they store is actually being used in their business.

“I know that more than 50% of our data is redundant – the problem is – Which 50%”

So where do the opportunities lie?

There is a certain perverse reality about this whole scenario as the value proposition is all about convincing organisations to actually spend less on expensive storage capacity and allow service providers to efficiently manage the task of maintaining a progressive reduction of their data volumes.

As sure as Turkeys will never vote for Christmas, neither will storage vendors admit to the cost savings that can be achieved by changing their approach. Storage vendors have danced around the problem of how to handle ‘aged data’ for years and continue to advise against the introduction of Hierarchical Storage Management (HSM) solutions - despite the fact that 60% of data is effectively redundant within weeks of creation.

Naturally it’s in their best commercial interests to ensure that expensive primary storage is continually upgraded with more disk capacity – regardless of the fact that much of this capacity is imply to accommodate redundant data. To add to the unnecessary expense, this data is continually backed up, thereby consuming more expensive storage and adding more time to the backup window.

For those who may be unfamiliar with the whole subject of HSM – Hierarchical Storage Management , the concept is quite simple. By utilising a Stubbing & Archiving technology, you can identify the volume of ‘Aged’ data, i.e. unstructured data files that have not been opened for a considerable period. The solution is simply a ‘data moving’ exercise, whereby the volume of the file is ‘moved’ to a much cheaper secondary storage array, but leaves a tiny file known as the stub, which remains on the user’s view, enabling them to seamlessly recover the file.

Whilst it has been around for decades in various forms and with slightly different titles, in the past HSM has come across a great deal of criticism, for being slow and ‘clunky’. This, however, was for the most part due to the legacy middleware that sat between production servers and the Archive Storage arrays.

Today, that has all changed - the cumbersome Middleware has been removed and the retrieval process is almost as slick as one would expect from a high performance SAN.

The oil & gas industries are the best examples of this archiving strategy where petabytes of exploration data used to consume acres of expensive storage on a Cathedral scale. Now, if a file is not opened within six months of creation, it will be stubbed and archived automatically. The end user is not aware that their files have been moved to a much cheaper storage array; the Stub file still shows on the screen and the file can be retrieved from the archive just as seamlessly as before; it may take a couple of nano seconds longer, but the cost savings are very compelling.

Another major coup for modern HSM systems is the simplicity and advancement of the analysis tools, a quantum leap from the days when you had to commit to an expensive technology without any idea as to how effective it might be. Now you can set the analysis tool running as a background task on all your unstructured data servers and play ‘What If’ rules – “What if we Stub all files of a certain type that have not been opened for four months? The resulting reports will show you exactly how much disk space you will free up and the reduced rate of growth, if you maintain these various policies. It will also show you a complete profile of your aged data and tell you by file type the exact volumes that have not been accessed over two or three years.

Armed with this level of detailed analysis, you can make informed decisions and take the next step of having a free trial to run a real life tests on a representative volumes of data. From a backup & DR perspective, the cost implications are immediately obvious – less data volume equals reduced cost of backup infrastructure and much reduced backup Windows. But, be mindful of the fact that a copy of this data still needs to be retained off-site for DR purposes. With every innovation there has to be a loser. It may be some time before you see your friendly storage salesman again.

This was posted in Bdaily's Members' News section by Covenco .