If industry insiders are correct, 2012 should see an increasing number of vendors and enterprises launching big data initiatives. Many of those projects will involve Apache Hadoop, an open source technology that makes it possible -- and economical -- for enterprises to store large amounts of diverse data on clusters of standard servers and to analyze that data very quickly.
Hadoop has been around since 2006, but it's really been gaining attention in the past year or so. "2011 was kind of the year where a critical mass of enterprise customers and vendors kind of began to realize the opportunity and value behind the Hadoop phenomenon," noted Shaun Connolly, VP of corporate strategy for Hortonworks, one of the key contributors to Hadoop. "I totally expect the trend to continue in 2012." Connolly added that Hortonworks believes "that by 2015, more than half the world's data will be processed by Apache Hadoop."
That prediction has big implications for enterprises, for vendors and for developers working on big data projects.
Hadoop No Longer a 'Science Project'
Many enterprises have already begun experimenting with Hadoop in small ways, but analysts say this could be the year they begin to get serious about the technology. Benjamin Woo, program vice president for worldwide storage systems at IDC, noted that until now most companies have been approaching Hadoop as a "science project." However, Woo said, "We believe will happen this year is that there will be enterprise acceptance of Hadoop."
What's driving this enterprise rush to Hadoop? The opportunity to make money.
"Google showed us that you can build a large, profitable, fast-growing business entirely out of data. Apache Hadoop represents the opportunity for businesses of all stripes to apply those same technologies and techniques to unlock new value from the under-utilized asset that is their data," explained Charles Zedlewski, VP of product at Cloudera. "It turns out everyone has big data."
Connolly pointed out that while enterprises store a lot of data, "75 percent of the data that flows through enterprises isn't stored." Because Hadoop makes it economically feasible to store much more of that data, "arguably there is now a whole long tail of data that can be stored and farmed for extreme value," he added. "Technology aside, economics are a big factor in this."
According to market research firm Gartner, "Worldwide information volume is growing annually at a minimum rate of 59 percent annually, and while volume is a significant challenge in managing big data, business and IT leaders must focus on information volume, variety and velocity."
Those three Vs -- volume, variety, and velocity -- explain the appeal of Hadoop. It can deal with large volumes of data. It can handle a variety of data from widely different sources. And it can analyze that data quickly, enabling business leaders to rapidly changing conditions.
They may not be sure exactly how they will use their data, but enterprises are betting that they'll be able to make money by analyzing it.
Hadoop's Place in a Crowded Big Data Market
Of course, enterprises aren't the only ones hoping to make money from big data. Numerous vendors have launched Hadoop-related products and services. In fact, Woo said, "We've identified almost 200 companies in the big data space."
With so many players in the market, it's easy to see that not all of them will flourish. IDC has predicted that this year will see a lot of merger and acquisition activity as large technology companies rush to buy smaller companies with expertise in big data. By 2015, the analysts say it's likely that none of the current "major players" in the Hadoop market will still exist.