Billion Dollar Open-Source
Big Data is one of the defining technology themes of this decade. It has already proven its ability to unlock value across a range of industries with enhanced analysis and agile fulfillment. This trend is only going to accelerate as an entire industry organizes itself around the task of developing and delivering this value.
That Apache Hadoop—the dominant software framework—is an open-source project is no longer a surprise. Open-source has proven to be an extremely effective model to incentivize and organize the development of new frameworks collaboratively by individuals around the world. One of the big payoffs for getting involved in the community is peer-recognition, love of the game, and once the sector is sophisticated enough—market share worth billions of dollars.
The stakes are incredibly high for companies seeking to contribute to the ecosystem and eventually monetize their role. Companies are jockeying to create stable, walled-off versions of the source code which can then be licensed out to developers along with a portfolio of other add-value services.
A Similar Battle
Today’s battle over the dominance of Hadoop harkens back to a similar battle not too many years ago for another open-source framework known as Linux. Linux was developed as an open-source operating system and was adopted by enterprise disrupters seeking to avoid expensive proprietary software.
Though there were many challengers, including Novell and SUSE, the company which won the battle for Linux-dominance was Red Hat. The main parameters that that battle were fought over included stability of the framework, securing a majority of quality partners, and the bottom line of who employed the most committers of source code to the Linux project.
That last point, of how much code your employees commit to the source is a large factor in the resources of community support that you can offer to third-party application developers. Hadoop, like Linux or like Facebook, is considered to be a platform rather than an application. This means that it enables multiple applications with its core utility rather than focusing on refining only one utility.
Third-party platform developers take source code and then build their own utility on top of it. This means that every customer this third-party goes on to acquire will become a customer of the provider of the entrenched source code. This provider can offer add-value services, and enjoys multiple proxy sales forces through these third-party developers. It creates a center of gravity which drives more and more entrants to choose the prevailing standard rather than the minority. This is how the network effect plays a part in open-source, and why Big Data will probably be a winner-take-all scenario.
The Lion’s Prize
Though Big Data has yet to reach a tipping point, there will come a time when one company will unlock a critical mass of adoption and thus set off a virtuous cycle of growth. True, many players will benefit tangentially but bundling Big Data utility with proprietary software and hardware, but there is a lion’s prize still waiting to be taken.
Cloudera and Hortonworks, the two dominant players, have both raised a massive war chest of VC cash in order to hire more developers to commit code, as well as engage in marketing activities to promote their standard.
Cloudera boasts a two year lead, and early market share, greater traction, and greater number of partners. On the other hand Hortonworks promotes the purity of its approach to open source, which is spiritually aligned with the positioning taken by Red Hat. Then again, Cloudera’s product, Cloudera Manager, is considered to be more stable than Hortonworks’ Ambari. All romanticism aside, early stages of a technology sector generally leave little room for romantic reasons of purity to take precedence in a purchasing decision. 50% of a market is compromised of stable, risk-adverse enterprise customers who adopt new technologies in order to keep up with competition, or avoid being clobbered, but are not necessarily interested in pushing the envelope or trying to out innovate others.
[See Also: Real-Time Big Data and Startups]
The picture gets a bit murkier when considering the percentage of lifetime patches contributed to all Hadoop related projects by employer, with a fairly slim gap between the two top competitors.
It is too early to call a winner, and the turf war over open-source Big Data is set to be one of the defining technology competitions over the next several years.
Everything you need to know about outsourcing technology development
Access a special Introduction Package with everything you want to know about outsourcing your technology development. How should you evaluate a partner? What components of your solution that are suitable to be handed off to a partner? These answers and more below.