Apache Hadoop vs Spark: Worldwide Market Share Compared
Apache Hadoop, built by Yahoo for engineers and data scientists, has not aged well. Once praised for its wealth of potential, it’s suffered at the hands of swifter products, often from its own ecosystem. “[Apache] Spark killed Hadoop,” H20.ai founder Sri Ambati told Datanami last week. It seems despite success of do-everything competitors like Google and Microsoft, the Big Data space calls for specifics.
The data sings a different tune. Though Hadoop’s brand seems synonymous with confusion, adoption has not slowed for this decade-old offering. And Spark has by no means taken over. Has the newer product lived up to the hype? We consulted our records to see:
Our starting point, July 2016, shows a level playing field for the complementary products. With no meaningful spikes after the new year, customers don’t seem to have jumped on Spark, or abandoned its elder completely.
Market Penetration by Industry
It’s no surprise that a product created for experts would stay in its lane. Spark, however, boasts a meaningful distribution across industries, thanks perhaps to a proliferation of Big Data principles in all kinds of markets. So while Spark may have a wider spread, Hadoop still dominates its intended user base.
Major Geographic Markets
Worldwide, we see competitor Informatica taking center stage, with a more meaningful presence in Europe and the Americas, and an overall market share of 32%. Hadoop (and consequently, Spark) remain limited to past successful markets.
Adoption Trends by Customer Company Size
Nor has there been a proliferation of Spark among enterprise customers. Noting that most companies in the world are smaller-scale (1-50 employees), Spark doesn’t appear to be the only choice for companies of any size. It’s emerged as a helpful offering to those already using Hadoop, not as a selling point for the product at-large. That said, it’s not limited to one kind of customer, as Hadoop may have been a decade prior.
Who is Using Hadoop with Spark?
Our records show legacy tech companies. Mainstays like EBay, Verizon, HP and Amazon mingle with UnitedHealth, Ciena, Epsilon, Pronix and Booz Allen.
Curious about the top players in Big Data, or other tech spaces? With data crawlers tracking job listings, resumes, client documentation and more, iDatalabs’ system gauges probability that a company is using a certain product. Access these insights and more on over 10,000 tech targeting pages.
Share your queries in the comments, or check out our profile on Quora.