How Many Ecommerce Companies Are There?

There is no shortage of top-down research telling us that the ecommerce market is enormous, growing extremely fast, and showing no signs of slowing down. According to sources like eMarketer, ecommerce is the only trillion-dollar industry growing at a double-digit percentage each year. And with the US Census Bureau estimating that only 7% of retail sales are done on the internet, ecommerce still has a lot of runway for growth.

Despite all this research, however, no one seems to be able to answer the key question: how many ecommerce companies are there?. The few estimates that exist vary by orders of magnitude, from tens of thousands to nearly a million.

We set out to answer this question for ourselves.

How we did it

We have a secret ingredient that helped us build an estimate from the ground-up: proprietary data. Here at RJMetrics, we work with hundreds of online retailers who generously allow us to anonymize high-level data points for analyses like these.

By combining our proprietary data with size and revenue information from third-party sources like the Internet Retailer Top 500 Guide, Alexa, and BuiltWith, we’ve conducted a comprehensive bottoms-up analysis of the ecommerce industry.

Size matters

Obviously, the long tail is going to be very long here. Using BuiltWith to identify which websites have ecommerce technologies installed, we found 180,000 live websites with just the Magento shopping cart. When you extrapolate to include the full universe of competing ecommerce technologies, you can see how some estimates approach the one-million mark. As you might have guessed, however, the majority of these sites are not generating revenue on any meaningful scale.

In order to separate the wheat from the chaff, we needed to come up with revenue-based exclusion criteria.

Tying Alexa rank to revenue

Alexa rank is an easily-obtained proxy for traffic. Alexa ranks every website in the world based on traffic volume. A global rank of 1 represents the website with the most traffic in the world (currently Google). Since ecommerce revenue is directly correlated with the number of visitors to a site, we theorized that Alexa rank could serve as a proxy for revenue. To test this, we needed revenue data for a set of ecommerce companies that spanned a broad spectrum of Alexa ranks.

To get revenue data, we turned to the data in the Internet Retailer Top 500 guide and augmented it with our own proprietary benchmarking data set. The IR 500 includes the heaviest-hitters in ecommerce and our own data covered mid- and smaller-sized companies. Between these two data sets we had Alexa rank and revenue data on the full spectrum of ecommerce companies. Here’s what we saw:

Jackpot! There appears to be a pretty clear-cut link between revenue and Alexa rank. To be sure, let’s zoom in past the Walmarts and Amazons of the world and just look at the “long tail” of sites with Alexa ranks between 10,000 and 1,000,000:

Awesome. These combined data sets have given us visibility into the revenue of ecommerce companies throughout the Alexa top 1 million sites.

Meaningful scale

Note that, while the 500k-1M data point is quite low, it’s far from zero. The mean 2013 revenue for sites in that range is actually $1.5 Million and the median is around $500k. As evidenced by that discrepancy, average revenue drops meaningfully in this range.

For this reason, we’ve made an Alexa rank of 1,000,000 the cutoff for sites we include in our count.

While we are aware of many websites with an Alexa rank above 1,000,000 that are generating well into six and even seven figures of revenue, we believe there would be far more false positives than false negatives if we included sites beyond this mark. We’re comfortable concluding that the balance of false positives/negatives that exist on either side of the threshold are well balanced with a threshold at the Alexa one-million mark.

Defining ecommerce

Now that we had a way of estimating which ecommerce companies are actually generating meaningful revenue, we simply needed some way of figuring out which sites in the Alexa Top One Million are actually ecommerce.

Using the BuiltWith API, we were able to profile every website in the Alexa Top One Million by evaluating the technologies being used by those sites. BuiltWith can detect a whole universe of shopping carts, marketing tools, and other ecommerce-specific technology that makes a website a dead giveaway as ecommerce.

But this wasn’t good enough—we were still getting a lot of false positives and false negatives. We decided to go a step further. We scraped the HTML of each site’s home page and looked for certain words: “shop”, “buy”, “sell”. We also detected defunct pages and sites that looked more like linkspam. We ended up building an entire set of rules to automatically evaluate whether or not a given site was ecommerce.

And at every turn, we evaluated the rules against a set of websites that we had evaluated by hand. Eventually, our algorithm was actually able to predict whether a site was ecommerce with 95% accuracy.

After we had fine-tuned the algorithm, we turned it loose on the Alexa Global Top One Million sites. Here’s what we found:

There are approximately 110,000 ecommerce websites generating revenue of meaningful scale on the internet.

More than 12% of the 100,000 highest-traffic websites are ecommerce, and that density clearly declines to about 10% for long tail. According to our data, ecommerce websites make up approximately 10-12% of the internet. And to our knowledge, we’re the first to actually attempt to count them.

I should point out that we include any online transactional business in our assessment. In addition to traditional online retail, this includes companies selling virtual goods, hosted software providers, marketplaces, travel sites, and even mobile apps with a commerce component. Basically, if you can spend money on their website, it qualifies.

It should also be noted that our detection methodology excludes non-English language websites and pornographic websites. When building our algorithm, we had to search for particular content on these pages. We didn’t have the resources to translate and test these rules in other languages, and we didn’t have the…inclination…to test them against pornographic websites. Both of these limitations of our analysis deflate the numbers we report.

Mid-market ecommerce companies generate a ton of revenue

Having just tagged every site on the Alexa Top One Million as ecommerce or not, and having figured out the underlying relationship between Alexa rank and revenue, we have our hands on a pretty interesting dataset. We’ll be exploring this data in several posts down the road, but here’s the first cut we wanted to share with you.

We looked at the revenue breakdown between the largest and smallest of these sites to try to figure out the industry landscape. Based on our dataset, ecommerce clearly breaks down into three distinct groups.

The largest ecommerce sites on the internet make up about 1% of the total population and generate 34% of the total revenue.

A distinct middle tier of ecommerce sites make up 51% of the total population and generate 63% of the total revenue.

Small ecommerce sites make up 48% of the total population and generate 3% of the total revenue.

The opportunity in ecommerce

This represents a big opportunity for vendors (like RJMetrics) serving the ecommerce market. Any company that can help merchants move from the bottom to the middle tier of the market will make a very significant impact on their top line. The middle of the market is where traffic volumes start to really bring in dollars, and getting to that scale is an imperative for any ecommerce company focused on growth.

Robert, this is a very interesting analysis. Thanks for putting it up. However, the input data is unreliable (Alexa and BuiltWith). This link on BuildWith – http://trends.builtwith.com/shop – shows that there are 4 Million websites with a cart functionality, which basically qualifies them as ecommerce websites. It doesn’t matter if they generate revenue or not, technically they fit the definition of ecommerce. My way of coming up with the number of ecommerce websites was to sum up the publicly advertised number of stores from the top 10 most used ecommerce platforms. That summed up to almost 800,000 website, but this is not reliable as well, as it’s based on advertising 🙂

http://rjmetrics.com/ Tristan Handy

Hi Traian,

Thanks for your careful review, glad to see how interested you are in the topic!

We evaluated thousands of individual sites by hand while building our algorithm. Without doing that work, it’s impossible to know what’s really out there.

The reason we used the methodology we did is that most of the websites with ecommerce technology installed actually *aren’t* ecommerce businesses. Frequently, they are defunct–an administrator will install a cart platform on a domain and then never actually get the site up and running. Frequently, they’re totally unrelated business models. Content sites such as newspapers also often use cart platforms, but are not “ecommerce”. There are many other situations where someone that has a cart installed isn’t actually an ecommerce vendor; we had to find each of these situations and then build rules into our algorithm to correct for them.

Hope that makes sense!

Tristan

Traian Neacsu

🙂 ecommerce is a subject dear to my heart. As a matter of fact I will be publishing something on ecommerce pretty soon.

yes, you are totally right about the forgotten code, but technically speaking anything that involves a monetary transaction between two computers interconnected on WWW is electronic commerce, right?

David Booth

Interesting analysis Robert. Is your data able to show the location of those ecommerce businesses? Eg “East Coast” vs “West Coast”, or by state, or more granular?

http://rjmetrics.com/ Tristan Handy

Yes–we could absolutely do that. We’re not quite there yet, but I can imagine doing that in a future iteration. Will definitely post updates here.

http://www.goudengids.be Robin Soubry

Hi Robert,

For an analysis on the e-commerce landscape in Belgium, I’m working out a similar initiative.
Would it be possible to have a call on the methodology you used to clear out most of the false positives/negatives?
Could the algorithm be licensed to us for analysis of the market we’re looking at? (However Belgium has 3 official languages, so tweaks will be required).

http://rjmetrics.com/ Tristan Handy

Hey Robin, thanks for your interest. I wish I could help. This was actually a massive software development effort (months!) that used a lot of internal proprietary data. We’re not going to be able to release it because of the sensitivity.

If you read the post as a how-to, you can start to get a sense of how we wrote what we wrote… 🙂

Good luck! It was a really fun project.

Nikolaus Foulkrod

Hey Robert,

I love the article really great data. I am a college student and am doing a research project on Ecommerce in the US. I don’t know if this is possible but is there anyway that you logarithm could also account for the location of the eccomerce company. I have been looking for a heat map of Ecommerce companies by state without any luck. Any suggestions would be greatly appreciated!

http://rjmetrics.com/ Tristan Handy

You’re the second person who has asked for this! That’s awesome. We don’t have that data right now but it looks like we have a good next step…

Nikolaus Foulkrod

Well if you do put something together let me know for sure! That would be awesome

Dimitrios Kourtesis

Hi Robert,

Great article. Thanks for sharing your insights. The revenue share among top/mid/bottom ecommerce sites is illuminating. But I’m curious about the actual total revenue figure that is being broken down. What is the total revenue you’re assuming those ecommerce sites are making?

Dimitrios

Charlie

Great post. I’ve been looking for this for ages. I even found the Referral Candy one you referenced, but it didn’t go far enough. I’m stumped by one of the paragraphs in your post: “Note that, while the 500k-1M data point is quite low, it’s far from zero. The mean 2013 revenue for sites in that range is actually $1.5 Million” It doesn’t seem to correlate to the graph above. I would like to answer the question: what’s the mean revenue for the stores you define as top, mid and bottom? Thanks. Charlie

http://rjmetrics.com/ Tristan Handy

Hey Charlie, let me look into this. I totally see what you’re saying and need to take a look at whether the axis on that chart is off or whether I’m just forgetting some piece of information. Will get back to you…

Charlie

Thanks

http://rjmetrics.com/ Tristan Handy

Hah. I really appreciate you pointing this out. It turns out that when we were prepping the charts for this piece we screwed up the y axis—it was off by a factor of 100. Just fixed. Thanks again 🙂

Charlie

Thanks for checking and fixing this. Really helpful work!

http://liesandsubtweets.wordpress.com Nick Quah

“It should also be noted that our detection methodology excludes non-English language websites”

Quick question on this: don’t you think the exclusion of non-English sites ends up providing a considerably incomplete picture of e-commerce across the (globally-populated) internet landscape?

http://rjmetrics.com/ Tristan Handy

Hello Nick! You’re not wrong–we’re definitely not estimating the global question. Our goal was to produce the best answer, to-date, of at least *part* of that question. We hope to continue enriching this data set in the coming months and years, such that we’ll be able to understand the growing world of ecommerce better and better.

The data that went into this post has been very useful to us, and in sharing this we’re hopeful that you’ll find it valuable as well, even though it isn’t yet perfect 🙂

Bettina Morton

I am also looking for ecommerce seller numbers by state. Have you made any progress on that yet? Bettina

Matthew Goodman

Hi Robert,

You identified the mean revenue for sites in the 500K to 1M range.

Do you have any data on the mean revenue for sites in the 10K to 500K range?

Many thanks
Matt

Joey Lei

Robert – great analysis – looking to size an omnichannel market (e-commerce business) with a retail presence. How would we determine this using your methodology? Not all e-commerce businesses are pure e-commerce of course… and this seems to exclude ebay/amazon sellers?

Storehippo

Interesting post. Just in case you are a online retailers looking to tap the e-commerce marketplace, then StoreHippo is your answer. With features like Cash on Delivery, integrated
Logistics, leading payment gateways, customized features, Multi-lingual feature, beautiful themes, SEO rich tools,Google Analytics, StoreHippo is fulfills the need of every retailer.For more info http://www.storehippo.com

http://www.brightverge.com/ Bright Verge

Thanks for your post. I think E-Commerce Websites trend is really gaining popularity time by time but I think it’s not hard to build a website I think it’s hard to run website as it requires lot of money to put on advertisement, keywords etc…!!!!!!!!!

https://www.whitehatworlds.com/ WHW

Wonderful blog & good post. It is really helpful for me, awaiting for more new post. Keep Blogging! White Hat World ….

Peter Caputa

Do you have any idea how many employees these firms employ? Would love to see a curve of employee headcount vs alexa rank OR even employee headcount vs revenue? Either one would get me what I’m looking for, since you have the alexa rank vs headcount graph figured out.

Thank you very much for publishing this, btw. Very helpful.

http://www.datanyze.com Sam Laber

Really interesting article. I know BuiltWith and Alexa data is readily available, but were you guys using your own datasets to determine the revenue of these sites? I wrote a very similar post detailing the open source ecommerce market, so would be interested to understand more about the methodology behind connecting Alexa range to revenue.

http://www.retailreco.com/ Retail Reco

This is the greatest post on eCommerce global business. One
thing we are wondering from where you got the accurate 2013 revenue numbers and
the second thing we would like to mention the mean revenue 500K to 1M alexa
rank would easily increase from 1.5 M to at least 2M if all of them used
retailreco personalized solution. Our predictive analytics enable on site and
omnichannel personalization solutions help retailers increase revenue by 20%or
more for detail visit retailreco.com

http://www.retailreco.com/ Retail Reco

This is the greatest post on eCommerce global business. One
thing we are wondering from where you got the accurate 2013 revenue numbers and
the second thing we would like to mention the mean revenue 500K to 1M alexa
rank would easily increase from 1.5 M to at least 2M if all of them used
retailreco personalized solution. Our predictive analytics enable on site and
omnichannel personalization solutions help retailers increase revenue by 20%or
more for detail visit retailreco.com

Daniel Ripoll

Great piece Robert. I appreciate the statistical breakdowns, and found the breakdown into Top, Middle, and Bottom to be especially illuminating. Thanks.

Drew Foster

Hi Robert,

Have you done much refinement with the data set and do you have the same charts with the Median values instead of the Mean? I appreciate the insights and thank you for sharing…

http://www.bitcorati.com/ Ryan Charleston

Great analysis Robert! I’m curious if you had any stats on avg. monthly traffic and/or unique visitors to these eCommerce sites, particularly those ranked in the top 10,000?

http://webnexs.com/ Webnexs

Hello Robert.
You made a nice article, a properly planned marketing strategy will provide your store a clear 50% increase in potential than previous year. Hence its important to make sure you run a perfect marketing campaign. Webnexs Wcomm is 100% SEO friendly and can be optimized to make your Store to be indexed easily on popular search engines.
Thanks
Webnexs

Kaajal Baheti

Hi Robert,

Do you have a list of the companies names that you have narrowed down as eCommerce companies that “There are approximately 110,000 ecommerce websites generating revenue of meaningful scale on the internet”? Could you share that list?