Where Do Companies Like Alexa and Compete Get Their Data?

Alexa-Ranking. Image Ranking: rodolfogrimaldi Alexa-Ranking. Image Ranking: rodolfogrimaldi
Alexa-Ranking. Image Ranking: rodolfogrimaldi

Alexa is one of the best known Web analytics companies, and it’s the number one best known analytics company that isn’t known for its analytics. The Amazon-operated company grew famous for their website rankings, which dominated the lower echelons of Internet marketers for years. Even today, you find people who care more about that ranking than they should.

Compete, meanwhile, is an amazing source of website information, showing you a lot of data about users, traffic, and ranking for various queries throughout the web. It’s not the most robust site analytics suite out there, but it has some data that can’t be found anywhere else, except perhaps Alexa itself.

The reason is the unique way both of these sites harvest their data.

Unique data sources
Both Alexa and Compete get a lot of their data from the same type of source, though the exact source differs between them. That source is a browser toolbar. Alexa’s data comes primarily from their Firefox toolbar, where Compete gets theirs from their own toolbar, as well as a handful of other sources.

There are two major drawbacks to this practice, and they tend to put the entire business into question if you look too closely:

The first drawback is user awareness; How many users of the Alexa toolbar, for example, understand that their browsing habits and site data is being sent to the Alexa servers? How many Compete users are aware that their data is being monitored? Sure, there’s disclosure here and there, in the fine print, but that’s always a shady practice isn’t it? Companies — not just Alexa and Compete — have long had a history of doing the bare minimum for disclosure, leaving many people simply unaware of what’s going on behind the scenes. Just look at every medication commercial with tiny fine print and a low, monotonous, sped-up voice over.

The second drawback is data accuracy; Alexa was somewhat notorious for this with their Web rankings. When all of their data comes from users of a toolbar, they’re getting data from a biased set of users; toolbar users.

Data accuracy
Think about it. What does the Alexa toolbar provide? Alexa rank data, link data, access to the Internet Archive of a site, and some site analytics. In other words, marketing data. Who wants that data? Marketers, site owners, and SEO professionals. Who doesn’t? DIYers, stay at home parents, athletes, fishermen and a nearly infinite list of other people.

This means the data ends up skewed towards the bias of the Web marketer. Sites that are frequently visited by such users end up with more data and better rankings than sites that don’t.

Now, Compete takes data from several other sources and aggregates it, which makes them correspondingly less biased and more accurate. Alexa has even expanded their data sources in recent years, in response to the criticism they received on exactly this issue. Still, their data is limited compared to many other analytics suites.

Value retention
At the end of the day, though, both sites – inaccuracies and biases and all — are still valuable for marketers. They have to be, right? No business would have lasted 15 years in the online space if they didn’t provide some good value. This goes double for analytics, where every app, program, extension, and eBook seems to have analytics added on.

Other than the global Alexa rankings, which are a curiosity to users outside of the marketing circle, both services are used primarily by marketers. That’s fine. They’re services designed for marketers. There’s nothing inherently wrong with that, so long as they provide accurate data. That’s always the true test.

On the plus side, even if the businesses seem a little shady for minimizing disclosure and getting users to report data without their knowledge, it’s not all that open to abuse. Nothing personally identifiable is sent along with either toolbar; they just include basic browsing habit information. It’s not like Facebook, harvesting reams of information for every person on the site.

[Entrepreneur]