More Fun with Big Data

As Problogger pointed out that Define Media Group pointed out, Buzzfeed’s recent picture of traffic referral sources may be slightly skewed. Their claims suggest that Facebook generates nearly triple the traffic referrals that Google does. It’s an interesting statistic, but the methodology and data sources are clearly opaque. This problem suddenly becomes compounded when publications such as Recode and The Atlantic propagate said data without verifying it.

Good vs. Evil, Facebook vs. Google, DMG vs. Buzzfeed

But could it even be possible? Facebook has 1.24 billion active users and Google has almost 12 billion monthly searches, so yeah, I guess it’s possible that highly active users post and refer more traffic. Again, I’m dubious: Buzzfeed, a player in the social arena, understandably wants to promote social media, since social media promotes their services.

Reading Recode’s original article about the Buzzfeed phenomenon, it’s hard to tell where the data comes from: “BuzzFeed’s pretty darn big, and its network has some 200 other sites in it, so while we’re not looking at all of the Web here, we’re at least looking at a good-sized chunk of it.” DMG adds more about the data sources, but not much: “According to BuzzFeed their data gathering is done via a tracking code across their network of sites of which ‘represent an audience of more than 300 million people globally.'”


via Define Media Group

Define Media Group, on the other hand, is a marketing firm that provides both search and social media marketing consulting. DMG is very explicit with their methodology and their data sources. Their data suggests results almost the opposite relationship between search and social referrals. In my mind, transparent methodology and data sources certainly lend DMG the upper hand here.

Hype and manipulated statistics have been around for quite a long time, but in the internet age, they can have a tendency to go viral and make big waves.

Surfing and Wiping Out

In Bob Hoffman’s notorious speech where he slammed new school marketing pundits, entitled, “The Golden Age of Bullshit,” he brought up the Pepsi Refresh Project.

A few years ago, to much fanfare, Pepsi dropped its marketing campaign in favor of a complete shift to social media marketing. And, after 2010, corporate social media spending climbed 64% each year for several years running, according to stats I found at Hootsuite.

We’re clearly living in a new age, right? An age of conversation, engagement, and buzz?

According to Hoffman: one estimate has it that the Pepsi Refresh Project cost the company between $50-100 million. The popular soft drink dropped from the second best-selling drink to third and lost a 5% market share before slinking back to its former paid advertising practices.

The same research companies that had proclaimed the death of traditional advertising turned around and stated that social media was a “barely negligible source of sales.”

Hoffman cites Forrester Research, which had foretold the beginning of a new age of social media marketing and “the end of the era of mass marketing” just a few years earlier. They later changed their position, and stated that email marketing was nearly forty times as effective as Facebook and Twitter combined.

What does this tell you about big data?

Big Data = Statistics

Big data is statistics with just more of them. It can be insightful and truthful, or it can be skewed and manipulative. Transparency in both methodology and data sources are vital if we are to make any useful sense of statistics that are thrown our way. Publications such as The Atlantic and Recode — not to mention anyone wielding statistics — have a responsibility to do some fact-checking and verification before propagating such big bad data.

If I had to pick one data set out of the two mentioned above, it would be DMG, because they are open about their methodology and statistics. With Buzzfeed’s info, we literally just have a picture, without understanding the methodology or numbers behind it, just as with Google Trends.