If the Data Looks too Amazing to Be True…
By Tim Wilson on in Analysis, Twitter with One Comment
I’ve hauled out this same anecdote off and on for the past decade:
Back in the early aughts [I’m not Canadian, but I know a few of ’em], I was the business owner of the web analytics tool for a high tech B2B company. We were running Netgenesis (remember Netgenesis? I still have nightmares), which was a log file analysis tool that generated 100 or so reports each month and published them as static HTML pages. It took a week for all of the reports to process and publish, but, once published, they were available to anyone in the company via a web interface. One of the product marcoms walked past my cubicle one day early in the month, then stopped, backed up, and stuck his head in: “Did you see what happened to traffic to <the most visited page on our site other than the home page> last month?” I indicated I had not. We pulled up the appropriate report, and he pointed to a step function in the traffic that had occurred mid-month — traffic had jumped 3X and stayed there for the remainder of the month.
“I made a couple of changes to the meta data on the page earlier in the month. This really shows how critical SEO is! I shared it with the weekly product marketing meeting [which the VP of Marketing attended most weeks].”
I got a sinking feeling in my stomach, told him I wanted to look into it a little bit, and sent him on his way. I then pulled up the ad hoc analysis tool and started doing some digging and quickly discovered that a pretty suspicious-looking user-agent seemed to be driving an enormous amount of traffic. It turned out that Gomez was trying to sell into the company and had just set up their agent to ping that page so they could get some ‘real’ data for an upcoming sales demo. Since it was a logfile-based tool, and since the Gomez user agent wasn’t one that we were filtering out, that traffic looked like normal, human-based traffic. When the traffic from that user-agent was filtered out, the actual overall visits to the page had not shown any perceptible change. I explained this to the product marcom, and he then had to do some backtracking on his claims of a wild SEO success (which he had continued to make in the course of the few hours since we’d first chatted and I’d cautioned him that I was skeptical of the data). The moral of the story: If the data looks too dramatic to be true, it probably is!
This anecdote is an example of The Myth of the Step Function (planned to be covered in more detail in Chapter 10 of the book I’ll likely never get around to writing) — the unrealistic expectation that analytics can regularly deliver deep and powerful insights that lead to immediate and drastic business impact. And, the corollary to that myth is the irrational acceptance of data that shows such a step function.
Any time I do training or a presentation on measurement and analytics, I touch on this topic. In an agency environment, I want our client managers and strategists to be comfortable with web analytics and social media analytics data. I even want them to be comfortable exploring the data on their own, when it makes sense. But, (or, really, it’s more like “BUT“), I implore them that, if they see anything that really surprises them, to seek out an analyst to review the data before sharing it with the client. More often than not, the “surprise” will be a case of one of two things:
- A misunderstanding of the data
- A data integrity issue
All of this is to say, I know this stuff. I have had multiple experiences where someone has jumped to a wholly erroneous conclusion when looking at data that they did not understand or that was simply bad data. I’d even go so far as to say it’s one of my Top Five Pieces of Personal Data Wisdom!
When I did a quick and simple data pull from an online listening tool last week, I had only the slightest of pauses before jumping to a conclusion that was patently erroneous.
Maybe it’s good to get burned every so often. And, I’m much happier to be burned by a frivolous data analysis shared with the web analytics community than to be burned by a data analysis for a paying client. It’s tedious to do data checks — it’s right up there with proof-reading blog posts! — and it’s human nature to want to race to the top of the roof and start hollering when a truly unexpected result (or a more-dramatically-than-expected affirming result) comes out of an analysis.
For me, though, this was a good reminder that taking a breath, slowing down, and validating the data is an unskippable step.