One more reason why you CAN’T just start with the data
By Tim Wilson on in Analysis, Data Management, Metrics with No Comments
My boss mentioned Parkinson’s Law to me this morning in reference to a discussion we were having about sales and marketing process efficiency. I was familiar with the concept, but not with the actual law. If you didn’t follow the link, and you don’t know what it is, it’s the principle that “work expands so as to fill the time available for its completion.” This is so true in the business world that it’s well, kinda sad.
The part of the write-up that jumped out at me, though, was the statement that, “It has been observed over the last 10 years that the memory usage of evolving systems tends to double roughly once every 18 months.” Poor form on the passive voice usage, but that’s a tangent that is not related to this post (or this blog at all, for that matter). I need to do some digging to find the source of this stat. It sounds right, but I did some digging several years ago for this sort of information, and I didn’t find this. What I did find were two different studies by Gartner — performed several years apart — that predicted that there would be a 30x increase in the total volume of enterprise data in the next seven years (I think the studies were done five years apart, and both had a similar projection). I have a clipping somewhere with one of the studies, but it’s in a box en route to Ohio, so I can’t nail the specifics.
These two estimates are so eerily similar that they sort of smell like they came from the same study. Doubling every 18 months would mean you had a 32x increase (2^5) in 7.5 years.
As usual, I’m spending way too damn long on the preamble and not getting to the point, which is this:
Rewind seven years and let’s come up with a hypothetical situation whereby you have just started in a new position. In order to get the lay of the land and figure out what you should do first, you ask for a dump of all data that could possibly be related to your domain of responsibility. For chuckles, let’s say that came out to 3 pages of raw data (not realistic, but making it ridiculously small still supports my point). So, you could take that data, print it out, spread it out on your desk, and pore over it for a couple of hours. Make it a day. You could become so intimate with that data that you would feel like plopping back on a big fluffy pillow and smoking a cigarette. If you did plop back on a pillow and take a drag on a smoke, you could then stare up at the ceiling and wait for your brain to work it’s magic. If there were any interesting, useful insights in that data, your brain would likely find them (assuming your boss doesn’t interrupt your thoughts and want to know: 1) what you’re doing with a big fluffy pillow in your office, or 2) why you’re smoking). That’s one of those really cool things about the brain.
So, in that case, you could start with the data: “Give me the data, I’ll ‘analyze’ it, and then I’ll figure out what action I should take.”
Fast forward seven years. Same situation. Except, there’s been a 30x increase in what you get when you ask for “all the data that could possibly be relevant.” That’s 90 pages of data. You’re brain isn’t going to be able to work it’s magic with that. You could spend 3 weeks looking at the data without feeling like you truly had your head wrapped around it. What most people would do with 90 pages of data would be to start charting it. A picture is worth a 1,000 words, right? That’s one way to get 90 pages of data summarized into something that the brain might be able to handle. Of course, with 90 pages of data, you could produce 900 pages of graphs. Obviously, you would have to pick and choose what you would graph and how. Then, you would keep generating one graph at a time until you saw something that showed either an “interesting” trend or a spike somewhere. At that point, you would be so relieved that you had found something, that you would quickly copy the chart and paste it into PowerPoint so you could show it to a group in a meeting and prove that you were, by golly, doing stuff (um…see Parkinson’s Law!).
If asked by an anal BI-oriented stickler, “Did you take action on the data?” you would respond, “Absolutely! I charted it, put it in PowerPoint, and showed it in a meeting, where everyone agreed that it was interesting!”