Book Review Part 1 (of ?): Super Crunchers
By Tim Wilson on in Analysis with No Comments
I promised in an earlier post that I would read Ian Ayres new book, Super Crunchers, and it’s taken me longer to get to it than I’d planned. And, while it’s a fairly easy read, it’s not getting a nightly hit on my nightstand because it tends to get me worked up.
I’m fully aware that I got the book with preconceived notions that it might set me off. I’m doing my darnedest, though, to be objective in my observations about it. And I haven’t finished it, yet, but I’m going to lose track of the different points I want to make if I don’t start recording them.
My fundamental beef with the book is that it picks and pulls anecdotes that paint a picture of how easy it is and how common it is becoming to do super crunching. So far, it has done very little to articulate all the ways that super crunching does not or cannot work…either “yet” or “ever.” Much more on that to come. To me, it reads mostly like an effort to jump on the Freakonomics bandwagon. Freakonomics was original and interesting. Super Crunchers is a poor, poor follow-on.
Ayres does reference some great books in his opening pages. James Surowiecki’s The Wisdom of Crowds is a must-read. And I thoroughly enjoyed Moneyball (although I’m an admitted college baseball junkie). I’m ambivalent about Freakonomics, but generally feel positive about it. It may be that it just seems to play too much to the mass market, and I like to think I’m higher brow than that. 😉
Ayres views the world with a heavy, heavy academic bias. As such, he thinks data is cleaner than it is, people are easier to change than they are, and time machines exist. I’ll tackle the last one first. Ayres presents super crunching as an easy, two-party formula: 1) the ability to run regressions (just plug the date into the software and press “Go” is what is sounds like), and 2) A/B testing is a no-brainer (and just occurs in the world naturally if you look for it).
Here’s where the time machine comes in. On page 69, he expounds on how politicians are using super crunching. And then he extends how they could use it even more as follows:
“Want to know how negative radio ads will influence turnout of both your and your opponent’s voters? Run the ads at random in some cities and not in others.”
Stop and think about that one for a minute. Clearly, we’re talking about a statewide or federal election. You’re on a 2-, 4-, or 6-year election cycle. You run negative ads in some cities and not others. The election happens. You lose. You assess voter turnout. Great. Now what? The election is OVER!
Earlier in the same paragraph, Ayers started at a more reasonable level by discussing political scientists. Extend that to political consultants, and you’ve got some validity in super crunching. The RNC and DNC could probably learn a lot by exerting some influence and doing some randomized testing on this sort of thing with all of their candidates and then applying that knowledge in the next election (they will get pushback from their candidates, who are much more concerned about winning their current election than they are about providing data for the common good). But Ayers very explicitly dives down to a prescription for one candidate’s strategy in one campaign against one opponent. And that just doesn’t work.
Why does this irk me so much? Because the casual reader might not realize what a joke that statement is. And, thus, might walk away from the book with a vague notion that super crunching can be applied to every situation with ease!
A more general criticism is that Ayers picks mammoth, established companies that have massive data sets to work from for most of his examples. When I worked at a $600 million company, which was by no means mammoth, but still generated a lot of data, we frequently found that we didn’t have enough data to answer the most burning questions with any statistical validity. Why? Because: 1) the real world is very, very noisy, and it’s really hard to control for all of the likely independent variables (in Ayers election example, there was no mention of the candidates’ issues and how those may resonate differently in different cities regardless of what ads were run), 2) businesses operate in the fourth dimension (time) — if they have a six-month sales cycle and they want to run a test of how something early in the cycle influences events late in the sales cycle, they can be in a pickle when it comes to actually capturing enough data to run a legitimate regression, and 3) the data is NEVER as clean as folk in academia or who haven’t really sat down and understood the processes think it is.
I’ll jump ahead to Ayres’s discussion of evidence-based medicine. He does let in a sliver of “it ain’t always a slam dunk” when he discusses (pp. 90 and 91) a study that linked excessive caffeine consumption to increased risk of heart disease…without controlling for the fact that there was a correlation between caffeine consumption and the likeliness that the subject smoked. That reminded me of a huge commercial super crunching gaff that used A/B testing and regression heavily…but that bit one of the top consumer brands of all time in the butt: Coca-Cola.
Coke doesn’t show up in Ayres’s index, so I’m assuming I’m not going to stumble across the New Coke fiasco. Two different books have given reasons for that flop: in Stumbling on Happiness (I’m almost positive that was the book, but I read Blink at the same time and there was a lot of overlap in the concepts), Daniel Gilbert made the point that the test was flawed. Coke assumed that a single sip in a blind taste test would be a valid indicator. It turns out that your first sip of a soft drink gives you one experience, while the last 80% of the drink gives you another experience — after your taste buds / brain have adjusted to the “new” experience. In Waiting for Your Cat to Bark, the Eisenberg brothers attributed the failure more to the fact that the Coke flavor was sufficiently ingrained in the soul of Coke drinkers that, by golly, they didn’t want that flavor to change! Either way (and it was likely a combination of both) this was a super crunching disaster!
Back to evidence-based medicine. There is some absolutely fascinating stuff in this chapter. And there is no refuting that the medical profession is wildly behind the times when it comes to adopting information technology. And Ayres outlines some pretty interesting initiatives on that front that I genuinely hope succeed. But, again, there were some alarming statements in the chapter that illustrated Ayres’s academic naivete. He plugs in a quote from Dr. Joseph Britto around a feature under development in his (very cool) Isabel system that will enable doctors to enter their notes in a patient’s medical record through a “structured flexible set of input fields to capture essential data.” Britto is quoted as saying, “If you have structured fields, you then force a physician to go through them and therefore the data that you get are much richer that had you left him on his own to write case notes, where they tend to be brief.”
TWO big beefs with this line of thought.
Beef No. 1: I HAVE gone through many situations where someone who wants to use the data (and, therefore, wants data that is comprehensive and complete) makes a massive logical leap that the answer is simply to make the system “force” the people who are inputting the data to provide more data and to provide it in a way that tees it up for valid, straightforward analysis. Life. Just. Don’t. Work. That. WAY! The only time I have seen this work is when immediate, direct value to the person inputting the data is realized by the change. “Do it this way, and you will feed a much larger aggregate data set that you will ultimately benefit from” simply does not work. “Required fields” don’t work. If you introduce a process that takes as little as 15 seconds longer than the old process, those required fields quickly become populated with a single character or the first value on the list (for structured inputs). Taking a top-down approach and auditing and enforcing better data input requires a Herculean effort, but it’s doable. What concerns me is that neither Britto nor Ayres acknowledge that it’s not as simple as a system deployment. The data analyst’s approach is going to be towards capturing everything about the patient, regardless of whether it seems relevant. That way, data mining may find things that no one thought were relevant but turned out to be. Unfortunately, 95% of that “comprehensive” data is not going to be relevant, and there’s no way to identify where that 95% is. Britto and Ayres want to put the burden of capturing all of that data on the doctors, which is simply not practical.
My current company is actually experimenting with a tool on the CRM side that acknowledges this challenge. ShadeTree Technology is a plug-in for Salesforce.com that, at the end of the day, is trying to drive more comprehensive, more consistent, cleaner CRM data. BUT, they’re taking an approach that the only way that will work is if the tool provides immediate, direct benefit to the people who are using it. We are early on in our implementation of the tool, but the fact that ShadeTree understands this concept is a key reason that we are implementing it.
Beef no. 2: I don’t have a medical background, but I have three kids. Our oldest wound up in the hospital for a week when he was six years old. He is a very bright and articulate kid. What landed him in the hospital was a pain that was somewhere in his hip/upper thigh area, deep under the skin. X-rays, bone scans, and CAT scans weren’t able to pinpoint the location of the issue. It took an MRI to find a small abscess. And it took a battery of tests to figure out that he’d gotten salmonella in his bloodstream that, ultimately caused the abscess. How is this relevant? Because we had a helluva time trying to understand exactly what his symptoms were. “Upper thigh” is different from “hip.” Even with a willing, articulate kid, identifying “pain in the bone” vs. “pain in the muscle” was damn near impossible. Humans are not numbers and data. “Pain” is subjective. “Small brown spots on the skin” has numerous subjective elements (another example from this chapter). Ayres wildly glosses over the challenges here, and that bothers me.
To be clear, I’m not saying that none of these initiatives will work. And, for all I know, Britto actually is much more sensitive to the issues than Ayres makes him sound. What sticks in my craw is that Ayres continually presents the world as a place where clean, comprehensive, easily analyzed data already exists or where it’s going to be relatively easy to collect it. That is painting an unrealistic picture. It may sell books, but it’s not all that actionable and, I claim, will actually hinder serious readers’ ability to actually improve their own usage of data.
More to come in Part 2…