Gilligan’s Unified Theory of Analytics (Requests)

By on in , with 10 Comments

The bane of many analysts’ existence is that they find themselves in a world where the majority of their day is spent on the receiving end of a steady flow of vague, unfocused, and misguided requests:

“I don’t know what I don’t know, so can you just analyze the traffic to the site and summarize your insights?”

“Can I get a weekly report showing top pages?”

“I need a report from Google Analytics that tells me the gender breakdown for the site.”

“Can you break down all of our metrics by: new vs. returning visitors, weekend vs. weekday visitors, working hours vs. non-working hours visitors, and affiliate vs. display vs. paid search vs. organic search vs. email visitors? I think there might be something interesting there.”

“Can you do an analysis that tells me why the numbers I looked at were worse this month than last?”

“Can you pull some data to prove that we need to add cross-selling to our cart?”

“We rolled out a new campaign last week. Can you do some analysis to show the ROI we delivered with it?”

“What was traffic last month?”

“I need to get a weekly report with all of the data so I can do an analysis each week to find insights.”

The list goes on and on. And, in various ways, they’re all examples of well-intended requests that lead us down the Nefarious Path to Reporting Monkeydom. It’s not that the requests are inherently bad. The issue is that, while they are simple to state, they often lack context and lack focus as to what value fulfilling the request will deliver. That leads to the analyst spending time on requests that never should have been worked on at all, making risky assumptions as to the underlying need, and over-analyzing in an effort to cover all possible bases.

I’ve given this a lot of thought for a lot of years (I’m not exaggerating — see the first real post I wrote on this blog almost six years ago…and then look at the number of navel-gazing pingbacks to it in the comments). And, I’ve become increasingly convinced that there are two root causes for not-good requests being lobbed to the analytics team:

  • A misperception that “getting the data” is the first step in any analysis — a belief that surprising and actionable insights will pretty much emerge automagically once the raw data is obtained.
  • A lack of clarity on the different types and purposes of analytics requests — this is an education issue (and an education that has to be 80% “show” and 20% “tell”)

I think I’m getting close to some useful ways to address both of these issues in a consistent, process-driven way (meaning analysts spend more time applying their brainpower to delivering business value!).

Before You Say I’m Missing the Point Entirely…

The content in this post is, I hope, what this blog has apparently gotten a reputation for — it’s aimed at articulating ideas and thoughts that are directly applicable in practice. So, I’m not going to touch on any of the truths (which are true!) that are more philosophical than directly actionable:

  • Analysts need to build strong partnerships with their business stakeholders
  • Analysts have to focus on delivering business value rather than just delivering analysis
  • Analysts have to stop “presenting data” and, instead “effectively communicate actionable data-informed stories.”

All of these are 100% true! But, that’s a focus on how the analyst should develop their own skills, and this post is more of a process-oriented one.

With that, I’ll move on to the three types of analytics requests.

Hypothesis Testing: High Value and SEXY!

Hands-down, testing and validation of hypotheses is the sexiest and, if done well, highest value way for an analyst to contribute to their organization. Any analysis — regardless of whether it uses A/B or multivariate testing, web analytics, voice of the customer data, or even secondary research — is most effective when it is framed as an effort to disprove or fail to disprove a specific hypothesis. This is actually a topic I’m going to go into a lot of detail (with templates and tools) on during one of the eMetrics San Francisco sessions I’m presenting in a couple of weeks.

The bitch when it comes to getting really good hypotheses is that “hypothesis” is not a word that marketers jump up and down with excitement over. Here’s how I’m starting to work around that: by asking business users to frame their testing and analysis requests in two parts:

Part 1: “I believe…[some idea]”

Part 2: “If I am right, we will…[take some action]“

This construct does a couple of things:

  • It forces some clarity around the idea or question. Even if the requestor says, “Look. I really have NO IDEA if it’s ‘A’ or ‘B’!” you can respond with, “It doesn’t really matter. Pick one and articulate what you will do if that one is true. If you wouldn’t do anything different if that one is true, then pick the other one.”
  • It forces a little bit of thought on the part of the requestor as to the actionability of the analysis.

And…it does this in plain, non-scary English.

So, great. It’s a hypothesis. But, how do you decide which hypotheses to tackle first? Prioritization is messy. It always is and it always will be. Rather than falling back on the simplistic theory of “effort and expected impact” for the analysis, how about tackling it with a bit more sophistication:

  • What is the best approach to testing this hypothesis (web analytics, social media analysis, A/B testing, site survey data analysis, usability testing, …)? That will inform who in your organization would be best suited to conduct the analysis, and it will inform the level of effort required. 
  • What is the likelihood that the hypothesis will be shown to be true? Frankly, if someone is on a fishing expedition and has a hypothesis that making the background of the home page flash in contrasting colors…common sense would say, “That’s a dumb idea. Maybe we don’t need to prove it if we have hypotheses that our experience says are probably better ones to validate.”
  • What is the likelihood that we actually will take action if we validate the hypothesis? You’ve got a great hypothesis about shortening the length of your registration form…but the registration system is so ancient and fragile that any time a developer even tries to check the code out to work on it, the production code breaks. Or…political winds are blowing such that, even if you prove that always having an intrusive splash page pop up when someone comes to your home page is hurting the site…it’s not going to change.
  • What will be the effort (time and resources) to validate the hypothesis? Now, you damn well better have nailed down a basic approach before answering this. But, if it’s going to take an hour to test the hypothesis, even if it’s a bit of a flier, it may be worth doing. If it’s going to take 40 hours, it might not be.
  • What is the business value if this hypothesis gets validated (and acted upon)? This is the “impact” one, but I like “value” over “impact” because it’s a little looser.

I’ve had good results when taking criteria along these lines and building a simple scoring system — assigning High, Medium, Low, or Unknown for each one, and then plugging in some weighted scores for each value for each criteria. The formula won’t automatically prioritize the hypotheses, but it does give you a list that is sortable in a logical way, It, at least, reveals the “top candidates” and the “stinkers.”

Performance Measurement (think “Reporting”)

Analysts can provide a lot of value by setting up automated (or near-automated) performance measurement dashboards and reports. These are recurring (hypothesis testing is not — once you test a hypothesis, you don’t need to keep retesting it unless you make some change that makes sense to do so).

Any recurring report* should be goal- and KPI-oriented. KPIs and some basic contextual/supporting metrics should go on the dashboard, targets need to be set (and set up such that alerts are triggered when a KPI slips). Figuring out what should go on a well-designed dashboard comes down to answering two questions:

  1. What are we trying to achieve? (What are our business goals for this thing we will be reporting on?)
  2. How will we know that we’re doing that? (What are our KPIs?)

They need to get asked and answered in order, and that’s a messier exercise oftentimes than we’d like it to be. Analysts can play a strong role in getting these questions appropriately answered…but that’s a topic for another time.

Every other recurring report that is requested should be linkable back to a dashboard (“I have KPIs for my paid search performance, so I’d like to always get a list of the keywords and their individual performance so I have that as a quick reference if a KPI changes drastically.”)

Having said that, a lot of tools can be set up to automatically spit out all sorts of data on a recurring basis. I resist the temptation to say, “Hey…if it’s only going to take me 5 minutes to set it up, I shouldn’t waste my time trying to validate its value.” But, it can be hard to not appear obstructionist in those situations, so, sometimes, the fastest route is the best. Even if, deep down, you know you’re delivering something that will get looked at the first 2-3 times it goes out…and will never be viewed again.

Quick Data Requests — Very Risky Territory (but needed)

So, what’s left? That would be requests of the,. “What was traffic to the site last month?” ilk. There’s a gross misperception when it comes to “quick” requests that there is a strong correlation between the amount of time required to make the request and the amount of time required to fulfill the request. Whenever someone tells me they have a “quick question,” I playfully warn them that the length of the question tends to be inversely correlated to the time and effort required to provide an answer.

Here’s something I’ve only loosely tested when it comes to these sorts of requests. But, I’ve got evidence that I’m going to be embarking on a journey to formalize the intake and management of these in the very near future, so I’m going to go ahead and write them down here (please leave a comment with feedback!).

First, there is how the request should be structured — the information I try to grab as the request comes in:

  • The basics – who is making the request and when the data is needed; you can even include a “priority” field…the rest of the request info should help vet out if that priority is accurate.
  • A brief (255 characters or so) articulation of the request — if it can’t be articulated briefly, it probably falls into one of the other two categories above. OR…it’s actually a dozen “quick requests” trying to be lumped together into a single one. (Wag your finger. Say “Tsk, tsk!”
  • An identification of what the request will be used forthere are basically three options, and, behind the scenes, those options are an indication as to the value and priority of the request:
    • General information — Low Value (“I’m curious,” “It would be be interesting — but not necessarily actionable — to know…”)
    • To aid with hypothesis development — Medium Value (“I have an idea about SEO-driven visitors who reach our shopping cart, but I want to know how many visits fall into that segment before I flesh it out.”)
    • To make a specific decision — High Value
  • The timeframe to be included in the data — it’s funny how often requests come in that want some simple metric…but don’t say for when!
  • The actual data details — this can be a longer field; ideally, it would be in “dimensions and metrics” terminology…but that’s a bit much to ask for many requestors to understand.
  • Desired delivery format — a multi-select with several options:
    • Raw data in Excel
    • Visualized summary in Excel
    • Presentation-ready slides
    • Documentation on how to self-service similar data pulls in the future

The more options selected for the delivery format, obviously, the higher the effort required to fulfill the request.

All of this information can be collected with a pretty simple, clean, non-intimidating intake form. The goal isn’t to make it hard to make requests, but there is some value in forcing a little bit of thought rather than the requestor being able to simply dash off a quickly-written email and then wait for the analyst to fill in the many blanks in the request.

But that’s just the first step.

The next step is to actually assess the request. This is the sort of thing, generally, an analyst needs to do, and it covers two main areas:

  • Is the request clear? If not, then some follow-up with the requestor is required (ideally, a system that allows this to happen as comments or a discussion linked to the original request is ideal — Jira, Sharepoint, Lotus Notes, etc.)
  • What will the effort be to pull the data? This can be a simple High/Medium/Low with hours ranges assigned as they make sense to each classification.

At that point, there is still some level of traffic management. SLAs based on the priority and effort, perhaps, and a part of the organization oriented to cranking out those requests as efficiently as possible.

The key here is to be pretty clear that these are not analysis requests. Generally speaking, it’s a request for data for a valid reason, but, in order to conduct an analysis, a hypothesis is required, and that doesn’t fit in this bucket.

So, THEN…Your Analytics Program Investment

If the analytics and optimization organization is framed across these three main types of services, then conscious investment decisions can be made:

  • What is the maximum % of the analytics program cost that should be devoted to Quick Data Requests? Hopefully, not much (20-25%?).
  • How much to performance measurement? Also, hopefully, not much — this may require some investment in automation tools, but once smart analysts are involved in defining and designing the main dashboards and reports, that is work that should be automated. Analysts are too scarce for them to be doing weekly or monthly data exports and formatting.
  • How much investment will be made in hypothesis testing? This is the highest value

With a process in place to capture all three types of efforts in a discrete and trackable way enables reporting back out on the value delivered by the organization:

  • Hypothesis testing — reporting is the number of hypotheses tested and the business value delivered from what was learned
  • Performance measurement — reporting is the level of investment; this needs to be done…and it needs to be done efficiently
  • Quick data requests — reporting is output-based: number of requests received, average turnaround time. In a way, this reporting is highlighting that this work is “just pulling data” — accountability for that data delivering business value really falls to the requestors. Of course, you have to gently communicate that or you won’t look like much of a team player, now, will you?

Over time, shifting an organization to think it terms of actionable and testable hypotheses is the goal — more hypotheses, fewer quick data requests!

And, of course, this approach sets up the potentially to truly close the loop and follow through on any analysis/report/request delivered through a Digital Insight Management program (and, possibly, platform — like Sweetspot, which I haven’t used, personally, but which I love the concept of).

What Do You Think?

Does this make sense? It’s not exactly my opus, but, as I’ve hastily banged it out this evening, I realize that it includes many of the ways that I’ve had the most success in my analytics career, and it includes many of the structures that have helped me head off the many ways I’ve screwed up and had failures in my analytics career.

I’d love your thoughts!

 

*Of course, there are always valid exceptions.

10 Comments


  1. Thank Tom,

    A great article, up there in the Gary Angel camp! Loads of practical advice here.

    Many of your requests sound very familiar, particularly the dreaded panic question “tell me why we have a conversion issue” which I find is often followed with “I need it before lunchtime!”.

    An area you have touched on and I think is really important is for analysts to be given time to do their own work with their own hypotheses. A great way to do this is to schedule a quarterly presentation and report that looks at a snapshot of the last quarter and outlines the analysts view of the top priority issues on the site.

    To achieve the above you need some breathing space between your team and the data requests. The kind of structure you have suggested would really help and I recommend a “pay yourself first approach” to the prioritisation of these projects. :)

    Thanks again,

    Billy Dixon
    @appliedwa

  2. Nice post and comment. Analysts that help ‘frame the question’ and answer it are worth their weight in gold as they can move the needle (business performance) by themselves. That said, it’s a special breed of analyst rarely found in the wild.

    Since many of the ‘actions’ will be based on predictions the analyst’s skills need huge bump in education from where to find filter retrieve and present data to creating coherent distributions and modeling and time series forecasting.

    Are you and your readers experiencing the same? And if so, where are you sending staff for such an ‘education’?
    (Posted from mobile device pardon the typos)

  3. Billy — GREAT point. Analysts who are actually tightly ingrained in the business should absolutely be generating hypotheses (in addition to tightening/framing the hypotheses of others). And, I totally agree that, if there’s a buffer in place of some sort (an intake management / capacity planning / traffic management process and function) it makes sense for there to be some non-negotiable time allotted for the analysts to dig in to their ideas. Thanks!

    Kevin — I’d love to get a good tips on that very point. It’s actually an area where I, personally, am weak. And, when I’ve dug into self-education, I’ve come up short. But, I’ve actually had success with “I know what I don’t know”…and, therefore, know when to reach out to others with the proper skillset when statistical rigor and/or predictive analytics is needed. It’s not a perfect solution, but it seems practical?

  4. Hi Tim,

    Great post and I’ve encountered similar issues over the years.

    Before I get to my point, regards Kevin’s question education around predictive modelling. I tried taking a predictive analytics course after one of the eMetrics summits and it proved to me I don’t know enough about it :) . There were some interesting points in that course and I learned a few things, but I also learned I’d need to spend a few months learning how to apply the various algorithms and modelling to different situations and I am not so sure the end results would be worth the effort.

    However I do recommend people read everything written by Kevin Hillstrom (dunno whether the comment above is from the same Kevin, but Hillstrom’s perspective is very good) and to get hold of a book called “How to measure anything” from Hubbard. This book in my view is the missing link between web analytics data and predictive modelling. I have found the various simulations particularly useful, particularly simple stuff like Monte Carlo simulations applied to datasets to help with decisions on unclear outcomes or risky undertakings. But this book set off light bulbs all over the place as far as I was concerned. I’d been struggling with when to use predictive analysis and when not to and this book got me thinking about dozens of different scenarios I deal with every week that could be enhanced using some of the techniques described. I also found the methods easy to train in and all of our senior staff now use simulations and some simple modelling when it’s useful. Very simple methods in clear English.

    Anyhow back to your post and my question. Back in 2008/9 I tried one of the methods you describe here, a form for validating data requests. My company were working for a large enterprise customer at the time and we were wasting our time churning out about 20 reports a month to about 50-60 so called “stakeholders”. My opinion when I took over the case was that we were wasting our time so I instructed the team to stop sending all the reports. We got a few requests for one report (happily the one with the best information) but that was all. So we automated that one and dropped all the rest. Then we set-up a data request form like you suggest and it stopped all requests coming in.

    So I’d love to see an example of your simple set-up. We wanted to encourage people without putting barriers in place but found our form and later our simple email questions back at the requestors were becoming a barrier. We therefore had to drop it and continue supporting these requests as they came in. (We did eventually get over it via a number of workshops educating the stakeholders but it took years literally).

    Mainly it was because the organisation I was working for had fairly low level managers doing the requesting that were so used to not thinking about what they were doing that when asked to put some effort in they didn’t know how. I am not blaming them, it’s human nature to take the easiest route and our company had become the goto place for data. We had to change that perception and it took ages. But it would be great to hear examples of how you did/do it as well. Changing the perception and making the requests light I mean.

    Thanks
    Steve

  5. Thanks, Steve!

    I remember when “How to Measure Anything…” came out and wondering if it would be a relevant read…and then never getting around to checking it out. That’s quite an endorsement, so it’s now on my summer reading list. Thanks!

    I completely agree with you on the Kevin Hillstrom front — both his posts and his multi-part tweets!

    Your anecdote about decommissioning reports sounds really familiar. I did an similar exercise once where I had the producers of several weekly reports (manual — fairly quick…but still a weekly manual process) add bold, red text to the top saying, “You will no longer receive this report if you do not reply to let me know.” The reports went out for 3 weeks with that message and then stopped. ONE person responded after they stopped, and, from digging into that person’s needs (copying and pasting a couple of values from that report into another report that was getting compiled), we eliminated almost the entire report *and* gave her the information in a more usable format!

    But, I also agree that it doesn’t take much to be perceived as a barrier. As I’ve changed roles over the years, I’ve never stepped in and immediately implemented a formal intake system. Rather, I’ve worked pretty hard to build up trust and credibility by being responsive and treading *very* carefully to not be perceived as pushing back. Depending on the environment, that takes a few weeks to several months. Then, it’s a matter of transparency: “Look…I’m doing this so that things *don’t* fall through the cracks, and so we’ve got some level of mine-able data that we can use to figure out where to get more efficient and where best to try to develop self-service mechanisms.” I can’t say I’ve built the totality of what’s described here and had it clicking along perfectly. Partly, that’s been because I haven’t been in a role where I’ve needed to. But, it’s all stuff that I’ve prototyped in some fashion, with a layer of, “If I got to do it over again, here’s how I would adjust it.”

    Stay tuned — I’ve got some final tweaking to do, but I’ll get a post up after eMetrics next week that will link to some templates that might inspire some ideas, at least.

    Thanks for the thoughtful comment! Great stuff!

  6. WRT Books/Training/Modeling (and a minor rant, pardon me):

    “How to measure anything” from Hubbard combined with Savage’s “Flaw of Averages” together could change your life. It did mine but I spent the better part of 2 years in business with Savage (that’s a disclosure more than a bragging point). Let me sum it up this way: Once a person starts to think of Uncertainty as a shape (probability distribution) there is no going back to a single number (an average) . You’ll start to cringe when you hear or see an average and will curse Google Analytics for not creating coherent distributions.

    One of the sad facts is that stats are taught as if it’s 1950 still. Savage makes this point in his book calling it ‘steam era statistics’. My eyes gloss over every time I hear somebody talk about standard deviation, skewness, fitting a distribution and if they write a ‘greek’ formula forget it. Unfortunately many ‘training’ classes do just that. I have found it useful to teach people how to think in shapes not formulas and send those ‘Shape’ through very basic planning models (think Excel IF statements).

    Anyway, I’ve drifted a bit OT for this blog post so let me end with this – business people are looking to ‘inform (their) intuition’ about the future, confirm a historical transactions are helping met expectations (KPIs) or comply with policy/regulations. Analysts that can help them inform intuition are golden. As you point out, in many cases this can be done with very little data making “getting the data” less relevant.

    Enjoy eMetrics.

  7. Tim, I particularly like your thoughts around hypothesis testing and the value analysts can bring to the table.

    I’ve spent a lot of time delving into the lean startup methodology which is fundamentally based on formulating hypotheses and testing them using the minimum amount of resources before any significant resources are spent.

    Lean Analytics (the most recent addition to the Lean Startup movement) proposes a Business Problem-Solution Hypothesis-Metrics canvas which follows the process you’ve suggested and more importantly, it places it in an interative loop — once you validate/invalidate a hypothesis with data you then use the lessons learned to formulate the next hypothesis. This way you systematically tackle all aspects of the critical business questions.

    The process is primarily thought out for startups which are particularly resource-poor but it would certainly be useful to minimising wasted resources in a larger company as well. Worth adding to your summer reading list maybe?

    Cheers
    Carmen

  8. Pingback A.D.A.P.T. to Act and Learn | Gilligan on Data by Tim Wilson

  9. Pingback A/B Testing | Annotary

Leave your Comment


Notify me of followup comments via e-mail. You can also subscribe without commenting.

« »