Data Management — As Sexy As a High Quality Mattress

Steve Woods of Eloqua invited me to write a guest post on his Digital Body Language blog after we’d gone back and forth a bit about contact data management and marketing automation. Over the past six or seven years, I’ve been thumped on the back of the ear with data management issues again and again. It always hurts, and, by the time I’ve realize I’ve got a mess…it’s a heckuva challenge to recover.

In my current job, I’m a full-time customer data management guy. It is not sexy. Like many large companies, we’ve got customer data that is created and managed in a wide range of disparate systems on diverse platforms, each with multiple decades of system evolution. It’s important. It’s painful.

There are some great opportunities in our increasingly electronic and e-based world to make some real headway with data management. In the case of the guest blog post, I focussed on opportunities to use marketing automation tools and your web site to drive improvements in the quality of your customer data. As for how exactly I made the “high quality mattress” analogy? Click on over and check out the post!

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

Columbus Web Analytics Wednesday — July 2009 with Bizresearch

Web Analytics Wednesdays are an opportunity for full-time web analysts, part-time web analysts, and anyone who is interested in learning more about web analytics to get together and share their experiences! We will informally network for a bit before sitting down and ordering food, at which point we will have a brief presentation/discussion about Bizwatch led by Laura Thieme.

Details:

When: Wednesday, July 15th at 6:30 PM

Where: Barley’s Smokehouse and Brewpub, 1130 Dublin Road, Columbus, OH 43215

Registration: the Web Analytics Wednesday site

How to find us: We have a room reserved — just go to the back of Barley’s and hang a right

We are excited to welcome a new sponsor this month! Bizresearch will be co-sponsoring the event with the Web Analytics Wednesdays Global SponsorsThe sponsors will be covering food and nonalcoholic beverages only, although you are welcome (and encouraged) to sample Barley’s fine offering of frothy beverages on your own tab.

Laura Thieme, a 12-year search marketing and analytics veteran, has developed a new search analytics application: Bizwatch. Observing the challenges of monthly trend search marketing reporting and analysis, she developed a new application that combines SEO, competitors, keyword research, paid search and web analytics. It focuses on data integration amongst the three areas of search marketing. It focuses on trend analysis and keywords that convert.

Thieme is looking for feedback from industry colleagues on the search analytics application. She is also hoping to hear from search marketers regarding monthly reporting, applications they are using, and other search analytics data integration challenges they are experiencing.

It should be an engaging discussion!

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

Data Visualization that Is Colorblind-Friendly — Excel 2007?

Wow. This post started out not as a post, but as what I thought was going to be a 5-minute exercise with Google to download a colorblind-friendly palette for Excel charts. That was two weeks ago, and this post is just scratching the surface.

Several weeks ago, one of the presenters in a meeting showed some data as a map overlay. As soon as she projected the first map, someone in the meeting quipped, “Good luck understanding this one, Jim!” Jim, you see, is colorblind. And, apparently, most of the people in the meeting knew it. Approximately 8% of men have some form of color blindness (it’s much more rare in women — only 1 in 200). And the overlays on the map were color-coded very subtly. Jim commented that it was hopeless!

As it happened, I was exploring a fresh set of data that same week, as we’d recently rolled out some new customer data capture capabilities. As I worked through how best to present the results, I decided to grab a colorblind-friendly palette from the web and use it in the visualization of the information. I’d hoped to find a site with one or more Excel files that I could download with such a palette, but, worst case, I was prepared to snag a palette and manually update my Excel file (for future sharing on this blog, of course!).

No. Such. Luck!

What I did find was a slew of information on the different types of color blindness (which I’ll touch on briefly in a bit), as well as a bevy of almost-useful tools and palettes:

  • How to make figures and presentations that are friendly to Colorblind people — ultimately, I used the palette that is ~2/3 of the way down this page for my spreadsheet (the figure labeled “Set of colors that is unambiguous both to colorblinds and non-colorblinds”).  Mr. Excel actually references this palette and provides a macro that will update a workbook’s palette with this palette. The downside of this palette is that, while it may be plenty functional, I can’t say I’m wild about it from an aesthetic viewpoint. But, I’d spent the 30 minutes I’d given myself to dig, so I ran with it.
  • Colorjack Color Blindness Simulation — a view of the color spectrum as seen by people with eight different forms of color blindness. That’s informative…but doesn’t really provide a realistic way to build a functional palette for data visualization purposes.
  • Colorjack — a nifty tool for finding a color palette. Unfortunately…there’s no way to test how colorblind-friendly any of the palettes are
  • Colorblind Web Page Filter — there were a number of tools for sale that would simulate how content would appear to people with different forms of colorblindness, but this is the (free) online tool I wound up using for the exercise below. It couldn’t be easier to use — you just provide a URL and what form of color blindness you’re interested in, and it renders it

So, aside from the one palette that was solely focussed on functionality and not at all on aesthetics, I struck out. As I pondered this over the next few days, it occurred to me that, perhaps Excel’s default colors always seemed so gosh-awful because they were actually developed explicitly with colorblindness in mind. I could not find any documentation to support the theory…so I turned left and headed down that rathole to see if I could figure it out myself.

The exercise was pretty simple. I created a 10-color bar chart using the Excel 2007 default palette. Note: This was created purely for palette-testing — this actual chart is a great example of needlessly using more color than is needed! Here’s the chart:

Excel 2007 Default Chart Colors
Excel 2007 Default Chart Colors

Like the one colorblind-friendly palette I found online, I really don’t like the aesthetics of this palette. It’s been toned down a bit from the Excel 2003 (and earlier) versions, which is good, but it still seems rather harsh. Could that be for colorblind compatibility? I think so! I took the chart above and ran it through the Colorblind Web Page Filter mentioned above for the four most common types of color blindness (as described in a Pearson report by Betsy J. Case):

Excel 2007 Default Chart Colors -- Deuteranomaly (Affects 4.9% of Males)
Deuteranomaly (Affects 4.9% of Men)
Excel 2007 Default Chart Colors -- Deuteranopia (Affects 1.1% of Men)
Deuteranopia (Affects 1.1% of Men)
Excel 2007 Default Chart Colors -- Protanopia (Affects 1% of Men)
Protanopia (Affects 1% of Men)
Excel 2007 Default Chart Colors -- Protanomaly (Affects 1% of Men)
Protanomaly (Affects 1% of Men)

Overall, the palette seems workable in all four situations. The first three colors absolutely work. Color 4, as well as color 5, start to lose a little contrast from color 1, but they still seem manageable. Color 5 and color 7, as well as color 10, start to get a little problematic in some cases, but, if you’re going beyond four colors in a single chart, you might need to reconsider your chart type anyway. Right?

Now, one final test: for achromatopsia. On the one hand, this is extremely rare. On the other hand…it’s common when your office has a lot of black-and-white printers:

Excel 2007 Default Chart Colors -- Achromatopsia
Achromatopsia (Extremely Rare)

Apparently, any palette that works in grayscale is a quick way to check for compatibility with all forms of colorblindness. It’s also…a best practice. Interestingly, the Excel 2007 palette really lays an egg here, in that colors 1, 2, and 4 are all barely distinguishable!

Clearly, there is an opportunity here to test a variety of functional, attractive palettes for grayscale printability and the top four forms of colorblindness and develop something better than the Excel defaults. But, that’s an exercise for another time. I think I’ll aim for the first four colors of the palette being “highly distinguishable” in all scenarios and the next four being “functionally distinguishable.” What do you think? Would this be useful? What else should I take into consideration?

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

Columbus Web Analytics Wednesday Meets #fiestamovement

Last night was the monthly Columbus Web Analytics Wednesday at Barley’s Smokehouse and Brewpub, and we were fortunate to have Webtrends sponsor for the second time this year! This time, we managed to get it scheduled in a way that lined up with Noé Garcia’s travel plans, so he wore the dual crown of “Traveled Farthest to the Event” (from Portland, OR) and “Sponsor Representative.” The dual crown looked surprisingly like an empty beer glass:

Noe Garcia of Webtrends

Noe and Bryan Cristina of Nationwide co-facilitated a discussion about going beyond the application of web analytics tools within the confines of the tool itself. The most active discussion on that front was spawned by one of the regular participants in the group who works at a major, Columbus-based online retailer. Not necessarily this guy, but maybe it was him. My lips are sealed.

Monish Datta explains an approach to web analytics

We talked about how web analytics data, tied to order information, and then matched back to offline marketing channels such as printed catalogs, can be very effective at driving marketing efficiency. In the examples that triggered the discussion, as well as from the other participants’ experiences, the consensus was that, while the ideal world would have all of this data hooked together automatically…rolling up your sleeves and tying the data together manually can still yield a substantial payback. Part of the discussion got into volume — for companies that do a lot of direct mail-oriented promotion, using web analytics data to cut the mail volume by even a fraction of a percent (by using that data to better target who does/does not respond to printed mail) can provide significant and quantifiable savings for a company.

I didn’t think I’d ever hear anyone at a WAW say “Zip+4″ (that’s shorthand for the 5-digit zip code plus the four additional digits that you see on a lot of your mail)…other than me! But I did! The person who said that may or may not be a different person pictured in the photo above. Again…my lips are sealed!


And…Ford’s Fiesta Movement

Dave Culbertson, a WAW promotional channel unto himself, kicked off an entirely different, but equally intriguing discussion:

Dave Culbertson Expounds

It all started as Dave was driving his Mazda in Grandview a couple of weeks ago. He got quasi-cut off by a 2011 Ford Fiesta two cars ahead of him. That prompted this tweet:

Dave Culbertson's "I just got cut off" tweet

Now, Dave regularly mocks people who promote themselves as being social media gurus/experts/mavens…but he’s one of the most social media savvy marketers I know. He also knows his cars. For one of those reasons (or maybe both) he immediately recognized that the car in front of him was part of Ford’s Fiesta Movement so he nailed a very relevant hashtag with his tweet. As it happened, someone else on Twitter saw the tweet, quickly realized who the likely culprit was, tweeted to her, and she wound up apologizing via Twitter less than an hour after the incident!

Ms. Single Mama's Cut Off Apology

Ms. Single Mama is a popular blogger, and this was the first time that she and Dave met in person. Everyone was curious about her Ford Fiesta agent experience. She obliged us by explaining, and, later, a good chunk of us headed out to the parking lot to see the 2011 Ford Fiesta she is driving for six months:

mssinglemama.com and her 2011 Ford Fiesta

Yes, we had name tags. Yes, the intial group that followed Alaina out to look at her car was entirely male. Yes, all told, about twice as many people as this wound up checking out the car. And, finally, yes, Alaina made a call in the midst of this picture! Andrew (far left) commented that the dashboard looked like the head of a Transformer. He…was right!

Transformer Head

2011 Ford Fiesta Dashboard

Dave even demonstrated his social media hipness by snapping a picture of the vehicle with his iPhone and then tweeting it:

Dave Culbertson iPhones a picture of a 2011 Ford Fiesta

All in all, it was an engaging, informative evening. I’m sure I’ll miss some of the companies that were represented, but they included JPMorgan Chase, Nationwide, Victoria’s Secret Online, Webtrends, Clearsaleing, Bath&Body Works, Cardinal Solutions, Highlights for Children, Rosetta, Foresee Results, Acappella Limited, DK Business Consulting, Lightbulb Interactive…and others! Not. A. Bad. Crowd!

The next WAW will be July 15th. We’re working hard to get our calendar for the rest of the year nailed down, which means we are looking for sponsors and presenters. Please contact me at tim at <this domain> if you are interested on either front.

 

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

The Teeter-Totter of Customer Data Management

Teeter-totter

I had a professor in business school who used to explain the relationship between the stock market and the bond market as a teeter-totter (in rural southeast Texas, I grew up knowing this as a see-saw): as the yields on one went up, the yields on the other went down and vice versa. 

Managing your customer data can be like that, too — the more of a burden you put on your customers and prospects to keep your data about them clean, the less of a burden you put on yourself. And, likewise, the more of a burden you take on yourself, the less of a burden you’re putting on your customer.

While bouncing through links from a tweet, I stumbled across Steve Woods’s original Contact Washing Machine post, and it set some alarm bells off. Steve’s a damn sharp guy — he was a co-founder and remains the CTO of Eloqua, and he is pretty much an undisputed visionary when it comes to marketing automation technology. Yet, this post sparked an immediate reaction, as well as teeter-totter imagery. Since then, Steve has clarified…and I think I misread his initial premise. His point is that data cleansing should happen as early in the data acquisition process as possible — cleanse the data as it comes in, rather than crossing your fingers and waiting to run batch processes after the fact in the hopes that the data will get cleaned up.

That’s a valid point, but, after digging deeper into the cross-links in the post, I still think there’s some under-estimating of what it takes to “fix” dirty data as it comes in. For starters, when it comes to customer/prospect data, there are typically a range of incoming data entry points:

Web Data Entry

In the world o’ the web, data can come into your systems directly as typed by a visitor to your site — when a user is filling out a web form, for instance. On the surface, that’s a great place to do data validation, because you’ve got the actual user right there to clarify anything that has gone amiss. If he’s fat-fingered his phone number or put in an e-mail address that is clearly not valid, it’s best to prompt him right then and there to correct the mistake. But, the teeter-totter comes into play: if that piece of data is really not germaine (as perceived by the user), it doesn’t take long for your cleansing to lead to a frustrated visitor to your. Worse, if you don’t allow the user to bypass the validation step (with a “I don’t care what you think, I’ve entered the information correctly, so just keep it that way and let me move on” option), there is a very good chance that you will keep some visitors from ever getting to where they and you want them to!

If you include field validation on your web forms, and if you don’t allow the user to override that validation, it behooves you to include detailed form abandonment tracking in your web analytics to make sure you haven’t set up an insurmountable barrier for some of your customers.

Human Data Entry

Call centers almost always serve a data entry function as part of the customer service process. In addition, many companies have dedicated data entry staff to translate mail, fax, tradeshow-collected leads, or other transactions. This can be a great opportunity to clean your data up front, as you can certainly place a higher burden of getting the data right and enforced data validation on employees of your own company than you can on your customers and prospects.

BUT, this turns out to be a stickier wicket than it seems at first blush. If I had a nickel for every time I heard someone living in world of backend data propose data augmentation or enhancement by updating the human data entry processes to “just add one more quick step,” I’d be able to buy a Starbucks Venti Caramel Frapuccino® blended coffee (which is a lot of nickels, if you think about it). Two reasons that there should be a proceed-with-extreme-caution label placed prominently on any solution that heads down this path:

  • Call centers typically live and die by the average handle time (AHT) for their calls; yes, they want to meet the customer’s needs, but they also, out of necessity, can save big dollars by cutting the AHT by a few seconds on average. Adding 5 or 10 seconds to every call can have a very real impact (and can make you some quick enemies with call center managers)
  • It’s easy to identify the benefits of more, more complete, or cleaner data…when it comes to backend processes and data analysis. But, is that benefit readily evident to the people whom you’re relying on to capture it? Does it benefit them directly, either through smoothing the immediate next steps in their process or by impacting their compensation? Due to the high-volume nature of call center and data entry work, data that is “just another field you need to fill out” is data that is at risk of falling prey to shortcuts (the first value in the dropdown, “aaa” in a text field, etc.). The most successful introductions of process changes have a net-no-change or net decrease in the number of steps/time/complexity of the process into which it is being introduced.

Human data entry offers opportunities to get data that is more complete and cleaner…but those opportunities don’t come automatically.

There are many other ways that data can enter your systems: provided by an intermediary (often semi-independent sales channels: distributors, resellers, etc.), sourced from a third-party lead sourcing company, passed in from another system within your company (often a system that doesn’t store the data in the same format or even have the same definitions for what specific fields mean and are used for), etc. There’s value in inspecting the sources of your customer data, assessing how clean the data is that comes from those different sources, and then, with the teeter-totter firmly in mind, investigating where and how to get that data coming in cleaner!

Photo courtesy of jhirtz.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

What is “Analysis?”

Stephen Few had a recent post, Can Computers Analyze Data?, that started: “Since ‘business analytics’ has come into vogue, like all newly popular technologies, everyone is talking about it but few are defining what it is.” Few’s post was largely a riff off of an article by Merv Adrian on the BeyeNETWORK: Today’s ‘Analytic Applications’ — Misnamed and Mistargeted. Few takes issue (rightly so), with Adrian’s implied definition of the terms “analysis” and “analytics.” Adrian outlines some fair criticisms of BI tool vendors, but Few’s beef regarding his definitions are justified.

Few defines data analysis as “what we do to make sense of data.” I actually think that is a bit too broad, but I agree with him that analysis, by definition, requires human beings.

Fancy NancyWith data “coming into vogue,” it’s hard to walk through a Marketing department without hearing references to “data mining” and “analytics.” Given the marketing departments I tend to walk through, and given what I know of their overall data maturity, this is often analogous to someone filling the ice cube trays in their freezer with water and speaking about it in terms of the third law of thermodynamics.

I’ve got a 3-year-old daughter, and it’s through her that I’ve discovered the Fancy Nancy series of books, in which the main character likes to be elegant and sophisticated well beyond her single-digit age. She regularly uses a word and then qualifies it as “that’s a fancy way to say…” a simpler word. For instance, she notes that “perplexed” is a fancy word for “mixed up.”

“Analytics” is a Fancy Nancy word. “Web analytics” is a wild misnomer. Most web analysts will tell you there’s a lot of work to do with just basic web site measurement. And, that work is seldom what I would consider “analytics.” As cliché as it is, you can think about data usage as a pyramid, with metrics forming the foundation and analysis (and analytics) being built on top of them.

Metrics Analysis Pyramid

There are two main types of data usage:

  • Metrics / Reporting – this is the foundation of using data effectively; it’s the way you assess whether you are meeting your objectives and achieving meaningful outcomes. Key Performance Indicators (KPIs) live squarely in the world of metrics (KPIs are a fancy way to say “meaningful metrics”). Avinash Kaushik defines KPIs brilliantly: “Measures that help you understand how you are doing against your objectives.” Metrics are backward-looking. They answer the question: “Did I achieve what I set out to do?” They are assessed against targets that were set long before the latest report was pulled. Without metrics, analysis is meaningless.
  • Analysis – analysis is all about hypothesis testing. The key with analysis is that you must have a clear objective, you must have clearly articulated hypotheses, and, unless you are simply looking to throw time and money away, you must validate that the analysis will lead to different future actions based on different possible outcomes. Analysis tends to be backward looking as well — asking questions, “Why did that happen?”…but with the expectation that, once you understand why something happened, you will take different future actions using the knowledge.

So, what about “analytics?” I asked that question of the manager of a very successful business intelligence department some years back. Her take has always resonated with me: “analytics” are forward-looking and are explicitly intended to be predictive. So, in my pyramid view, analytics is at the top of the structure — it’s “advanced analysis,” in many ways. While analysis may be performed by anyone with a spreadsheet, and hypotheses can be tested using basic charts and graphs, analytics gets into a more rigorous statistical world: more complex analysis that requires more sophisticated techniques, often using larger data sets and looking for results that are much more subtle. AND, using those results, in many cases, to build a predictive model that is truly forward-looking.

The key is that the foundation of your business (whether it’s the entire company, or just your department, or even just your own individual role) is your vision. From your vision comes your strategy. From your strategy come your objectives and your tactics. If you’re looking to use data, the best place to start is with those objectives — how can you measure whether you are meeting them, and, with the measures you settle on, what is the threshold whereby you would consider that you achieved your objective? Attempting to do any analysis (much less analytics!) before really nailing down a solid foundation of objectives-oriented metrics is like trying to build a pyramid from the top down. It won’t work.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

Blogroll Update+

Blogrolls, blogrolls, blogrolls. I realized over the weekend that the blogroll(s) on my site were wildly out of date — they reflected some great blogs…but not exactly the ones that I really follow and read most consistently these days.

So, I updated that. But, in the process, I decided to re-open a nasty can of worms that I’d only casually eyed in the past, and I added a Favorite Feeds page to the site. There were two reasons this was a dicey place to go:

  • While I’ve got the best intentions for putting up the page — to give people who come to my site an easy way to scan the content I’m most likely reviewing through my feed reader and possibly discover a new blog or two they’d like to follow — the “content ownership” makes for a touchy subject. There is plenty of splogging going on out there, and that’s really not my intent.
  • The logistics of actually posting a page with a dynamically generated, yet easy to read and duly giving credit where credit is due, list was trickier than it seemed like it ought to be

I think I handled both of these challenges successfully, but please drop a comment if you think I’ve missed something.

Approach to Avoiding Inappropriate Republishing of Content

What I settled on was only posting the post titles and prepending each post with the source in brackets. Clicking on the link takes you to the content on the site where it originated (via feedproxy.google.com, which was entirely unintentional, but may yield some nice benefits down the road — I don’t think this introduces any ethical issues).

Technical Approach for Pulling this Off Using WordPress

I’m sure there are technically more elegant solutions, but here’s the list of how I stitched things together to make the page work:

  1. Created a Yahoo! Pipe that pulls each of these feeds, prepends the source in brackets, and then combines all of the feeds into a single feed sorted from newest to oldest publication date
  2. Ran the pipe through Feedburner — this wasn’t absolutely necessary, but just seemed like a best practice (I subscribe to the feed directly in my feed reader for when time is really short)
  3. Installed both the Exec-PHP WordPress plugin and the WP-RSSImport plugin
  4. To get Exec-PHP to work, and because I do use the WordPress WYSIWYG editor, I created a new user account that has the WYSIWYG editor turned off and used that account to create the new page
  5. To get WP-RSSImport to work, I ran the documentation page through Google to get enough of a translation for me to figure out that I needed to use the following code on the new page I created:
    <?php RSSImport(20,”http://feeds2.feedburner.com/GilliganOnDataFavoriteFeeds”,false,false); ?>

It took a number of false starts, but the result seems fairly clean, so I’m going to go with it.

Whatcha’ think?

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

Recovery.gov Needs Some Few and Some Tufte

I caught an NPR story about recovery.gov last week, and it sounded really promising. Depending on where you fall on the political spectrum, the various rounds of stimulus and bailout funding that have come through over the past six months fall somewhere between “throwing money away,” “ready, fire, aim,” and “point in what seems what might be a good direction, pull the finger, and shoot.” No one can stand up and say, with 100% certainty, that we’re not going to look back on this approach in a decade or two and say, “Um…oops?”

It’s hard to imagine anyone taking issue with the proclaimed intent of recovery.gov, though — make the process as transparent as possible, including how much money is going where, when it’s going, and what ultimately comes of it. It was a day or two before I found myself at a computer with time to check out the site…and I was disappointed. In the NPR interview, the interviewer commented how the site was slick and clean. Reality is “not so much.”

Now, I did once take a run at downloading the federal budget to try to scratch a curiousity itch regarding, at a macro level, where the federal government allocates its funds. On the one hand, I was pleased that I was able to find a .csv file with a sea of data that I could easily download and open with Excel. On the other hand, the budget is incredibly complex, and it takes someone with a deeper understanding of our government to really translate that sea of data into the answers I was looking for. Really, though, that wasn’t a surprise:

The data is ALWAYS more complex than you would like…when you’re trying to answer a specific question.

To the credit of recovery.gov, they clearly intended to show some high-level charts that would answer some of the more common questions citizens are asking. Unfortunately, it looks like they turned over the exercise to a web designer who had no experience in data visualization.

Examples from the featured area on the home page: 

recovery.gov Funds Distribution Reported by Week

The overall dark/inverse style itself I won’t knock too much (althought it bothers me). And, the fact that the gridlines are kept to a minimum is definitely a good thing. My main beef is admittedly a bit ticky-tack. There was an earlier version where there was a $30 B gridline, and that has since been removed. Clearly, someone would have to really be scrutinizing the graph to identify this hiccup, but someone will

When presenting data to an audience, the data as it stands alone needs to be rock solid. If it contradicts itself, even in a minor way, it risks having its overall credibility questioned.

So, moving on to some more egregious examples:

recover.gov Relief for America's Working Families

We get a triple-whammy with this one:

  • Pie charts are inherently difficult for the human brain to interpret accurately
  • Pie charts are even worse when they are “tilted” to give a 3D effect — the wedges on the right and left get “shrunk” while wedges on the top or bottom get “stretched”
  • Exploding a pie chart and then providing a pie chart of just the wedge…just ain’t good

Two questions this visualization might have been trying to answer:

  • How much of the stimulus plan is devoted to tax benefits?
  • How much of the stimulus plan is going to the “Making Work Pay” tax credit?

Without doing any math, can you estimate either one of these? For the first question, you’re estimating the size of the small wedge on the left pie chart. It looks like it’s ~ 1/4 of the pie, doesn’t it? In reality, it’s 37%! For the second question, you have to combine your first estimate with an estimate of the lavender wedge in the right pie chart…and that’s way more work than it’s worth. If you do the math, you’ll get that the lavender wedge works out to ~7% of the entire left pie. A simple table or a bar graph would be more effective.

And, finally, the estimated distribution of Highway Infrastructure Funds:

recovery.gov Distribution of Highway Infrastructure Funding

Well, that’s just silly. There is NO value of making these bars come flying out of the graph. Really.

Now, to the site’s credit, it takes all of 3 clicks to get from the home page to downloading .csv files with department-specific data and weekly updates (which includes human-entered context as to major activities during the prior week). That’s good (assuming it’s not unduly cumbersome to maintain)! And, I’m sure the site will continue to evolve. But, I’d love to see them bring in some data visualization expertise. The model for the visualization should be pretty simple:

  1. Identify the questions that citizens are asking about the stimulus money
  2. Present the data in the way that answers those questions most effectively
  3. Link to the underlying data — the aggregate and the detail — directly from each visualization

As it turns out, Edward Tufte has already been engaged (thanks to Peter Couvares for that tip via Twitter), and is doing some pro bono work. But, it’s not clear that he’s focussing on the high-level stuff. I would love to see Stephen Few get involved as well — pro bono or not! Or, hell, I’d offer my services…but might as well get the Top Dog for something like this.

Starting today, the site is hosting a weeklong online dialogue to engage the public, potential recipients, solution providers, and state, local and tribal partners about how to make Recovery.gov better. I’ve submitted a couple of ideas already!  

 

 


  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

Columbus Web Analytics Wednesday: A Speedy April

We had our monthly Web Analytics Wednesday meetup at Barley’s Smokehouse and Brewpub last week. Once again, the Web Analytics Wednesday Global Sponsors (Coremetrics, Web Analytics Demystified, and SiteSpect) sponsored the event, which is always appreciated!

This month, in lieu of a formal topic, Dave Culbertson facilitated a round of speed networking — like speed dating, but with the purpose of driving interaction beyond everyone’s immediate tablemates. Each round lasted for 1 minute, and the main challenge was getting people to stop talking and shift on to the next person! It was a little intense, but Dave cut it off after 15-20 minutes, and the overwhelming consensus was that it was fun and useful!

 April 2009 Columbus Web Analytics Wednesday

April 2009 Columbus Web Analytics Wednesday

At the end of the exercise, Dave commented that he really hoped we could start extending these 1:1 connections and interactions through social media. As it is, Dave (@daveculbertson) is one of the most interesting people I follow on Twitter, especially when it comes to finding and tweeting links to content that I find interesting and informative. We’d actually thought ahead (if “six hours before the event” counts as “ahead”) and made a sign-up sheet that included a space for the attendees to write their Twitter usernames and indicate if it would be okay to post them. I then proceeded to leave the sign-in sheet behind when I left! Something about Barley’s — last month, I left my notebook behind and had to go and retrieve it the next day (2 beers over 2.5 hours plus a full meal…in case you’re wondering — it’s just something in the air there!).

So, instead, we’re broadening our social media presence. Consider joining one or all of the following, depending on where/how you hang out on the ‘net:

  • Facebook — we’ve had a WAW Columbus group there for some time
  • Twitter Group — this was Dave’s suggestion, and I haven’t used twittgroup.com before, but we’ve now got a cbuswaw group there as well
  • LinkedIn — might as well kick it old school, too, so we’ve now got a Columbus WAW LinkedIn group

Pick your poison, one or all!

Overall, the event had a great mix of both practicing web analysts (from companies like Resource Interactive, Highlights for ChildrenVictoria’s Secret, Lightbulb Interactive, Coldwell Banker, …and I’m just rattling off the companies I can remember, so this is an incomplete list) as well as some web analytics-centric companies: BizResearchClearSaleingSearchSpring, and WebTech Analytics (all the way up from Cincinnati!). And, with a handful of sharp people in the crowd who are currently looking for full-time work, it was great that TeamBuilder Search came out as well! From a quick count of faces in my brain, the attendance broke down to be ~25% first-timers, ~25% loooonnnngg-time attendees, and 50% who have attended 1-5 times before. All in all, a great mix!

The most-interesting-but-random site/tool that I learned about this month was City-Data.com — think The World Factbook, but for U.S. cities rather than for countries! And, with a slew of charts that are pretty clean and provide a pretty good way to get the flavor of a town — weather, jobs, houses, and so on.

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail

PowerPoint / Presentations / Data Visualization

I wrote a post last week about PowerPoint and how easy it is to use it carelessly — to just open it up and start dumping in a bunch of thoughts and then rearranging the slides. That post wound up being, largely, a big, fat nod to Garr Reynolds / Presentation Zen. Since then, I’ve been getting hit right and left with schtuff that’s had me thinking more broadly about effective communication of information in a business environment:

Put all of those together, and I’ve got a mental convergence of PowerPoint usage, presenting effectively (which goes well beyond “the deck”), and data visualization. These are all components of “effective communication” — the story, the content, how the content is displayed, how the content is talked to. In one of Reynolds’s sets of sample slides, you can clearly see the convergence of data visualization and PowerPoint. And, even he admits that this is a tricky thing to post…because it removes overall context for the content and it removes the presenter. Clearly, there are lots of resources out there that lay out fundamental best practices for effectively communicating in a presentation-style format. Three interrelated challenges, though:

  • The importance of learning these fundamentals is wildly undervalued — it sounds like Abela’s book tries to quantify this value through tangible examples…but it’s a niche book that, I suspect, will not get widely read by the people who would most benefit from reading it
  • “I need to put together a presentation for <tomorrow>/<Friday>/<next week>” – we’re living under enormous time pressure, and it’s incredibly easy to get caught up in “delivering a substantive deliverable” rather than “effectively communicating the information.” When I think about the number of presentations that I’ve developed and delivered over the past 15 years, the percentage that were truly effective, compelling, and engaging is abysmally small. And that’s a waste.
  • Culture/expectations — every company has its own culture and norms. For many companies, the norms regarding presentations are that they are linear, slide-heavy, logically compiled, and mechanically delivered affairs. For recurring meetings, there is often the “template we use every month” whereby the structure is pre-defined, and each subsequent presentation is an update to the skeleton from the prior meeting. Walk into one of those meetings and deliver a truly rich, meaningful, presentation…and your liable to be shuttled off for a mandatory drug test, followed by a dressing down about “lack of proper preparation” because the slides were not sufficiently text/fact/content-heavy. <sigh>

What’s interesting to me is that I have spent a lot of time and energy boning up on my data visualization skills over the past few years. And, even if it takes me an extra 5-10 minutes in Excel, I never send out something that doesn’t have data viz best practices applied to some extent. As you would expect, applying those best practices is getting easier and faster with repetition and practice. So, can I do the same for presentations? And, again, that’s presentations-the-whole-enchilada, rather than presentations-the-PowerPoint-deck. Can I balance that with cultural norms — gently pushing the envelope rather than making a radical break? Can you? Should you?

  • Digg
  • del.icio.us
  • Facebook
  • Reddit
  • Technorati
  • Google
  • e-mail