Stephen Few’s Derivation of Tufte: The Data-Pixel Ratio
By Tim Wilson on in Data Visualization, Excel with 2 Comments
I’ve glanced through various folks’ copies of Stephen Few’s Information Dashboard Design: The Effective Visual Communication of Data on several occasions over the past few years. And, it was a heavy influence on the work that an ad hoc team in the BI department at National Instruments undertook a couple of years ago to standardize/professionalize the work they were putting out.
I finally got around to reading a good chunk of the book as I was flying a three-legged trip out to British Columbia last week…and it is good! One section that particularly struck me started on page 100:
Edward R. Tufte introduced a concept in his 1983 classic The Visual Display of Quantitative Information that he calls the “data-ink ratio.” When quantitative data is displayed in printed form, some of the ink that appears on the page presents data, and some presents visual content that is not data.
He then applies it as a principle of design: “Maximize the data-ink ratio, within reason. Every bit of ink on a graphic requires a reason. And nearly always that reason should be that the ink presents new information.”
This principle applies perfectly to the design of dashboards, with one simple revision: because dashboards are always displayed on computer screens, i’ve changed the work “ink” to “pixels.”
I’ll actually go farther and say that “dashboards” can be replaced with “spreadsheets” and this maxim holds true. Taking some sample data straight from Few’s book, and working with a simple table, below is how at least 50% of Excel users would format a simple table with bookings by geographic region:
Look familiar? The light gray gridlines in the background turned on in Excel by default. And, a failure to resist the urge to put a “thin” grid around the entire data set.
Contrast that with how Few represents the same data:
Do you agree? This is clearly an improvement, and all Few really did was remove the unnecessary non-data pixels.
So, how would I have actually formatted the table? It’s tough to resist the urge to add color, and I am a fan of alternating shaded rows, which I can add with a single button click based on a macro that adds conditional formatting (“=MOD(ROW()+1,2)=0″ for shaded and “=MOD(ROW(),2)=0″ for not shaded):
In this case…I’d actually vote for Few’s approach. But, even Few gives the okay to lightly shaded alternative rows later in the same chapter, when some sort of visual aid is needed to follow a row across a large set of data. That’s really not necessary in this case. And, does bolding the totals really add anything? I don’t know that it does.
The book is a great read. It’s easy to dismiss the topic as inconsequential — the data is the data, and as long as it’s presented accurately, does it really matter if it’s presented effectively? In my book, it absolutely does matter. The more effectively the data is presented, the less work the consumer of the data needs to do to understand it. The human brain, while a wondrously effective computer, has its limits, and presenting data effectively allows the brain to spend the bulk of its effort on assessing the information rather than trying to understand the data.