# Calculating Trend Indicators

## By Tim Wilson on in Data Visualization, Metrics with 8 Comments

Put this down as one of my more tactical posts, brought on by a fit of lingering annoyance with the use (and by “use” I mean “grotesque misuse”) of trend indicators on reports and dashboards. The trouble is that trends are a trickier business than they seem at first blush, and, at the same time, there are a number of quick and easy ways to calculate them…that are all problematic.

With the well-warranted increasing use of sparklines, which are inherently trend-y representations of data, I like to be able to put a meaningful trend indicator that complements the sparkline. Throughout this post, I will illustrate trend*lines*, but I’m really focussed on trend *indicators*, which are a symbol that indicates whether the trend in the data is upward, downward, or flat. Although there are a few minor tweaks I’d love to make once Excel 2010 is released and allows the customization of icon sets, I’m reasonably happy with their 5-arrow set of trend indicators:

They’re clean and clear, and they work in both color and in black and white. And, with conditional formatting, they can be automatically updated as new data gets added to a dashboard or report. While I won’t show these indicators again in this post, the trendlines I do show are the behind-the-scenes constructs that would manifest themselves as the appropriate indicator next to a sparkline or numerically reported measure.

I’ll use a simple 12-period data set throughout this post to illustrate some thoughts (not as a sparkline, but the principles all still apply):

Trends are slippery beasts for several reasons:

**Noise, noise, noise**— all data is noisy, which means it’s easy to over-read into the data and spot a trend that is not really there**The aircraft carrier vs. the speedboat conundrum**— the more data points you use, the more stable your trend, but the longer it takes to collect enough data to identify a trend, or, worse, to determine if you’ve truly impacted the trend going forward

Let’s start this exploration by walking through some of the common ways that “trend” judgments get made and point out why they’re troubling. I will then show an alternative that, while only marginally more complex to implement, works better when it comes to specifying trend-age.

## Trending Approaches of which I’m Leery

### Trending Based on the Change Over the Previous Period

The most common way I see trends reported is on a “change since the previous period” basis.

In this example, the trend would be an “up” because the data went up from the prior period to the current period. The problem with this is that, if you look at the longer pattern of data, you see that the data is pretty noisy, and it’s entirely possible that this “trend” is entirely a case of noise masking the true signal.

### Trending Over an Extended Period

Another way to trend your data, which Excel makes very simple, is to add a trendline using Excel’s built-in trending capabilities (converting this trendline to an indicator would require some use of a couple of Excel functions that I’ll go into a bit in my recommended approach later in the post).

With this method, the trend would be indicated as “slightly up.” While this may be a valid representation of the overall trend…it seldom seems quite right to use it. The trend gets impacted heavily by any sort of big spikes (or dips) in the data. These keep the same upward or downward trend for a very long period of time. I had a blog post during March Madness one year that wound up driving a big spike in traffic to my site. While it was legitimate for that spike to show an upward trend when I looked at my traffic that week or month, that spike has now wreaked havoc on the macro trend indicator that Google Analytics has shown ever since — for several *months* that spike kept my overall trend *up*, and, then, once that spike passed the fulcrum of the tool’s trend calculation, it caused the reporting of a downward trend for severals subsequent months. Through the whole period, I had to mentally discount what the trend indicator showed.

### Year-Over-Year Trending

Because seasonality wreaks havoc with trendlines, it’s not uncommon to see trend indicators based on year-over-year results — if the current reporting period is a higher number than the same period a year ago, then the trend is up. For *trending* purposes, this combines the worst of the two prior examples — it takes a very small number of data points (subjecting the assessment to noise) *and* it uses ancient history data in the equation.

This isn’t to say that comparisons to the same period in the prior year (or even the same period in the prior quarter, since many companies see an intra-quarter pattern) are bad. But, the question those comparisons answer differs from a trend: a trend should be an indication of “where we are heading of late such that, if we continue on the current course, we can estimate whether we will be doing better or worse next week/next month,” while a year-over-year comparison is more a measure of “did we move positively from where we were last year at this time?”

## Trending Approaches I Feel Better About

I’ve spent an embarrassing amount of time thinking about trending over the past four or five years, but I’ve finally settled on an approach that meets all of these criteria:

- It balances the number of data points available for the trend with the sluggishness/timeliness of the results
- It’s reasonably intuitive to explain
- It passes the “sniff test” — while a trend indicator may initially be a little surprising, on closer inspection, the user will realize it’s legit

The last bullet point is really a combination/result of the first two.

### My Failed Exploration: Single Point Moving Range (mR)

Because of criteria above, I’ve discarded what I thought was my most promising approach — using the single point moving range (mR). A light bulb went off last spring when I took an intermediate stats class, and, although the professor glossed over the moving range formulas, I thought it was going to be the answer that would allow me to solve my trendline quandary — it would look at the “change over previous period” and determine if that change was sufficiently large to warrant reporting a measurable trend. After noodling with it quite a bit… I don’t think that it works for the purposes of trend indicators. For chuckles, a moving range chart for the example in this post looks like the following:

If you want to read more about moving ranges, the best explanation I found was on the Quality Magazine web site. I’ll just stop there, though. We’ve already lost on the “reasonably intuitive” front, and I haven’t even calculated the control limits yet!

### And Another Failed Exploration: the Moving Average

There’s also the “moving average” approach, which smooths things out quite a bit:

I always feel like the moving average is some sort of narcotic applied to the data — it makes things fuzzy by having a single data point factored into multiple points represented on the chart. But, I’ll grudgingly admit that it does have its merits in some cases.

### My Approach to Trending (At Last!!!)

There are two key elements to my trending approach, and neither is particularly earth-shattering:

- Break the data into smaller components than the reporting cycle
- Trend only over recent data, rather than over the entire reported timeframe

Going back to the original example here, let’s say that I update a dashboard once a month, and that the dashboard primarily looks at data for the prior 3 months. In that case, the 12 data points each represent (roughly) one week. IF I simply reported the data on a monthly basis, then the chart would look like this:

That shows a clear upward trend, regardless of whether I look at the last month or the last two months of data. It would be hard not to put an upward trend indicator on this plot. But, we’re relying on all of *three* data points, *and* we’re going back three full reporting periods to draw that conclusion. Both of these are a bit concerning. Invariably, we’d want to go back farther in time to get more data points to see if this trend was real…and then we’re falling into the aircraft carrier dilemma.

Instead, though, I can keep the granularity of the reporting at a week, *but only trend over the last four periods*:

I don’t actually plot the trendline shown in the chart above. Rather, I calculate the formula for the line using the SLOPE and INTERCEPT Excel functions. I then calculate the value of the 4-weeks-ago endpoint of the line and the most-recent-week endpoint of the line and look at the percentage change from one to the other. I actually set some named cells in my workbook to specify how many periods I report over (so I can vary from 4 to 6 or something else universally) as well as what the different thresholds are for a strong up, weak up, no change, weak down, or strong down trend.

In the example in this post, the change is a 16% drop, which usually would garner a “strong down” trend — very different from all the upward trends in the early examples! And, even somewhat counter-intuitive, as the most recent change was actually an “up.” If the entire range has been trending upward, as shown by the 3-point plot as well as by a close inspection of the raw basic data (think of it as a sparkline), then you already have that information available as the longer term trend, but, *of late*, the trend seems to be somewhat downward.

### A Note of Caution

This post has gone through what works for me as a general rule. As I read back over it, I realize I’m setting myself up for a case of, “Yeah, you CAN make the data say whatever you want.”

I’m less concerned about prescribing a universally-effective approach to trend calculation as I am about putting out a cautionary tone on the various “obvious” ways to calculate a trend. The sniff test is important — does the trend work for your specific situation when you actually apply it? Or, have you adopted a simplistic, formulaic approach that can actually provide a very clear misrepresentation of the data?

### And…a Nod to Efficiency and Automation

The prospect of introducing SLOPE and INTERCEPT functions may seem a little intimidating from a maintenance and updating perspective, but it really doesn’t need to be. By using built-in Excel functionality, these can be set up once and then dynamically updated as new data comes in. I like to build spreadsheets with a data selector so that the dashboard is a poor man’s BI tool that allows exploring how the data has changed over time. The key is to use some of Excel’s most powerful, yet under-adopted, features:

- Conditional formatting — especially in Excel 2007 where conditional formatting can make use of customized icon sets
- Named cells and named ranges — these are handy for establishing constants used throughout the workbook (thresholds, for instance) that you may want to adjust
- Data validation — using a cell as your “date range selector” that references a named range of the column that lists the dates for which you record the data
- VLOOKUP — because you used data validation, you can then use VLOOKUP to find the current data based on what is selected by the user
- Dynamic charts — these actually aren’t a “feature” of Excel so much as the clever combination of several different features; Jon Peltier has an excellent write-up of how to do this

If set up properly, a little investment up front can make for an easily updated report delivery tool…with meaningful trend indicators!

This is a great discussion of the issues of trending. I believe their is one concept that needs to be included. Trending is generally used to forecast future performance. Your March Madness example is perfect. Trendlines only are indicative of the future if you assume that nothing changes either externally or internally in your processes.

Excellent point, Rick! The analog in marketing would be any sort of “big splash” campaign that affects what you’re measuring. So, then the question is: do you try to remove those outliers from the trend, or do you go deeper on the statistical front and start truly breaking out rational subgroups and building a more rigorous regression? The situation will dictate, right?

Hey Tim,

Great post. I will actually use this today when discussing data visualization with my team members. I think bottom line, is your situation will dictate how and what statistical technique you use to trend. This is probably why MR did not work for you. I think it has some applications for us, but to be honest, I never even used it that much in Quality Control charts. You may want to check out an exponentially weighted moving average. Its like a moving average, but it gives more weight to the most recent data point and less weight to previous data points. I have used it in the past to detect small shifts within my mean, which is typically what you want to know….

Tim – came across this article while musing the same topic, myself. Can you share a sample excel or some formulas to further illustrate how you are using slope and intercept within your formulas? Thanks!

Pingback An Excel Dashboard Widget | Gilligan on Data by Tim Wilson

How many data points are needed to prove a trend?

So, I’m sitting here, looking at the Pew Center’s 2014 Religious Landscape survey. And I have taken note of Pew’s explanation that this survey is the second time (first in 2007), it has undertaken this “landscape” poll.

I suppose one way to think about the two polls, at a macro kind of level is that each comprises a single data point. Inasmuch as there are two data points now, I asked myself: can a trend be discerned among the data at this point; what level of confidence could be had in such a discerned trend.

So I ran a google search for data points needed for a trend, and here I am.

It’s not your job to worry about why I’d wonder about things, but I suppose if it was my job to wonder about why I worry about things, I’d start out by saying, I don’t have nearly enough data to establish a trend at this point.

Thoughts?

http://www.wmctc.co.uk/contact.asp