When to use the start-at-zero rule
Junk Charts 2014-04-24
A response to a tweet forwarded to me. The person tweeting complained that FiveThirtyEight uses charts that don’t start the vertical axis at zero. The example given was this:
In this post, I want to clear some confusion around the "start-at-zero" rule.
This rule is an absolute must only for column (or bar) charts but is not intended for line charts. Here is a bar chart with the axis starting at 60% instead of 0:
I highlighted the columns for 1993 and 1996. Visually, the height of one column is twice that of the other column. And yet the axis labels tell us that the difference is 65% versus 62.5%.
***
The reason for the start-at-zero rule is to avoid exaggerating meaningless differences.
To judge whether a change is meaningful or not, in time-series data like this, we have to use history to understand the general variability in college enrollment rates. Based on what we can see in this data (about 20 years), the college enrollment rate hovers between 60 and 70 percent. There is no data between 0 and 60 percent. Those are irrelevant values for this data series. This is why starting at zero is counterproductive.
Here is the line chart starting at zero:
This display has the unintended effect of squashing meaningful changes over time by inserting a lot of empty space below the line.
A column chart starting at zero looks like this:
This is a fix on the truncated column chart from above. But it also squashes meaningful changes over time. A column chart is just a poor choice to illustrate this dataset.
For those who don't like the line chart, consider using a dot plot: