My Favorite Charts
eagereyes 2014-09-04
There are many charts I hate, because they’re badly done, sloppy, meaningless, deceiving, ugly, or for any number of other reasons. But then there are the ones I keep coming back to because they’re just so clear, well-designed, and effective.
All of these are a few years old. Like a fine wine analogy that I could insert here, it probably takes a while for a chart to come up again and again in conversation and when looking for examples to realize how good it is.
Scatterplot
My favorite scatterplot, and perhaps my favorite chart ever, is Why Is Her Paycheck Smaller? by Hannah Fairfield and Graham Roberts. It shows men’s versus women’s weekly earnings, with men on the horizontal axis and women on the vertical. A heavy black diagonal line shows equal wages, three additional lines show where women make 10%, 20%, and 30% less. Any point to the bottom right of the line means that women make less money than men.
The diagonal lines are a stroke of genius (pun fully intended). When you see a line in a scatterplot, it’s usually a regression line that models the data; i.e., a line that follows the points. But such a line only helps reinforce the difficulty of judging the differences between the two axes, which is something we’re not good at, and which is not typically something you do in a scatterplot anyway.
But the diagonal line, as simple as it is, makes it not just possible, but effortless. It’s such a simple device and yet so clear and effective. All the points on the line indicate occupations where men and women make the same amount of money. To the top left of the line is the area where women make more money than men, and to the bottom right where women make less.
The additional lines show 10%, 20%, and 30% less for women. If it’s hard to tell if a point is lying on the main diagonal of a scatterplot, it becomes impossible to guess the percentage it is off. The additional lines make it possible to guesstimate that number to within a few percent. That is a remarkable level of precision, and it is achieved with three simple lines.
There is some interactivity: mousing over points brings up a tooltip that shows the occupation the point represents and how much more one gender makes than the other. Filters in the top left corner let you focus on just a small number of occupations, which include annotations for a few select jobs.
But the key element is the inclusion of the reference lines that help people make sense of the scatterplot and read it with a high level of precision. Simple but effective, and powerful.
Line Chart
My favorite line chart is The Jobless Rate for People Like You by Shan Carter, Amanda Cox, and Kevin Quealy. This chart is somewhat ancient, having been created in Flash and showing unemployment data from January 2007 to September 2009. But its brilliant design and interaction make it timeless.
It’s a line chart, but with a twist. The first thing you see is the heavy blue line, showing the overall unemployment rate. But there are more lines in the background, what are those? So you mouse over and they respond: they light up and there’s a tooltip telling you what they represent. Each is the unemployment rate for a subset of the population, defined as the combination of race, gender, age group, and education. How are hispanic men over 45 with only a high school diploma doing compared to the overall rate? What about black women 15–24? Or white college grads of any age and gender?
Clicking on a line moves the blue line there so it’s easier to see, but the overall rate stays easily visible. The y axis also rescales nicely when the values go above what it can currently display.
In addition, the filters at the top also respond to the selection to show who is selected. Clicking around inside the chart updates them. Hm, so maybe I can use those to explore too? And of course you can, broadening or narrowing your selection, or clicking through different age groups of the same subset of the population, etc.
The Human-Computer Interaction field has a nice term for an indication of more data and interaction: information scent. This is usually used with widgets, which indicate where more information can be found (like the little tick marks on the scrollbar in Chrome when when you search within the page). What makes this chart so good is its clever use of information scent to entice viewers to dig deeper, explore, and ask questions.
It also brilliantly and clearly demonstrates the fact that the overall unemployment rate is a rather meaningless number. The actual rate in your demographic is likely to look very different, and the range is huge. This was the inspiration for my What Means Mean piece, though I don’t think that was nearly as clear as this.
The chart shows interesting data, explains a somewhat advanced concept, and invites people to interact with it. This comes in a package that is understated and elegant in its design. Best line chart ever.
Bar Chart
I have already written about the Bikini Chart, and it remains my favorite bar chart. It’s an incredibly effective piece of communication, and it’s all just based on a simple time series. The fact that the bars point down clearly communicates how it is supposed to be read: down is bad, less down is better than more down.
Bar charts are not exactly a common medium for artistic expression, but the designers of this chart managed to subtly but clearly get a message across.
Bubble Chart/Animated Scatterplot
Animated scatterplots may not have been invented by Hans Rosling and gapminder, but they certainly were not a common thing until his TED talk in 2007. And while it may seem a bit facetious to point to the only reasonably well-known example of a particular chart type as my favorite one, this is clearly one of my favorite charts, no matter what type.
The animation may seem like a bit of a gimmick – and it has been criticized as not being terribly effective –, but it works to communicate a number of important pieces of information.
The main piece of information, of course, is change over time. How have different countries changed in terms of their wealth, healthcare, etc.? This is reasonably effective, because there are trends, and many countries follow them. The outliers are reasonably easy to spot, especially when you can turn on trails and replay the animation. It’s not always immediately possible to see everything, but it does invite people to play and explore.
But then, there are the explanations. There is the clever animation that constructs the two-dimensional scatterplot from a one-dimensional distribution. There is the clever drill-down animation that breaks continents down into countries, and countries down into quintiles, to show the enormous range of values covered by each. This is not just a simple data display, but a way to introduce people to statistical concepts and data operations they may have heard of but don’t understand (drill-down), or never have heard of in the first place (quintiles).
Rosling’s video, and the gapminder software, not only introduced millions of people to data they knew nothing about (the video has over 8.5 million views!), it also demonstrated how a compelling story can be told without a single photograph or other image, just with data. That is an incredible achievement that opened our eyes to the possibilities of data visualization for communication.
Appreciating Good Work
It’s easy to find, and make fun of, bad charts. But between all the pie chart bashing and general criticism of bad charts, it is equally important to find the good examples and try to figure out what makes them work so well. Even if it may be more fun to beat up the bad examples, we will ultimately learn more from understanding the design choices and ideas that went into the good ones.