Psychology experiments to understand what’s going on with data graphics?
Statistical Modeling, Causal Inference, and Social Science 2013-04-22
Ricardo Pietrobon writes, regarding my post from last year on attitudes toward data graphics,
Wouldn’t it be the case to start formally studying the usability of graphics from a cognitive perspective? with platforms such as the mechanical turk it should be fairly straightforward to test alternative methods and come to some conclusions about what might be more informative and what might better assist in supporting decisions. btw, my guess is that these two constructs might not necessarily agree with each other.
And Jessica Hullman provides some background:
Measuring success for the different goals that you hint at in your article is indeed challenging, and I don’t think that most visualization researchers would claim to have met this challenge (myself included). Visualization researchers may know the user psychology well when it comes to certain dimensions of a graph’s effectiveness (such as quick and accurate responses), but I wouldn’t agree with this statement as a general claim. This isn’t to say there isn’t an interest in reaching more holistic evaluations for a visualization success. Researchers are now and have been thinking through and experimenting with new techniques, through the BELIV workshops and evaluation panels and sessions at the VisWeek conferences. I think in the next five to ten years we’ll see more and more studies that try to get at things beyond error rates and response time, and that try to capture the differences in communicative visualizations.
Some possible outcomes that come to mind for demonstrating alternative affordances of visualizations (which could together fall under the term ‘engagement’) are memorability, metaphor, affect, preference, and likelihood to share. Some of these have been explored a bit already, others not as much. The memorability aspect is addressed by the chart-junk papers (Bateman et al’s CHI 2009 paper and this years InfoVis paper by Borgo et al.) that include memory (both long and short term) along with the traditional metrics like accuracy and visual search time. The results make clearer that there trade-offs between what sticks in a viewer’s mind (because it’s attractive, because the metaphor is well-matched to content, etc) and what is best for visual search and accurate perception. This research lends some support to the way the graph design can be impactful or persuasive over and above the raw results of analyses in communication contexts. A few additional studies look specifically at the importance of the visual metaphor used (e.g., Zemkiewicz & Kosara 2008). But there’s plenty of additional work to do in identifying other reasons why embellished or otherwise non-traditional visualizations can be effective in certain cases. For instance, while the tendency for visualizations to enable immediate perceptually-based intuitions tends to be a key affordance, there may be cases where designing a graph to intentionally go against a user’s tendency to form a quick intuitive interpretation of a visualization can be helpful. Some of the visualizations created by Fox news around election data, for example, contain skewed axes and other flaws to mislead a viewer into drawing a conclusion. Going against graph conventions in cases where data flaws are likely might actually help people make more informed decisions from the data.
Eye-tracking can be used to verify what parts of a visualization captured a user’s attention most (through timing and duration of focus on certain areas). But it may also lead to a better understanding of the roles of visual comparisons in interpreting visualized data. In general, access to finer-grain perceptual data on interactions could help us understand how external visualizations relate to internal representations that are constructed or cued as one processes a graph, such as comparisons between an external visualization and an internal representation that captures one’s expectations. These aspects are relevant to exploratory and communicative vis. The cognitive psychologist Mary Hegarty spoke in her VisWeek keynote a couple years back about how her and her colleagues’ studies in graph perception have helped show where mental animation abilities are effective and even preferable for learning over external animations (reducing the need for fancy interactive effects in some cases) as well as where they fall short. If you consider small multiples less efficient than animations for representing time-based changes in a variable, there’s one example from InfoVis: Robertson et al. (2008) find that small multiples are more accurate as well as faster to use despite involving a lot of individual comparisons across charts. Still, while there may be mentions of a need for better understanding internal representations in InfoVis (see also this call from Liu and Stasko http://www.zcliu.org/wp-content/uploads/2010/07/infovis10-mbr.pdf), the empirical work has been done primarily by psychologists who aren’t necessarily publishing in the information visualization venues.
Another part of the challenge of measuring engagement may be that it is often represented naturally online through social media activity that isn’t easy to study in a controlled way. Shares of a visualization via facebook or twitter are one form of evidence that it’s engaging, but it’s hard to measure the engagement of two graphs in social media contexts since there are so many aspects of a network that are also crucial, like the connectedness of the sharer, etc. If someone could come up with a study design to look at engagement in a social environment while controlling for all these other factors, that would be a nice step forward toward understanding the real-world reception of visualizations.
A final type of evaluation we may begin to see more of was suggested by Mary Czerwinski in this year’s VisWeek keynote. Her idea is that visualization evaluation could learn a lot by studying physical responses or affect through GSR and other physical signals. Preference has thus far mostly been relegated to a side investigation if it’s captured at all in user studies. There are problems with gathering peoples’ self-reports of what they like from a single use of a visualization, as opposed to looking at longer term use, or sharing with others. But GSR and physical signals could help verify when a visualization is found immediately engaging or exciting, or over longer periods where users become particularly frustrated with a visualization.
A symptom of this ongoing discussion and exploration of engagement metrics may be that researchers can try out principles in systems that would have once been considered irrelevant. A system that caught my eye at the infovis conference this year deals with intentionally sketchy rendering of visualizations (http://tobias.isenberg.cc/VideosAndDemos/Wood2012SRI) motivated by the XKCD comic. It was generally well received, and so provides one example of how infovis folks are thinking outside of the box about what makes a graph useful.
Just for example, here’s an example of different goals in graphics, and here’s my paper with Antony Unwin on tradeoffs.