Halo Effects vs. Intention-Laden Ratings: Separating Baby and Bathwater
R-bloggers 2013-04-09
Summary:
Are halo effects real or illusory? Much has been written arguing that rating scales contain extensive amounts of measurement bias. Some tells us to avoid ratings altogether (What do customers really want?). Others warn against the use of ratings scales without major adjustments (e.g., overcoming scale usage heterogeneity with the R package bayesm). Obviously, by including the baby and bathwater idiom, I believe that there may be something "real" in those halo effects - real in the sense that a tendency to rate all the items higher or all the items lower may tells us much about one's intentions toward the entity being rated. A Concept from Performance Appraisal The concept of a halo effect flows from a dualism inherent in some theories of human perception. There is a world independent of the observer, and there are reports about that world from human informants. It is not an accident that the first use of the term "halo effect" came from human resources. Personnel decisions are supposedly based on merit. An employee's performance is satisfactory or it is not. We do not collect supervisor ratings because we care about the feelings of the supervisor. The supervisor is merely the measuring instrument, and in such cases, halo effects seem to be a form of measurement bias. To be clear, halo effects are only bias when the informant is the measurement instrument and nothing more. If we cared about the feelings of our observers, then removing the "halo effect" would constitute throwing the baby out with the bathwater. Why Intention-Laden Ratings? The title, Intention-Laden Ratings, comes from N.R. Hanson's work on theory-laden observation. According to Hanson, when Galileo looked at the moon, he did not see discontinuities on the lunar surface. He saw craters; he saw the quick and violent origins of these formations. Similarly, the phrase "intention-laden ratings" is meant to suggest that ratings are not simply descriptive. Observations are not theory-free, and ratings are not intention-free. Perception serves action, and ratings reflect intentions to act. Failure to understand this process dooms us to self-defeating attempts to control for halo effects when what is called "halo" is the very construct that we are trying to measure. We seem to forget that we are not reading off numbers from a ruler. People provide ratings, and people have intentions. I intend to continue using R for all my statistical analysis. I climbed up that steep learning curve, and I am not going back to SPSS or SAS. Still, I would not give top-box ratings to every query about R. R has its own unique pattern of strengths and weaknesses, its own "signature." There may be considerable consensus among R users due to common experiences learning and using R and because we all belong to an R community that shares and communicates a common view. Moreover, I would expect that those who have not made a commitment to R do not see a different relative pattern of pros and cons. They would, however, tend to give lower ratings across all the items. What some might call "halo" in brand ratings can more accurately be described as intention-laden ratings, in particular, a commitment or intention to continued brand usage. If it helps, think of intention-ladened as a form of cognitive dissonance. R Enables Us to Provide an Example An example might help explain my point. Suppose that we collected five-point satisfaction ratings from 100 individuals who responded "yes" when asked if they use R for at least some portion of their statistical computing. Let us say that we asked for nine ratings tapping the features that one might mention when recommending R or when suggesting some other statistical software. Here is a summary of the nine ratings. Number of Respondents Giving Each Score 1-5 var mean sd 1 2 3 4 5 1 4.59 0.89 2 2
