Data Science: What the Facebook Controversy is Really About
Data & Society / saved 2014-07-01
Summary:
Facebook has always “manipulated” the results shown in its users’ News Feeds by filtering and personalizing for relevance. But this weekend, the social giant seemed to cross a line, when it announced that it engineered emotional responses two years ago in an “emotional contagion” experiment, published in the Proceedings of the National Academy of Sciences (PNAS).
Since then, critics have examined many facets of the experiment, including its design, methodology, approval process, and ethics . Each of these tacks tacitly accepts something important, though: the validity of Facebook’s science and scholarship. There is a more fundamental question in all this: What does it mean when we call proprietary data research data science?
As a society, we haven't fully established how we ought to think about data science in practice. It's time to start hashing that out.
Before The Data Was Big...
Data by definition is something that is taken as “given,” but somehow we’ve taken for granted the terms under which we came to agree that fact. Once, the professional practice of “data science” was called business analytics. The field has now rebranded as a science in the context of buzzwordy “Big Data,” but unlike other scientific disciplines, most data scientists don’t work in academia. Instead, they’re employed in commercial or governmental settings.
The Facebook Data Science team is a prototypical data science operation. In the company’s own words , it collects, manages, and analyzes data to "drive informed decisions in areas critical to the success of the company, and conduct social science research of both internal and external interest." Last year, for example, it studied self-censorship—when users input but do not post status updates . Facebook’s involvement with data research goes beyond its in-house team. The company is actively recruiting social scientists with the promise of conducting research on "recording social interaction in real time as it occurs completely naturally." So what does it mean for Facebook to have a Core Data Science Team, describing their work—on their own product—as data science?
Contention about just what constitutes science has been around since the start of scientific practice. By claiming that what it does is data science , Facebook benefits from the imprimatur of an established body of knowledge. It looks objective, authoritative, and legitimate, built on the backs of the scientific method and peer review. Publishing in a prestigious journal, Facebook legitimizes its data collection and analysis activities by demonstrating their contribution to scientific discourse as if to say, “this is for the good of society.”
"A data scientist is a statistician who lives in San Fransisco" #monkigras pic.twitter.com/HypLL3Cnye
— Jeremy Jarvis (@jeremyjarvis) January 30, 2014 So it may be true that Facebook offers one of the largest samples of social and behavioral data ever compiled, but all of its studies—and this one, on social contagion— only describe things that happen on Facebook . The data is structured by Facebook, entered in a status update field created by Facebook, produced by users of Facebook, analyzed by Facebook researchers, with outputs that will affect Facebook’s future News Feed filters, all to build the business of Facebook. As research, it is an over-determined and completely constructed object of study, and its outputs are not generalizable.
Ultimately, Facebook has only learned something about Facebook.
Means and Ends
For-profit companies have long conducted applied science research. But the reaction to this study seems to suggest there is something materially different in the way we perceive commercial data science research’s impacts. Why is that?
At GE or Boeing, two long-time applied science leaders, the incentives for research scientists are the same as they are for those at Facebook. Employee-scientists at all three companies hope to produce research that directly informs product development and leads to revenue. However, the outcomes of their research are very different. When Boeing does research, it contributes to humanity's ability to fly. When Facebook does research, it serves its own ideological agenda and perpetuates Facebooky-ness.
Facebook is now more forthright about this. In a response to the recent controversy, Facebook data scientist Adam Kramer wrote , "The goal of all of our research at Facebook is to learn how to provide a better service...We were concerned that exposure to friends' negativity might lead people to avoid visiting Facebook. We didn't clearly state our motivations in the paper."
Facebook’s former head of data science Cameron Marlow offers , “Our goal is not to change the pattern of communication in society. Our goal is to understand it so we can adapt our platform to give people the experience that they want.”
But data scientists don’t just produce knowledge about observable, naturally occurring phenomena; they shape outcomes. A/B testing and routinized experimentation in real time are done on just about every major website in order to opti