Skip to main content
Log in

“But the data is already public”: on the ethics of research in Facebook

  • Published:
Ethics and Information Technology Aims and scope Submit manuscript

Abstract

In 2008, a group of researchers publicly released profile data collected from the Facebook accounts of an entire cohort of college students from a US university. While good-faith attempts were made to hide the identity of the institution and protect the privacy of the data subjects, the source of the data was quickly identified, placing the privacy of the students at risk. Using this incident as a case study, this paper articulates a set of ethical concerns that must be addressed before embarking on future research in social networking sites, including the nature of consent, properly identifying and respecting expectations of privacy on social network sites, strategies for data anonymization prior to public release, and the relative expertise of institutional review boards when confronted with research projects based on data gleaned from social media.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. While no individuals within the T3 dataset were positively identified (indeed, the author did not attempt to re-identify individuals), discovering the source institution makes individual re-identification much easier, perhaps even trivial, as discussed below.

  2. See also bibliography maintained by danah boyd at http://www.danah.org/SNSResearch.html.

  3. The research team includes Harvard University professors Jason Kaufman and Nicholas Christakis, UCLA professor Andreas Wimmer, and Harvard sociology graduate students Kevin Lewis and Marco Gonzalez.

  4. See “Social Networks and Online Spaces: A Cohort Study of American College Students”, Award #0819400, http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0819400.

  5. See relevant National Science Foundation Grant General Conditions (GC-1), section 38. Sharing of Findings, Data, and Other Research Products (http://www.nsf.gov/publications/pub_summ.jsp?ods_key=gc109).

  6. The dataset is archived at the IQSS Dataverse Network at Harvard University (http://dvn.iq.harvard.edu/dvn/).

  7. College Board, http://www.collegeboard.com.

  8. This process is described at the Harvard College Office of Residential Life website: http://www.orl.fas.harvard.edu/icb/icb.do?keyword=k11447&tabgroupid=icb.tabgroup17715.

  9. Screenshot of http://dvn.iq.harvard.edu/dvn/dv/t3 taken on October 22, 2008, on file with author.

  10. Screenshot of http://dvn.iq.harvard.edu/dvn/dv/t3 taken on March 27, 2009, on file with author. Webpage remains unchanged as of April 29, 2009.

  11. Screenshot of http://dvn.iq.harvard.edu/dvn/dv/t3 taken on November 1, 2009, on file with author. As of May 29, 2010, this message remains in place.

  12. Facebook allows users to control access to their profiles based on variables such as “Friends only”, or those in their “Network” (such as the Harvard network), or to “Everyone”. Thus, a profile might not be discoverable or viewable to someone outside the boundaries of the access setting.

  13. Simply stripping names from records is rarely a sufficient means to keep a dataset anonymous. For example, Latanya Sweeny has shown that 87 percent of Americans could be identified by records listing solely their birth date, gender and ZIP code (Sweeney 2002).

  14. See, for example, the California Senate Bill 1386, http://info.sen.ca.gov/pub/01-02/bill/sen/sb_1351-1400/sb_1386_bill_20020926_chaptered.html.

  15. European Union Data Protection Directive 95/46/EC, http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046:EN:HTML.

  16. http://www.fas.harvard.edu/~research/hum_sub/.

  17. Attempts to obtain information about the IRB deliberations with regard to the T3 project have been unsuccessful.

  18. This section is intended as an informal analysis of the discourse used when talking about the T3 project. It is meant to reveal gaps in broader understanding of the issues at hand, and not necessarily directed against a particular speaker.

  19. After the T3 research project was funded and well underway, Kaufman became a fellow at the Berkman Center for Internet & Society at Harvard University, an organization dedicated to studying a number of Internet-related issues, including privacy. While Kaufman presented preliminary results of his research to the Berkman community prior to joining the center (Kaufman 2008a), there is no evidence that others at Berkman were consulted prior to the release of the T3 dataset.

  20. I thank an anonymous reviewer for suggesting this organizing framework.

  21. See, for example, the United States Federal Trade Commission’s Fair Information Practice Principles (http://www.ftc.gov/reports/privacy3/fairinfo.shtm), which include “Access” as a key provision, providing data subjects the ability to view and contesting inaccurate or incomplete data.

  22. See Part 46 Protection of Human Subjects of Title 45 Public Welfare of the Code of Federal Regulations at http://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.htm.

  23. See, for example, the “Internet Research Ethics: Discourse, Inquiry, and Policy” research project directed by Elizabeth Buchanan and Charles Ess (http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0646591).

  24. An important movement in this direction is the recently funded “Internet Research and Ethics 2.0: The Internet Research Ethics Digital Library, Interactive Resource Center, and Online Ethics Advisory Board” project, also directly by Elizabeth Buchanan and Charles Ess (http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0924604 and http://www.internetresearchethics.org/).

References

Download references

Acknowledgments

The author thanks the participants at the International Conference of Computer Ethics: Philosophical Enquiry in Corfu, Greece, as well as the Internet Research 10: Internet Critical conference in Milwaukee, Wisconsin, for their helpful comments and feedback. Additional thanks to Elizabeth Buchanan, Charles Ess, Alex Halavais, Anthony Hoffmann, Jon Pincus, Adam Shostack, and Fred Stutzman for their valuable insights and conversations, both online and off. The author also thanks the anonymous reviewers for their helpful suggestions and criticisms. This article would not have been possible without the research assistance of Wyatt Ditzler and Renea Drews. Finally, I would like to thank Jason Kaufman and Colin McKay at the Berkman Center for Internet & Society, for their valued and continued feedback regarding this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Zimmer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zimmer, M. “But the data is already public”: on the ethics of research in Facebook. Ethics Inf Technol 12, 313–325 (2010). https://doi.org/10.1007/s10676-010-9227-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10676-010-9227-5

Keywords

Navigation