skip to main content
10.1145/2976749.2978313acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Public Access

Online Tracking: A 1-million-site Measurement and Analysis

Published:24 October 2016Publication History

ABSTRACT

We present the largest and most detailed measurement of online tracking conducted to date, based on a crawl of the top 1 million websites. We make 15 types of measurements on each site, including stateful (cookie-based) and stateless (fingerprinting-based) tracking, the effect of browser privacy tools, and the exchange of tracking data between different sites ("cookie syncing"). Our findings include multiple sophisticated fingerprinting techniques never before measured in the wild. This measurement is made possible by our open-source web privacy measurement tool, OpenWPM, which uses an automated version of a full-fledged consumer browser. It supports parallelism for speed and scale, automatic recovery from failures of the underlying browser, and comprehensive browser instrumentation. We demonstrate our platform's strength in enabling researchers to rapidly detect, quantify, and characterize emerging online tracking behaviors.

References

  1. G. Acar, C. Eubank, et al. The web never forgets: Persistent tracking mechanisms in the wild. In Proceedings of CCS. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Acar, M. Juarez, et al. FPDetective: dusting the web for fingerprinters. In Proceedings of CCS. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. A. Adamic and B. A. Huberman. Zipf's law and the internet. Glottometrics, 3(1):143--150, 2002.Google ScholarGoogle Scholar
  4. H. C. Altaweel I, Good N. Web privacy census. Technology Science, 2015.Google ScholarGoogle Scholar
  5. J. Angwin. What they know. The Wall Street Journal. http://online.wsj.com/public/page/what-they-know-digital-privacy.html, 2012.Google ScholarGoogle Scholar
  6. M. Ayenson, D. J. Wambach, et al. Flash cookies and privacy II: Now with HTML5 and ETag respawning. World Wide Web Internet And Web Information Systems, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  7. P. E. Black. Ratcliff/Obershelp pattern recognition. http://xlinux.nist.gov/dads/HTML/ratcliffObershelp.html, Dec. 2004.Google ScholarGoogle Scholar
  8. Bugzilla. WebRTC Internal IP Address Leakage. https://bugzilla.mozilla.org/show_bug.cgi?id=959893.Google ScholarGoogle Scholar
  9. A. Datta, M. C. Tschantz, et al. Automated experiments on ad privacy settings. Privacy Enhancing Technologies, 2015.Google ScholarGoogle Scholar
  10. P. Eckersley. How unique is your web browser? In Privacy Enhancing Technologies. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Electronic Frontier Foundation. Encrypting the Web. https://www.eff.org/encrypt-the-web.Google ScholarGoogle Scholar
  12. S. Englehardt, D. Reisman, et al. Cookies that give you away: The surveillance implications of web tracking. In 24th International Conference on World Wide Web, pp. 289--299. International World Wide Web Conferences Steering Committee, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Federal Trade Commission. Google will pay$22.5 million to settle FTC charges it misrepresented privacy assurances to users of Apple's Safari internet browser. https://www.ftc.gov/news-events/press-releases/2012/08/google-will-pay-225-million-settle-ftc-charges-it-misrepresented, 2012.Google ScholarGoogle Scholar
  14. D. Fifield and S. Egelman. Fingerprinting web users through font metrics. In Financial Cryptography and Data Security, pp. 107--124. Springer, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  15. N. Fruchter, H. Miao, et al. Variations in tracking in relation to geographic location. In Proceedings of W2SP. 2015.Google ScholarGoogle Scholar
  16. C. J. Hoofnagle and N. Good. Web privacy census. Available at SSRN 2460547, 2012.Google ScholarGoogle Scholar
  17. M. Kranch and J. Bonneau. Upgrading HTTPS in midair: HSTS and key pinning in practice. In NDSS '15: The 2015 Network and Distributed System Security Symposium. February 2015.Google ScholarGoogle Scholar
  18. S. A. Krashakov, A. B. Teslyuk, et al. On the universality of rank distributions of website popularity. Computer Networks, 50(11):1769--1780, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Krishnamurthy and C. Wills. Privacy diffusion on the web: a longitudinal perspective. In Conference on World Wide Web. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Laperdrix, W. Rudametkin, et al. Beauty and the beast: Diverting modern web browsers to build unique browser fingerprints. In 37th IEEE Symposium on Security and Privacy (S&P 2016). 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Lécuyer, G. Ducoffe, et al. Xray: Enhancing the web's transparency with differential correlation. In USENIX Security Symposium. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Libert. Exposing the invisible web: An analysis of third-party http requests on 1 million websites. International Journal of Communication, 9(0), 2015. ISSN 1932--8036.Google ScholarGoogle Scholar
  23. J. R. Mayer and J. C. Mitchell. Third-party web tracking: Policy and technology. In Security and Privacy (S&P). IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. M. McDonald and L. F. Cranor. Survey of the use of Adobe Flash Local Shared Objects to respawn HTTP cookies, a. ISJLP, 7, 2011.Google ScholarGoogle Scholar
  25. K. Mowery and H. Shacham. Pixel perfect: Fingerprinting canvas in html5. Proceedings of W2SP, 2012.Google ScholarGoogle Scholar
  26. Mozilla Developer Network. Mixed content - Security. https://developer.mozilla.org/en-US/docs/Security/Mixed_content.Google ScholarGoogle Scholar
  27. C. Neasbitt, B. Li, et al. Webcapsule: Towards a lightweight forensic engine for web browsers. In Proceedings of CCS. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. N. Nikiforakis, L. Invernizzi, et al. You are what you include: Large-scale evaluation of remote javascript inclusions. In Proceedings of CCS. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. N. Nikiforakis, A. Kapravelos, et al. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In Security and Privacy (S&P). IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. F. Ocariza, K. Pattabiraman, et al. Javascript errors in the wild: An empirical study. In Software Reliability Engineering (ISSRE). IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. L. Olejnik, G. Acar, et al. The leaking battery. Cryptology ePrint Archive, Report 2015/616, 2015.Google ScholarGoogle Scholar
  32. Phantom JS. Supported web standards. http://www.webcitation.org/6hI3iptm5, 2016.Google ScholarGoogle Scholar
  33. M. Z. Rafique, T. Van Goethem, et al. It's free for a reason: Exploring the ecosystem of free live streaming services. In Network and Distributed System Security (NDSS). 2016.Google ScholarGoogle Scholar
  34. N. Robinson and J. Bonneau. Cognitive disconnect: Understanding Facebook Connect login permissions. In 2nd ACM conference on Online social networks. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. F. Roesner, T. Kohno, et al. Detecting and Defending Against Third-Party Tracking on the Web. In Symposium on Networking Systems Design and Implementation. USENIX, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. Schelter and J. Kunegis. On the ubiquity of web tracking: Insights from a billion-page web crawl. arXiv preprint arXiv:1607.07403, 2016.Google ScholarGoogle Scholar
  37. Selenium Browser Automation. Selenium faq. https://code.google.com/p/selenium/wiki/FrequentlyAskedQuestions, 2014.Google ScholarGoogle Scholar
  38. K. Singh, A. Moshchuk, et al. On the incoherencies in web browser access control policies. In Proceedings of S&P. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Soltani, S. Canty, et al. Flash cookies and privacy. In AAAI Spring Symposium: Intelligent Information Privacy Management. 2010.Google ScholarGoogle Scholar
  40. O. Starov, J. Dahse, et al. No honor among thieves: A large-scale analysis of malicious web shells. In International Conference on World Wide Web. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Z. Tollman. We're Going HTTPS: Here's How WIRED Is Tackling a Huge Security Upgrade. https://www.wired.com/2016/04/wired-launching-https-security-upgrade/, 2016.Google ScholarGoogle Scholar
  42. J. Uberti. New proposal for IP address handling in WebRTC. https://www.ietf.org/mail-archive/web/rtcweb/current/msg14494.html.Google ScholarGoogle Scholar
  43. J. Uberti and G. wei Shieh. WebRTC IP Address Handling Recommendations. https://datatracker.ietf.org/doc/draft-ietf-rtcweb-ip-handling/.Google ScholarGoogle Scholar
  44. S. Van Acker, D. Hausknecht, et al. Password meters and generators on the web: From large-scale empirical study to getting it right. In Conference on Data and Application Security and Privacy. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. Van Acker, N. Nikiforakis, et al. Flashover: Automated discovery of cross-site scripting vulnerabilities in rich internet applications. In Proceedings of CCS. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. T. Van Goethem, F. Piessens, et al. Clubbing seals: Exploring the ecosystem of third-party security seals. In Proceedings of CCS. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. W. V. Wazer. Moving the Washington Post to HTTPS. https://developer.washingtonpost.com/pb/blog/post/2015/12/10/moving-the-washington-post-to-https/, 2015.Google ScholarGoogle Scholar
  48. X. Xing, W. Meng, et al. Understanding malvertising through ad-injecting browser extensions. In 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. C. Yue and H. Wang. A measurement study of insecure javascript practices on the web. ACM Transactions on the Web (TWEB), 7(2):7, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. Zarras, A. Kapravelos, et al. The dark alleys of madison avenue: Understanding malicious advertisements. In Internet Measurement Conference. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Online Tracking: A 1-million-site Measurement and Analysis

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security
                  October 2016
                  1924 pages
                  ISBN:9781450341394
                  DOI:10.1145/2976749

                  Copyright © 2016 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 24 October 2016

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  CCS '16 Paper Acceptance Rate137of831submissions,16%Overall Acceptance Rate1,261of6,999submissions,18%

                  Upcoming Conference

                  CCS '24
                  ACM SIGSAC Conference on Computer and Communications Security
                  October 14 - 18, 2024
                  Salt Lake City , UT , USA

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader