ABSTRACT
Web pages include extraneous material that may be viewed as undesirable by a user. Increasingly many Web sites also require users to register to access either all or portions of the site. Such tension between content owners and users has resulted in a "cat and mouse" game between content provided and how users access it.We carried out a measurement-based study to understand the nature of extraneous content and its impact on performance as perceived by users. We characterize how this content is distributed and the effectiveness of blocking mechanisms to stop it as well as countermeasures taken by content owners to negate such mechanisms. We also examine sites that require some form of registration to control access and the attempts made to circumvent it.Results from our study show that extraneous content exists on a majority of popular pages and that a 25-30% reduction in downloaded objects and bytes with corresponding latency reduction can be attained by blocking such content. The top ten advertisement delivering companies delivered 40% of all URLs matched as ads in our study. Both the server name and the remainder of the URL are important in matching a URL as an ad. A majority of popular sites require some form of registration and for such sites users can obtain an account from a shared public database. We discuss future measures and countermeasures on the part of each side.
- Adblock. http://adblock.mozdev.org/.Google Scholar
- Alexa: Most popular web sites. http://www.alexa.com/.Google Scholar
- L. Bent and G. Voelker. Whole page performance. In Proceedings of the 7th International Workshop on Web Content Caching and Distribution, Boulder, Colorado, August 2002.Google Scholar
- bugmenot.com, tell everyone you know. http://www.bugmenot.com/.Google Scholar
- E. Cohen and B. Krishnamurthy. A short walk in the Blogistan. Computer Networks, Spring 2005. Google ScholarDigital Library
- Official home of Filterset.G. http://www.pierceive.com/.Google Scholar
- O. Kharif. Why web publishers fear a little sharing. BusinessWeek Online. October 19 2004 http://www.businessweek.com/bwdaily/dnflash/oct2004/nf20041019_9800_db016.htm.Google Scholar
- B. Krishnamurthy and C. E. Wills. Analyzing Factors that influence end-to-end Web performance. In Proceedings of the WWW, pages 17--32, May 2000. Google ScholarDigital Library
- B. Krishnamurthy and C. E. Wills. Improving Web Performance by Client Characterization Driven Server Adaptation. In Proceedings of the WWW, May 2002. Google ScholarDigital Library
- R. Kumar, J. Novak, and P. Raghavan. On the Bursty Evolution of Blogspace. In Proceedings of the WWW, 2003. Google ScholarDigital Library
- R. Metz. We don't need no stinkin' login. WIRED Magazine, July 2004 http://www.wired.com/news/print/0,1294,64270,00.html.Google Scholar
- Planetlab. http://www.planet-lab.org/.Google Scholar
- 100 of the most popular political websites on the net. http://www.rightwingnews.com/special/rank.php.Google Scholar
- Webwasher classic. http://www.cyberguard.com/products/webwasher.Google Scholar
Index Terms
- Cat and mouse: content delivery tradeoffs in web access
Recommendations
Generating a privacy footprint on the internet
IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurementAs a follow up to characterizing traffic deemed as unwanted by Web clients such as advertisements, we examine how information related to individual users is aggregated as a result of browsing seemingly unrelated Web sites. We examine the privacy ...
Review: A survey on solutions and main free tools for privacy enhancing Web communications
Concern for privacy when users are surfing on the Web has increased recently. Nowadays, many users are aware that when they are accessing Web sites, these Web sites can track them and create profiles on the elements they access, the advertisements they ...
Identifying Webbrowsers in Encrypted Communications
WPES '14: Proceedings of the 13th Workshop on Privacy in the Electronic SocietyWebbrowser fingerprinting is a powerful tool to identify an Internet end-user. Previous research has shown that the information extracted from webbrowsers can uniquely identify an end-user. To collect webbrowser specific information, intentional ...
Comments