Article

Cantina: a content-based approach to detecting phishing web sites

Authors:
Yue Zhang

University of Pittsburgh

University of Pittsburgh
View Profile

,
Jason I. Hong

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Lorrie F. Cranor

Carnegie Mellon University

Carnegie Mellon University
View Profile

WWW '07: Proceedings of the 16th international conference on World Wide WebMay 2007Pages 639–648https://doi.org/10.1145/1242572.1242659

Published:08 May 2007Publication History

WWW '07: Proceedings of the 16th international conference on World Wide Web

Pages 639–648

ABSTRACT

Phishing is a significant problem involving fraudulent email and web sites that trick unsuspecting users into revealing private information. In this paper, we present the design, implementation, and evaluation of CANTINA, a novel, content-based approach to detecting phishing web sites, based on the TF-IDF information retrieval algorithm. We also discuss the design and evaluation of several heuristics we developed to reduce false positives. Our experiments show that CANTINA is good at detecting phishing sites, correctly labeling approximately 95% of phishing sites.

References

3Sharp, 3Sharp Study finds Internet Explorer 7 Edges Out Netcraft As Most Accurate for Anti-Phishing Protection. 2006. http://www.3sharp.com/projects/antiphishing/Google Scholar
Anti-Phishing Working Group, Phishing Activity Trends Report. 2006. http://www.antiphishing.org/reports/apwg_report_june_06.pdfGoogle Scholar
Anti-Phishing Working Group (APWG). Visited: Nov 20, 2006. http://www.antiphishing.org/Google Scholar
Chou, N., R. Ledesma, Y. Teraguchi, D. Boneh, and J. C. Mitchell. Client-Side Defense against Web-Based Identity Theft. In Proceedings of The 11th Annual Network and Distributed System Security Symposium (NDSS '04). http://crypto.stanford.edu/SpoofGuard/webspoof.pdfGoogle Scholar
Cloudmark Inc. Visited: Nov 20, 2006. http://www.cloudmark.com/desktop/download/Google Scholar
Cranor, L., S. Egelman, J. Hong, and Y. Zhang. Phinding Phish: Evaluating Anti-Phishing Tools. In Proceedings of The 14th Annual Network and Distributed System Security Symposium (NDSS '07). February 28- March 2, 2007.Google Scholar
Dao, T., Term frequency-Inverse document frequency implementation in C#, The Code Project - C# Programming. Visited: Nov 20, 2006. http://www.codeproject.com/csharp/tfidf.aspGoogle Scholar
Dhamija, R. and J.D. Tygar. The battle against phishing: Dynamic Security Skins. In Proceedings of the First Symposium on Usable Privacy and Security (SOUPS 2005). pp. 77--88 2005. Google ScholarDigital Library
Dhamija, R., J. D. Tygar, and M. Hearst. Why Phishing Works. In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI2006), pp. 581--590, April 2006. Google ScholarDigital Library
Downs, J. S., M. B. Holbrook, and L. F. Cranor. Decision strategies and susceptibility to phishing. In Proceedings of the Second Symposium on Usable Privacy and Security (SOUPS 2006). pp. 79--90 2006. Google ScholarDigital Library
eBay Inc., Spoof Email Tutorial. Visited: Nov 20, 2006. http://pages.ebay.com/education/spooftutorial/Google Scholar
eBay Inc., Using eBay Toolbar's Account Guard. Visited: Nov 20, 2006. http://pages.ebay.com/help/confidence/account-guard.htmlGoogle Scholar
Federal Trade Commission, An E-Card for You game. Visited: Nov 20, 2006. http://www.ftc.gov/bcp/conline/ecards/phishing/index.htmlGoogle Scholar
Federal Trade Commission, Federal Trade Commission. Phishing Alerts. Visited: Nov 20, 2006. http://www.ftc.gov/bcp/conline/pubs/alerts/phishingalrt.htmGoogle Scholar
Ferguson, A. J., Fostering E-Mail Security Awareness: The West Point Carronade, EDUCASE Quarterly, 2005. http://www.educause.edu/ir/library/pdf/eqm0517.pdfGoogle Scholar
Fette, I., N. Sadeh, and A. Tomasic. Learning to Detect Phishing Emails. ISRI Technical Report. CMU-ISRI-06-112, 2006.http://reports-archive.adm.cs.cmu.edu/anon/isri2006/abstracts/06-112.htmlGoogle ScholarCross Ref
Gabber, E., P. B. Gibbons, Y. Matias, and A. J. Mayer. How to make personalized web browsing simple, secure, and anonymous. In Proceedings of Financial Cryptography. pp. 17--32 1997. Google ScholarDigital Library
GeoTrust Inc., TrustWatch Toolbar. Visited: Nov 20, 2006. http://toolbar.trustwatch.com/tour/v3ie/toolbar-v3ie-tour-overview.htmlGoogle Scholar
Google Inc., Google Safe Browsing for Firefox. Visited: Nov 20, 2006. http://www.google.com/tools/firefox/safebrowsing/Google Scholar
Halderman, J. A., B. Waters, and E. W. Felten. A Convenient Method for Securely Managing Passwords. In Proceedings of 14th International World Wide Web Conference, 2005. Google ScholarDigital Library
Herzberg, A. and A. Gbara, TrustBar: Protecting (even Naive) Web Users from Spoofing and Phishing Attacks. 2004, Cryptology ePrint Archive: Report 2004/155. http://www.cs.biu.ac.il/~herzbea/Papers/ecommerce/spoofing.htmGoogle Scholar
Jackson, J. W., A. J. Ferguson, and M. J. Cobb. Building a University-wide Automated Information Assurance Awareness Exercise: The West Point Carronade. In Proceedings of 35th ASEE/IEEE Frontiers in Education Conference 2005. http://fie.engrng.pitt.edu/fie2005/papers/1694.pdfGoogle Scholar
Jagatic, T., N. Johnson, M. Jakobsson, and F. Menczer, Social Phishing, 2006, http://www.indiana.edu/~phishing/social-network-experiment/phishing-preprint.pdfGoogle Scholar
Keizer, G., Phishing Costs Nearly $1 Billion, TechWeb Technology News. Visited: Nov 20, 2006. http://www.techweb.com/wire/security/164902671Google Scholar
Kumaraguru, P., Y. W. Rhee, A. Acquisti, L. Cranor, and J. Hong. Protecting People from Phishing: The Design and Evaluation of an Embedded Training Email System. In Proceedings of CHI2007. Google ScholarDigital Library
Mail Frontier, Phishing IQ. Visited: Nov 20, 2006. http://survey.mailfrontier.com/survey/quiztest.htmlGoogle Scholar
McMillan, R., Gartner: Consumers to lose $2.8 billion to phishers in 2006, NetworkWorld, 2006. http://www.networkworld.com/news/2006/110906-gartner-consumers-to-lose-28b.htmlGoogle Scholar
Microsoft, Consumer Awareness Page on Phishing. Visited: Nov 20, 2006. http://www.microsoft.com/athome/security/ email/phishing.mspxGoogle Scholar
Netcraft, Netcraft Anti-Phishing Toolbar. Visited: Nov 20, 2006. http://toolbar.netcraft.com/Google Scholar
New York State Office of Cyber Security & Critical Infrastructure Coordination. 2005. Gone Phishing & A Briefing on the Anti-Phishing Exercise Initiative for New York State Government. Aggregate Exercise Results for public release.Google Scholar
Panahy, A., Google Parser, The Code Project - C# Programming. Visited: Nov 20, 2006. http://www.codeproject.com/csharp/googleparser.aspGoogle Scholar
Phelps, T. A. and R. Wilensky, Robust Hyperlinks and Locations, D-Lib Magazine, vol. 6(7/8), 2000. http://www.dlib.org/dlib/july00/wilensky/07wilensky.htmlGoogle Scholar
PhishTank. Visited: Nov 20, 2006. http://www.phishtank.com/Google Scholar
PhishTank, Statistics about Phishing Activity and PhishTank Usage. Visited: Nov 20, 2006. http://www.phishtank.com/stats/2006/10/Google Scholar
Salton, G. and M. J. McGill, Introduction to Modern Information Retrieval. New York, NY: McGraw-Hill, 1986. Google ScholarDigital Library
Stanford Applied Crypto Group, PwdHash. Visited: Nov 20, 2006. http://crypto.stanford.edu/PwdHashGoogle Scholar
Wu, M., R. Miller, and S. Garfinkel. Do Security Toolbars Actually Prevent Phishing Attacks? In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI2006), CHI Letters 8(1). Quebec, Canada: ACM Press. pp. 601--610, April 2006. Google ScholarDigital Library
Wu, M., R. C. Miller, and G. Little. Web Wallet: Preventing Phishing Attacks by Revealing User Intentions. In Proceedings of The Second Symposium on Usable Privacy and Security (SOUPS 2006). pp. 102--113 2006. Google ScholarDigital Library
Ye, Z., S. Smith, and D. Anthony, Trusted paths for browsers. ACM Transactions on Information and System Security 2005. 8(2): p. 153--186. Google ScholarDigital Library
Yee, K. -P. and K. Sitaker. Passpet: Convenient Password Management and Phishing Protection. In Proceedings of The Second Symposium on Usable Privacy and Security (SOUPS 2006). pp. 32--43 2006. Google ScholarDigital Library
Zolnikov, P., Extending Explorer with Band Objects using.NET and Windows Forms, The Code Project - C# Programming. Visited: Nov 20, 2006. http://www.codeproject.com/csharp/dotnetbandobjects.aspGoogle Scholar

Index Terms

Cantina: a content-based approach to detecting phishing web sites
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Itrustpage: a user-assisted anti-phishing tool
EuroSys '08

Despite the many solutions proposed by industry and the research community to address phishing attacks, this problem continues to cause enormous damage. Because of our inability to deter phishing attacks, the research community needs to develop new ...
Read More
Itrustpage: a user-assisted anti-phishing tool
Eurosys '08: Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008

Despite the many solutions proposed by industry and the research community to address phishing attacks, this problem continues to cause enormous damage. Because of our inability to deter phishing attacks, the research community needs to develop new ...
Read More
Automatic Detection of Phishing Target from Phishing Webpage
ICPR '10: Proceedings of the 2010 20th International Conference on Pattern Recognition

An approach to identification of the phishing target of a given (suspicious) webpage is proposed by clustering the webpage set consisting of its all associated webpages and the given webpage itself. We first find its associated webpages, and then ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
General Chairs:
Carey Williamson
University of Calgary, Canada
,
Mary Ellen Zurko
IBM, USA
,
Program Chairs:
Peter Patel-Schneider
Bell Labs Research, USA
,
Prashant Shenoy
University of Massachusetts at Amherst, USA
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 May 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
TF-IDF
anti-phishing
evaluation
phishing
toolbar
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Upcoming Conference
WWW '24

Sponsor:

sigweb

The ACM Web Conference 2024

May 13 - 17, 2024

Singapore , Singapore
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 510
  Total Citations
  View Citations
- 3,594
  Total Downloads
- Downloads (Last 12 months)170
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.