Abstract
The KDD-Cup 2005 Competition was held in conjunction with the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. The task of the KDD-Cup 2005 competition was to classify 800,000 internet user search queries into 67 predefined categories. This task is easy to understand, but the lack of straightforward training set, subjective user intents of queries, poor information in short queries, and high noise level make the task very challenge.In this paper, we summarize the competition task, the evaluation method, and the results of the competition. Here we only highlight some key techniques used in submitted solutions. The technical details of the solutions from the three award winning teams are available in their papers separately in this issue of SIGKDD Explorations. At the end, we also share the results of a survey conducted with this year's Cup participants. To facilitate research in this area, the task description, data, answer set, and related information of this KDD-Cup are published at the KDD-Cup 2005 web site: http://www.acm.org/sigs/sigkdd/kdd2005/kddcup.html.
- ACM SIGKDD 2005. http://www.acm.org/sigs/sigkdd/kdd2005.Google Scholar
- ACM SIGKDD-CUP 2005. http://www.acm.org/sigs/sigkdd/kdd2005/kddcup.htmlGoogle Scholar
- MSN Search. http://search.msn.com/Google Scholar
- C. J. van Rijsbergen, Information Retrieval (Second Edition). London, U.K., 1979 Google ScholarDigital Library
- C. D. Manning and H. Schtüze. Foundations of Statistical Natural Language Processing, London, U.K., 1999, 575--608. Google ScholarDigital Library
- Wordnet. http://wordnet.princeton.edu/Google Scholar
- Wikipedia. http://www.wikipedia.org/Google Scholar
- C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large Alta Vista query log, SRC Technical Note # 1998--14.Google Scholar
- B. J. Jansen and U. Pooch. A review of web searching studies and a framework for future research. Journal of the American Society of Information Science and Technology, 53(3):235--246, 2000. Google ScholarDigital Library
- U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. Technical report, UCLA Computer Science, 2004.Google Scholar
- D. E. Rose and D. Levinson. Understanding user goals in web search. In Proc. of WWW 2004, 2004. Google ScholarDigital Library
- L. Wang, C. Wang, X. Xie, J. Forman, Y. Lu, W. Ma and Y. Li. Detecting dominant locations from search queries, In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, 2005. Google ScholarDigital Library
- J.Sun, H. Zeng, H. Liu, Y. Lu, and Z. Chen. CubeSVD: A novel approach to personalized web search. In Proceedings of the 14th international conference on World Wide Web, 2005 Google ScholarDigital Library
- J. Teevan, S. T. Dumais and E. Horvitz. Personalizing search via automated analysis of interests and activities. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, 2005. Google ScholarDigital Library
- Google personalized search. http://labs.google.com/personalized.Google Scholar
- My Yahoo! http://my.yahoo.com/?myhome.Google Scholar
Index Terms
- KDD CUP-2005 report: facing a great challenge
Recommendations
KDD Cup and workshop 2007
Special issue on visual analyticsThe KDD Cup is the oldest of the many data mining competitions that are now popular [1]. It is an integral part of the annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). In 2007, the traditional KDD Cup competition ...
Data mining methods for anomaly detection KDD-2005 workshop report
For many applications, data mining systems are required to detect anomalous (abnormal, unmodeled, or unexpected) observations. This has so far proven to be a difficult challenge because anomalies are usually considered to be "non-normal" observations, ...
KDD-2005 workshop report: Link Discovery: issues, approaches and application (LinkKDD-2005)
In this paper we provide a summary of the workshop on Link Discovery: Issues, Approaches and Applications (LinkKDD-2005) held in conjunction with ACM SIGKDD 2005, on August 21st in Chicago, Illinois, USA. We report in detail about the research issues ...
Comments