skip to main content
10.1145/2556195.2556215acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Latent dirichlet allocation based diversified retrieval for e-commerce search

Published:24 February 2014Publication History

ABSTRACT

Diversified retrieval is a very important problem on many e-commerce sites, e.g. eBay and Amazon. Using IR approaches without optimizing for diversity results in a clutter of redundant items that belong to the same products. Most existing product taxonomies are often too noisy, with overlapping structures and non-uniform granularity, to be used directly in diversified retrieval. To address this problem, we propose a Latent Dirichlet Allocation (LDA) based diversified retrieval approach that selects diverse items based on the hidden user intents. Our approach first discovers the hidden user intents of a query using the LDA model, and then ranks the user intents by making trade-offs between their relevance and information novelty. Finally, it chooses the most representative item for each user intent to display. To evaluate the diversity in the search results on e-commerce sites, we propose a new metric, average satisfaction, measuring user satisfaction with the search results. Through our empirical study on eBay, we show that the LDA model discovers meaningful user intents and the LDA-based approach provides significantly higher user satisfaction than the eBay production ranker and three other diversified retrieval approaches.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Anagnostopoulos, A. Z. Broder, and D. Carmel. Sampling search engine results. In WWW, pages 245--256, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. M. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Bookstein. Information retrieval: A sequential learning process. Journal of the American Society for Information Science, 34(5):331--342, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and reproducing summaries. In SIGIR, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Chen and D. R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Dubey, S. Chakrabarti, and C. Bhattacharyya. Diversity in ranking via resistive graph centers. In KDD, pages 78--86, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In WWW, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. L. Griffiths and M. Steyvers. Finding scientific topics. PNAS, 101:5228--5235, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Guo and S. Sanner. Probabilistic latent maximal marginal relevance. In SIGIR, pages 833--834, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Hoffman, D. M. Blei, and F. Bach. Online learning for latent dirichlet allocation. In NIPS, pages 856--864, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Li, K. Zhou, G.-R. Xue, H. Zha, and Y. Yu. Enhancing diversity, coverage and balance for summarization through structure learning. In WWW, pages 71--80, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. R. McLaughlin and J. L. Herlocker. A collaborative filtering algorithm and evaluation metric that accurately model the user experience. In SIGIR, pages 329--336, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Q. Mei, J. Guo, and D. Radev. Divrank: the interplay of prestige and diversity in information networks. In KDD, pages 1009--1018, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. P. Putthividhya. ebay's internal report. 2011.Google ScholarGoogle Scholar
  17. F. Radlinski and S. Dumais. Improving personalized web search using result diversification. In SIGIR, pages 691--692, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In ICML, pages 784--791, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. E. Robertson. Readings in information retrieval. chapter The probability ranking principle in IR, pages 281--286. Morgan Kaufmann Publishers Inc., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Teevan, S. T. Dumais, and E. Horvitz. Characterizing the value of personalizing search. In SIGIR, pages 757--758, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Turpin and F. Scholer. User performance versus precision measures for simple search tasks. In SIGIR, pages 11--18, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S. A. Yahia. Efficient computation of diverse query results. In ICDE, pages 228--236, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Wang, H. Bai, M. Stanton, W.-Y. Chen, and E. Y. Chang. Plda: Parallel latent dirichlet allocation for large-scale applications. In AAIM, pages 301--314, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. J. Welch, J. Cho, and C. Olston. Search result diversity for informational queries. In WWW, pages 237--246, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Yue and T. Joachims. Predicting diverse subsets using structural svms. In Proceedings of the 25th international conference on Machine learning, pages 1224--1231, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Zhai, W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Zhai and J. Lafferty. A risk minimization framework for information retrieval. Information Processing and Management, 42(1):31--55, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma. Improving web search results using affinity graph. In SIGIR, pages 504--511, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR, pages 81--88, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Zhu, A. B. Goldberg, J. V. Gael, and D. Andrzejewski. Improving diversity in ranking using absorbing random walks. In HLT-NAACL, pages 97--104, 2007.Google ScholarGoogle Scholar
  31. C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. Improving recommendation lists through topic diversification. In WWW, pages 22--32, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Latent dirichlet allocation based diversified retrieval for e-commerce search

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
      February 2014
      712 pages
      ISBN:9781450323512
      DOI:10.1145/2556195

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 February 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '14 Paper Acceptance Rate64of355submissions,18%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader