research-article

Visual Search at Pinterest

Authors:
Yushi Jing

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

,
David Liu

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

,
Dmitry Kislyuk

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

,
Andrew Zhai

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

,
Jiajing Xu

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

,
Jeff Donahue

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

,
Sarah Tavel

Pinterest, San Francisco, CA, USA

Pinterest, San Francisco, CA, USA
View Profile

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2015Pages 1889–1898https://doi.org/10.1145/2783258.2788621

Published:10 August 2015Publication History

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1889–1898

ABSTRACT

We demonstrate that, with the availability of distributed computation platforms such as Amazon Web Services and open-source tools, it is possible for a small engineering team to build, launch and maintain a cost-effective, large-scale visual search system. We also demonstrate, through a comprehensive set of live experiments at Pinterest, that content recommendation powered by visual search improves user engagement. By sharing our implementation details and learnings from launching a commercial visual search engine from scratch, we hope visual search becomes more widely incorporated into today's commercial applications.

Supplemental Material

p1889.mp4

mp4

273.6 MB

Download

References

S. Bengio, J. Dean, D. Erhan, E. Ie, Q. V. Le, A. Rabinovich, J. Shlens, and Y. Singer. Using web co-occurrence statistics for improving image categorization. CoRR, abs/1312.5697, 2013.Google Scholar
T. L. Berg, A. C. Berg, J. Edwards, M. Maire, R. White, Y.-W. Teh, E. Learned-Miller, and D. A. Forsyth. Names and faces in the news. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pages 848--854, 2004. Google ScholarDigital Library
K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In British Machine Vision Conference, 2014.Google ScholarCross Ref
M. Cheng, N. Mitra, X. Huang, P. H. S. Torr, and S. Hu. Global contrast based salient region detection. Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2014.Google Scholar
R. Datta, D. Joshi, J. Li, and J. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Survey, 40(2):5:1--5:60, May 2008. Google ScholarDigital Library
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. CoRR, abs/1310.1531, 2013.Google Scholar
D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov. Scalable object detection using deep neural networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23--28, 2014, pages 2155--2162, 2014. Google ScholarDigital Library
P. F. Felzenszwalb, R. B. Girshick, and D. A. McAllester. Cascade object detection with deformable part models. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2241--2248, 2010.Google ScholarCross Ref
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint arXiv:1311.2524, 2013. Google ScholarDigital Library
K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), pages 346--361. Springer, 2014.Google ScholarCross Ref
V. Jagadeesh, R. Piramuthu, A. Bhardwaj, W. Di, and N. Sundaresan. Large scale visual recommendations from street fashion images. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (SIGKDD), 14, pages 1925--1934, 2014. Google ScholarDigital Library
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.Google Scholar
Y. Jing and S. Baluja. Visualrank: Applying pagerank to large-scale image search. IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 30(11):1877--1890, 2008. Google ScholarDigital Library
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1725--1732, 2014. Google ScholarDigital Library
A. Krizhevsky, S. Ilya, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS), pages 1097--1105. 2012.Google ScholarDigital Library
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Comput., 1(4):541--551, Dec. 1989. Google ScholarDigital Library
S. Liu, Z. Song, M. Wang, C. Xu, H. Lu, and S. Yan. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. Google ScholarDigital Library
J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. arXiv preprint arXiv:1411.4038, 2014.Google Scholar
M. Muja and D. G. Lowe. Fast matching of binary features. In Proceedings of the Conference on Computer and Robot Vision (CRV), 12, pages 404--410, Washington, DC, USA, 2012. IEEE Computer Society. Google ScholarDigital Library
H. Müller, W. Müller, D. M. Squire, S. Marchand-Maillet, and T. Pun. Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recognition Letter, 22(5):593--601, 2001. Google ScholarDigital Library
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. ImageNet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575, 2014.Google Scholar
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.Google Scholar
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.Google Scholar
K. Yamaguchi, M. H. Kiapour, L. Ortiz, and T. Berg. Retrieving similar styles to parse clothing. Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2014.Google Scholar
Q. Yan, L. Xu, J. Shi, and J. Jia. Hierarchical saliency detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 13, pages 1155--1162, Washington, DC, USA, 2013. Google ScholarDigital Library

Index Terms

Visual Search at Pinterest
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

Understanding Behaviors that Lead to Purchasing: A Case Study of Pinterest
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Online e-commerce applications are becoming a primary vehicle for people to find, compare, and ultimately purchase products. One of the fundamental questions that arises in e-commerce is to characterize, understand, and model user long-term purchasing ...
Read More
Learning a Unified Embedding for Visual Search at Pinterest
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

At Pinterest, we utilize image embeddings throughout our search and recommendation systems to help our users navigate through visual content by powering experiences like browsing of related content and searching for exact products for shopping. In this ...
Read More
PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Latent user representations are widely adopted in the tech industry for powering personalized recommender systems. Most prior work infers a single high dimensional embedding to represent a user, which is a good starting point but falls short in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
computer vision
deep learning
distributed systems
information retrieval
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 78
  Total Citations
  View Citations
- 1,580
  Total Downloads
- Downloads (Last 12 months)111
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Visual Search at Pinterest

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Understanding Behaviors that Lead to Purchasing: A Case Study of Pinterest

Learning a Unified Embedding for Visual Search at Pinterest

PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest