ABSTRACT
Despite the successful use of local image features for large-scale object recognition, they are not effective in recognizing book spines on bookshelves. This is because some book spines contain only text components that do not yield distinguishing image features. To overcome this issue, we develop a new approach that combines a text-based spine recognition pipeline with an image feature-based spine recognition pipeline. The text within the book spine image is recognized and used as keywords to search a book spine text database. The image features of the book spine image are searched through a book spine image database. The search results of the two approaches are then carefully combined to form the final result. We implement the proposed hybrid book recognition pipeline used in a book inventory management system, and conduct extensive experiments to evaluate its performance. The experimental results show that while text-based or image feature-based systems only achieve a recall of 72%, the proposed hybrid system achieves a recall of ~91%.
- H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool. Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 2008. Google ScholarDigital Library
- D. Chen, S. Tsai, K.-H. Kim, C.-H. Hsu, J. P. Singh, and B. Girod. Low-cost asset tracking using location-aware camera phones. Number 1, San Diego, California, USA, 2010.Google Scholar
- D. M. Chen, S. S. Tsai, B. Girod, C.-H. Hsu, K.-H. Kim, and J. P. Singh. Building book inventories using smartphones. In Proc. ACM Multimedia (MM'10'), MM '10, Firenze, Italy, 2010. ACM. Google ScholarDigital Library
- H. Chen, S. S. Tsai, G. Schroth, D. M. Chen., V. Chandrasekhar, G. Takacs, R. Vedantham, R. Grzeszczuk, and B. Girod. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In International Conference on Image Processing, 2011.Google ScholarCross Ref
- D. Crasto, A. Kale, and C. Jaynes. The smart bookshelf: A study of camera projector scene augmentation of an everyday environment. In Proc. IEEE Workshop on Applications of Computer Vision (WACV'05), Breckenridge, CO, January 2005. Google ScholarDigital Library
- M. Fischler and R. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 1981. Google ScholarDigital Library
- D. Lee, Y. Chang, J. Archibald, and C. Pitzak. Matching book-spine images for library shelf-reading process automation. In Proc. IEEE International Conference on Automation Science and Engineering (CASE'08), Arlington, VA, September 2008.Google ScholarCross Ref
- M. Loechtefeld, S. Gehring, J. Schoening, and A. Krueger. Shelftorchlight: Augmenting a shelf using a camera projector unit. UBIProjection 2010 - Workshop on Personal Projection, 2010.Google Scholar
- K. Matsushita, D. Iwai, and K. Sato. Interactive bookshelf surface for in situ book searching and storing support. In Proceedings of the 2nd Augmented Human International Conference, New York, NY, USA, 2011. Google ScholarDigital Library
- D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06), New York, NY, June 2006. Google ScholarDigital Library
- J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08),Anchorage, AL, June 2008.Google ScholarCross Ref
- N. Quoc and W. Choi. A framework for recognition books on bookshelves. In Proc. International Conference on Intelligent Computing (ICIC'09), Ulsan, Korea, September 2009. Google ScholarDigital Library
- G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 1988. Google ScholarDigital Library
- I. H. Witten, A. Moffat, and T. C. Bell. Managing gigabytes: Compressing and indexing documents and images. 1999. Google ScholarDigital Library
- T. Yeh and B. Katz. Searching documentation using text, ocr, and image. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, 2009. Google ScholarDigital Library
Index Terms
- Combining image and text features: a hybrid approach to mobile book spine recognition
Recommendations
Combining intra-image and inter-class semantics for consumer image retrieval
Unconstrained consumer photos pose great challenge for content-based image retrieval. Unlike professional images or domain-specific images, consumer photos vary significantly. More often than not, the objects in the photos are ill-posed, occluded, and ...
The effect of low-level image features on pseudo relevance feedback
Relevance feedback (RF) is a technique popularly used to improve the effectiveness of traditional content-based image retrieval systems. However, users must provide relevant and/or irrelevant images as feedback for their queries, which is a tedious ...
Multimodal Image Retrieval Based on Keywords and Low-Level Image Features
Semantic Keyword-Based Search on Structured Data SourcesAbstractImage retrieval approaches dealing with the complex problem of image search and retrieval in very large image datasets proposed so far can be roughly divided into those that use text descriptions of images (text-based image retrieval) and those ...
Comments