skip to main content
10.1145/1989323.1989331acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

CrowdDB: answering queries with crowdsourcing

Published:12 June 2011Publication History

ABSTRACT

Some queries cannot be answered by machines only. Processing such queries requires human input for providing information that is missing from the database, for performing computationally difficult functions, and for matching, ranking, or aggregating results based on fuzzy criteria. CrowdDB uses human input via crowdsourcing to process queries that neither database systems nor search engines can adequately answer. It uses SQL both as a language for posing complex queries and as a way to model data. While CrowdDB leverages many aspects of traditional database systems, there are also important differences. Conceptually, a major change is that the traditional closed-world assumption for query processing does not hold for human input. From an implementation perspective, human-oriented query operators are needed to solicit, integrate and cleanse crowdsourced data. Furthermore, performance and cost depend on a number of new factors including worker affinity, training, fatigue, motivation and location. We describe the design of CrowdDB, report on an initial set of experiments using Amazon Mechanical Turk, and outline important avenues for future work in the development of crowdsourced query processing systems.

References

  1. Pictures of the Golden Gate Bridge retrieved from Flickr by akaporn, Dawn Endico, devinleedrew, di_the_huntress, Geoff Livingston, kevincole, Marc\_Smith, and superstrikertwo under the Creative Commons Attribution 2.0 Generic license.Google ScholarGoogle Scholar
  2. Amazon. AWS Case Study: Smartsheet, 2006.Google ScholarGoogle Scholar
  3. Amazon Mechanical Turk. http://www.mturk.com, 2010.Google ScholarGoogle Scholar
  4. S. Amer-Yahia et al. Crowds, Clouds, and Algorithms: Exploring the Human Side of "Big Data" Applications. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Armbrust et al. PIQL: A Performance Insightful Query Language. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. S. Bernstein et al. Soylent: A Word Processor with a Crowd Inside. In ACM SUIST, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. J. Carey and D. Kossmann. On saying "Enough already!" in SQL. SIGMOD Rec., 26(2):219--230, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. S. Chawathe et al. The TSIMMIS Project: Integration of Heterogeneous Information Sources. In IPSJ, 1994.Google ScholarGoogle Scholar
  9. K. Chen et al. USHER: Improving Data Quality with Dynamic Forms. In ICDE, pages 321--332, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. A. Doan, R. Ramakrishnan, and A. Halevy. Crowdsourcing Systems on the World-Wide Web. CACM, 54:86--96, Apr. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. M. Haas et al. Optimizing Queries Across Diverse Data Sources. In VLDB, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. M. Hellerstein et al. Adaptive Query Processing: Technology in Evolution. IEEE Data Eng. Bull., 2000.Google ScholarGoogle Scholar
  13. J. M. Hellerstein and J. F. Naughton. Query Execution Techniques for Caching Expensive Methods. In SIGMOD, pages 423--434, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Huang et al. Toward Automatic Task Design: A Progress Report. In HCOMP, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. http://hdl.handle.net/2451/29801, 2010.Google ScholarGoogle Scholar
  16. P. G. Ipeirotis. Mechanical Turk, Low Wages, and the Market for Lemons. http://behind-the-enemy-lines.blogspot.com/2010/07/ mechanical-turk-low-wages-and-market.html, 2010.Google ScholarGoogle Scholar
  17. A. G. Kleppe, J. Warmer, and W. Bast. MDA Explained: The Model Driven Architecture: Practice and Promise. Addison-Wesley, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Little. How many turkers are there? http://groups.csail.mit.edu/uid/deneme/?p=502, 2009.Google ScholarGoogle Scholar
  19. G. Little et al. TurKit: Tools for Iterative Tasks on Mechanical Turk. In HCOMP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Marcus et al. Crowdsourced Databases: Query Processing with People. In CIDR, 2011.Google ScholarGoogle Scholar
  21. Microsoft. Table Column Properties (SQL Server), 2008.Google ScholarGoogle Scholar
  22. A. Parameswaran et al. Human-Assisted Graph Search: It's Okay to Ask Questions. In VLDB, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Parameswaran and N. Polyzotis. Answering Queries using Humans, Algorithms and Databases. In CIDR, 2011.Google ScholarGoogle Scholar
  24. J. Ross et al. Who are the Crowdworkers? Shifting Demographics in Mechanical Turk. In CHI EA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Schall, S. Dustdar, and M. B. Blake. Programming Human and Software-Based Web Services. Computer, 43(7):82--85, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Turker Nation. http://www.turkernation.com/, 2010.Google ScholarGoogle Scholar
  27. Turkopticon. http://turkopticon.differenceengines.com/, 2010.Google ScholarGoogle Scholar
  28. T. Yan, V. Kumar, and D. Ganesan. CrowdSearch: Exploiting Crowds for Accurate Real-time. Image Search on Mobile Phones. In MobiSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. CrowdDB: answering queries with crowdsourcing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
      June 2011
      1364 pages
      ISBN:9781450306614
      DOI:10.1145/1989323

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 June 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader