Unsupervised Relation Extraction Using Dependency Trees for Automatic Generation of Multiple-Choice Questions

Afzal, Naveed; Mitkov, Ruslan; Farzindar, Atefeh

doi:10.1007/978-3-642-21043-3_4

Naveed Afzal²¹,
Ruslan Mitkov²¹ &
Atefeh Farzindar²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6657))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

1664 Accesses
3 Citations

Abstract

In this paper, we investigate an unsupervised approach to Relation Extraction to be applied in the context of automatic generation of multiple-choice questions (MCQs). MCQs are a popular large-scale assessment tool making it much easier for test-takers to take tests and for examiners to interpret their results. Our approach to the problem aims to identify the most important semantic relations in a document without assigning explicit labels to them in order to ensure broad coverage, unrestricted to predefined types of relations. In this paper, we present an approach to learn semantic relations between named entities by employing a dependency tree model. Our findings indicate that the presented approach is capable of achieving high precision rates, which are much more important than recall in automatic generation of MCQs, and its enhancement with linguistic knowledge helps to produce significantly better patterns. The intended application for the method is an e-Learning system for automatic assessment of students’ comprehension of training texts; however it can also be applied to other NLP scenarios, where it is necessary to recognise the most important semantic relations without any prior knowledge as to their types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plaintext Collections. In: Proc. of the 5th ACM International Conference on Digital Libraries (2000)
Google Scholar
Becker, W.E., Watts, M.: Teaching methods in U.S. and undergraduate economics courses. Journal of Economics Education 32(3), 269–279 (2001)
Google Scholar
Brown, J., Frishkoff, G., Eskenazi, M.: Automatic question generation for vocabulary assessment. In: Proc. of HLT/EMNLP, Vancouver, B.C. (2005)
Google Scholar
Cohen, A.M., Hersh, W.R.: A Survey of Current Work in Biomedical Text Mining. Briefings in Bioinformatics, 57–71 (2005)
Google Scholar
Corney, D.P., Jones, D., Buxton, B., Langdon, W.: BioRAT: Extracting Biological Information from Full-length Papers. Bioinformatics, 3206–3213 (2004)
Google Scholar
Erkan, G., Ozgur, A., Radev, D.R.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proc. of CoNLL-EMNLP (2007)
Google Scholar
Greenwood, M., Stevenson, M., Guo, Y., Harkema, H., Roberts, A.: Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System. In: Proc. of the 4th Learning Language in Logic Workshop, Bonn, Germany (2005)
Google Scholar
Grover, C., Lascarides, A., Lapata, M.: A Comparison of Parsing Technologies for the Biomedical Domain. Natural Language Engineering 11(1), 27–65 (2005)
Article Google Scholar
Hasegawa, T., Sekine, S., Grishman, R.: Discovering relations among named entities from large corpora. In: Proc. of ACL 2004 (2004)
Google Scholar
Hoshino, A., Nakagawa, H.: A Real-time Multiple-choice Question Generation for Language Testing – A Preliminary Study. In: Proc. of the 43rd ACL 2005 2nd Workshop on Building Educational Applications Using Natural Language Processing, Ann Arbor, U.S., pp. 17–20 (2005)
Google Scholar
Huang, M., Zhu, X., Payan, G.D., Qu, K., Li, M.: Discovering patterns to extract protein-protein interactions from full biomedical texts. Bioinformatics, 3604–3612 (2004)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice Hall, Englewood Cliffs (2008)
Google Scholar
Katrenko, S., Adriaans, P.: Learning relations from biomedical corpora using dependency trees. In: Tuyls, K., Westra, R.L., Saeys, Y., Nowé, A. (eds.) KDECB 2006. LNCS (LNBI), vol. 4366, pp. 61–80. Springer, Heidelberg (2007)
Chapter Google Scholar
Kim, J.-D., Ohta, T., Tsujii, J.: Corpus Annotation for Mining Biomedical Events from Literature, BMC Bioinformatics (2008)
Google Scholar
Lin, D., Pantel, P.: Concept Discovery from Text. In: Proc. of Conference on CL 2002, Taipei, Taiwan, pp. 577–583 (2002)
Google Scholar
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)
MATH Google Scholar
Martin, E.P., Bremer, E., Guerin, G., DeSesa, M.-C., Jouve, O.: Analysis of Protein/Protein Interactions through Biomedical Literature: Text Mining of Abstracts vs. Text Mining of Full Text Articles, pp. 96–108. Springer, Berlin (2004)
Google Scholar
Mitkov, R., An, L.A.: Computer-aided generation of multiple-choice tests. In: Proc. of the HLT/NAACL 2003 Workshop on Building educational applications using Natural Language Processing, Edmonton, Canada, pp. 17–22 (2003)
Google Scholar
Mitkov, R., Ha, L.A., Karamanis, N.: A computer-aided environment for generating multiple-choice test items. Natural Language Engineering 12(2), 177–194 (2006)
Article Google Scholar
Ono, T., Hishigaki, H., Tanigami, A., Takagi, T.: Automated Extraction of Information on Protein–Protein Interactions from the Biological Literature. Bioinformatics, 155–161 (2001)
Google Scholar
Pustejovsky, J., Casta, J., Cochran, B., Kotecki, M.: Robust relational parsing over biomedical literature: Extracting inhibit relations. In: Proc. of the 7th Annual Pacific Symposium on Bio-computing (2002)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Sekine, S.: On-Demand Information Extraction. In: Proc. of the COLING/ACL (2006)
Google Scholar
Shinyama, Y., Sekine, S.: Preemptive Information Extraction using Unrestricted Relation Discovery. In: Proc. of the HLT Conference of the North American Chapter of the ACL, New York, pp. 304–311 (2006)
Google Scholar
Stevenson, M., Greenwood, M.: A Semantic Approach to IE Pattern Induction. In: Proc. of ACL 2005, pp. 379–386 (2005)
Google Scholar
Stevenson, M., Greenwood, M.: Dependency Pattern Models for Information Extraction. Research on Language and Computation (2009)
Google Scholar
Sudo, K., Sekine, S., Grishman, R.: An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition. In: Proc. of the 41st Annual Meeting of ACL 2003, Sapporo, Japan, pp. 224–231 (2003)
Google Scholar
Sumita, E., Sugaya, F., Yamamoto, S.: Measuring non-native speakers’ proficiency of English using a test with automatically-generated fill-in-the-blank questions. In: Proc. of the 2nd Workshop on Building Educational Applications using NLP, pp. 61–68 (2005)
Google Scholar
Szpektor, I., Tanev, H., Dagan, I., Coppola, B.: Scaling Web-based acquisition of Entailment Relations. In: Proc. of EMNLP 2004, Barcelona, Spain (2004)
Google Scholar
Tapanainen, P., Järvinen, T.: A Non-Projective Dependency Parser. In: Proc. of the 5th Conference on Applied Natural Language Processing, Washington, pp. 64–74 (1997)
Google Scholar
Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., Tsujii, J.: Developing a Robust Part-of-Speech Tagger for Biomedical Text. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 382–392. Springer, Heidelberg (2005)
Chapter Google Scholar
Tsuruoka, Y., Tsujii, J.: Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data. In: Proc. of HLT/EMNLP, pp. 467–474 (2005)
Google Scholar
Wilbur, J., Smith, L., Tanabe, T.: BioCreative 2. Gene Mention Task. In: Proc. of the 2nd Bio-Creative Challenge Workshop, pp. 7–16 (2007)
Google Scholar
Zhou, G., Su, J., Shen, D., Tan, C.: Recognizing Name in Biomedical Texts: A Machine Learning Approach. Bioinformatics, 1178–1190 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Institute for Information and Language Processing (RIILP), University of Wolverhampton, Wolverhampton, UK
Naveed Afzal & Ruslan Mitkov
NLP Technologies Inc., 1255 University Street, Suite 1212, Montreal, QC, H3B 3W9, Canada
Atefeh Farzindar

Authors

Naveed Afzal
View author publications
You can also search for this author in PubMed Google Scholar
Ruslan Mitkov
View author publications
You can also search for this author in PubMed Google Scholar
Atefeh Farzindar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Regina, 3737 Wascana Parkway, Regina, S4S 0A2, Saskatchewan, Canada
Cory Butz
Department of Mathematics and Computing Science, Saint Mary’s University, B3H 3C3, Halifax, Nova Scotia, Canada
Pawan Lingras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Afzal, N., Mitkov, R., Farzindar, A. (2011). Unsupervised Relation Extraction Using Dependency Trees for Automatic Generation of Multiple-Choice Questions. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-21043-3_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21042-6
Online ISBN: 978-3-642-21043-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics