Abstract
In this paper, we investigate an unsupervised approach to Relation Extraction to be applied in the context of automatic generation of multiple-choice questions (MCQs). MCQs are a popular large-scale assessment tool making it much easier for test-takers to take tests and for examiners to interpret their results. Our approach to the problem aims to identify the most important semantic relations in a document without assigning explicit labels to them in order to ensure broad coverage, unrestricted to predefined types of relations. In this paper, we present an approach to learn semantic relations between named entities by employing a dependency tree model. Our findings indicate that the presented approach is capable of achieving high precision rates, which are much more important than recall in automatic generation of MCQs, and its enhancement with linguistic knowledge helps to produce significantly better patterns. The intended application for the method is an e-Learning system for automatic assessment of students’ comprehension of training texts; however it can also be applied to other NLP scenarios, where it is necessary to recognise the most important semantic relations without any prior knowledge as to their types.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plaintext Collections. In: Proc. of the 5th ACM International Conference on Digital Libraries (2000)
Becker, W.E., Watts, M.: Teaching methods in U.S. and undergraduate economics courses. Journal of Economics Education 32(3), 269–279 (2001)
Brown, J., Frishkoff, G., Eskenazi, M.: Automatic question generation for vocabulary assessment. In: Proc. of HLT/EMNLP, Vancouver, B.C. (2005)
Cohen, A.M., Hersh, W.R.: A Survey of Current Work in Biomedical Text Mining. Briefings in Bioinformatics, 57–71 (2005)
Corney, D.P., Jones, D., Buxton, B., Langdon, W.: BioRAT: Extracting Biological Information from Full-length Papers. Bioinformatics, 3206–3213 (2004)
Erkan, G., Ozgur, A., Radev, D.R.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proc. of CoNLL-EMNLP (2007)
Greenwood, M., Stevenson, M., Guo, Y., Harkema, H., Roberts, A.: Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System. In: Proc. of the 4th Learning Language in Logic Workshop, Bonn, Germany (2005)
Grover, C., Lascarides, A., Lapata, M.: A Comparison of Parsing Technologies for the Biomedical Domain. Natural Language Engineering 11(1), 27–65 (2005)
Hasegawa, T., Sekine, S., Grishman, R.: Discovering relations among named entities from large corpora. In: Proc. of ACL 2004 (2004)
Hoshino, A., Nakagawa, H.: A Real-time Multiple-choice Question Generation for Language Testing – A Preliminary Study. In: Proc. of the 43rd ACL 2005 2nd Workshop on Building Educational Applications Using Natural Language Processing, Ann Arbor, U.S., pp. 17–20 (2005)
Huang, M., Zhu, X., Payan, G.D., Qu, K., Li, M.: Discovering patterns to extract protein-protein interactions from full biomedical texts. Bioinformatics, 3604–3612 (2004)
Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice Hall, Englewood Cliffs (2008)
Katrenko, S., Adriaans, P.: Learning relations from biomedical corpora using dependency trees. In: Tuyls, K., Westra, R.L., Saeys, Y., Nowé, A. (eds.) KDECB 2006. LNCS (LNBI), vol. 4366, pp. 61–80. Springer, Heidelberg (2007)
Kim, J.-D., Ohta, T., Tsujii, J.: Corpus Annotation for Mining Biomedical Events from Literature, BMC Bioinformatics (2008)
Lin, D., Pantel, P.: Concept Discovery from Text. In: Proc. of Conference on CL 2002, Taipei, Taiwan, pp. 577–583 (2002)
Manning, C., SchĂĽtze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)
Martin, E.P., Bremer, E., Guerin, G., DeSesa, M.-C., Jouve, O.: Analysis of Protein/Protein Interactions through Biomedical Literature: Text Mining of Abstracts vs. Text Mining of Full Text Articles, pp. 96–108. Springer, Berlin (2004)
Mitkov, R., An, L.A.: Computer-aided generation of multiple-choice tests. In: Proc. of the HLT/NAACL 2003 Workshop on Building educational applications using Natural Language Processing, Edmonton, Canada, pp. 17–22 (2003)
Mitkov, R., Ha, L.A., Karamanis, N.: A computer-aided environment for generating multiple-choice test items. Natural Language Engineering 12(2), 177–194 (2006)
Ono, T., Hishigaki, H., Tanigami, A., Takagi, T.: Automated Extraction of Information on Protein–Protein Interactions from the Biological Literature. Bioinformatics, 155–161 (2001)
Pustejovsky, J., Casta, J., Cochran, B., Kotecki, M.: Robust relational parsing over biomedical literature: Extracting inhibit relations. In: Proc. of the 7th Annual Pacific Symposium on Bio-computing (2002)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Sekine, S.: On-Demand Information Extraction. In: Proc. of the COLING/ACL (2006)
Shinyama, Y., Sekine, S.: Preemptive Information Extraction using Unrestricted Relation Discovery. In: Proc. of the HLT Conference of the North American Chapter of the ACL, New York, pp. 304–311 (2006)
Stevenson, M., Greenwood, M.: A Semantic Approach to IE Pattern Induction. In: Proc. of ACL 2005, pp. 379–386 (2005)
Stevenson, M., Greenwood, M.: Dependency Pattern Models for Information Extraction. Research on Language and Computation (2009)
Sudo, K., Sekine, S., Grishman, R.: An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition. In: Proc. of the 41st Annual Meeting of ACL 2003, Sapporo, Japan, pp. 224–231 (2003)
Sumita, E., Sugaya, F., Yamamoto, S.: Measuring non-native speakers’ proficiency of English using a test with automatically-generated fill-in-the-blank questions. In: Proc. of the 2nd Workshop on Building Educational Applications using NLP, pp. 61–68 (2005)
Szpektor, I., Tanev, H., Dagan, I., Coppola, B.: Scaling Web-based acquisition of Entailment Relations. In: Proc. of EMNLP 2004, Barcelona, Spain (2004)
Tapanainen, P., Järvinen, T.: A Non-Projective Dependency Parser. In: Proc. of the 5th Conference on Applied Natural Language Processing, Washington, pp. 64–74 (1997)
Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., Tsujii, J.: Developing a Robust Part-of-Speech Tagger for Biomedical Text. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 382–392. Springer, Heidelberg (2005)
Tsuruoka, Y., Tsujii, J.: Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data. In: Proc. of HLT/EMNLP, pp. 467–474 (2005)
Wilbur, J., Smith, L., Tanabe, T.: BioCreative 2. Gene Mention Task. In: Proc. of the 2nd Bio-Creative Challenge Workshop, pp. 7–16 (2007)
Zhou, G., Su, J., Shen, D., Tan, C.: Recognizing Name in Biomedical Texts: A Machine Learning Approach. Bioinformatics, 1178–1190 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Afzal, N., Mitkov, R., Farzindar, A. (2011). Unsupervised Relation Extraction Using Dependency Trees for Automatic Generation of Multiple-Choice Questions. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-21043-3_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21042-6
Online ISBN: 978-3-642-21043-3
eBook Packages: Computer ScienceComputer Science (R0)