Abstract
We computed linguistic information at the lexical, syntactic, and semantic levels for Recognizing Inference in Text (RITE) tasks for both traditional and simplified Chinese in NTCIR-9 and NTCIR-10. Techniques for syntactic parsing, named-entity recognition, and near synonym recognition were employed, and features like counts of common words, statement lengths, negation words, and antonyms were considered to judge the entailment relationships of two statements, while we explored both heuristics-based functions and machine-learning approaches. The reported systems showed their robustness by simultaneously achieving second positions in the binary-classification subtasks for both simplified and traditional Chinese in NTCIR-10 RITE-2. We conducted more experiments with the test data of NTCIR-9 RITE, with good results. We also extended our work to search for better configurations of our classifiers and investigated contributions of individual features. This extended work showed interesting results and should encourage further discussions.
Similar content being viewed by others
Notes
The pronunciations and translations of all Chinese strings mentioned in this paper are provided in the Appendix.
The pronunciations and translations of all Chinese strings mentioned in this paper are provided in the Appendix.
References
Androutsopoulos I, Malakasiotis P (2010) A survey of paraphrasing and textual entailment methods. J Artif Intell Res 38:135–187
Bar-Haim R, Dagan I, Dolan B, Ferro L, Giampiccolo D, Magnini B (2006) The second PASCAL recognising textual entailment challenge. In: Proceedings of the second PASCAL challenges workshop on recognising textual entailment
Bar-Haim R, Dagan I, Mirkin S, Shnarch E, Szpektor I, Berant J, Greenthal I (2008) Efficient semantic deduction and approximate matching over compact parse forests. In: Proceedings of the TAC 2008 workshop on textual entailment
Budanitsky A, Hirst G (2006) Evaluating WordNet-based measures of lexical semantic relatedness. Comput Linguist 32(1):13–47
Burchardt A, Pennacchiotti M, Thater S, Pinkal M (2009) Assessing the impact of frame semantics on textual entailment. Nat Lang Eng 15(4):527–550
Chambers N, Cer D, Grenager T, Hall D, Kiddon C, MacCartney B, de Marneffe MC, Ramage D, Yeh E, Manning CD (2007) Learning alignments and leveraging natural logic. In: Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing, pp 165–170
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. In: ACM Trans Intell Syst Technol 2(3):article 27
Chang PC, Galley M, Manning CD (2008) Optimizing Chinese word segmentation for machine translation performance. In: Proceedings of the third workshop on statistical machine translation, pp 224–232
Chang PC, Tseng H, Jurafsky D, Manning CD (2009) Discriminative reordering with Chinese grammatical relations features. In: Proceedings of the third workshop on syntax and structure in statistical translation
Chang TH, Hsu YC, Chang CW, Hsu YC, Chang JI (2013) KC99: a prediction system for Chinese textual entailment relation using decision tree. In: Proceedings of the tenth NTCIR conference, pp 469–473
Chen KJ (2013) Lexical semantic representation and semantic composition: an introduction to E-HowNet. http://rocling.iis.sinica.edu.tw/CKIP/paper/Technical_Reprt_E-HowNet.pdf
Chen WT, Lin SC, Huang SL, Chung YS, Chen KJ (2010) E-HowNet and automatic construction of a lexical ontology. In: Proceedings of twenty-third international conference on computational linguistics (demonstration volume), pp 45–48
Chuang YH, Liu CL, Chang JS (2012) Effects of combining bilingual and collocational information on translation of English and Chinese verb-noun pairs. Int J Comput Linguist Chin Lang Process 17(3):1–28
Dagan I, Dolan B, Magnini B, Roth D (2009) Recognizing textual entailment: rational, evaluation and approaches. Nat Lang Eng 15(4):i–xvii
Dagan I, Glickman O, Magnini B (2006) The PASCAL recognising textual entailment challenge. Lect Notes Comput Sci 3944:177–190
Day MY, Tu C, Huang SJ, Vong HC, Wu SW (2013) IMTKU textual entailment system for recognizing inference in text at NTCIR-10 RITE2. In: Proceedings of the tenth NTCIR conference, pp 462–468
de Salvo Braz R, Girju R, Punyakanok V, Roth D, Sammons M (2005) Knowledge representation for semantic entailment and question-answering. In: Proceedings of IJCAI-05 workshop on knowledge and reasoning for question answering
Duan H, Sui Z, Tian Y, Li W (2012) The CIPS_SIGHAN CLP 2012 Chinese word segmentation on microblog corpora bakeoff. In: Proceedings of the second CIPS-SIGHAN joint conference on Chinese language processing, pp 35–40
Fillmore CJ (1976) Frame semantics and the nature of language. Ann N Y Acad Sci 280(1):20–32
Firth JR (1935) The technique of semantics. Trans Philolog Soc 34(1):36–73
Firth JR (1957) A synopsis of linguistic theory 1930–1955. In: Studies in linguistic analysis, pp 1–32
Gao J, Li M, Wu A, Huang CN (2005) Chinese word segmentation and named entity recognition: a pragmatic approach. Comput Linguist 31(4):531–574
Harris Z (1954) Distributional structure. Word 10(23):146–162
Huang WJ (2013) Textual Entailment Recognition for Chinese and English. Master’s Thesis, Department of Computer Science, National Chengchi University, Taiwan
Huang WJ, Lin PC, Liu CL (2013) An exploration of textual entailment and reading comprehension for Chinese and English. In: Proceedings of the twenty-fifth conference on research on computational linguistics and speech processing, pp 105–119
Huang WJ, Liu CL (2013) NCCU-MIG at NTCIR-10: using lexical, syntactic, and semantic features for the RITE tasks. In: Proceedings of the tenth NTCIR conference, pp 430–434
Levy R, Manning CD (2003) Is it harder to parse Chinese, or the Chinese Treebank? In: Proceedings of the forty-first annual meetings of association for computational linguistics, pp 439–446
Liu CL, Pai TW (2006) Methods for path and service planning under route constraints. Int J Comput Appl Technol 25(1):40–49
Lin CJ, Lee CW, Shih CW, Hsu WL (2015) Rank correlation analysis of RITE datasets and evaluation metrics—an observation on NTCIR-10 RITE Chinese subtasks. Web Intell 13(2)
Lloret E, Ferrández Ó, Muñoz R, Palomar M (2008) A text summarization approach under the influence of textual entailment. In: Proceedings of the fifth international workshop on natural language processing and cognitive science, pp 22–31
Nielsen RD, Ward W, Martin JH (2009) Recognizing entailment in intelligent tutoring systems. Nat Lang Eng 15(4):479–502
Page L, Brin S, Motwani R, Winograd T (1998) The Pagerank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project
Shibata T, Kurohashi S, Kohama S, Yamamoto A (2013) Predicate-argument structure based textual entailment recognition system of Kyoto team for NTCIR-10 RITE-2. In: Proceedings of the ninth NTCIR conference, pp 537–544
Shih CW, Liu C, Lee CW, Hsu WL (2013) IASL RITE system at NTCIR-10. In: Proceedings of the tenth NTCIR conference, pp 425–429
Shima H, Kanayama H, Lee CW, Lin CJ, Mitamura T, Miyao Y, Shi S, Takeda K (2012) Overview of NTCIR-9 RITE: recognizing inference in text. In: Proceedings of the ninth NTCIR conference, pp 291–301
Stern A, Lotan A, Mirkin S, Shnarch E, Kotlerman L, Berant J, Dagan I (2011) Knowledge and tree-edits in learnable entailment proofs. In: Proceedings of the text analysis conference (TAC’11)
Stern A, Shnarch E, Lotan A, Mirkin S, Kotlerman L, Zeichner N, Berant J, Dagan I (2010) Rule chaining and approximate match in textual inference. In: Proceedings of the text analysis conference (TAC’10)
Takesue Y, Ninomiya T (2013) EHIME textual entailment system using Markov logic in NTCIR-10 RITE-2. In: Proceedings of the tenth NTCIR conference, pp 507–511
Tatar D, Mihis AD, Lupsa D, Tamaianu-Morita E (2009) Entailment-based linear segmentation in summarization. Int J Softw Eng Knowl Eng 19(8):1023–1038
Tsujii J (2012) Natural language understanding, semantic-based information retrieval and knowledge management. In: Proceedings of the ninth NTCIR conference, p 8
Vanderwende L, Menezes A, Snow R (2006) Microsoft research at RTE-2: syntactic contributions in the entailment task: an implementation. In: Proceedings of the second PASCAL challenges workshop on recognising textual entailment
Wang XL, Zhao H, Lu BL (2013) BCMI-NLP labeled-alignment-based entailment system for NTCIR-10 RITE-2 task. In: Proceedings of the tenth NTCIR conference, pp 474–478
Watanabe Y, Miyao Y, Mizuno J, Shibata T, Kanayama H, Lee CW, Lin CJ, Shi S, Mitamura T, Kando N, Shima H, Takeda K (2013a) Overview of the recognizing inference in text (RITE-2) at NTCIR-10. In: Proceedings of the tenth NTCIR conference, pp 385–404
Watanabe Y, Mizuno J, Inui K (2013b) THN’s natural logic-based compositional textual entailment model at NTCIR-10 RITE-2. In: Proceedings of the tenth NTCIR conference, pp 531–536
Watanabe Y, Mizuno J, Nichols E, Narisawa K, Nabeshima K, Okazaki N, Inui K (2012) Leveraging diverse lexical resources for textual entailment recognition. ACM Trans Asian Lang Inf Process 11(4):Article 18
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Burlington
Wu SH, Yang SS, Chen LP, Chiu HS, Yang RD (2013) CYUT Chinese textual entailment recognition system for NTCIR-10 RITE-2. In: Proceedings of the tenth NTCIR conference, pp 443–448
Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the thirty-third annual meeting of the association for computational linguistics, pp 189–196
Acknowledgments
This research was supported in part by the student travel fund of the Department of Computer Science of National Chengchi University and in part by funding from the Grants of 100-2221-E-004-014, 101-2221-E-004-018, 102-2420-H-001-006-MY2, and 103-2918-I-004-001 of the Ministry of Science and Technology of Taiwan. Access to the digital library services of the Harvard Library was granted to the second author during his visit to Harvard University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by C.-S. Lee.
Appendix
Appendix
We provide information about the Chinese text included in this paper. The section column indicates the sections where the Chinese text appears. The Chinese column shows the mentioned Chinese text. The Pronunciation column shows the pronunciations of the Chinese texts in Hanyu pinyin. The Translation/Interpretation column provides a way to interpret the Chinese text in this paper.
Rights and permissions
About this article
Cite this article
Huang, WJ., Liu, CL. Exploring lexical, syntactic, and semantic features for Chinese textual entailment in NTCIR RITE evaluation tasks. Soft Comput 21, 311–330 (2017). https://doi.org/10.1007/s00500-015-1629-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1629-1