Skip to main content
Log in

Exploring lexical, syntactic, and semantic features for Chinese textual entailment in NTCIR RITE evaluation tasks

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

We computed linguistic information at the lexical, syntactic, and semantic levels for Recognizing Inference in Text (RITE) tasks for both traditional and simplified Chinese in NTCIR-9 and NTCIR-10. Techniques for syntactic parsing, named-entity recognition, and near synonym recognition were employed, and features like counts of common words, statement lengths, negation words, and antonyms were considered to judge the entailment relationships of two statements, while we explored both heuristics-based functions and machine-learning approaches. The reported systems showed their robustness by simultaneously achieving second positions in the binary-classification subtasks for both simplified and traditional Chinese in NTCIR-10 RITE-2. We conducted more experiments with the test data of NTCIR-9 RITE, with good results. We also extended our work to search for better configurations of our classifiers and investigated contributions of individual features. This extended work showed interesting results and should encourage further discussions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. The pronunciations and translations of all Chinese strings mentioned in this paper are provided in the Appendix.

  2. http://www.nist.gov/tac/2010/RTE/.

  3. http://www.cl.ecei.tohoku.ac.jp/rite2.

  4. https://sites.google.com/site/ntcir11riteval/.

  5. http://ehownet.iis.sinica.edu.tw/.

  6. http://nlp.stanford.edu/software/index.shtml.

  7. The pronunciations and translations of all Chinese strings mentioned in this paper are provided in the Appendix.

  8. http://nlp.stanford.edu/software/segmenter.shtml.

  9. http://wordnet.princeton.edu/.

  10. http://dict.revised.moe.edu.tw/.

  11. http://ehownet.iis.sinica.edu.tw/.

  12. figure ae
  13. figure ai
  14. http://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient.

References

  • Androutsopoulos I, Malakasiotis P (2010) A survey of paraphrasing and textual entailment methods. J Artif Intell Res 38:135–187

    MATH  Google Scholar 

  • Bar-Haim R, Dagan I, Dolan B, Ferro L, Giampiccolo D, Magnini B (2006) The second PASCAL recognising textual entailment challenge. In: Proceedings of the second PASCAL challenges workshop on recognising textual entailment

  • Bar-Haim R, Dagan I, Mirkin S, Shnarch E, Szpektor I, Berant J, Greenthal I (2008) Efficient semantic deduction and approximate matching over compact parse forests. In: Proceedings of the TAC 2008 workshop on textual entailment

  • Budanitsky A, Hirst G (2006) Evaluating WordNet-based measures of lexical semantic relatedness. Comput Linguist 32(1):13–47

    Article  MATH  Google Scholar 

  • Burchardt A, Pennacchiotti M, Thater S, Pinkal M (2009) Assessing the impact of frame semantics on textual entailment. Nat Lang Eng 15(4):527–550

    Article  Google Scholar 

  • Chambers N, Cer D, Grenager T, Hall D, Kiddon C, MacCartney B, de Marneffe MC, Ramage D, Yeh E, Manning CD (2007) Learning alignments and leveraging natural logic. In: Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing, pp 165–170

  • Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. In: ACM Trans Intell Syst Technol 2(3):article 27

  • Chang PC, Galley M, Manning CD (2008) Optimizing Chinese word segmentation for machine translation performance. In: Proceedings of the third workshop on statistical machine translation, pp 224–232

  • Chang PC, Tseng H, Jurafsky D, Manning CD (2009) Discriminative reordering with Chinese grammatical relations features. In: Proceedings of the third workshop on syntax and structure in statistical translation

  • Chang TH, Hsu YC, Chang CW, Hsu YC, Chang JI (2013) KC99: a prediction system for Chinese textual entailment relation using decision tree. In: Proceedings of the tenth NTCIR conference, pp 469–473

  • Chen KJ (2013) Lexical semantic representation and semantic composition: an introduction to E-HowNet. http://rocling.iis.sinica.edu.tw/CKIP/paper/Technical_Reprt_E-HowNet.pdf

  • Chen WT, Lin SC, Huang SL, Chung YS, Chen KJ (2010) E-HowNet and automatic construction of a lexical ontology. In: Proceedings of twenty-third international conference on computational linguistics (demonstration volume), pp 45–48

  • Chuang YH, Liu CL, Chang JS (2012) Effects of combining bilingual and collocational information on translation of English and Chinese verb-noun pairs. Int J Comput Linguist Chin Lang Process 17(3):1–28

    Google Scholar 

  • Dagan I, Dolan B, Magnini B, Roth D (2009) Recognizing textual entailment: rational, evaluation and approaches. Nat Lang Eng 15(4):i–xvii

  • Dagan I, Glickman O, Magnini B (2006) The PASCAL recognising textual entailment challenge. Lect Notes Comput Sci 3944:177–190

    Article  Google Scholar 

  • Day MY, Tu C, Huang SJ, Vong HC, Wu SW (2013) IMTKU textual entailment system for recognizing inference in text at NTCIR-10 RITE2. In: Proceedings of the tenth NTCIR conference, pp 462–468

  • de Salvo Braz R, Girju R, Punyakanok V, Roth D, Sammons M (2005) Knowledge representation for semantic entailment and question-answering. In: Proceedings of IJCAI-05 workshop on knowledge and reasoning for question answering

  • Duan H, Sui Z, Tian Y, Li W (2012) The CIPS_SIGHAN CLP 2012 Chinese word segmentation on microblog corpora bakeoff. In: Proceedings of the second CIPS-SIGHAN joint conference on Chinese language processing, pp 35–40

  • Fillmore CJ (1976) Frame semantics and the nature of language. Ann N Y Acad Sci 280(1):20–32

    Article  Google Scholar 

  • Firth JR (1935) The technique of semantics. Trans Philolog Soc 34(1):36–73

    Article  Google Scholar 

  • Firth JR (1957) A synopsis of linguistic theory 1930–1955. In: Studies in linguistic analysis, pp 1–32

  • Gao J, Li M, Wu A, Huang CN (2005) Chinese word segmentation and named entity recognition: a pragmatic approach. Comput Linguist 31(4):531–574

    Article  MATH  Google Scholar 

  • Harris Z (1954) Distributional structure. Word 10(23):146–162

    Article  Google Scholar 

  • Huang WJ (2013) Textual Entailment Recognition for Chinese and English. Master’s Thesis, Department of Computer Science, National Chengchi University, Taiwan

  • Huang WJ, Lin PC, Liu CL (2013) An exploration of textual entailment and reading comprehension for Chinese and English. In: Proceedings of the twenty-fifth conference on research on computational linguistics and speech processing, pp 105–119

  • Huang WJ, Liu CL (2013) NCCU-MIG at NTCIR-10: using lexical, syntactic, and semantic features for the RITE tasks. In: Proceedings of the tenth NTCIR conference, pp 430–434

  • Levy R, Manning CD (2003) Is it harder to parse Chinese, or the Chinese Treebank? In: Proceedings of the forty-first annual meetings of association for computational linguistics, pp 439–446

  • Liu CL, Pai TW (2006) Methods for path and service planning under route constraints. Int J Comput Appl Technol 25(1):40–49

    Article  Google Scholar 

  • Lin CJ, Lee CW, Shih CW, Hsu WL (2015) Rank correlation analysis of RITE datasets and evaluation metrics—an observation on NTCIR-10 RITE Chinese subtasks. Web Intell 13(2)

  • Lloret E, Ferrández Ó, Muñoz R, Palomar M (2008) A text summarization approach under the influence of textual entailment. In: Proceedings of the fifth international workshop on natural language processing and cognitive science, pp 22–31

  • Nielsen RD, Ward W, Martin JH (2009) Recognizing entailment in intelligent tutoring systems. Nat Lang Eng 15(4):479–502

    Article  Google Scholar 

  • Page L, Brin S, Motwani R, Winograd T (1998) The Pagerank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project

  • Shibata T, Kurohashi S, Kohama S, Yamamoto A (2013) Predicate-argument structure based textual entailment recognition system of Kyoto team for NTCIR-10 RITE-2. In: Proceedings of the ninth NTCIR conference, pp 537–544

  • Shih CW, Liu C, Lee CW, Hsu WL (2013) IASL RITE system at NTCIR-10. In: Proceedings of the tenth NTCIR conference, pp 425–429

  • Shima H, Kanayama H, Lee CW, Lin CJ, Mitamura T, Miyao Y, Shi S, Takeda K (2012) Overview of NTCIR-9 RITE: recognizing inference in text. In: Proceedings of the ninth NTCIR conference, pp 291–301

  • Stern A, Lotan A, Mirkin S, Shnarch E, Kotlerman L, Berant J, Dagan I (2011) Knowledge and tree-edits in learnable entailment proofs. In: Proceedings of the text analysis conference (TAC’11)

  • Stern A, Shnarch E, Lotan A, Mirkin S, Kotlerman L, Zeichner N, Berant J, Dagan I (2010) Rule chaining and approximate match in textual inference. In: Proceedings of the text analysis conference (TAC’10)

  • Takesue Y, Ninomiya T (2013) EHIME textual entailment system using Markov logic in NTCIR-10 RITE-2. In: Proceedings of the tenth NTCIR conference, pp 507–511

  • Tatar D, Mihis AD, Lupsa D, Tamaianu-Morita E (2009) Entailment-based linear segmentation in summarization. Int J Softw Eng Knowl Eng 19(8):1023–1038

    Article  Google Scholar 

  • Tsujii J (2012) Natural language understanding, semantic-based information retrieval and knowledge management. In: Proceedings of the ninth NTCIR conference, p 8

  • Vanderwende L, Menezes A, Snow R (2006) Microsoft research at RTE-2: syntactic contributions in the entailment task: an implementation. In: Proceedings of the second PASCAL challenges workshop on recognising textual entailment

  • Wang XL, Zhao H, Lu BL (2013) BCMI-NLP labeled-alignment-based entailment system for NTCIR-10 RITE-2 task. In: Proceedings of the tenth NTCIR conference, pp 474–478

  • Watanabe Y, Miyao Y, Mizuno J, Shibata T, Kanayama H, Lee CW, Lin CJ, Shi S, Mitamura T, Kando N, Shima H, Takeda K (2013a) Overview of the recognizing inference in text (RITE-2) at NTCIR-10. In: Proceedings of the tenth NTCIR conference, pp 385–404

  • Watanabe Y, Mizuno J, Inui K (2013b) THN’s natural logic-based compositional textual entailment model at NTCIR-10 RITE-2. In: Proceedings of the tenth NTCIR conference, pp 531–536

  • Watanabe Y, Mizuno J, Nichols E, Narisawa K, Nabeshima K, Okazaki N, Inui K (2012) Leveraging diverse lexical resources for textual entailment recognition. ACM Trans Asian Lang Inf Process 11(4):Article 18

  • Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Burlington

  • Wu SH, Yang SS, Chen LP, Chiu HS, Yang RD (2013) CYUT Chinese textual entailment recognition system for NTCIR-10 RITE-2. In: Proceedings of the tenth NTCIR conference, pp 443–448

  • Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the thirty-third annual meeting of the association for computational linguistics, pp 189–196

Download references

Acknowledgments

This research was supported in part by the student travel fund of the Department of Computer Science of National Chengchi University and in part by funding from the Grants of 100-2221-E-004-014, 101-2221-E-004-018, 102-2420-H-001-006-MY2, and 103-2918-I-004-001 of the Ministry of Science and Technology of Taiwan. Access to the digital library services of the Harvard Library was granted to the second author during his visit to Harvard University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao-Lin Liu.

Additional information

Communicated by C.-S. Lee.

Appendix

Appendix

We provide information about the Chinese text included in this paper. The section column indicates the sections where the Chinese text appears. The Chinese column shows the mentioned Chinese text. The Pronunciation column shows the pronunciations of the Chinese texts in Hanyu pinyin. The Translation/Interpretation column provides a way to interpret the Chinese text in this paper.

figure cf
figure cg
figure ch

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, WJ., Liu, CL. Exploring lexical, syntactic, and semantic features for Chinese textual entailment in NTCIR RITE evaluation tasks. Soft Comput 21, 311–330 (2017). https://doi.org/10.1007/s00500-015-1629-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1629-1

Keywords

Navigation