Lemaza: An Arabic why-question answering system*

AQIL M. AZMI; NOUF A. ALSHENAIFI

doi:10.1017/S1351324917000304

Lemaza: An Arabic why-question answering system*

Published online by Cambridge University Press: 24 August 2017

AQIL M. AZMI and

NOUF A. ALSHENAIFI

Show author details

AQIL M. AZMI: Affiliation:
Department of Computer Science, King Saud University, Riyadh 11543, Saudi Arabia e-mail: aqil@ksu.edu.sa, noalshenaifi@ksu.edu.sa
NOUF A. ALSHENAIFI: Affiliation:
Department of Computer Science, King Saud University, Riyadh 11543, Saudi Arabia e-mail: aqil@ksu.edu.sa, noalshenaifi@ksu.edu.sa

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Question answering systems retrieve information from documents in response to queries. Most of the questions are who- and what-type questions that deal with named entities. A less common and more challenging question to deal with is the why -question. In this paper, we introduce Lemaza (Arabic for why), a system for automatically answering why -questions for Arabic texts. The system is composed of four main components that make use of the Rhetorical Structure Theory. To evaluate Lemaza, we prepared a set of why -question–answer pairs whose answer can be found in a corpus that we compiled out of Open Source Arabic Corpora. Lemaza performed best when the stop-words were not removed. The performance measure was 72.7%, 79.2% and 78.7% for recall, precision and c@1, respectively.

Type: Articles
Information: Natural Language Engineering , Volume 23 , Issue 6 , November 2017 , pp. 877 - 903

DOI: https://doi.org/10.1017/S1351324917000304 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

We would like to thank W. Al-Sanie for sharing his RST implementation; and the language specialist for helping us with why-question–answer pairs. The first author would like to thank Miss Maryam for her assistance in proof-reading the manuscript. Special thanks to all three anonymous reviewers for their constructive comments, which helped in further improvement of the manuscript. This work was supported by a special fund in the Research Center of College of Computer & Information Sciences (CCIS) at King Saud University for which the authors are thankful.

References

Abouenour, L., Bouzouba, K., and Rosso, P., 2013. An evaluated semantic query expansion and structure-based approach for enhancing Arabic question/answering. International Journal on Information and Communication Technologies (IJICT) 3 (3): 37–51.Google Scholar

Abouenour, L., Bouzoubaa, K., and Rosso, P. 2008. Improving Q/A using Arabic wordnet. In Proceedings of the 2008 International Arab Conference on Information Technology (ACIT’08), Tunisia.Google Scholar

Akour, M., Abufardeh, S., Magel, K., and Al-Radaideh, Q., 2011. QArabPro: a rule based question answering system for reading comprehension tests in Arabic. American Journal of Applied Sciences 8 (6): 652–61.Google Scholar

Al-Kabi, M. N., Kazakzeh, S. A., Abu Ata, B. M., Al-Rababah, S. A., and Alsmadi, I. M., 2015. A novel root based Arabic stemmer. Journal of King Saud University – Computer and Information Sciences 27 (2): 94–103.Google Scholar

Al-Sanie, W. 2005. Towards an Infrastructure for Arabic Text Summarization using Rhetorical Structure Theory. Master’s Thesis, King Saud University, Riyadh, Saudi Arabia.Google Scholar

Asher, N., and Lascarides, A., 2003. Logics of Conversation. Cambridge: Cambridge University Press.Google Scholar

Azmi, A. M., and Al-Thanyyan, S., 2012. A text summarizer for Arabic. Computer Speech and Language 26 (4): 260–73.CrossRef Google Scholar

Azmi, A. M., and Aljafari, E. A. 2017. Universal web accessibility and the challenge to integrate informal Arabic users: a case study. In Universal Access in the Information Society (UAIS), Springer, doi:10.1007/s10209-017-0522-3.Google Scholar

Azmi, A. M., and Almajed, R. S., 2015. A survey of automatic Arabic diacritization techniques. Natural Language Engineering (NLE) 21 (3): 477–95.Google Scholar

Azmi, A. M., and AlShenaifi, N. 2014. Handling ‘why’ questions in Arabic. In Proceedings of the 5th International Conference on Arabic Language Processing (CITALA ’14), Oujda, Morocco. Available at http://www.citala.org/papers/paper_56.pdf.Google Scholar

Bateman, J., and Delin, J. 2006. Rhetorical structure theory. In Brown, K. (ed.), Encyclopedia of Language and Linguistics, 2nd ed., pp. 589–97. Amsterdam: Elsevier, BV.Google Scholar

Benajiba, Y. 2007. Arabic Question Answering. Master’s Thesis, Universidad Politécnica de Valencia, Spain.Google Scholar

Benajiba, Y., Rosso, P., and Soriano, J. 2007. Adapting the JIRS passage retrieval system to the Arabic language. In Computational Linguistics and Intelligent Text Processing, pp. 530–41. Lecture Notes in Computer Science, vol. 4394. Berlin Heidelberg: Springer.Google Scholar

Bosma, W. 2005. Extending answers using discourse structure. In RANLP 2005 Workshop on Crossing Barriers in Text Summarization Research, Borovets, Bulgaria.Google Scholar

Brini, W., Ellouze, M., Trigui, O., Mesfar, S., Belguith, L. H., and Rosso, P., 2009. Factoid and definitional Arabic question answering system. In NOOJ ’09, Tozeur, Tunisia, pp. 243–55.Google Scholar

El-Khair, I. A., 2006. Effects of stop words elmination for Arabic information retrieval: a comparative study. International Journal of Computing and Information Sciences 4 (3): 119–33.Google Scholar

Ezzeldin, A. M., and Shaheen, M. 2012. A survey of Arabic question answering: challenges, tasks, approaches, tools, and future trends. In Proceedings of the 13th International Arab Conference on Information Technology (ACIT’12), pp. 280–7.Google Scholar

Farghaly, A., and Shaalan, K., 2009. Arabic natural language processing: challenges and solutions. ACM Transaction on Asian Language Information Processing 8 (4): 1–22.Google Scholar

Ferguson, C. A., 1959. Diglossia. Word 15 (2): 325–40.Google Scholar

Gaizauskas, R., and Humphreys, K., 2000. A combined IR/NLP approach to question answering against large text collections. In Proceedings of RIAO 2000: Content-Based Multimedia Information Access, Paris, France, pp. 1288–1304.Google Scholar

Habash, N., and Rambow, O. 2005. Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In Proceedings 43rd Annual Meeting on Association for Computational Linguistics, pp. 573–80.Google Scholar

Habash, N., Rambow, O., and Roth, R., 2009. MADA+TOKAN: a toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In Proceedings 2nd International Conference on Arabic Language Resources and Tools (MEDAR), Cairo, Egypt, pp. 102–9.Google Scholar

Hammo, B., Abu-Salem, H., Lytinen, S., and Evens, M. 2002. QARAB: a question answering system to support the Arabic language. In Workshop on Computational Approaches to Semitic Languages (ACL ’02). Association for Computational Linguistics, pp. 55–68.Google Scholar

Hammo, B., Abuleil, S., Lytinen, S., and Evens, M., 2004. Experimenting with a question answering system for the Arabic language. Computers and the Humanities 38 (4): 397–415.Google Scholar

Higashinaka, R., and Isozaki, H., 2008. Corpusbased question answering for why questions. In Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India, pp. 419–25.Google Scholar

Iruskieta, M., da Cunha, I., and Taboada, M., 2014. A qualitative comparison method for rhetorical structures: identifying different discourse structures in multilingual corpora. Language Resources & Evaluation 49 (2): 263–309.Google Scholar

Kanaan, G., Hammouri, A., Al-Shalabi, R., and Swalha, M., 2009. A new question answering system for the Arabic language. American Journal of Applied Sciences 6 (4): 797–805.CrossRef Google Scholar

Keskes, I., Zitoune, F. B., and Belguith, L. H., 2014. Splitting Arabic texts into elementary discourse units. ACM Transaction Asian Language Information Processing 13 (2): 9:19:23.Google Scholar

Khoja, S., and Roger, G. 1999. Stemming Arabic text. Technical Report, Computing department, Lancaster University.Google Scholar

Larkey, L. S., Ballesteros, L., and Connell, M. E. 2002. Improving stemming for Arabic information retrieval: light stemming and cooccurrence analysis. Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 275–82.Google Scholar

Mann, W. C., and Thompson, S. A. 1988. Rhetorical structure theory: toward a functional theory of text organization. Text 8 (3), 243–81.Google Scholar

Manning, C. D., Raghavan, P., and Schütze, H., 2008. Introduction to Information Retrieval. Cambridge: Cambridge University Press.Google Scholar

Marcu, D. 1997. The Rhetorical Parsing, Summarization, and Generation of Natural Languag Texts. PhD’s Thesis, University of Toronto, Toronto, Canada.Google Scholar

Marcu, D., 1998. Improving summarization through rhetorical parsing tuning. In Proceedings of the 6th Workshop on Very Large Corpora, Montreal QC, Canada, pp. 206–15.Google Scholar

Marcu, D., 2000. The Theory and Practice of Discourse Parsing and Summarization. Cambridge, MA: MIT Press.Google Scholar

Nakov, P., Màrquez, L., Magdy, W., Moschitti, A., Glass, J., and Randeree, B., 2015. Semeval-2015 task 3: answer selection in community question answering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval ’15), Denver, Colorado, pp. 269–81.Google Scholar

Nakov, P., Màrquez, L., Moschitti, A., Magdy, W., Mubarak, H., Freihat, A., Glass, J., and Randeree, B. 2016. SemEval- 2016 task 3: community question answering. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval ’16), San Diego, California.Google Scholar

Nakov, P., Hoogeveen, D., Màrquez, L., Moschitti, A., Mubarak, H., Baldwin, T., and Verspoor, K. 2017. SemEval- 2017 task 3: community question answering. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval ’17), Vancouver, Canada.Google Scholar

Oh, J. H., Torisawa, K., Hashimoto, C., Kawada, T., De Saeger, S., Kazama, J., and Wang, Y., 2012. Why-question answering using sentiment analysis and word classes. In Proceedings 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea, pp. 368–78.Google Scholar

Oh, J. H., Torisawa, K., Hashimoto, C., Sano, M., De Saeger, S., and Ohtake, K., 2013. Why-question answering using intra and intersentential causal relations. In Proceedings 51st Annual Meeting of the Association for Computational Linguistic (ACL 2013), Sofia, Bulgaria, pp. 1733–43.Google Scholar

Peñas, A., and Rodrigo, A., 2011. A simple measure to assess non-response. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL–HLT ’11), Portland, Oregon, pp. 1415–24.Google Scholar

Peñas, A., Hovy, E. H., Forner, P., Rodrigo, Á, Sutcliffe, R. F. E., Sporleder, C., Forascu, C., Benajiba, Y., and Osenova, P. 2012. Overview of QA4MRE at CLEF 2012: question answering for machine reading evaluation. In CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy.Google Scholar

Rosso, P., Benajiba, Y., and Lyhyaoui, A. 2006. Towards an Arabic question answering system. In Proceedings of the 4th Conference on Scientific Research Outlook & Technology Development in the Arab World, Syria, pp. 11–14.Google Scholar

Ryding, K. C., 2005. A Reference Grammar of Modern Standard Arabic. Cambridge: Cambridge University Press.Google Scholar

Saad, M. K., and Ashour, W. Nov., 2010. OSAC: open source Arabic corpora. In Proceedings of the 6th International Conference on Electrical and Computer Science (EECS’10), Lefke, North Cyprus, pp. 118–23.Google Scholar

Salem, Z., Sadek, J., Chakkour, F., and Haskkour, N. 2010. Automatically finding answers to ‘Why’ and ‘How to’ questions for arabic language. In Setchi, R., Jordanov, I., Howlett, R., and Lakhmi, J. (eds.), Knowledge-Based and Intelligent Information and Engineering Systems, vol. 6279, pp. 586–93. Lecture Notes in Computer Science. Berlin Heidelberg: Springer.Google Scholar

Salton, G., Wong, A., and Yang, C. S., 1975. A vector space model for automatic indexing. Communications of ACM 18 (11): 613–20.Google Scholar

Scott, D. R., and de Souza, C. S. 1990. Getting the message across in RST-based text generation. In Dale, R., Mellish, C., and Zock, M. (eds.), Current Research in Natural Language Generation, pp. 47–73. San Diego CA: Academic Press Professional Inc.Google Scholar

Seif, A., Mathkour, H., and Touir, A., 2005. An RST computational tool for the Arabic language. In Proceedings of the 7th International Conference on Information Integrationed Web-based Applications Services (iiWAS’05), Kuala Lumpur, Malaysia, pp. 527–34.Google Scholar

Semmar, N., Laib, M., and Fluhr, C. 2006. Using stemming in morphological analysis to improve Arabic information retrieval, Traitement automatique des Langues naturelles (TALN 2006), Leuven, Belgium, pp. 317–26.Google Scholar

Severyn, A., and Moschitti, A., 2012. Structural relationships for largescale learning of answer reranking. In Proceedings of the 35th Annual ACM SIGIR Conference (SIGIR 2012), Portland, Oregon, pp. 741–50.Google Scholar

Severyn, A., and Moschitti, A., 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th Annual ACM SIGIR Conference (SIGIR 2015), Santiago, Chile, pp. 373–82.Google Scholar

Shaheen, M., and Ezzeldin, A. M., 2014. Arabic question answering: systems, resources, tools, and future trends. Arabian Journal for Science and Engineering 39 (6): 4541–64.Google Scholar

Silberztein, M. 2005. NooJ: a linguistic annotation system for corpus processing. In Proceedings of the Conference on Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Vancouver BC, Canada.CrossRef Google Scholar

Taboada, M., and Stede, M. 2009. Introduction to RST (Rhetorical Structure Theory). Slides available at http://edu.cs.uni-magdeburg.de/EC/lehre/wintersemester-2011-2012/dokumentverarbeitung/folien-und-materialien/RST_Introduction.pdf Google Scholar

Trigui, O., Belguith, L. H., and Rosso, P., 2010. DefArabicQA: Arabic definition question answering system. In Proceedings of the 7th LREC Workshop on Language Resources and Human Language Technologies for Semitic Languages, Valletta, Malta, pp. 40–5.Google Scholar

Tymoshenko, K., and Moschitti, A., 2015. Assessing the impact of syntactic and semantic structures for answer passages reranking. In Proceedings of The 24th ACM International Conference on Information and Knowledge Management (CIKM 2015), Melbourne, Australia, pp. 1451–60.Google Scholar

Verberne, S. 2010. In Search of the Why. PhD Thesis, University of Nijmegen, The Netherlands.Google Scholar

Verberne, S., Boves, L., Coppen, P.-A., and Oostdijk, N. 2007. Discourse-based answering of why-questions. Traitement automatique des Langues (TAL), Published by Association pour le traitement automatique des langues (ATALA), Paris France 47 (2): 21–41.Google Scholar

Verberne, S., Boves, L., Oostdijk, N., and Coppen, P.-A. 2010. What is not in the bag of words for Why-QA? Computational Linguistics 36 (2): 229–45.Google Scholar

Verberne, S., van Halteren, H., Theijssen, D., Raaijmakers, S., and Boves, L., 2011. Learning to rank for why-question answering. Information Retrieval 14 (2): 107–32.Google Scholar

Webber, B., 2004. D-LTAG: extending lexicalized TAG to discourse. Cognitive Science 28 (5): 751–79.Google Scholar

Zhao, Y.-M., Xu, Z.-M., Guan, Y., and Wang, X.-L., 2006. An open domain question answering system based on improved system similarity model. In Proceedings of the 5th International Conference on Machine Learning and Cybernetics, Dalian, China, pp. 4521–6.Google Scholar

Article contents

Lemaza: An Arabic why-question answering system*

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests