Document Details

Document Type : Thesis 
Document Title :
Automatic Question Answering System for Arabic Language Textual Data
نظام أسئلة وإجابة آلي للبيانات النصية في اللغة العربية
 
Subject : Faculty of Computing and Information Technology 
Document Language : Arabic 
Abstract : Question answering (QA) has a long tradition, involving many disciplines, ranging from philosophy to database theory. Depending on the discipline different aspects of the question answering process are investigated. In this thesis, an Arabic Information Retrieval (IR) System has been implemented, we know that searching inside a large corpus is a hard and time-consuming task for the user, so that, establishing a way to retrieve the data to the user is very effective. The main concern is about the Prophetic Hadith. We have assumed that the corpus is divided into main topics and each one is divided into sub-topics and so on. In another hand, this thesis presented the application of pattern recognition algorithm based on statistical learning, the Hidden Markov Model (HMM) which builds one model for each topic related trained texts and before training there is a processing step in any IR system which is stemming, which removes morphological information from the word. Stemming has a long tradition in document retrieval, and a variety of stemmers are available. The Arabic language is a highly inflected language and it has a complex morphology. After stemming, and for training purpose a feature vector for each word in the corpus is generated. A new approach has been implemented, which creates the feature vector for the words from its frequency inside the topics, then labels are generated for the words by clustering them into groups and one label is given for all words in one cluster, the clustering process is used k-means algorithm witch classify or group our stems based on attribute/feature. Although we used a Prophetic Hadith corpus, the system could be used in any other context, anyhow several experiments have been carried out in this research in order to increase the performance of our system and the highest possible accuracy accomplished in 64%. 
Supervisor : Dr. Reda A. Alkhoribi 
Thesis Type : Master Thesis 
Publishing Year : 1430 AH
2009 AD
 
Co-Supervisor : Dr. Omar A. Batarfi 
Added Date : Monday, December 28, 2009 

Researchers

Researcher Name (Arabic)Researcher Name (English)Researcher TypeDr GradeEmail
توفيق زهير حسنينHasanain, Tawfeq ZUHAIRResearcherMasterTAWFEQ@TAWFEQ.COM

Files

File NameTypeDescription
 24551.pdf pdf 

Back To Researches Page