Published online by Cambridge University Press: 16 April 2015
This work presents the development and evaluation of an extended Urdu parser. It further focuses on issues related to this parser and describes the changes made in the Earley algorithm to get accurate and relevant results from the Urdu parser. The parser makes use of a morphologically rich context free grammar extracted from a linguistically-rich Urdu treebank. This grammar with sufficient encoded information is comparable with the state-of-the-art parsing requirements for the morphologically rich Urdu language. The extended parsing model and the linguistically rich extracted-grammar both provide us better evaluation results in Urdu/Hindi parsing domain. The parser gives 87% of f-score, which outperforms the existing parsing work of Urdu/Hindi based on the tree-banking approach.