The computational analysis of morphosyntactic categories in Urdu

Andrew Hardie
Urdu is a language of the Indo-Aryan family, widely spoken in India and Pakistan, and an important minority language in Europe, North America, and elsewhere. This thesis describes the development of a computer-based system for part-of-speech tagging of Urdu texts, consisting of a tagset, a set of tagging guidelines for manual tagging or post-editing, and the tagger itself. The tagset is defined in accordance with a set of design principles, derived from a survey of...
This data repository is not currently reporting usage information. For information on how your repository can submit usage information, please see our documentation.