Information And Communication Technology For Development For Africa
This paper presents the development of a free/open-source finite-state morphological transducer for Wolaytta, an Omotic language of Ethiopia, using the Helsinki Finite-State Transducer toolkit (HFST). Developing a full-fledged morphological analysis tool for an under-resourced language like Wolaytta is an important step towards developing further NLP (Natural Language Processing) applications. Morphological analyzers for highly inflectional languages are most efficiently developed using finite-state transducers. To develop the transducer, a lexicon of root words was obtained semi-automatically. The morphotactics of the language were implemented by hand in the lexc formalism, and morphophonological rules were implemented in the twol formalism. Evaluation of the transducer shows as it has decent coverage (over 80%) of forms in a large corpus and exhibits high precision (94.85%) and recall (94.11%) over a manually verified test set. To the best of our knowledge, this work is the first systematic and exhaustive implementation of the morphology of Wolaytta in a morphological transducer.
Wolaytta language, Morphological analysis and generation, HFST, Apertium, NLP
F. Mekuria, E. Enideg Nigussie, W. Dargie, M. Edward, and T. Tegegne
September 25–27, 2017
Bahir Dar, Ethiopia
T. A. Gebreselassie, Jonathan North Washington, M. Gasser, and B. Yimam.
"A Finite-State Morphological Analyzer For Wolaytta".
Information And Communication Technology For Development For Africa.