A Finite-State Morphological Analyzer For Wolaytta
Document Type
Conference Proceeding
Publication Date
2018
Published In
Information And Communication Technology For Development For Africa
Series Title
Lecture Notes Of The Institute For Computer Sciences, Social Informatics And Telecommunications Engineering
Abstract
This paper presents the development of a free/open-source finite-state morphological transducer for Wolaytta, an Omotic language of Ethiopia, using the Helsinki Finite-State Transducer toolkit (HFST). Developing a full-fledged morphological analysis tool for an under-resourced language like Wolaytta is an important step towards developing further NLP (Natural Language Processing) applications. Morphological analyzers for highly inflectional languages are most efficiently developed using finite-state transducers. To develop the transducer, a lexicon of root words was obtained semi-automatically. The morphotactics of the language were implemented by hand in the lexc formalism, and morphophonological rules were implemented in the twol formalism. Evaluation of the transducer shows as it has decent coverage (over 80%) of forms in a large corpus and exhibits high precision (94.85%) and recall (94.11%) over a manually verified test set. To the best of our knowledge, this work is the first systematic and exhaustive implementation of the morphology of Wolaytta in a morphological transducer.
Keywords
Wolaytta language, Morphological analysis and generation, HFST, Apertium, NLP
Published By
Springer
Editor(s)
F. Mekuria, E. Enideg Nigussie, W. Dargie, M. Edward, and T. Tegegne
Conference
ICT4DA 2017
Conference Dates
September 25–27, 2017
Conference Location
Bahir Dar, Ethiopia
Recommended Citation
T. A. Gebreselassie, Jonathan North Washington, M. Gasser, and B. Yimam.
(2018).
"A Finite-State Morphological Analyzer For Wolaytta".
Information And Communication Technology For Development For Africa.
Volume 244,
14-23.
DOI: 10.1007/978-3-319-95153-9_2
https://works.swarthmore.edu/fac-linguistics/267