A Finite-State Morphological Analyzer For Wolaytta

Document Type

Conference Proceeding

Publication Date

2018

Published In

Information And Communication Technology For Development For Africa

Series Title

Lecture Notes Of The Institute For Computer Sciences, Social Informatics And Telecommunications Engineering

Abstract

This paper presents the development of a free/open-source finite-state morphological transducer for Wolaytta, an Omotic language of Ethiopia, using the Helsinki Finite-State Transducer toolkit (HFST). Developing a full-fledged morphological analysis tool for an under-resourced language like Wolaytta is an important step towards developing further NLP (Natural Language Processing) applications. Morphological analyzers for highly inflectional languages are most efficiently developed using finite-state transducers. To develop the transducer, a lexicon of root words was obtained semi-automatically. The morphotactics of the language were implemented by hand in the lexc formalism, and morphophonological rules were implemented in the twol formalism. Evaluation of the transducer shows as it has decent coverage (over 80%) of forms in a large corpus and exhibits high precision (94.85%) and recall (94.11%) over a manually verified test set. To the best of our knowledge, this work is the first systematic and exhaustive implementation of the morphology of Wolaytta in a morphological transducer.

Keywords

Wolaytta language, Morphological analysis and generation, HFST, Apertium, NLP

Published By

Springer

Editor(s)

F. Mekuria, E. Enideg Nigussie, W. Dargie, M. Edward, and T. Tegegne

Conference

ICT4DA 2017

Conference Dates

September 25–27, 2017

Conference Location

Bahir Dar, Ethiopia

Share

COinS