Debre Berhan University Institutional Repository

A Bi-Directional Ge’ez-Amharic Neural Machine Translation: a Deep Learning Approach

Show simple item record

dc.contributor.author Amdework , Asefa Belay
dc.date.accessioned 2021-09-23T11:47:20Z
dc.date.available 2021-09-23T11:47:20Z
dc.date.issued 2021-07
dc.identifier.uri http://etd.dbu.edu.et:80/handle/123456789/737
dc.description.abstract Currently, due to globalization, our world is moving into one village and human languages are being transnational. So far, human interpreters have been resolving communication gaps between two people who speak different languages. However, since human translation is costly and inconvenient, many kinds of research are being done to resolve this problem with Machine Translation (MT) techniques. MT is a process of automatically translating text or speech from one human language to another by computers. Neural Machine Translation uses Artificial Neural Networks such as Transformers, which are the state of the art models that shows promising result over the previous MT models. Several ancient scripts written in the Ge’ez language that needs to be translated are available in Ethiopia and abroad. Currently, youth and researchers are interested to learn and involve in research areas of Ge’ez and Amharic manuscripts. This thesis, therefore, aims to demonstrate the capabilities of deep learning algorithms on MT tasks for those morphologically rich languages. A bi-directional text-based Ge’ez-Amharic MT was tested on two main different deep learning models viz. Seq2Seq with attention, and Transformer. A total of 20,745 parallel corpora was used for the experiment, from which the 13,787 parallel sentences were collected from former researchers and a new 6958 parallel corpus was prepared. In addition, a Ge’ez Latin numeric corpus having 3,078 parallel lines has been added to handle numeric translation. We conducted four experiments, and the transformer outperforms other techniques by scoring 22.9 and 29.7 BLEU scores from Ge’ez to Amharic and vice versa using 20,745 parallel corpora. The typical Seq2Seq model improves the BLEU score of the SMT model, obtained by previous researchers with BLEU scores of +0.65 and +0.79 that is 2.46% and 4.66% increment from Ge’ez to Amharic and from Amharic to Ge’ez using 13,833 parallel sentences. Doing further researches with clean larger corpus size and pre-trained models may improve the result we have reported in this work. However, we faced a scarcity of corpus and pre-trained models for Amharic and Ge’ez languages to get better results. en_US
dc.language.iso en en_US
dc.subject Artificial Neural Network, Attention, BLEU, Seq2seq, Machine Translation, Neural Machine Translation, Transformer en_US
dc.title A Bi-Directional Ge’ez-Amharic Neural Machine Translation: a Deep Learning Approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DBU-IR


Browse

My Account