Acessibilidade / Reportar erro

Training and analyzing a Transformer-based machine translation model

abstract

The objective of this work is to analyze transformer-based machine translation models. It aims to test the feasibility of using trained models based on specialized corpus. For the training of such model, a parallel English-French corpus was built with seven texts related to the Convention of 25 October 1980 on the Civil Aspects of International Child Abduction. The translation results obtained by the trained model were compared with those produced by Google Translate. For the evaluation stage, sacreBLEU automatic evaluation and human evaluation methods were used. The outcome of the automatic evaluation of sentences produced by the trained model was, on average, higher than those generated by the non-trained model. The human evaluation of the sentences revealed that there were adequacy errors in the use of language specific to the subject matter of the 1980 Hague Convention both in sentences generated by the trained model and in sentences generated by the Google Translate model.

Keywords:
Computational Linguistics; Machine translation; Transformer; Parallel corpus; Machine translation evaluation

Universidade Federal de Minas Gerais - UFMG Av. Antônio Carlos, 6627 - Pampulha, Cep: 31270-901, Belo Horizonte - Minas Gerais / Brasil, Tel: +55 (31) 3409-6009 - Belo Horizonte - MG - Brazil
E-mail: revistatextolivre@letras.ufmg.br