NLP

Kakao Enterprise’s WMT21 Machine Translation using Terminologies Task Submission

박윤주(카카오엔터프라이즈), 선지민(카카오엔터프라이즈, 카네기멜론대), 김재현(카카오엔터프라이즈), 류성원(카카오엔터프라이즈), 이창민(카카오엔터프라이즈)

WMT 2021 System Papers

2021-11-10

Abstract

This paper describes Kakao Enterprise’s submission to the WMT21 shared Machine Translation using Terminologies task. We integrate terminology constraints by pre-training with target lemma annotations and fine-tuning with exact target annotations utilizing the given terminology dataset. This approach yields a model that achieves outstanding results in terms of both translation quality and term consistency, ranking first based on COMET in the En→Fr language direction. Furthermore, we explore various methods such as back-translation, explicitly training terminologies as additional parallel data, and in-domain data selection.