Kakao Enterprise’s WMT21 Machine Translation using Terminologies Task Submission

박윤주(카카오엔터프라이즈), 선지민(카카오엔터프라이즈, 카네기멜론대), 김재현(카카오엔터프라이즈), 류성원(카카오엔터프라이즈), 이창민(카카오엔터프라이즈)

WMT 2021 System Papers



This paper describes Kakao Enterprise’s submission to the WMT21 shared Machine Translation using Terminologies task. We integrate terminology constraints by pre-training with target lemma annotations and fine-tuning with exact target annotations utilizing the given terminology dataset. This approach yields a model that achieves outstanding results in terms of both translation quality and term consistency, ranking first based on COMET in the En→Fr language direction. Furthermore, we explore various methods such as back-translation, explicitly training terminologies as additional parallel data, and in-domain data selection.