OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack

최동현(카카오엔터프라이즈), 신명철(카카오엔터프라이즈), 김응균(카카오엔터프라이즈), 신동렬(성균관대학교)

ACL-IJCNLP Findings of ACL



Out-of-domain (OOD) input detection is vital in a task-oriented dialogue system since the acceptance of unsupported inputs could lead to an incorrect response of the system. This paper proposes OutFlip, a method to generate out-of-domain samples using only in-domain training dataset automatically. A white-box natural language attack method HotFlip is revised to generate out-of-domain samples instead of adversarial examples. Our evaluation results showed that integrating OutFlip-generated out-of-domain samples into the training dataset could significantly improve an intent classification model’s out-of-domain detection performance.