My caption 😄

Effective Korean units for sequence encoding in deep learning

Abstract

Deep learning has emerged as a new area of machine-learning research and has been successfully applied to natural language processing, such as machine translation and sentence classification. In this work, we use effective Korean input token units to encode Korean sentences for classification problems, such as topic detection. Recurrent and convolutional neural networks for Korean sentence encoding are briefly reviewed, and various Korean input tokens units, including character, morpheme-tag, morpheme, word, subword, syllable window, and hybrids of morpheme and character methods are explored. Extensive experiments on sentimental analysis, topic detection, and intention understanding tasks are conducted to find effective input token units.

Publication
In Journal of KIISE, Vol 45 no. 05, Pages 0457 ~ 0465, May
Date
Links