딥러닝 기반 언어모델을 이용한 한국어 학습자 쓰기 평가의 자동 점수 구간 분류 -KoBERT와 KoGPT2를 중심으로-

조희련; 이유미; 임현열; 차준우; 이찬규

doi:10.15652/ink.2021.18.1.217

상세 보기

딥러닝 기반 언어모델을 이용한 한국어 학습자 쓰기 평가의 자동 점수 구간 분류 -KoBERT와 KoGPT2를 중심으로-

Automatic Score Range Classification of Korean Essays Using Deep Learning-based Korean Language Models -The Case of KoBERT & KoGPT2-

조희련;
이유미;
임현열;
차준우;
이찬규

Citations

WEB OF SCIENCE

1

초록

Automatic Score Range Classification of Korean Essays Using Deep Learning-based Korean Language Models-The Case of KoBERT & KoGPT2-. We investigate the performance of deep learning-based Korean language models on a task of automatically classifying Korean essays written by foreign students. We construct an experimental data set containing a total of 304 essays, which include essays discussing the criteria for choosing a job (‘job’), conditions of a happy life (‘happiness’), relationship between money and happiness, and definition of success. These essays were divided into four scoring levels, and using this 4-class data set, we fine-tuned two Korean deep learning-based language models, namely, KoBERT and KoGPT2, to use them in the automatic essay classification experiment. The 7-fold cross validation classification accuracies of ‘job’ and ‘happiness’ essays were 48.8% and 65.2% respectively for KoBERT, and 50.6% and 58.9% respectively for KoGPT2. Furthermore, the 7-fold cross validation classification accuracies of the integrated dataset that combined all essays were 54.5% and 46.5% for KoBERT and KoGPT2 respectively.

키워드

딥러닝; 언어모델; 한국어 쓰기 답안지; 자동 점수 구간 분류; Deep Learning; Language Model; Korean Essays; KoBERT; KoGPT2; Automatic Score Range Classification

제목: 딥러닝 기반 언어모델을 이용한 한국어 학습자 쓰기 평가의 자동 점수 구간 분류 -KoBERT와 KoGPT2를 중심으로-

제목 (타언어): Automatic Score Range Classification of Korean Essays Using Deep Learning-based Korean Language Models -The Case of KoBERT & KoGPT2-

저자: 조희련; 이유미; 임현열; 차준우; 이찬규

DOI: 10.15652/ink.2021.18.1.217

발행일: 2021-04

저널명: 한국언어문화학

권: 18

호: 1

페이지: 217 ~ 241