대규모 언어 모델을 활용한 한국어 문맥 기반 문장 순서 예측 모형

박상춘; 이혜원; 서정민; 여서연; 김대호; 곽일엽

doi:10.5351/KJAS.2026.39.2.221

상세 보기

대규모 언어 모델을 활용한 한국어 문맥 기반 문장 순서 예측 모형

Context-aware Korean sentence ordering using large language models

박상춘;
이혜원;
서정민;
여서연;
김대호;
... 곽일엽

Citations

WEB OF SCIENCE

0

초록

Sentence order critically affects textual coherence and comprehension, yet real-world data often exhibit disrupted ordering. This study investigates Korean context-aware sentence ordering with Large Language Models, comparing three approaches—Pairwise, Sequence, and Global—through fine-tuning of pretrained models such as KLUE-BERT, KoELECTRA, KLUE-RoBERTa, and T5. Experiments were conducted on the DACON Context-Aware Sentence Ordering AI Competition dataset, comprising 7,350 training and 1,780 test samples. The Pairwise approach effectively captured local sentence relations but failed to model global coherence. The Sequence approach provided an intuitive framework, yet its performance degraded with longer inputs due to overfitting. By contrast, the Global approach, formulated as a classification problem over all permutations, exhibited the most consistent and superior results. Notably, the KLUE-RoBERTa–based Global model achieved the highest score of 83.71% on the private leaderboard.

키워드

deep learning; sentence order prediction; large language model; Korean natural language processing; 딥러닝; 문장 순서 예측; 대규모 언어 모델; 한국어 자연어처리

제목: 대규모 언어 모델을 활용한 한국어 문맥 기반 문장 순서 예측 모형

제목 (타언어): Context-aware Korean sentence ordering using large language models

저자: 박상춘; 이혜원; 서정민; 여서연; 김대호; 곽일엽

DOI: 10.5351/KJAS.2026.39.2.221

발행일: 2026-04

유형: Article

저널명: 응용통계연구

권: 39

호: 2

페이지: 221 ~ 234