대규모 언어 모델 기반 애니메이션 숏폼 리프레이밍

이강희; 양해준; 배재형; 김탁훈; 최종원

doi:10.5909/JBE.2026.31.1.152

상세 보기

대규모 언어 모델 기반 애니메이션 숏폼 리프레이밍

LLM-based Animation Reframing for Short-Form Video

이강희;
양해준;
배재형;
김탁훈;
최종원

초록

As most multimodal large language models (MLLMs) are trained on real-world data, MLLMs face challenges in accuratelyinterpreting animated videos that feature stylized visual characteristics such as exaggerated geometry and simplified shading. As aresult, video reframing often places key characters outside the frame, and recognition performance degrades when characters are notincluded in the pre-training data. To address this issue, this paper proposes a training-free short-form transformation pipeline thatjointly interprets animated video content and script-based text prompts while utilizing character images as visual queries. Theproposed approach first performs scene extraction using an MLLM, followed by object detection based on visual queries, and thenapplies Adaptive Zooming to mitigate object loss and cropping errors that may occur during the reframing process.

키워드

Large Language Model; Segment Retrieval; Visual Query Localization; Animation

제목: 대규모 언어 모델 기반 애니메이션 숏폼 리프레이밍

제목 (타언어): LLM-based Animation Reframing for Short-Form Video

저자: 이강희; 양해준; 배재형; 김탁훈; 최종원

DOI: 10.5909/JBE.2026.31.1.152

발행일: 2026-01

유형: Y

저널명: 방송공학회 논문지

권: 31

호: 1

페이지: 152 ~ 161

상세 보기

대규모 언어 모델 기반 애니메이션 숏폼 리프레이밍

초록

키워드

파일 다운로드