Automatic background animation generation aligned with LLM-generated lyrics for children’s songs

Lee, Sanghyuck; Khairulov, Timur; Park, Ye-Chan; Seo, Wangduk; Lee, Jaesung

doi:10.1038/s41598-025-30139-6

상세 보기

Automatic background animation generation aligned with LLM-generated lyrics for children’s songs

Lee, Sanghyuck;
Khairulov, Timur;
Park, Ye-Chan;
Seo, Wangduk;
Lee, Jaesung

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Media content creation is a labor-intensive and expensive process requiring significant time. Recent developments in artificial intelligence have introduced generative models, which have significant potential in the entertainment industry. Meanwhile, demand for video content tailored to children's songs has steadily increased, reflecting their significant contribution to early education and entertainment. In this paper, we present a generative model-based approach to automated video creation for children's songs. The proposed pipeline consists of three key steps: generating lyrics using a language model, producing background images with a diffusion model, and overlaying dynamic visual effects to enhance the final output. Our experiments include a comparison of conventional diffusion models and prompt engineering methods, highlighting the superior performance of CascadeSD and the efficacy of landscape or image-style prompting. Lastly, we provide experimental results comparing text-to-video models with our pipeline. The code for our project is available in the following repository: https://github.com/KhrTim/BAGen.

제목: Automatic background animation generation aligned with LLM-generated lyrics for children’s songs

저자: Lee, Sanghyuck; Khairulov, Timur; Park, Ye-Chan; Seo, Wangduk; Lee, Jaesung

DOI: 10.1038/s41598-025-30139-6

발행일: 2026

유형: Article

저널명: Scientific Reports

권: 16

호: 1

상세 보기

Automatic background animation generation aligned with LLM-generated lyrics for children’s songs

초록

파일 다운로드