상세 보기
- Lee, Sanghyuck;
- Khairulov, Timur;
- Park, Ye-Chan;
- Seo, Wangduk;
- Lee, Jaesung
WEB OF SCIENCE
0SCOPUS
0초록
Media content creation is a labor-intensive and expensive process requiring significant time. Recent developments in artificial intelligence have introduced generative models, which have significant potential in the entertainment industry. Meanwhile, demand for video content tailored to children's songs has steadily increased, reflecting their significant contribution to early education and entertainment. In this paper, we present a generative model-based approach to automated video creation for children's songs. The proposed pipeline consists of three key steps: generating lyrics using a language model, producing background images with a diffusion model, and overlaying dynamic visual effects to enhance the final output. Our experiments include a comparison of conventional diffusion models and prompt engineering methods, highlighting the superior performance of CascadeSD and the efficacy of landscape or image-style prompting. Lastly, we provide experimental results comparing text-to-video models with our pipeline. The code for our project is available in the following repository: https://github.com/KhrTim/BAGen.
- 제목
- Automatic background animation generation aligned with LLM-generated lyrics for children's songs
- 저자
- Lee, Sanghyuck; Khairulov, Timur; Park, Ye-Chan; Seo, Wangduk; Lee, Jaesung
- 발행일
- 2026-01
- 유형
- Article
- 권
- 16
- 호
- 1