GODiff: Region-Specific Semantic Editing with CLIP-Guided Diffusion Models
Citations

WEB OF SCIENCE

1
Citations

SCOPUS

0

초록

Text-based image editing enables intuitive and flexible content creation, but existing methods often suffer from issues such as the loss of original image characteristics within the target area or unintended modifications in irrelevant regions. Additionally, high-quality results often require additional training or fine-tuning, which causes considerable inconvenience in practical use. To address these limitations, we propose GODiff, a method that offers a more convenient and user-friendly editing experience. It allows immediate and precise modifications of a single image based on dynamically changing text prompts and can be flexibly applied to various diffusion models without additional training. In particular, the editing region is automatically identified, and text-based guidance is focused solely on that area, thereby improving efficiency and minimizing unnecessary changes. Furthermore, by optimizing the guidance in real time during the generative process, GODiff more accurately reflects the intent of the text prompt and produces more natural and consistent results. It has demonstrated consistent performance across diverse datasets and model architectures, and a human evaluation confirmed that it provides superior editing quality in terms of naturalness and user preference compared to existing methods. © 2013 IEEE.

키워드

CLIP-Based Image ManipulationDiffusion ModelsImage Editing
제목
GODiff: Region-Specific Semantic Editing with CLIP-Guided Diffusion Models
저자
Kim, JunhoKim, EunwooLee, Kyungjae
DOI
10.1109/ACCESS.2025.3580263
발행일
2025
유형
Article
저널명
IEEE Access
13
페이지
112818 ~ 112834

파일 다운로드