GODiff: Region-Specific Semantic Editing with CLIP-Guided Diffusion Models

Kim, Junho; Kim, Eunwoo; Lee, Kyungjae

doi:10.1109/ACCESS.2025.3580263

상세 보기

GODiff: Region-Specific Semantic Editing with CLIP-Guided Diffusion Models

Kim, Junho;
Kim, Eunwoo;
Lee, Kyungjae

Citations

WEB OF SCIENCE

1

Citations

SCOPUS

0

초록

Text-based image editing enables intuitive and flexible content creation, but existing methods often suffer from issues such as the loss of original image characteristics within the target area or unintended modifications in irrelevant regions. Additionally, high-quality results often require additional training or fine-tuning, which causes considerable inconvenience in practical use. To address these limitations, we propose GODiff, a method that offers a more convenient and user-friendly editing experience. It allows immediate and precise modifications of a single image based on dynamically changing text prompts and can be flexibly applied to various diffusion models without additional training. In particular, the editing region is automatically identified, and text-based guidance is focused solely on that area, thereby improving efficiency and minimizing unnecessary changes. Furthermore, by optimizing the guidance in real time during the generative process, GODiff more accurately reflects the intent of the text prompt and produces more natural and consistent results. It has demonstrated consistent performance across diverse datasets and model architectures, and a human evaluation confirmed that it provides superior editing quality in terms of naturalness and user preference compared to existing methods. © 2013 IEEE.

키워드

CLIP-Based Image Manipulation; Diffusion Models; Image Editing

제목: GODiff: Region-Specific Semantic Editing with CLIP-Guided Diffusion Models

저자: Kim, Junho; Kim, Eunwoo; Lee, Kyungjae

DOI: 10.1109/ACCESS.2025.3580263

발행일: 2025

유형: Article

저널명: IEEE Access

권: 13

페이지: 112818 ~ 112834

ScholarWorks@중앙대학교

상세 보기

초록

키워드

파일 다운로드