Text-guided cross-position attention for image analysis: Case of medical image

Lee, Go-Eun; Choi, Sang Tae; Kim, Seon Ho; Chung, Jaewoo; Cho, Jungchan; Choi, Sang-Il

doi:10.1016/j.compbiomed.2025.110297

상세 보기

Text-guided cross-position attention for image analysis: Case of medical image

Lee, Go-Eun;
Choi, Sang Tae;
Kim, Seon Ho;
Chung, Jaewoo;
Cho, Jungchan;
외 1명

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

In this study, we propose a novel text-guided cross-position attention module that aims to apply multimodality text and images to position attention in medical image segmentation. To match the dimension of the text feature to that of the image feature map, learnable parameters are multiplied by text features and combined with multimodal semantics via cross-attention. This allows the model to learn the dependencies between various text and image characteristics. The proposed model demonstrated superior performance in comparisons with other image-only or image-text data medical models. The module was applied to an active sacroiliitis diagnosis system and brain midline shift analysis by keypoint detection. In the first application, a region of interest (ROI) was segmented to classify inflammation of the sacroiliac joints. In the second application, the module was extended to a keypoint detection task that applied a Gaussian heatmap to detect the points required to determine the brain midline shift. Experimental results show that the proposed module can provide improved results in both applications. © 2025

키워드

Cross-positional attention; Image segmentation; Medical image; Multimodal learning; Text-guided attention

제목: Text-guided cross-position attention for image analysis: Case of medical image

저자: Lee, Go-Eun; Choi, Sang Tae; Kim, Seon Ho; Chung, Jaewoo; Cho, Jungchan; Choi, Sang-Il

DOI: 10.1016/j.compbiomed.2025.110297

발행일: 2025-07

유형: Article

저널명: Computers in Biology and Medicine

권: 193

ScholarWorks@중앙대학교

상세 보기

초록

키워드