Text-guided cross-position attention for image analysis: Case of medical image
  • Lee, Go-Eun
  • Choi, Sang Tae
  • Kim, Seon Ho
  • Chung, Jaewoo
  • Cho, Jungchan
  • 외 1명
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

In this study, we propose a novel text-guided cross-position attention module that aims to apply multimodality text and images to position attention in medical image segmentation. To match the dimension of the text feature to that of the image feature map, learnable parameters are multiplied by text features and combined with multimodal semantics via cross-attention. This allows the model to learn the dependencies between various text and image characteristics. The proposed model demonstrated superior performance in comparisons with other image-only or image-text data medical models. The module was applied to an active sacroiliitis diagnosis system and brain midline shift analysis by keypoint detection. In the first application, a region of interest (ROI) was segmented to classify inflammation of the sacroiliac joints. In the second application, the module was extended to a keypoint detection task that applied a Gaussian heatmap to detect the points required to determine the brain midline shift. Experimental results show that the proposed module can provide improved results in both applications. © 2025

키워드

Cross-positional attentionImage segmentationMedical imageMultimodal learningText-guided attention
제목
Text-guided cross-position attention for image analysis: Case of medical image
저자
Lee, Go-EunChoi, Sang TaeKim, Seon HoChung, JaewooCho, JungchanChoi, Sang-Il
DOI
10.1016/j.compbiomed.2025.110297
발행일
2025-07
유형
Article
저널명
Computers in Biology and Medicine
193