상세 보기
- Lee, Seungeun;
- Choi, Sunmook;
- Kang, Taein;
- Chung, Sanghyeok;
- Han, Soyul;
- ... Kwak, Il-Youp;
- 외 4명
WEB OF SCIENCE
0SCOPUS
0초록
Recent advances in deep learning have led to the widespread use of pre-trained large-scale speech models, such as wav2vec 2.0 (w2v2), in voice spoofing detection. However, the interpretability of such models remains a critical challenge due to their complex internal representations. In this paper, we propose iWAX, an interpretable voice spoofing countermeasure that combines a fine-tuned w2v2 front-end with the AASIST back-end, and an XGBoost classifier. iWAX leverages the feature importance mechanism of XGBoost to identify which temporal segments and frequency bands of the audio w2v2 prioritizes during spoofing detection. To enable frequency-based interpretability, we apply sinc filters to isolate specific spectral regions of input raw waveforms. Temporal analysis is conducted by selecting key features extracted from w2v2 and analyzing their contribution across time. Experimental results on the ASVspoof 2019 LA dataset demonstrate that iWAX not only outperforms baseline models such as AASIST and w2v2-AASIST but also provides human-understandable explanations of its predictions. Further analysis with LightGBM validates the robustness of our approach across different boosting models. Overall, iWAX offers a compelling balance between interpretability and performance, addressing the limitations of both traditional machine learning and modern deep learning-based countermeasures. © 2025. The Author(s).
키워드
- 제목
- iWAX: interpretable Wav2vec-AASIST-XGBoost framework for voice spoofing detection
- 저자
- Lee, Seungeun; Choi, Sunmook; Kang, Taein; Chung, Sanghyeok; Han, Soyul; Seo, Jaejin; Park, Seoyoung; Kim, Eujin; Oh, Seungsang; Kwak, Il-Youp
- 발행일
- 2025-11
- 유형
- Article
- 권
- 15