상세 보기
Fractal-Guided Token Pruning for Efficient Vision Transformers
- Kim, Seong Rok;
- Lee, Minhyeok
WEB OF SCIENCE
0SCOPUS
0초록
Vision Transformers achieve strong performance across computer vision tasks but suffer from quadratic computational complexity with respect to token count, limiting deployment in resource-constrained environments. Existing token pruning methods rely on attention scores to identify important tokens, but attention mechanisms capture query-specific relevance rather than intrinsic information content, potentially discarding tokens that carry information for subsequent layers or different downstream tasks. We propose fractal-guided token pruning, a method that leverages the correlation dimension Dcorr of token embeddings as a task-agnostic measure of geometric complexity. Our key insight is that tokens with high Dcorr span higher-dimensional manifolds in representation space, indicating complex patterns, while tokens with low Dcorr collapse to simpler structures representing redundant information. By computing a local Dcorr for each token and pruning those with the lowest values, our method retains geometrically complex tokens independent of attention-based relevance. The correlation dimension quantifies how token embeddings fill the representation space: embeddings from uniform background regions cluster tightly in low-dimensional subspaces (low Dcorr), while embeddings from complex textures or object boundaries spread across higher-dimensional manifolds (high Dcorr), reflecting their richer information content. Experiments on CIFAR-10 and CIFAR-100 with fine-tuned ViT-B/16 models show that fractal-guided pruning consistently outperforms random and norm-based pruning across all tested ratios. At forty percent pruning, fractal pruning maintains 92.26% accuracy on CIFAR-10 with only a 0.99 percentage point drop from the 93.25% baseline while achieving 1.17x speedup. Our approach provides a geometry-based criterion for token importance that complements attention-based methods and shows promising generalization between CIFAR-10 and CIFAR-100 datasets.
키워드
- 제목
- Fractal-Guided Token Pruning for Efficient Vision Transformers
- 저자
- Kim, Seong Rok; Lee, Minhyeok
- 발행일
- 2025-12
- 유형
- Article
- 저널명
- FRACTAL AND FRACTIONAL
- 권
- 9
- 호
- 12