상세 보기
FracGrad: A Discretized Riemann-Liouville Fractional Integral Approach to Gradient Accumulation for Deep Learning
WEB OF SCIENCE
0SCOPUS
0초록
Gradient accumulation enables training large-scale deep learning models under GPU memory constraints by aggregating gradients across multiple microbatches before parameter updates. Standard gradient accumulation treats all microbatches uniformly through simple averaging, implicitly assuming that all stochastic gradient estimates are equally reliable. This assumption becomes problematic in non-convex optimization where gradient variance across microbatches is high, causing some gradient estimates to be noisy and less representative of the true descent direction. In this paper, FracGrad is proposed, a simple weighting scheme for gradient accumulation that biases toward recent microbatches via a power-law schedule derived from a discretized Riemann-Liouville integral. Unlike uniform summation, FracGrad reweights each microbatch gradient by wi(alpha)=(N-i+1)alpha-(N-i)alpha & sum;j=1N[(N-j+1)alpha-(N-j)alpha], controlled by alpha is an element of(0,1]. When alpha=1, standard accumulation is recovered. In experiments on mini-ImageNet with ResNet-18 using up to N=32 accumulation steps, the best FracGrad variant with alpha=0.1 improves test accuracy from 16.99% to 31.35% at N=16. Paired t-tests yield p approximate to 2x10-6.
키워드
- 제목
- FracGrad: A Discretized Riemann-Liouville Fractional Integral Approach to Gradient Accumulation for Deep Learning
- 저자
- Lee, Minhyeok
- 발행일
- 2025-11
- 유형
- Article
- 저널명
- FRACTAL AND FRACTIONAL
- 권
- 9
- 호
- 11