A Survey on LLM Edge-Intelligence: Recent Advances and Open Challenges

Citations

SCOPUS

0

초록

Large language models (LLMs) are increasingly deployed on edge devices to reduce latency, bandwidth, and privacy risks. This survey reviews recent techniques that enable efficient edge-LLM inference and fine-tuning: (1) memoryefficient model architectures, (2) edge-aware inference orchestration, and (3) privacy-preserving fine-tuning. Representative works are analyzed for their algorithmic contributions and empirical gains in memory, latency, and accuracy. We identify three core challenges: (1) generalizing compression policies across diverse LLM families; (2) designing lightweight, online orchestration that jointly optimizes latency, bandwidth, memory, and energy under dynamic conditions; and (3) ensuring privacy-preserving, adaptive fine-tuning without catastrophic forgetting. The paper concludes with a roadmap for unified, end-to-end frameworks that balance resource constraints, performance, and security in practical Edge-LLM deployments.

키워드

Distributed ComputingEdge ComputingLarge Language Model
제목
A Survey on LLM Edge-Intelligence: Recent Advances and Open Challenges
저자
Nam, SanghyuckKim, KyeongyeonPark, Sangoh
DOI
10.1109/ICOIN68469.2026.11480489
발행일
2026
유형
Conference Paper
저널명
International Conference on Information Networking
페이지
996 ~ 998