Automated Logging With Large Language Models: Eliminating Task-Specific Pretraining

Wang, Dae-Sung; Kim, Tae-Il; Lee, Chan-Gun; Hong, Hyun-Taek

doi:10.1109/ACCESS.2025.3645237

상세 보기

Automated Logging With Large Language Models: Eliminating Task-Specific Pretraining

Wang, Dae-Sung;
Kim, Tae-Il;
Lee, Chan-Gun;
Hong, Hyun-Taek

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Logging is a fundamental practice in software engineering, supporting monitoring, debugging, and system comprehension. However, determining where to insert log statements, what information to record, and how to phrase messages remains a labor-intensive and error-prone task. Recent research has explored automated logging; however, most approaches rely on domain- or language-specific pretraining, limiting applicability across diverse projects and environments. This paper investigates whether large language models (LLMs) trained on general code and natural-language corpora can eliminate the need for task-specific pretraining in automated logging. In our setting, models do not perform any Java- or task-specific pretraining but are trained only through lightweight supervised fine-tuning on the logging dataset. We compare LEONID, a state-of-the-art pretraining-centric logging system, and off-the-shelf models, including T5, CodeT5, and Qwen, without additional Java-specific pretraining. This evaluation encompasses single-log injection (levels, variables, positions, and messages), log-message quality assessed using reference-based metrics, and multilog injection, requiring multiple statements in a method. The results indicate that general-purpose LLMs rival or surpass LEONID in several aspects. Notably, Qwen-7B improves insertion position accuracy by 14 percentage points, addressing one of the most critical factors for the practical deployment of automated logging. Moreover, CodeT5 and Qwen achieve competitive or superior performance in message generation, and Qwen-7B demonstrates stronger structural reasoning for insertion positions and multilog scenarios. Although LEONID remains more stable in level selection and variable binding, the findings indicate that task-specific pretraining is not universally required. This work provides the first head-to-head evaluation of a pretraining-based approach versus off-the-shelf LLM approaches for automated logging, offering practical guidance for when to apply custom pretraining and when general-purpose models are sufficient.

키워드

Automated logging; code intelligence; large language model; software engineering automation

제목: Automated Logging With Large Language Models: Eliminating Task-Specific Pretraining

저자: Wang, Dae-Sung; Kim, Tae-Il; Lee, Chan-Gun; Hong, Hyun-Taek

DOI: 10.1109/ACCESS.2025.3645237

발행일: 2025

유형: Article

저널명: IEEE Access

권: 13

페이지: 213260 ~ 213272

ScholarWorks@중앙대학교

상세 보기

초록

키워드

파일 다운로드