Fine-tuned large language models into traditional back-end web architectures

Bowen Li, Chuan Zhang

Article ID: 3168
Vol 3, Issue 1, 2025
DOI: https://doi.org/10.54517/cte3168
Received: 18 December 2024; Accepted: 11 March 2025; Available online: 19 March 2025; Issue release: 31 March 2025


Download PDF

Abstract

Integrating Large Language Models (LLMs) into traditional back-end systems can significantly reduce development overhead and enhance flexibility. This paper presents a novel approach using a fine-tuned LLama3 model as a modular back-end component capable of processing JSON-formatted inputs and outputs. We developed a specialized dataset through advanced prompt engineering with the Phi-3.5 model and fine-tuned LLama3 using Quantized Low-Rank Adaptation (QLoRA) on a single NVIDIA T4 GPU. An API layer was designed to facilitate seamless communication between clients and the LLM, effectively replacing conventional application logic. Our fine-tuned model achieved an average accuracy of 76.5¥% and a response time of 3.56 s across 100 test cases, demonstrating its effectiveness in handling back-end tasks. This work underscores the potential of LLMs to transform AI-driven back-end architectures, offering scalable and efficient solutions for modern web services.


Keywords

LLM; AI; back-end; fine-tuning; web service; LLama3


References

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Author(s)

License URL: https://creativecommons.org/licenses/by/4.0/