Fine-tuned large language models into traditional back-end web architectures

Bowen Li, Chuan Zhang

Article ID: 3168
Vol 3, Issue 1, 2025
DOI: https://doi.org/10.54517/cte3168
Received: 18 December 2024; Accepted: 11 March 2025; Available online: 19 March 2025; Issue release: 31 March 2025


Download PDF

Abstract

Integrating Large Language Models (LLMs) into traditional back-end systems can significantly reduce development overhead and enhance flexibility. This paper presents a novel approach using a fine-tuned LLama3 model as a modular back-end component capable of processing JSON-formatted inputs and outputs. We developed a specialized dataset through advanced prompt engineering with the Phi-3.5 model and fine-tuned LLama3 using Quantized Low-Rank Adaptation (QLoRA) on a single NVIDIA T4 GPU. An API layer was designed to facilitate seamless communication between clients and the LLM, effectively replacing conventional application logic. Our fine-tuned model achieved an average accuracy of 76.5¥% and a response time of 3.56 s across 100 test cases, demonstrating its effectiveness in handling back-end tasks. This work underscores the potential of LLMs to transform AI-driven back-end architectures, offering scalable and efficient solutions for modern web services.


Keywords

LLM; AI; back-end; fine-tuning; web service; LLama3


References

1. RoX818. How to Curate Datasets for OpenAI Fine-Tuning Success. AI Competence. Available online: https://aicompetence.org/how-to-curate-datasets-for-openai-fine-tuning/ (accessed on 22 January 2025).

2. Parthasarathy VB, Zafar A, Khan A, et al. The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities. Available online: https://arxiv.org/html/2408.13296v1 (accessed on 15 October 2024).

3. Song Y, Xiong W, Zhu D, et al. RestGPT: Connecting Large Language Models with Real-World RESTful APIs. Available online: https://arxiv.org/abs/2306.06624 (accessed on 15 October 2024)

4. Wang L, Yang N, Huang X, et al. Large Search Model: Redefining Search Stack in the Era of LLMs. Available online: https://arxiv.org/abs/2310.14587 (accessed on 21 October 2024).

5. Pandya K, Holia M. Automating Customer Service Using LangChain: Building Custom Open-Source GPT Chatbot for Organizations. Available online: https://arxiv.org/abs/2310.05421 (accessed on 21 October 2024).

6. Papageorgiou G, Sarlis V, Maragoudakis M, et al. Enhancing E-Government Services Through State-of-the-Art, Modular, and Reproducible Architecture Over Large Language Models. Applied Sciences. 2024; 14(18): 8259. doi: 10.3390/app14188259.

7. Lewis P, Oguz B, Rinott R, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Available online: https://arxiv.org/abs/2005.11401 (accessed on 21 October 2024).

8. Aisuko. Diverse Calculation. Hugging Face. Available online: https://huggingface.co/datasets/aisuko/diverse_calculation. https://doi.org/10.57967/hf/3724 (accessed on 9 December 2024).

9. Micost. simple_calculation. Hugging Face Datasets. Available online: https://huggingface.co/datasets/micost/simple_calculation (accessed on 14 December 2024).

10. Abdin M, Jacobs SA, Awan AA, et al. Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone. Available online: https://api.semanticscholar.org/CorpusID:269293048 (accessed on 8 December 2024).

11. Micost. db_operate. Hugging Face Datasets. Available online: https://huggingface.co/datasets/micost/db_operate (accessed on 8 December 2024).

12. Micost. Fine-tuning Llama 3 with Simple Calculation. Kaggle. Available online: https://www.kaggle.com/code/micost/fine-tuning-llama-3-with-simple-calculation (accessed on 8 December 2024).

13. Llama Team, AI @ Meta. The Llama 3 Herd of Models. Available online: https://arxiv.org/abs/2407.21783 (accessed on 8 December 2024).

14. Dettmers T, Pagnoni A, Holtzman A. QLoRA: Efficient Finetuning of Quantized LLMs. Available online: https://arxiv.org/abs/2305.14314 (accessed on 8 December 2024).

15. Mathav Raj J, Kushala VM, Warrier H, et al. Fine Tuning LLM for Enterprise: Practical Guidelines and Recommendations. Available online: https://arxiv.org/abs/2404.10779 (accessed on 8 December 2024).

16. Microsoft. Phi-3.5-mini-instruct. Hugging Face. Available online: https://huggingface.co/microsoft/Phi-3.5-mini-instruct (accessed on 8 December 2024).

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Author(s)

License URL: https://creativecommons.org/licenses/by/4.0/