Beyond LoRA: Can you beat the most popular fine-tuning technique?

Overview

We're on a journey to advance and democratize artificial intelligence through open source and open science. Back to Articles a]:hidden"> Beyond LoRA: Can you beat the most popular fine-tuning technique? Published June 18, 2026 Update on GitHub Upvote 7 +1 Benjamin Bossan BenjaminB Follow Sayak Paul sayakpaul Follow Marian hubnemo Follow Kashif Rasul kashif Follow When you plan to fine-tune a model in a parameter-efficient way, think beyond LoRA If you want to fine-tune an open model on your own data, you are probably interested in so-called parameter-efficient fine-tuning, in short PEFT .

Key Takeaways

This term describes techniques that significantly reduce the memory requirement to fine-tune a model.
Although there are dozens of these techniques, almost everyone chooses one called "LoRA".
Quantization reduces a model's memory footprint, but quantized models can't be fine-tuned directly.
So a set of techniques emerged to cut the memory needed for fine-tuning, called "parameter-efficient fine-tuning", or PEFT.
LoRA: The queen of fine-tuning techniques 👑 One parameter-efficient fine-tuning technique that emerged early and proved to be quite effective is called "Low Rank Adaptation", or short "LoRA" .
It works by adding a handful of parameters on top of the base model, freezing the base model weights, and only training those few parameters.
0% of PEFT checkpoints are LoRAs.
Searching for the code snippet on GitHub ( example GH query ), 71.
This all leads to the question: Are we all leaving performance on the table by shunning better techniques?

Stats & Key Facts

#Published June 18, 2026 Update on GitHub Upvote 7 +1 Benjamin Bossan BenjaminB Follow Sayak Paul sayakpaul Follow Marian hubnemo Follow Kashif Rasul kashif Follow When you plan to fine-tune a model in a parameter-efficient way, think beyond LoRA If you want to fine-tune an open model on your own data, you are probably interested in so-called parameter-efficient fine-tuning, in short PEFT .
#Here are a few estimates: Of a sample of 20,834 model cards on Hugging Face Hub that mention exactly one PEFT technique, 20,509 mention LoRA (98.

This term describes techniques that significantly reduce the memory requirement to fine-tune a model. Although there are dozens of these techniques, almost everyone chooses one called "LoRA". In this blog post, we explore whether LoRA is really the best choice, what tools are available to make an informed decision, and how you can benefit from extending your horizon beyond LoRA.

What is PEFT and when do you need it There are countless open models available, but they often aren't quite good enough for your use case. Prompting may help, but it usually isn't enough. Rather than training a new model from scratch, you should consider fine-tuning an existing one.

Fine-tuning, however, is memory-hungry: you generally need enough memory to fit the whole model several times over. Quantization reduces a model's memory footprint, but quantized models can't be fine-tuned directly. So a set of techniques emerged to cut the memory needed for fine-tuning, called "parameter-efficient fine-tuning", or PEFT.

For more details please read the original article at Hugging Face.

Continue Learning

Foundations

AI Fundamentals: Your First Steps

Foundations

History of AI: From Turing to Today

Foundations

How AI Actually Works (Under the Hood)

Originally published by Hugging Face

Read the original

Stats & Key Facts

Continue Learning

Comments