Fine-Tuning is a specialized form of Transfer Learning where a model that has already been trained on a broad dataset (see Pre-training) is further trained (or “refined”) on a smaller, specific dataset to adapt it for a particular task.
Why it Matters
- Efficiency: Training a model from scratch is prohibitively expensive. Fine-tuning builds upon existing “knowledge,” requiring significantly less data and compute.
- Performance: A fine-tuned model typically outperforms generic “zero-shot” models on specialized tasks (e.g., medical diagnosis, code generation).
Methods
1. Full Fine-Tuning
This involves updating all the parameters (weights) of the pre-trained model during the training process.
- Pros: Maximum adaptability.
- Cons: extremely computationally expensive; requires storing a full copy of the model for every new task.
2. Parameter-Efficient Fine-Tuning (PEFT)
Techniques that update only a small subset of the model’s parameters, or add small trainable layers (adapters), while keeping the vast majority of the pre-trained weights frozen.
- Examples: Low-Rank Adaptation (LoRA), Adapters, Prompt Tuning.
- Pros: Drastically reduces memory usage; ability to swap small “adapters” for different tasks on the same base model.
