The research team at Predibase analyzed 700+ fine-tuning experiments using 13 of the most popular open-source models, GPT-3.5/4/4o, and 31 distinct datasets and tasks. We chose open-source models with a max of 7B parameters to ensure that any organization can train the models on low-end GPUs. For evaluation, we utilized task-specific metrics including accuracy, rouge, and HumanEval to assess performance.
Key Takeaways
- LoRA Fine-tuned models outperform GPT-4 on specialized tasks
- LoRA Fine-tuned models are fast and cheap to train and serve, averaging less than $8 each
- Specialized tasks are ideal for LoRA fine-tuning
0