There's a resurgence of interest in fine tuning LLMs
I've yet to see a successful public use case where fine tuning > prompting.
But here's where I see fine tuning *mattering*:
First, fine tuning is for teaching an LLM specific tasks or behaviors
Not teaching an LLM new knowledge. For new knowledge, use Retrieval (store your data in an outside database and strategically pull the right chunks in to give the LLM context to your question)
But even in teaching LLMs specific tasks or behaviors - here's the catch...
LLMs are remarkably good at picking up tasks and behaviors from just a good prompt
THIS is what makes LLMs mind blowing after all
So that begs the question.
Where is fine tuning actually helpful?
Some use cases I could see developing are teaching LLMs tasks that are exceptionally difficult to describe, or fit into ~10 examples you can add to a prompt.
One way to think about this: if it would take someone a few weeks doing a task to 'master it' instead of being able to read training materials and get the picture...
That *may* be a use case for fine tuning
But proceed with caution
To truly teach an LLM a new behavior or task, you'll need to treat this like a machine learning project, not just throwing examples in and getting magic in return (which it still blows my mind that ChatGPT does this so well for us).
Things like:
- Dataset design
- Training and test data
- Overfitting
more as the tooling around fine tuning gets more sophisticated
The other obvious use case is cost.
If you can get a super small language model to do a task instead of GPT-4, there's meaningful cost savings there.
And if you're using a language model to do large scale tasks like triaging your customer support inbox, or analyzing public data for insights
The costs can add up.
But if you're wondering where the heck to invest in fine tuning...
My answer at the moment for most businesses is still:
Make sure you can't do it with prompts.