Summarize this article with:
- Adding examples to the prompt does help , but only up to a certain point.
- 10 examples seems to be good enough : We noticed that beyond 10 examples, there is generally no significant improvement in performance.
- Fine-tuning with very few examples gives worse performance than few-shot : Fine-tuning can hurt performance when done with very few examples (10 in our case).
- Eden AI provides a unified API platform that gives you access to the best AI providers for this use case, with standardized responses and centralized billing.
- Language codes standardization enables automation, accuracy improvements, and cost reduction across AI-powered applications.
by Eden AI
In this post, we investigate to what extent AI models are capable of learning from examples provided within the prompt itself. We attempt to figure out the best way to leverage what is called “few-shot learning” in practical scenarios, and when to prefer fine-tuning over it.
What is few-shot learning?
Few-shot learning is about providing the AI with some example pairs of inputs and associated outputs as part of the prompt. This helps the AI to understand what kind of output format we are looking for, and to better understand the task.
In this post, we look at how few-shot learning can be used for classification. The examples in the prompt look like this:
What is the ideal number of examples?
We investigate the ideal number of examples for our task (multi-class text classification). We use a movie plot dataset, and try to classify the movie genre given the plot.
In the figure below, we plot classification accuracy as a function of the number of examples for 3 different models (GPT3.5 Turbo, Cohere Command, and Anthropic Claude2). 0 examples corresponds to zero-shot learning (no examples in the prompt).
Looking at these curves, we can note that:
- Adding examples to the prompt does help, but only up to a certain point. Adding more and more examples doesn’t continuously improve performance, and could even decrease it at some point.
- 10 examples seems to be good enough: We noticed that beyond 10 examples, there is generally no significant improvement in performance. This is consistent with what has been observed in the literature for few-shot learning tasks.
How should examples be chosen?
Choosing the examples wisely can make a big difference. Here we investigate how the choice of examples impacts performance.
Random vs Diverse few-shot
Providing diverse examples can help the model better understand the range of possible inputs. We compare random and diverse examples below.
- Diversity helps on average but may not help for every model
Fixed vs Dynamic (or Adaptive) few-shot
Dynamic few-shot means that the examples are chosen based on the input. For example, we can use a retrieval model to find the most similar examples to the input. We compare fixed and dynamic examples below.
- Dynamic is better than fixed examples for all models
When should we prefer fine-tuning over few-shot learning?
Fine-tuning is generally much more powerful than few-shot learning when done correctly. However, it takes more time and effort to set up. There are also some cases where fine-tuning gives worse results than few-shot learning.
In our experiment, we fine-tune the Davinci model (GPT3) with 3 different numbers of training examples: 2 examples per class (10 total), 50 examples per class (250 total), and 500 examples per class (2500 total). We compare the fine-tuned models with the best few-shot learner.
- Fine-tuning with very few examples gives worse performance than few-shot: Fine-tuning can hurt performance when done with very few examples (10 in our case). More data is needed.
- Fine-tuning with 500 examples per class gives much better performance than few-shot: Fine-tuning is preferred when you have a larger number of annotated data.
Takeaways for few-shot learning
- 10 examples in the prompt is generally enough
- Dynamic few-shot outperforms fixed few-shot
- Diverse examples help on average but may not for all models
- Fine-tuning is preferred when you have more than ~50 training examples per class
.png)
.jpg)
.png)

