A 2-minute introduction to the fundamental building block behind Large Language Models:
Text Embeddings
The Internet is mainly text. For centuries, we've captured most of our knowledge using words, but there's one problem:
Neural networks hate text.
Judging by how good language models are today, this might not be obvious, but turning words into numbers is more complex than you think.
Imagine a 4-word vocabulary: King, Queen, Prince, and Princess.
The most straightforward approach to turning our vocabulary into numbers is to use consecutive values:
• King → 1
• Queen → 2
• Prince → 3
• Princess → 4
Unfortunately, neural networks tend to see what's not there. Is the concept of a princess four times as important as a king? How can we prevent the network from misreading these values?
We need a better representation.
Instead of using numerical values, we can use a group of ones and zeros to differentiate each word:
• King → [1, 0, 0, 0]
• Queen → [0, 1, 0, 0]
• Prince → [0, 0, 1, 0]
• Princess → [0, 0, 0, 1]
Notice how we create a different vector for each word by changing the position of the 1. This encoding fixes the problem of a network misinterpreting ordinal values but introduces a new one.
According to the Oxford English Dictionary, there are 171,476 words in use. If we wanted to represent the entire language, we would have to deal with huge vectors full of zeroes.
Here is where the idea of "word embeddings" enters the picture.
We can use a different model to learn vectors that represent words. These vectors will be smaller, dense, and will have a crucial characteristic:
Vectors representing related words should be close to each other.
The attached image is a two-dimensional chart where I placed the four words from our vocabulary. I organized them so the pair King/Queen and Prince/Princess are closer to each other. That's the crucial part!
But something magic happens.
Move on the horizontal axis from left to right. We go from masculine (King and Prince) to feminine (Queen and Princess). Our embedding encodes the concept of "gender"!
And if we move on the vertical axis, we go from young (Prince and Princess) to old (King and Queen). Our embedding also encodes the concept of "age"!
We can derive the new vectors from the coordinates of the chart:
• King → [3, 1]
• Queen → [3, 2]
• Prince → [1, 1]
• Princess → [1, 2]
The first component represents the concept of "age": King and Queen have a value of 3, indicating they are older than Prince and Princess with a value of 1.
The second component represents the concept of "gender": King and Prince have a value of 1, indicating male, while Queen and Princess have a value of 2, indicating female.
I used two dimensions for this example because we only have four words, but using more dimensions would allow us to represent many more concepts besides gender and age. For instance, OpenAI uses 12,288 dimensions to encode their vocabulary.
Embeddings are the backbone of generative AI models. They encode complex relationships in a compact form. However, we need to fine-tune these embeddings to tailor a model for specific tasks.
Fine-tuning adjusts the model to better suit particular applications, refining its responses to be contextually relevant. Fine-tuning a model is a complex, expensive process. It takes a lot of time, effort, and GPU computing. Fortunately, techniques like LoRA and QLoRA will help you fine-tune a larger model faster and cheaper than ever before.
If you have a model and want to fine-tune it, check
@monsterapis' platform.
Monsterapi.ai built the first platform that offers no-code fine-tuning of open-source models. They sponsored me and gave me 10,000 free credits for anyone who uses the code "SANTIAGO" in their dashboard.
If you want to read their latest updates, get free credits and special offers, join their Discord server:
discord.com/invite/mVXfag4kZ…