Gm, CT.
I just read an article on Retrieval-Augmented Generation (RAG) by
@0xPrismatic , and how it attempts to make LLMs smarter especially in terms of reducing hallucinations (which freak me out)
(article was featured in his Chain of thoughts site, btw)
So, why do LLM hallucinations freak me out?
>Because they sound confident even when they’re wrong, and if the subject of the output is outside your field, how do you even know if it’s correct or wrong?
Exactly. You don’t know.
>Wrong info spreads quickly online, especially if it looks polished and in real world settings like healthcare, finance, or academic writing, misinformations can be costly .
I for one personally know how this feels 😅.
Years back, when I first used ChatGPT for advanced research in school, I remember I asked it for Harvard style academic references for the correlating citations within the paper.
It gave me good looking citations, yes but being me with my trust issues, I doubled checked, only to find out that these citations didn’t exist.
I had to ask it directly, “Are these real?” and it admitted they weren’t.
(Not some btw, all of them were not real. )
It wasn’t lying, it just didn’t have access to academic databases, so it generated what looked like a reference based on patterns.
Now I know this was peculiar with the older gpt models, but this is a prime example of what hallucination looks like.
So what does RAG have to do with all this?
RAG stands for Retrieval-Augmented Generation. As
@0xPrismatic puts it, “RAG is a framework that lets LLMs access external knowledge at runtime”.
Simply, the model retrieves data before responding instead of using only information from pre-training .
>You ask a question.
>The system retrieves documents that look relevant (from wherever you’ve pointed it company docs, a website, internal files) and passes it to the LLM
>The LLM uses both its original training the fresh material to answer.
In comparison;
> An LLM only Model would generate responses from its fixed training.
Responses are fast, but it can hallucinate and it can’t cite new info.
> A RAG only model would be able to retrieve docs and show them raw.
We would have accurate sources, but no synthesis or context.
>LLM RAGModel would be able to retrieve data before responding, leading to fewer hallucinations.
What difference would RAG make?
A lot. Like in my case, if RAG had been active, the system wouldn’t have guessed fake citations. It would have searched actual databases (if connected), pulled real papers, and then used those to cite properly. The whole point is to avoid false information responses. And that is where AI development is fully headed.
Really great read btw 👏