Gartner has just published one of their famous Hype Cycle curves for 2023, specialized to Generative AI!
The curve is encouragingly optimistic on Vector Databases, predicting we still have 5-10 years to hit the infamous Peak of Inflated Expectations!
Here are 5 reasons why I also think Vector Databases are still an upcoming technology and no where near the peak: (1) RAG "GPT-5", (2) RAG Easy Fine-Tuning, (3) Easy Data Ingestion, (4) Generative Feedback Loops, and (5) Self-Driving DBs
1. RAG "GPT-5"
As a TLDR, the next-generation "GPT-5" will likely be a long context LLM. There is a huge opportunity to pre-train these kind of LLMs with more naturally long context data such as podcast transcriptions or code. It is also quite likely that these models have some retrieval-aware tuning as well to prevent hallucination to retrieved context.
Many people currently come to Vector DBs with a classic kind of "I have a 20-30 page PDF that I can't fit to ChatGPT". This is missing the point of the Zero-Shot LLM RAG in my opinion, don't just give the LLM your document -- give it the background knowledge as well! This is why I am a huge fan of the work in LlamaIndex and LangChain to pioneer query engineers across multiple search indexes.
2. RAG Easy Fine-Tuning
The tooling for fine-tuning is getting really strong, quick hat tip to HuggingFace, MosaicML, and Weights & Biases.
Imagine you are a lawyer. In addition to having the relevant laws you need to solve a case, you also need to have the skill of making the case. Making the case could entail surface level "style" (the current most common argument for this) or more complex compositional generalization that may be only possible to represent in high-dimensional data structures with non-linear interaction effects.
RAG is a fundamental modeling architecture that is perfectly amenable to fine-tuning. RAG generally adds (1) interpretability (you can see the docs that influenced the prediction, not 100% linked ofc), (2) parameter efficiency (by decomposing retriever-reader you get away with cheaper readers e.g. ATLAS), (3) continual updating (keeping the data as fresh in a parametric only LLM as a Kafka stream or what have you is unlikely - the unlearning stuff is cool though).
RAG Fine-Tuning also has a massive opportunity to create better search by training the search models end-to-end with the gradients from the reader. RLHF back to the embedding models (maybe rankers can be trained like this as well).
3. Easy Data Ingestion
Parsing PDFs into both unstructured text and structured layout information will ofc dramatically facilitate how many people can use Vector DBs. The tooling here is also getting incredibly strong thanks to Unstructured, LlamaIndex, and LangChain. Connecting this with your Twitter APIs, web scrapers, etc. through scheduled Cron Jobs will be amazing.
4. Generative Feedback Loops
RAG innovates on the output from DBs, Generative Feedback Loops (where we save generated or transformed data from an LLM back into the database), will innovate on the fundamental - What's in the database?
This will really get us to the peak in my opinion because it will also evangelize everyone having 100M vectors on their laptops (if managing their own personal DB) or say this kind of thing in a knowledge management platform like Notion / Confluence / GitHub / HuggingFace. Scale will really unlock the value of Vector DBs, not that I really agree with the Numpy is all you need argument anyways that overlooks the CRUD compatibility, cloud scaling, symbolic properties, search features like hybrid / filtering, etc. TLDR - AI will take your documents you like, pictures, movies, songs --- and create more of it! You will then need databases to navigate this explosion of content!
For this reason I think it is also important to think of Vector DBs as traditional DBs Search Engines Recommendation Systems -- because Recommendation Systems have a bit of nuance vs. Search only with user representation and more use of symbolic re-rankers like XGBoost -- also potentially an explore-exploit RL component to the recommender (kudos to whoever builds that).
5. Self-Driving DBs
Gorilla is an exciting research project that translates natural language commands to API syntax. Text-to-SQL is making a ton of progress! I think this will not only generalize to Text-to-SQL, but also the structuring of data with e.g. properties, tables, key join -- learned by monitoring and maintaining the system. I also think it's possible to use LLMs to optimize lower level physical storage configurations.
Between RAG Next-Gen Zero-Shot LLMs, RAG Easy Fine-Tuning, Easy Data Ingestion, Generative Feedback Loops, and Self-Driving DBs -- I think Gartner is right and we still have a long way to go to the peak of Vector Databases!
Thanks for reading! Check out Weaviate! 😎👍