I am incredibly proud to introduce Gemini-SQL2 🚀 as the first author. We've built the world's most powerful Text-to-SQL coding LLM, achieving a state-of-the-art score of 80.04% on the highly competitive BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) benchmark.
Independently evaluated by the BIRD team on a hidden test set of nearly 2,000 tasks, Gemini-SQL2 is the first-ever LLM to break the 80% execution accuracy barrier. This result outperforms submissions from all frontier models across the industry.
Gemini-SQL2 was born out of my nearly year-long internship at Google Research. While its name continues the legacy of its predecessor, I had the privilege of designing, developing, training, and deploying a much revamped architecture, harness, and serving system end-to-end.
Text-to-SQL is a part of the broader AI coding picture, and also a cornerstone of Database AI. Translating natural language directly into executable SQL enables both human non-experts and LLMs to "talk with" 💬 complex databases in plain language. It is one of the most exciting AI coding frontiers - and one of the few areas where AI is still on its way closing the gap with human experts.
A massive thank you to my Google Research team, my collaborators from Google Cloud Research, and leadership for the tremendous support and for helping share this milestone. Wrapping up this journey has been deeply rewarding.
@yanbang_wang ,
@qitianwu_ , Sami Abu-El-Haija, Mohammadreza pourreza,
@michael_galkin ,
@hemmatihadi , Hailong Li,
@yeounoh , Fatma Ozcan,
@phanein ,
@mirrokni
#ai #db #sql #artificialintelligence #AICoding #AgenticAI