Historically, unstructured data has dominated the spotlight in AI, while the mission-critical structured data that drives most enterprise workflows has remained under-leveraged, with few proven recipes for AI workloads.
Today, we’re changing that by fully open-sourcing Contextual-SQL, a state-of-the-art Text-to-SQL pipeline which ranks highly on the BIRD benchmark and you can run entirely on-prem.
A surprisingly simple pipeline delivers these results by leaning on two core ideas:
📖 Context beats parameters
DDL → mSchema (table column comments) → mSchema one few-shot example lifts execution accuracy from 54.7 % to 62.5 %. Before reaching for a larger model, enrich your schema docs and drop in a golden demo query.
📈 Scale at inference
Spin up 1000 diverse SQL candidates in parallel, filter invalid queries with a fast sqlite3 check, then rank what’s left using a lightweight reward model built on the same Qwen base plus log-prob confidence. That single trick bumps pass@1 to ~73% -- cheaper and cleaner than fine-tuning.
The whole flow is just five step: generate → filter → rank → pick → run, and lives on GitHub. Fork it, point it at your schema and ship a private text-to-SQL solution.
For a deeper dive, code, and benchmarks, see Sheshansh’s thread and the full blog post below.
Excited to release Contextual-SQL!
🏆#1 local Text-to-SQL system that is currently top 4 (behind API models) on BIRD benchmark!
🌐Fully open-source, runs locally
🔥MIT license
🧵