LLMs ace bar exams, but even the best gets 1 in 12 local queries wrong.
We tested 4 leading LLMs (Claude, GPT, Gemini, Perplexity) on 345 real-world local search prompts (finding restaurants, checking hours, planning routes, booking tables), each run with and without web search (2,415 evaluations).
Every recommended place was verified against Google Search and Maps. OpenAI leads (90.7/100), followed by Gemini (86.4), Claude (85.9), and Perplexity (80.4). No single provider wins everything; rankings shift by task type, and even OpenAI, the best performer, recommends a place that doesn't exist, has permanently closed, or is in the wrong neighborhood 8% of the time.
BTW, we solve this problem at VOYGR - view full report here:
voygr-tech.github.io/llm-loc…