Daniel Schofield

Daniel Schofield

5 Sep 2025

The Document AI space has seen a fundamental shift in the past year. Everyone—from scrappy startups to established players—has pivoted from custom supervised models to wrapping the same handful of closed-source multimodal models. Yet, despite the fact we're all using essentially the same approach and the same models under the hood, there's no shortage of “benchmark triumphs” from Document AI vendors touting the best performance on the market. I especially find it comical when these vendors compare their product against ours at @UnstructuredIO , and yet instead of comparing their VLM wrapper against our VLM wrapper (which according to our own benchmarks outperforms theirs), they compare it to our free, open source product—a product that doesn't depend on massive, powerful, expensive closed source models. *blink blink* I'm sorry, but that's like comparing public transportation in Rome to driving an Alfa Romeo 4C Spider convertible through the Tuscan hills—they were designed for different intents in mind. Here’s the truth: when Fortune 500 teams run real head-to-head evaluations—our commercial platform consistently performs on par or better than the best in the business. Month to month, we trade #1 spots with the leaders. But the bigger problem is this: benchmark theater is costing enterprises greatly. Choosing a vendor that is touted via their own benchmarks as having the best-in-class transform of pdfs, but can't process other document types results in organizations having to build a rats nest of supplemental home-grown capabilities that require management, maintenance, and eventually grows to the point where it needs to be swapped out with a more scalable solution. Those glossy accuracy charts usually measure PDFs in isolation—while critical data in .docx, .pptx, .eml, .msg, .tiff, .epub, or .xlsx files goes completely unseen. And what about model fallback, dynamic content-based routing, retries, and all the other features needed to ensure your VLM wrapper actually works at scale? Finally, let's not forget the factor that when it comes to benchmark performance, most vendors fine-tune (to the point of overfitting) prompts to perform well on major public benchmarks. At the end of the day, document transformation quality isn’t about cherry-picked metrics. It’s about coverage, fidelity, metadata richness, and mitigating the cost of missed information. Ready to see what benchmarks look like when they reflect real business impact? 🎙️ Join our deep-dive in next week's webinar on Wednesday, Sept. 10: Document Transformation Quality Series: Pushing the Boundaries of Document Transformation Quality → Sign up here: unstructured.io/events/pushi… #DocumentTransformation #BenchmarkTruth #EnterpriseAI #UnstructuredData #DocumentAI #Unstructured #BenchmarkData

0:43

669