SmithDB is the perfect example of how far performance can be pushed by having full control over the storage layer. The DataFusion
@vortexdotdev stack seems to be emerging as THE way to build next generation databases.
#ParquetIsForFloors
We built SmithDB: the database purpose built for agent observability workloads that now powers many parts of LangSmith.
Agent observability presents a challenging data problem. Agent traces can contain tens of thousands of intermediate spans and large, unbounded payloads. These characteristics are a direct result of agents running for longer time horizons and LLM context window sizes growing.
Traditional data infrastructure was not built to handle the complexities associated with storing and querying this data.
SmithDB brings LangSmith up to 12x performance improvements across access patterns most important for agent observability. I’ve been working on SmithDB directly with an amazing team over the past few months, and I’m incredibly proud of the results we’re seeing.
I wrote a bit more about the story and engineering challenges behind SmithDB in this blog.
Additionally, if you’re a systems engineer interested in building the future of agent observability, please reach out!