Braintrust

Braintrust

248 Photos and videos

Tweets

Pinned Tweet

Braintrust

@braintrust

Jun 1

Topics is now GA on all plans. Continuously find the patterns worth investigating across your production traffic.

5:27

2,245

Braintrust

Braintrust

@braintrust

Jun 12

How does your team rank when it comes to shipping quality AI products? Braintrust's AI quality assessment maps your current practices to the next useful step, whether you're still manually checking outputs or already running online scores in production.

0:32

289

Braintrust

Braintrust

@braintrust

Jun 12

Take the assessment → braintrustdata.link/ai-quali…

AI quality assessment - Braintrust

Find gaps in your AI development loop and get an AI quality roadmap.

braintrust.dev

136

Braintrust

Braintrust

@braintrust

Jun 11

When production issues hit, engineers need to search logs and identify problems in real time. Brainstore delivers query times under one second, even across terabytes of AI observability data. It's 23.9x faster at full text search and 3.73x faster at loading spans compared to leading competitors.

313

Braintrust

Braintrust

@braintrust

Jun 11

Read more → braintrustdata.link/database…

Brainstore makes AI observability at scale possible - Blog - Braintrust

Real-world benchmarks show Brainstore is up to 24x faster than competitors, making it possible to observe AI systems at production scale.

braintrust.dev

246

Braintrust

Braintrust

@braintrust

Jun 11

Your agent has brain rot

7,203

Braintrust

Braintrust

@braintrust

Jun 10

Traces tell you how customers are using your agents. Topics groups those interactions into patterns so you can uncover opportunities for improvement. Braintrust is hosting a workshop on using Topics to identify customer use cases from production data and turning those observations into agent decisions.

198

Braintrust

Braintrust

@braintrust

Jun 10

Join us → braintrustdata.link/ship-ai-…

Online workshop: Understand customer use cases · Zoom · Luma

Your production traces show how customers are using your product. Every session is someone trying to get something done. Topics groups those interactions into…

luma.com

137

Braintrust

Braintrust

@braintrust

Jun 9

Build a full eval pipeline (dataset, prompt, scorer, experiment) using just the Braintrust CLI and skills. In this video, we test GPT-5 on chess puzzles and analyze the results of our experiments with only natural language, no code written. Watch here → braintrustdata.link/CLI-skil…

9:35

314

Braintrust

Braintrust

@braintrust

Jun 8

The inaugural Agent Open is happening June 30 in SF. Come talk about AI observability, then show off your skills at one thing agents can’t do: pickleball. Hosted by the teams at @braintrust, @Cursor_ai, @llama_index, @turbopuffer, @p0, @modal, @browserbase.

692

Braintrust

Braintrust

@braintrust

Jun 8

See you there → braintrustdata.link/agent-op…

The Agent Open: AI's pickleball tournament · Luma

The inaugural Agent Open SF 2026 is happening. An afternoon of pickleball, food, drinks, AI conversations, and a tournament bracket full of people that take…

luma.com

246

Braintrust

Braintrust

@braintrust

Jun 5

What's new: -Topics is now GA, with $249 in credits for Pro plans -Multi-user human review, with averaged scores -Workload identity federation for Anthropic, Vertex AI, and Azure -Run remote evals and sandboxes as experiments -Automate data preparation with dataset pipelines

1:58

460

Braintrust

Braintrust

@braintrust

Jun 5

Read more → braintrustdata.link/whats-ne…

Product updates - Braintrust

New updates and product improvements

braintrust.dev

198

Ankur Goyal

Braintrust retweeted

Ankur Goyal

@ankrgyl

Jun 4

x.com/i/article/206263386151…

10,514

Braintrust

Braintrust

@braintrust

Jun 4

Raw agent traces can include millions of tokens across hundreds of spans. Too large for direct embedding, too irregular for classic topic modeling, and too high-volume for full-trace LLM classification. Topics solves this by summarizing traces into facets, then continuously embedding, clustering, and classifying them.

223

Braintrust

Braintrust

@braintrust

Jun 4

It sounds simple, but the hard part is making it work across millions of production traces without blowing out token costs or breaking down at scale. Here's how we built Topics → braintrustdata.link/architec…

How we made continuous trace intelligence possible at scale - Blog - Braintrust

A deep dive into the architecture of Topics.

braintrust.dev

178

Braintrust

Braintrust

@braintrust

Jun 3

Production traces capture where your AI falls short and what users are trying to do. Building evals with that data is how you catch failures earlier and decide what to ship next. Braintrust is leading a workshop on how to: - Use the patterns Braintrust surfaces automatically - Turn them into a labeled eval dataset - Run the same workflow every time a new pattern shows up

272

Braintrust

Braintrust

@braintrust

Jun 3

Join us → braintrustdata.link/producti…

Online workshop: Build evals from real production data · Zoom · Luma

Production traces capture where your AI falls short and what users are trying to do. Building evals from that data is how you catch failures earlier and make…

luma.com

189

Braintrust

Braintrust

@braintrust

Jun 2

Vibes-based testing and manual review don't scale. Automated evals are easy to set up and can make an immediate impact on AI development speed. Learn about three automated approaches to get started quickly with evals: LLM judges, heuristics, and comparative evals.

287

Braintrust

Braintrust

@braintrust

Jun 2

Read more → braintrustdata.link/getting-…

Getting started with automated evaluations - Blog - Braintrust

Three actionable approaches for engineering teams to get started with automated evaluations.

braintrust.dev

227

Ankur Goyal

Braintrust retweeted

Ankur Goyal

@ankrgyl

Jun 1

x.com/i/article/206147973632…

79,147