Quesma

Quesma

37 Photos and videos

Tweets

Pinned Tweet

Quesma @QuesmaOrg

Jan 26

Recently we built OTelBench – a benchmark to test how well LLMs handle OpenTelemetry instrumentation. We tested 14 models. The best (Claude Opus 4.5) hit only 29%. These weren't trick questions, just small subset of typical SRE tasks. Link here: quesma.com/blog/introducing-…

1,127

Piotr Migdal

Quesma retweeted

Piotr Migdal @pmigdal

Feb 27

AI Ghidra by NSA = reverse-engineering fun I am speaking at @AITinkerers Warsaw, 4th Mar 2026. One of my favorite event series - by and for the creators community. Vibe-resurrecting an old game from binaries 👾 and vibe-hardware-ing a LED backpack 🎒🌈.

372

Piotr Migdal

Quesma retweeted

Piotr Migdal @pmigdal

Feb 10

Claude can code, but can it read machine code? We gave AI agents access to Ghidra (a decompiler by the NSA) and tasked them with finding hidden backdoors in servers - working solely from binaries, without any access to source code. See our BinaryAudit: quesma.com/blog/introducing-…

179

1,442

231,926

Ryan Marten

Quesma retweeted

Ryan Marten

@ryanmart3n

Jan 25

Great to see the community releasing benchmarks in @harborframework now. These are invaluable resources for collectively building the most useful agents.

Jacek Migdal

@jakozaur

Jan 25

Replying to @ryanmart3n

Last week @QuesmaOrg released “terminal-bench-sre-part-1” called OTelBench in Harbor. Another releasing coming soon. Maybe even next week.

1,740

Jacek Migdal

Quesma retweeted

Jacek Migdal

@jakozaur

Jan 9

I used to cite Gartner, now I quote @GergelyOrosz and his Pragmatic Engineer. Enjoy our new blog post: quesma.com/blog/prompts-sour…

261

Quesma

Quesma @QuesmaOrg

24 Nov 2025

Finally, an AI that can draw a map without getting lost. Nano Banana Pro uses tools to create factually correct infographics - and it's a game-changer. quesma.com/blog/nano-banana-…

Nano Banana Pro: raw intelligence with tool use - Quesma Blog

Finally, an AI that can draw a map or create an infographic. The capability of leveraging tools pushed the frontier of image generation.

quesma.com

263

Jacek Migdal

Quesma retweeted

Jacek Migdal

@jakozaur

5 Nov 2025

Postmortems are painful to write, especially this one. Sharing my startup Quesma journey so far. quesma.com/blog/database-gat…

A postmortem on our $2.5M database gateway: lessons from pilot purgatory - Quesma Blog

We had a great team, $2.5M, and a validated problem. A year later, we sold our IP for parts. Here’s what we learned about urgency and co-foundership.

quesma.com

2,040

Quesma

Quesma @QuesmaOrg

24 Oct 2025

Interesting use case for AWS Lambda that we explored: sandboxing AI-generated code. We tried WebAssembly first but hit the wall. So, we scrapped our experiment for AWS Lambda with Docker containers in an isolated VPC. Full writeup from @pmigdal: awsfundamentals.com/blog/san…

Running Untrusted Code Safely at Scale with AWS Lambda

A case study on how Quesma built a secure, isolated execution environment on AWS Lambda to safely run untrusted, AI-generated code in production.

awsfundamentals.com

Tobias Schmidt @tpschmidt_

24 Oct 2025

Lambda has tons of use cases, but one I've missed: using it as some kind of sandbox for running AI-generated code. Lambda's isolation and scaling are a solid fit for this problem.

183

AISecHub

Quesma retweeted

AISecHub

@AISecHub

22 Oct 2025

The security paradox of local LLMs - quesma.com/blog/local-llms-s… by @jakozaur at @QuesmaOrg If you’re running a local LLM for privacy and security, you need to read this. Our research on gpt-oss-20b (for OpenAI’s Red‑Teaming Challenge) shows they are much more prone to being tricked than frontier models. When attackers prompt them to include vulnerabilities, local models comply with up to 95% success rate. These local models are smaller and less capable of recognizing when someone is trying to trick them. #AISecurity #LLMSecurity #LocalLLM #GenAI #MLOps #ModelRisk #DataPrivacy #AIPrivacy #PromptInjection #AIThreats #AIGovernance #EdgeAI

The security paradox of local LLMs - Quesma Blog

Local LLMs prioritize privacy over security. Our research reveals a 95% backdoor injection success rate.

quesma.com

350

Quesma

Quesma @QuesmaOrg

18 Sep 2025

Can AI compile 22-year-old code? We built CompileBench to find out. We know that LLMs can vibe-code or even win IOI, but what about dependency hell or legacy build systems? (image based on XKCD 2347)

ALT Cartoon about dependency hell; tangled ‘dependencies’ making simple tasks complex.

194

more replies

Quesma

Quesma @QuesmaOrg

18 Sep 2025

Cost-efficiency crown: @OpenAI. Across difficulties, OpenAI models dominate the Pareto frontier of cost. GPT-5-mini (high reasoning) is a great price/perf pick; GPT-4.1 is the fastest with solid wins.

ALT Scatter plot of success vs cost, highlighting OpenAI models.

137

Quesma

Quesma @QuesmaOrg

18 Sep 2025

See the full ranking and every run (logs, commands, binaries), methodology & code: ▶️ compilebench.com 💻 github.com/QuesmaOrg/Compile… 📃 quesma.com/blog/introducing-…

CompileBench

Benchmark of LLMs on real open-source projects against dependency hell, legacy toolchains, and complex build systems.

compilebench.com

101

Quesma

Quesma @QuesmaOrg

17 Sep 2025

Our blog post is second on Hacker News. Enjoy!

2,932

Quesma

Quesma @QuesmaOrg

22 May 2025

Our new blog post about Apache Ice erg limitations: quesma.com/blog-detail/apach…

149

Quesma

Quesma @QuesmaOrg

9 May 2025

At #IcebergSummit 2025, Ryan Blue unveiled Iceberg beyond Java, plus the path to Table Spec V3 & forward to V4. Przemysław Delewski’s new blog covers Fokko Driesprong on Pylceberg, Matt Topol on Go, Julien Le Dem on modular DBs. Essential read for next-gen data platforms. Link👇

197

Quesma

Quesma @QuesmaOrg

9 May 2025

quesma.com/blog-detail/highl…

Quesma

Quesma @QuesmaOrg

8 May 2025

Iceberg Table V3 is coming: dremio.com/blog/apache-icebe…

What’s New in Apache Iceberg Format Version 3? | Dremio

Explore what Apache Iceberg V3 brings with support for new data types, schema evolution controls and high-performance scalability at scale.

dremio.com

113

Piotr Migdal

Quesma retweeted

Piotr Migdal @pmigdal

24 Apr 2025

Everything is better when Kawaii 🌸🌸🌸: Titanic survival rates with freshly-released Quesma Charts. app.charts.quesma.com/s/20bv… At @DataCouncilAI conference in Oakland with Jacek Migdał. #dataViz @QuesmaOrg @jakozaur

311