Caleb Gross

Caleb Gross

258 Photos and videos

Tweets

Pinned Tweet

Caleb Gross

@noperator

Feb 10

1/ Agentic LLMs can automate vuln detection. Very exciting, but doesn't address the hardest part (imo) of vuln research: prioritization. Can we reliably explore the search space and separate signal from noise? I wrote a paper (and OSS tool) to solve this. arxiv.org/pdf/2512.06155

217

104,630

Caleb Gross

Caleb Gross

@noperator

Jun 11

getting closer to frontier capability @ home. I am personally running ds4 on Framework Desktop (AMD Strix Halo). > Full SWE-Bench Verified score [for 2-bit quant DeepSeek-V4-Flash] is between 67.5–85%. > The headline SWE-Bench Verified score for DeepSeek-V4-Flash is 80.8% for full-precision version. > It is incredibly impressive that the version of the same model having some layers quantized down to 2 bits still performs comparatively well. > To put it in a perspective, Claude 4.5 Opus scores 76.8% according to the official leaderboard.

antirez @antirez

Jun 11

That's why people using DS4F with DwarfStart, 2 bit quantized, are often surprised by the results. It's not a frontier model but it is not a toy, it is something you can actually use to get work done, and nobody can tell you want to do with it.

720

Caleb Gross

Caleb Gross

@noperator

Jun 10

beware cognitive surrender

242

Caleb Gross

Caleb Gross

@noperator

Jun 10

straight from the horse's mouse

David DiMolfetta @ddimolfetta

Jun 9

CISA will soon release a directive pushing agencies to stop treating every cyber vuln as equally urgent, acting director Nick Andersen said. “If we try to say that everything is equally as important, then absolutely nothing’s going to be important.” nextgov.com/cybersecurity/20…

1,112

Caleb Gross

Caleb Gross

@noperator

Jun 8

I'm here for "Claude Noir" (N-hour) as the offsec-focused version of Claude > “N-day” has become dangerously misleading. N-hour is closer to the reality we now operate in. red.anthropic.com/2026/n-day…

2,044

Caleb Gross

Caleb Gross

@noperator

Jun 8

synergy through the roof in this ds4/amd community collab github.com/antirez/ds4/issue…

Caleb Gross

@noperator

Jun 5

Replying to @antirez @AMD

using 26.04 on framework desktop without issue. latest experimental rocm optimizations on github.com/antirez/ds4/issue… are improving prefill/gen tps by 1.5–2X.

955

Caleb Gross

Caleb Gross

@noperator

Jun 5

tldr: trim the fat

Paul Graham

@paulg

Jun 5

I strive to make my writing unsummarizable, in the sense that it has so little fluff left in it that if you take any words out, as summaries by definition do, you lose a lot of interesting ideas.

1,512

Caleb Gross

Caleb Gross

@noperator

Jun 6

(this was a joak)

750

Paul Graham

Caleb Gross retweeted

Paul Graham

@paulg

Jun 5

Brevity is more than politeness to the reader. Compression is understanding.

137

200

2,933

149,863

Caleb Gross

Caleb Gross

@noperator

Jun 5

framework makes the dramework 💪

antirez @antirez

Jun 5

I just received a Framework Desktop (Strix Halo) courtesy of @AMD in order to merge and continue the development in DwarfStar of the ROCm support (currently community handled). What Linux distro should I install? Ubuntu 24.04.4 LTS which is officially supported, or Fedora 43?

433

Caleb Gross

Caleb Gross

@noperator

May 30

This concern makes sense if the medium doesn't matter. Perhaps we should question our desire for raw, unmediated information.

Steve McGuire

@sfmcguire79

May 29

“AI is demoralizing.” A Princeton Professor says he kept wondering this semester (while lecturing) if his students would be better off learning from Claude:

1,017

Caleb Gross

Caleb Gross

@noperator

May 30

Is that all a professor is—someone who "knows the most"? Is that all education is—information acquisition?

348

Caleb Gross

Caleb Gross

@noperator

May 30

thanks for the signal boost @antirez :) first time occupying the top spot on hacker news front page.

Caleb Gross

@noperator

May 28

x.com/i/article/206009633034…

220

74,816

Caleb Gross

Caleb Gross

@noperator

May 28

x.com/i/article/206009633034…

788

294,818

Caleb Gross

Caleb Gross

@noperator

May 23

marked safe from the earthquake in Hawaii 🤙

1,323

Goni Zahavy

Caleb Gross retweeted

Goni Zahavy

@gonizahavy

May 21

My MLX Vulkan backend just passed both CPP AND Python test suites!!

28,421

Caleb Gross

Caleb Gross

@noperator

May 21

betting that GitHub's "Download ZIP" button gets _way_ more clicks since the advent of LLMs. I very frequently drop entire zipped codebases in front of ChatGPT.

532

Simone Margaritelli

Caleb Gross retweeted

Simone Margaritelli

@evilsocket

May 18

Earlier today Cloudflare's CSO shared how they tested Anthropic Mythos using an unreleased 8-stage vulnerability-discovery agent. So I asked Opus to implement the agent for me, it works via Claude SDK with a Pro or Max subscription, no API. Enjoy github.com/evilsocket/audit

103

561

47,809

the tiny corp

Caleb Gross retweeted

the tiny corp

@__tinygrad__

May 18

Replying to @halvarflake

I find most computer security people have a grounded notion on AI because they have practical experience with things like fuzzing and z3 and see things as search. Search is powerful, but the space is bounded, and even within a space it can be just hard. AI is mainstream search.

1,111

Sudo su

Caleb Gross retweeted

Sudo su

@sudoingX

May 13

buy a gpu. 3090, 4090, dgx spark, whatever fits your budget. tier doesn't matter. running your first local model does. the moment your first prompt lands with no api between you and the model, your brain rewires. that single moment is worth more than every take you'll ever read on a timeline.

644

30,734

Caleb Gross

Caleb Gross

@noperator

May 11

a benefit of working from home: I can dictate a stream-of-consciousness rant into my mic while iterating on ideas with Claude. not sure how in-office folks handle this.

339