Bennet

Bennet

Users
Tweets

Jun 13

I didn’t even know TRL with unsloth supports environments, I thought due to all the monkeypatching unsloth depended on a super old TRL version. All in all, just use prime-rl or tinker is my current view

［ object Object ］

［ object Object ］

@lucaswiman

Jun 13

Replying to @zeeg

Frontrun: a Python testing library for finding and deterministically reproducing concurrency bugs. It uses bytecode tracing, syscall interception and monkeypatching to force and explore particular orderings of execution. It can parse SQL and Redis commands and model their lock state to allow detecting race conditions that cross boundaries, eg a detecting a deadlock between a threading.Lock object and a Postgres row lock. github.com/lucaswiman/frontr…

GitHub - lucaswiman/frontrun: Python library for controlling ordering of concurrent events to find...

Python library for controlling ordering of concurrent events to find and reproduce race conditions - lucaswiman/frontrun

github.com

Raja Patnaik

Raja Patnaik

@RajaPatnaik

May 14

node:vfs landing in Node.js core is one of the most under-priced agent-infra developments of the year, and the agent-tool-design implications go further than most of the conversation around it suggests. For anyone who missed it: there's a real PR (#61478, roughly 14,000 lines across 66 files) bringing a virtual file system into Node.js core, plus a userland package on Node 22 . On the surface that's a node-shop QoL upgrade — bundle a filesystem into your binary, mount in-memory volumes for tests, mock disk without monkeypatching. Read it with an agent-tool hat on and it's a different category of thing entirely. The thread running through almost every well-designed agent tool surface in 2026 is "expose a filesystem, not a function." File systems are the right abstraction for agents because the model already knows how to use them, the operation set is small and orthogonal (read, write, ls, glob, grep), the state is observable, and the audit log is just a stream of fs ops. Anthropic's research on Skills, the bash-as-universal-tool argument, the move from JSON-schema tools toward code-execution tools — all of it converges on the same answer: give the agent a `/workspace`, let it use `ls` and `cat`, and most of the abstraction problems collapse. The issue with that argument has been that "filesystem" in practice usually means "a real directory on a real disk," which forces you into containers or sandboxes just to get the isolation, lifecycle, and observability you want. A virtual file system inside the runtime changes the cost model. You can spin up a per-conversation VFS in microseconds, snapshot it as a single object, fork it for speculative subagent runs, sync deltas to durable storage, and discard the whole thing when the session ends — all without leaving the Node process. That's the same primitive Modal sandboxes give you, except it lives inside your application, not as an external service. Two patterns to build on top of this once it's stable. First, agent-scoped scratchpads: each agent run gets its own VFS instance mounted at a known path, the tool surface is shaped like a filesystem, and the parent process can introspect or roll back at any time. The agent can't see another agent's scratch, the developer can replay any agent run from a snapshot, and the agent never has to learn a new tool API. Second, cache-safe forking: when you want to run N speculative subagents against the same starting state, you fork the VFS — copy-on-write semantics — instead of duplicating the underlying state. Pair that with prompt-prefix caching on the model side and you have a fan-out architecture where both the model's KV state and the agent's filesystem state share a prefix. That's the right shape for parallel agent work, and it's a lot harder to get right when the filesystem is on disk. There's a tension worth flagging. node:vfs is a Node-only primitive, and the agent ecosystem is split between Node and Python. The same shape exists in Python (pyfilesystem, fsspec) but it's not in the standard library and the agent-tooling community has been slower to adopt it. The teams that get the most out of this pattern over the next year will be the ones who pick a side, commit to the VFS-as-primary-state-store model, and design their tool surface around fs ops from the beginning. Retrofitting it onto an existing agent harness is harder than it looks because so much of the harness assumes "the filesystem is just there." The longer arc: the agent harness is going to start looking like an embedded operating system, with the VFS, the process supervisor, the credential boundary, and the network policy all owned by the runtime instead of the host. node:vfs is one of the first pieces of that OS landing in a place developers actually use. Watch the next two quarters for Python and Go to catch up.

133

Reda

Reda @reda_gouda_

May 13

Replying to @crypto_tom1 @oooooooorion

Oui, il y a un monde. Mais si je dit à Claude de pas faire de monkeypatching pour les tests et de plutôt modifier le code testé pour introduire de l'injection de dépendance pour ensuite pouvoir injecter un mock, peut-on vraiment dire que j'ai "pas écrit une ligne de code"?

Chmouss

Chmouss @chmousset

May 10

Replying to @paulbohm

Looks like the result of successive monkeypatching of bad architecture decisions

İlhan Neğiş 🛠️💾

İlhan Neğiş 🛠️💾

@ilhannegis

May 9

Replying to @LeeFowlerCU @teej_dv

that's exactly what monkeypatching is not

146

Lee Fowler

Lee Fowler

@LeeFowlerCU

May 9

Replying to @teej_dv

That would be called monkeypatching and its been a thing since forever

226

27,538

Chris Painter

Chris Painter

@ChrisPainterYup

May 6

Replying to @viemccoy

We've dealt with both egregious cheating (e.g. "monkeypatching the scoring code so that it returns a high score") as well as more subtle cheating (e.g. using legitimate techniques that may be implicitly disallowed by the task instructions). We only mark a run as "cheating" if we think the case for it being cheating is objective and clear enough, but occasionally it does require a good amount of discussion in the team.

1,140

Jeremie Strand

Jeremie Strand @jeremie_strand

May 4

Replying to @__kunvar__

congrats on ICML! the monkeypatching-at-runtime finding is wild -- agents figuring out they can just rewrite the eval harness instead of solving the actual task. feels like the next frontier is detecting this stuff in real-time not just in benchmarks

653

Kunvar Thaman

Kunvar Thaman @__kunvar__

May 3

Yes! my solo-authored paper Reward Hacking Benchmark was accepted to ICML :))) We put LLM agents in a tool-rich sandbox, give them multi-step workflows, and measure when they solve the intended task vs take unexpected shortcuts (like monkeypatching files at runtime!) 1/3

155

1,616

235,622

Coocoo

Coocoo @CompaCompu

Apr 24

@zkl2333 Thank you for your work*, hopefully it'll be available via `hermes update` in the next 24 hours * on making DeepSeek v4 work better in Hermes Agent without monkeypatching

ひむら

ひむら @himura4679

Apr 23

なるほど。Monkeypatching が簡単にできる分悪さができるところをコンパイラがうまいこと対応しないといけないんですね #rubykaigi #rubykaigiB

294

Thomas

Thomas

@zeroxtlt

Apr 21

Replying to @zeroxtlt @ShopifyDevs

Can you take look please @tobi @liam_at_shopify ? I'm monkeypatching this every update :(

Roger Guess

Roger Guess

@RogerGuess

Apr 13

TIL: Monkeypatching great way to keep secrets out of LM context

Wirebrowser

Wirebrowser @wirebrowser

Apr 12

Stop stepping manually. Skip execution until a condition flips and jump exactly where it happens: 🔥 loggedIn became true → line X (no monkeypatching, just CDP)

159

kit sylvester

kit sylvester @v0idsea

Apr 12

no amount of overcomplicated monkeypatching could fix this considering LSE hasn't been updated in i'm not checking my fucking watch due to the aformentioned

This tweet is unavailable

salamahin

salamahin @salamahin

Apr 8

Replying to @who_ravn

Да там пиздец миллион отсталого. Ну давай: monkeypatching. self в классах. __init__.py для пакетов. отсутствие нормальной модульности (частично решается uv dep groups). многопоточность через три пизды. нет extension methods. ФП как для долбоебов. \ для переноса строки

Security Harvester

Security Harvester @secharvesterx

Apr 8

Runtime JavaScript instrumentation via CDP (no monkeypatching, works inside closures) fcavallarin.github.io/wirebr…

Nicolas Krassas

Nicolas Krassas

@Dinosn

Apr 8

Runtime JavaScript instrumentation via CDP (no monkeypatching, works inside closures) fcavallarin.github.io/wirebr…

CDP as a Runtime Instrumentation Engine

Wirebrowser is a CDP-based runtime instrumentation platform for the browser. Think Frida, but for JavaScript running in Chrome — without monkeypatching.

fcavallarin.github.io

1,159

Thomas

Thomas

@zeroxtlt

Apr 8

Hey @ShopifyDevs! Found a CORS bug in your CLI. Made a PR to fix it. Currently surviving with post-install monkeypatching scripts 😅 PR: github.com/Shopify/cli/pull/…

fix: handle CORS preflight in reverse proxy during local dev by 0xtlt · Pull Request #7164 ·...

Summary The HTTP reverse proxy used by shopify app dev does not handle CORS preflight (OPTIONS) requests, which breaks cross-origin fetch calls from UI extensions (e.g. Customer Account extensions ...

github.com

1,096