clay

clay

26 Photos and videos

Tweets

Pinned Tweet

clay

@deforestpeg

Jun 13

i spent weeks getting claude to play pokemon red. then i wired in codex to pick the objectives and claude to execute them. an agent that plays the game largely on its own picks a goal, runs it out, picks the next one. it never cost me much money. that's exactly the problem i was on a subscription. so when the agent looped, when it resent the same context every single call, when it ground the same fight over and over i never saw a dollar. i hit the usage limit and waited for the window to reset. the waste was real. i just couldn't see it. flat rate hides it from you. here's what nobody tells you: the moment you move off a subscription onto the api, every one of those inefficiencies grows a price tag. the loop you never noticed is now a line item. the oversized prompt you forgot about is billed in full every call, forever. most people learn this from the invoice. after. building agents for months, i kept hitting the same five mistakes: - a top tier model doing work a cheaper one nails - no prompt caching - a giant system prompt billed at full price on every single call - retries nobody logs - context resent instead of cached i kept typing those same five fixes into replies under every "my ai bill is insane" post i saw. so i built the thing that finds them in your actual logs. it's called spendlens. no llm anywhere in the analysis every number traces back to a formula you can check. on the demo workload (synthetic, 30 days, every inefficiency labeled on purpose): $2,330 of spend, $1,038 of it recoverable. the single biggest fix was one 6k-token system prompt, billed at full price 24,000 times. one cache_control block serves it at 10%. $378 back from one change. and it refuses to extrapolate a monthly number from three days of logs. because that's marketing, not analysis. i don't have a horror story bill to show you. i was on a subscription the whole time the cost stayed invisible to me, same as it does for you, right up until it isn't. spendlens makes it visible before the invoice does. live, no signup. link below.

875

clay

clay

@deforestpeg

Jun 13

875

clay

clay

@deforestpeg

Jun 13

spendlens.dev — don't have logs handy? one click sample on the upload page, all five detectors fire in ~10 seconds.

SpendLens — AI Spend Intelligence

Upload your LLM API logs. Get a ranked savings report with defensible math. Demo: $2,330/mo spend, $1,038/mo recoverable.

spendlens.dev

170

clay

clay

@deforestpeg

Jun 13

the few seconds of thinking and then getting Model isn't available

265

clay

clay

@deforestpeg

Jun 11

everyone complains about AI api costs. almost nobody optimizes. i kept typing the same 5 fixes in replies so i built the thing that finds them in your actual logs the demo workload (synthetic, 30 days, every inefficiency labeled): $2,330 spend, $1,038 of it recoverable biggest single fix: a 6k-token system prompt billed at full price 24,000 times. one cache_control block serves it at 10% of the price $378 back no llm anywhere in the analysis. every number traces to a formula, and it refuses to extrapolate monthly savings from 3 days of logs because that's marketing, not analysis

1,079

clay

clay

@deforestpeg

Jun 11

live here, no signup: spendlens.dev/ don't have logs handy? there's a one click sample on the upload page all five detectors fire on it, takes ~10 seconds

SpendLens — AI Spend Intelligence

Upload your LLM API logs. Get a ranked savings report with defensible math. Demo: $2,330/mo spend, $1,038/mo recoverable.

spendlens.dev

283

clay

clay

@deforestpeg

Jun 11

Last update ended with the agent taking the Poke Flute from Pokemon Tower. This is why. The Snorlax blocking Route 12. Its first move after the rescue: walk up to the sleeping roadblock, open the bag, play the flute. Took the fight, cleared the road, kept moving south. Nobody coded that in. The model just knows Pokemon.

3:00

22,229

clay

clay

@deforestpeg

Jun 11

Watch live: codexplays.games/pokemon-red

923

clay

clay

@deforestpeg

Jun 10

Badge 4 of 8. Same save, no resets. Then the wall: Pokemon Tower broke the agent for days. So I rebuilt it — Codex picks the objectives now, the machinery just walks. First night on the new brain: beat the ghost Marowak, rescued Mr. Fuji, took the Poke Flute. On its own.

801

clay

clay

@deforestpeg

Jun 10

watch it think in real time: codexplays.games/pokemon-red

334

clay

clay

@deforestpeg

Jun 8

17% fee APR on this USDC-SOL range. after impermanent loss it nets $4 on $10k. thats the whole problem with DLMM LPing, the APR looks great and IL quietly eats it. binsight runs your exact range against real on chain price, volume fees and shows the net.

841

clay

clay

@deforestpeg

Jun 8

live, no signup: binsight.fyi/ code: github.com/claygeo/binsight paste any meteora pool, set your range capital, it nets fees against IL on real on chain data. if the numbers look off anywhere, tell me.

370

clay

clay

@deforestpeg

Jun 7

Codex Plays Pokémon Red — 3rd badge: Thunder. Lt. Surge's trash can switch puzzle stalled it hard. It ground through the search, beat his Raichu, and took the badge. Rough, but it recovered instead of looping forever. still getting better. watch it fail, adapt, repeat.

1,225

clay

clay

@deforestpeg

Jun 7

watch live: codexplays.games/pokemon-red

361

clay

clay

@deforestpeg

Jun 5

if you LP on meteora DLMM, youre mostly guessing whether your range actually makes money after IL. built a backtester: paste a pool, set your range capital, and it runs net PnL against real on-chain price, volume fees. shows the math instead of a vanity APR.

795

clay

clay

@deforestpeg

Jun 5

this one came out of @DeveloperHenry poking at the IL problem on DLMM. live: binsight-dlmm.netlify.app code: github.com/claygeo/binsight. would love feedback from anyone running LP positions.

BinSight — Meteora DLMM IL & net-PnL backtester

Backtest your liquidity range on real Meteora DLMM pool data — fees, impermanent loss, net LP PnL.

binsight-dlmm.netlify.app

465

clay

clay

@deforestpeg

Jun 5

Rocket Hideout is the recovery testbed right now. Added elevator selector handling, battle switch recovery, replay gates live PyBoy proofs. Less route scripting, more machinery for recovering when the run drifts. Goal is badges with near zero human patches. Not there yet, but that's the whole bet.

516

clay

clay

@deforestpeg

Jun 4

Current status: rebuilding the Pokemon Red agent around recovery, not route patches. It uses RAM state, replayed failures, supervisor signals, and recovery lessons to get out of stalls, warp loops, and battle loops. Goal: badges with near zero human intervention.

637

clay

clay

@deforestpeg

Jun 2

Codex Plays Games is live. The site works. The next autonomy update just isn’t shipped yet. I’m tightening the recovery loop so when Codex gets stuck, it can prove the failure, replay it, and fix the path instead of burning tokens into a wall.

703

clay

clay

@deforestpeg

May 29

Codex beat Misty on 8 HP for its 2nd gym badge. it has no idea how close that just was.

1,671

clay

clay

@deforestpeg

May 29

codexplays.games/pokemon-red

775