pyronaur 🔥

pyronaur 🔥

459 Photos and videos

Tweets

pyronaur 🔥

@pyronaur

Jun 14

Can anyone explain why GPT models get "single minded" and can't seem to escape their line of thought? This is its biggest weakness at the moment, and I'm only now realizing how prevalent it is.

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 12

People of pi, how do I install this ass tool that @badlogicgames writes about in the docs? Sounds useful for compaction.

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 11

This was just 3 years ago. Reading that thread feels nostalgic already.

Simon Willison

@simonw

Jun 11

It's fun to look back at this Twitter conversation about the then-new ChatGPT Code Interpreter from three years ago - with hindsight this was our first glimpse of a coding agent, before we knew what a coding agent was

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 11

This has also grown into a habit of not fixing typos ini non LLM text

Peter Steinberger 🦞

@steipete

Jun 10

Replying to @_ARahim_ @bcherny

only boomers fix typos in prompts. llms perfectly understand you even if you mistype.

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 10

Fable: Because you won't be able to afford it and only hear fables how good it is.

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 9

So many people crying over siri an criticizing EU. I don't get it. It's clearly what apple wants - outrage, to protect their margin and get an exemption. I don't understand why though...

Piotr Cichocki @piotrekcichocki

Jun 9

.@EU_Commission reply on Siri AI roll-out in the EU

2:34

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 3

Ask your agent to roast the person who wrote the AGENTS.md file

pyronaur 🔥

pyronaur 🔥

@pyronaur

Jun 2

I guess DeepSWE is, temporarily, the only trustworthy bench. I hope that either: a) More benches like this come out. b) DeepSWE iterate on model releases to prevent being gamed like the rest.

Bleys Goodson

@bleysg

Jun 1

Since everyone is asking, I ran DeepSWE on MiniMax M3. Here is the lowdown. 15 of 113 passed! 19 if you count the 1.5x overtime I gave just to see. Full report: entrpi.github.io/misc/deep-s…

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 29

OpenAI is releasing a new model soon. Classic signals: - 5.5 is currently slightly dumber than normal - hallucinated foreign characters at 27% context These sort of things happen only weeks/days before new model release.

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 27

I have no idea how this didn't occur to me sooner, but compaction and summarization as a step in the harness is a bad idea.

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 26

Agents will lie, cheat, and steal to make the lints pass in the shortest, dumbest possible way.

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 22

Submitted my first app to the mac app store. Now we wait!

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 21

I might need to see a therapist about my anger issues 😂

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 18

In the entire history of magick, has anyone used it as well as codex can?

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 16

yasss. managed to squeeze in 57 UI tests on a 21 hour goal and a went from 2.5k mutants in 5900 tests to ~175 mutants. All in a couple of loops.

116

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 16

/goal improve tests is the easiest recipe to burn tokens 🔥

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 15

.@steipete step away from the devices and raise your arms calmly behind your head.

Peter Steinberger 🦞

@steipete

May 15

The latest CodexBar update renders API costs wayyyy nicer. codex.bar

116

pyronaur 🔥

pyronaur 🔥

@pyronaur

May 15

Did you know there are people who use "natural scrolling" on their mac? I mean - you move your finger down on the wheel and the page scrolls up.

Uncle Bob Martin

pyronaur 🔥 retweeted

Uncle Bob Martin

@unclebobmartin

May 14

Clean Code was never about syntax. It was always about structure. The second edition makes that even clearer by using the same principles in multiple languages. If we, who pilot agents, disengage from syntax, we are not disengaging from structure. The Clean Code principles still apply.

722

34,458