Solutions Engineer @OpenAI

Joined May 2013
53 Photos and videos
Friendly reminder that you should write loops. Are you looping already? Stop whatever you are doing and start writing loops.
16
still unsolved, but imagegen (with good set of skills) comes close
I’ve been hunting for a way to get agents to produce consistently good diagrams - are there better options? * Mermaid: tired of the awful layout algo * drawio: the model can't seem to reason about the XML, even with a skill * Diagrams library has nice results, but you have to instruct a loop of render png > read png > revise so agent "sees" what it's doing
15
Alex Schüren retweeted
install codex on your parents’ computers so you can fix stuff remotely
212
121
3,842
223,180
Alex Schüren retweeted
59
17
751
34,094
Alex Schüren retweeted
Jun 2
Building apps has never been easier. With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL. Rolling out to Business and Enterprise plans, before expanding more broadly.
964
1,863
19,536
9,507,429
We are slowly converging to the point where you no longer need to click around.
Small QoL improvement in today’s Codex App update: you can now search through all prior chat content to find past threads!✨
53
Alex Schüren retweeted
WOW I did not expect these results. This is actually crazy, insightful, and completely changes my dev workflow moving forward: A SINGLE CODEX /goal RUN IS THE CLEAR WINNER. NO ORCHESTRATION, NO OUROBOROS, JUST ONE LITTLE AGENT THAT COULD 🤯 IT COMPLETELY DESTROYED THE OPUS ORCHESTRATOR IN SPEED AND QUALITY! Before I went to sleep, Codex 5.5 xhigh finished 1 hour in! Full migration done, everything clean. I reviewed the PR and I am very happy. Claude Code (Opus 4.7) was working for 5 hours at that point by the time I went to bed. I woke up, and it's still working! 13 hours! It actually stopped working because it stopped to ask me an irrelevant question. Orchestration has never took this long for me in the past. I'm using the new CC /goal mode and auto-compacting at 25% (250k context) to prevent context rot past that point It is STUPID SLOW (which is funny bc it's managing GPT 5.5 low, fast-mode, so it shouldn't take THAT long) for what ended up being LOWER quality work! By a mile! This was really surprising to me, because before 5.5 came out Orchestrating like this was the absolute best, fastest and most efficient. And now on a large critical task, it was more than 6x slower than a single 5.5 /goal mode instance on xhigh ??? It seems compaction played a large role in the slow down here here, because Claude Code compacts at 25% (250k tokens) automatically (I set this in settings) Everytime it compacts it has to take the time to READ EVERYTHING and then get the full context then execute and get full again then compact and oh boy it's not efficient at all. In fact, most of it's time as the orchestrator was spent compacting and reading context then compacting again! Then Codex would just have one long continual running compaction, and just kept moving forward. I believe my goal ledger skill plays a big role in helping it stay aligned here! Look at this difference LMFAO: - Codex PR #23: backend Supabase removal complete, canonical wake wired, preserved surfaces intact, typecheck/lint/tests green, dogfooded against local Postgres, one item correctly deferred documented. Mergeable now. 4,056/−981. - Claude attempt-1: fails the headline goal (supabase dir 9 importers still present), regressed a preserved surface (gutted task.service, stubbed tasks.router to emptyBoard — PRD-forbidden), deleted ~5,456 test lines, uncommitted/dirty. The 17,762 deletions are over-deletion, not more work. Wow. I am actually shocked. I am so happy I ran two diff workflows on a big, identical PERSONAL problem. This completely changes my workflow moving forward- no longer will I orchestrate a big task from the top down Instead, I am going to now experiment with the following flow on Codex: 1. Having Codex scope our codebase, then having. aback and forth brainstorming/discussion on what needs to be done 2. Creating a master PRD from that file, and SPLITTING the work into focused branch work 3. Branching off the chat in parallel, until we get to a part where we need to merge work, then parallelize again This way, Codex agents can work individually, every single branch will have the same research/brainstormed context, and they just work to full completion Based off this experience, this feels like the right direction. I will never do an orchestrator in this style again (executing a PRD to completion). Instead, I will do more of... a manager of branched work. Regardless of what I do moving forward, I will never run an orchestrator setup like this again. LMFAO
OK FIRST EVAL: CODEX RUNNING /goal VS. CLAUDE CODE ORCHESTRATING CODEX AGENTS I have an ACTUAL long form tasks I have to finish. I created two separate worktrees This one is a full migration of services from Supabase to self-hosted Postgres instead, dogfooded, e2e tested I am curious if Codex (NOT orchestrating subagents, but doing work itself as a single agent) on xhigh will perform better than Claude Code (Opus 4.7, high) orchestrating an army of Codex Agents (5.5 low) I'll be judging these based on - did you do the thing i actually wanted - how long did it take - how much did it cost - which output is higher quality I never had incentive to do this because the best workflows were obvious but now it's not and I feel lost again 😭 Will run this overnight and see which one does best and report results !
39
24
324
37,802
Justin is a machine. I once ranted about some tiny thing I thought nobody would care about, and he pinged me right away and fixed it like it was nothing. Send him your weirdest bugs.
Every bug in @ChatGPTapp is getting fixed With the help of codex (and the rest of the lovely team and their codexes) along with a 7pm iced americano there will be zero bugs This is a formal request for tiny nits, error states, broken ui, etc The tinier the better!
59
Alex Schüren retweeted
We’re having way too much fun working through your feedback. (Please, keep it coming.) Keyboard shortcuts are now customizable. Set Codex up around how you actually work, then tweak shortcuts from settings instead of adapting to our defaults.
294
154
2,407
479,870
Alex Schüren retweeted
May 15
You can now use your ChatGPT subscription in the Zed agent, with the same usage and rate limits you benefit from in Codex directly. We're grateful that @openaidevs continues to support subscription-based access for third-party tools, even as others move toward usage-based billing.
147
219
3,789
445,117
Alex Schüren retweeted
May 14
You've been asking for this one... Now in preview: Codex in the ChatGPT mobile app. Start new work, review outputs, steer execution, and approve next steps, all from the ChatGPT mobile app. Codex will keep running on your laptop, Mac mini, or devbox.
1,696
2,612
21,895
4,739,811
Alex Schüren retweeted
For every person who replies with a screenshot of their cancelled Claude Code plan, I will donate $10 to open source.
I can't help but feel personally burned by the Claude Code changes announced today. We put so much work into wrapping the (atrocious) Claude Agent SDK in T3 Code. It was the ONLY path they supported, so we made it work. It was hell. Now our users are getting their rate limits cut by 40x, despite us doing everything right. I listened to the Claude Code team. I had my issues with their direction, but I trusted them and took them at their word. I will never make that mistake again. Until we see significant change, it is safe to assume any statement from an Anthropic employee is a lie on a timer. The rug will be pulled, no matter how many promises are made beforehand.
737
189
4,318
834,350
Alex Schüren retweeted
I can't figure out if vaccines work or not. Tough one. Need Sherlock Holmes on this one.
595
4,260
28,062
603,601
Alex Schüren retweeted
Anthropic: Keeps limiting compute and lying to playing customers / nerfing models. OpenAI: - 10 min downtime? Limits reset! - We hit 4 million followers? Limit reset! - Starbucks person spelled my name correctly on my coffee today, let’s have a limit reset!
Apr 21
Happy Tuesday. Codex has hit 4M active users, adding over 1M users in less than two weeks. To celebrate we will reset the rate limits again in a few hours. Enjoy!
87
192
3,965
194,989
This is not a screenshot
Apr 21
This is not a screenshot.
24
Alex Schüren retweeted
Dps, tank, utility, healer
The only 4 jobs that will remain at tech companies. Credits: @yrechtman
83
441
5,128
378,901
Alex Schüren retweeted
now: openclaw gives me a daily personalized news brief through angela merkel posing as a news anchor with a heavy german accent no one understands the age of PERSONALIZED SOFTWARE is HERE
149
206
3,190
294,127
Alex Schüren retweeted
Here is how I do and don't use agents, idk who this will help but its worth spelling out my preferences and why: - I tell the agent to code how I would do it - If the language is one I am _very_ familiar with I feel comfortable getting it to generate very good idiomatic code that is indistinguishable from my own and doing large refactors - If the language is one I'm not comfortable with, I keep the pull request under 100-200 lines of code for the reviewers sanity since I can't discern the nuance of good/versus bad code - ALWAYS read/self review the code before opening the PR, the onus is on the AI wielder to make sure the code is up to par with what they would do themselves before inflicting their teammates - never auto open PRs because of ^ - if you do all the above you can avoid slop and not annoy your peers my name is jessie frazelle and i have not touched code in an editor since october.
23
31
535
60,408
Alex Schüren retweeted
feels like a good day to revisit this video of defunkt at github universe in 2017
4
10
81
28,810
We went from petting individual servers to running thousands of containers. Nobody misses the ‘craft’ of manually SSH-ing into everything. What happened to ops is now happening for SWEs.
This has been said a thousand times before, but allow me to add my own voice: the era of humans writing code is over. Disturbing for those of us who identify as SWEs, but no less true. That's not to say SWEs don't have work to do, but writing syntax directly is not it.
56