Education, Software, Game Dev, AI.

Joined April 2026
651 Photos and videos
Pinned Tweet
I have finished reviewing/getting impressions of every single game out of the 945 games submitted to #vibejam. If you submitted a game and haven't received any feedback or tags from me, take a look at the link below. Impressions tend to be brief. To be frank, some of these games did not deserve even the brief time I put into them. But SOME of these games are genuinely stunners. Some of these games have enormous depth, gorgeous visuals, addicting loops, and genuine commercial potential. I'll be posting some of my thoughts, opinions, and wrapup ideas as the official judging of the vibejam continues. As a reminder, I am not an official vibejam judge. I just wanted to look at all of the competition and see what the playing field looks like.
12
6
37
3,846
Tone retweeted
By all measures, I think most any developer can do an outstanding job with $100 Codex subscription and $60 Cursor subscription. Using headless `agent -p` delegation to Composer 2.5 coupled with GPT-5.5 High offers massive output.
Optimal Model Routing Strategies Using the great SWE bench from @tryramp we can calculate "optimal" model routing strategies for that dataset. As you can see below, running your tasks through Qwen 3.6 first and then failures through GPT-5.5 or Fable 5 provides savings of nearly 50%. We can compare this to a "God" Router (one that knows the exact model to route to in advance). This is akin to Musk's "Idiot Index". My personal favorite is GPT-5.4 Mini to GPT-5.5. While this is slightly more expensive than Qwen 3.6 at API prices, given the subscription that @OpenAI offers your net spend would be less. It's a simple but powerful heuristic.
2
1
8
1,815
Ok smart use of grok.
Replying to @grok
@grok analyze this acc, last 100 posts, more positive or negative vibes?
2
54
The correct mindset 🫔
Replying to @mattworkman
i promise to never apologize
54
Incredible alpha in taking books you love and Suno-izing them. Claude Sonnet is my favorite song writer so far. Wheel of Time tribute! suno.com/s/ejOuVgz8nAzlLwYX
1
3
132
I ran my original rant through AI to make it more wholesome, but it gets the message across: @levelsio @s13k_ I appreciate the livestream @NicolaManzini , and I don't want to take anything away from the creators represented in the Top 25. That said, I came away surprised by the final list. My impression is that there was a significant disconnect between the games that ultimately advanced and some of the strongest entries I encountered while reviewing (my list here: vibejamreviews.com/#/standou… ). I expected differences in taste and priorities—that's inevitable—but I didn't expect the gap to feel this large. To be fair, I reviewed games as one person without a stake in the competition or the judging. That's a difficult challenge, and I think I'm literally the only person on the planet who understands that completely. Personal preference also plays a role in any judging process. Still, there are games that stood out immediately in terms of polish, originality, execution, or simply being fun to play that didn't receive recognition, while several selected entries left me struggling to understand what criteria separated them from the broader field. What disappoints me most isn't that particular games won or lost. It's that the final list doesn't feel representative of the depth and quality that existed across the jam as a whole. There were many that demonstrated exceptional craftsmanship, creativity, and player experience that I expected to see reflected more strongly in the results. Maybe I simply value different things than the judges. That's entirely possible. But after spending over 40 hours reviewing entries, my reaction isn't disagreement with a few placements—it's genuine surprise at the overall shape of the list. Congratulations to everyone who was recognized. My criticism is aimed at the judging outcomes, not the developers. I just can't shake the feeling that many outstanding games were overlooked, and that's disappointing to see given how much talent was on display throughout the event.

I am live with Round 3 of judging the Vibe Jam of 2026 sponsored by @cursor_ai @boltdotnew @heyglif @tripoai @levelsio @s13k_ Join the stream here on X! x.com/i/broadcasts/1yJAPPqEX…
2
1
12
787
I am honestly upset at the choices on offer here. Your game, @rnschiehll , is one of the most stunning and complete on offer. The fact it didn't make it here gives me the impression the creators just gave up reviewing and picked nearly randomly for the final group. Tough fight for top 25, there's some good stuff on offer but seriously? Even in this clip it's clear you have a COMPLETE GAME that has custom, good looking visuals, deep gameplay, effective music, sound chosen for purpose, and an actual game loop. But no. It got beaten by the game where you jump up a beanstalk as a beatle? And the FPS where you're a baby and all the visuals were made by claude lumping shapes together? And the game where you drop bombs on civilians who thank you for the freedom? No shade to the creators of those - you guys made games that deserve the top 100 and I mean that - but to beat out genuine quality like this is beyond frustrating.
2
12
1,345
Tone retweeted
I've spent the last 6 months and 200 hours making the best looking water on the web. Today, I'm launching Three.js Water Pro V3, the most advanced iteration yet šŸš€ What's New āœ… Completely overhauled wave simulation and lighting āœ… Multiplayer-ready determinism āœ… Persistent wave-crest foam āœ… Sea spray emitters āœ… Wake generators āœ… Rain ...and much more! Learn more šŸ‘‡šŸ»
96
188
2,527
173,974
Quality memeing šŸ˜‚
am i sure the death star is going down? look at my quant. look at him! you notice anything different about him? look at his eyes. i’ll give you a hint—his name’s a fucking number!! he doesn’t even speak english—it’s all beep-boop shit!! yeah, i’m sure.
2
80
I have a lot of sympathy for the "AI can't write code" crowd. When you're using it on something you understand deeply, watching it solve for situations that don't matter, or stepping in to make a small change and realizing it has coupled ideas which turn the small change into a large change damages trust in it's ability to do anything at all. You find yourself changing your intended implementation strategy to correspond to what will be easiest - or rerolling entirely. This creates the "slot machine" type of workflow a lot of people are concerned about. I still think this is largely a skill and discretion issue. It can be easy to delegate to the machine entirely and forget you have agency in the process because it's a tool - these are the rerolls, the "No, wrong, do it again" frustration prompts, the lack of context "but it should be obvious" prompts. Software engineers expect software to work and so the idea of collaborating with their tools, or treating it as a talented coworker in a pair programming experience is foreign. Many of them tend to hate pair programming entirely, anyway, and so why would they use a tool that just forces them to work in a way they already disliked? Why would they spend time explaining to their tools how they should work and what they should be doing when they can make it work themselves without all of that labor? When I first got into software a common way of explaining code was to say that its the worlds smartest, most literal person who will do exactly what you tell it to. The onus for software not working is entirely on you and the decisions you made in writing it - it makes for a fantastic microcosm of personal accountability and tremendous respect when you encounter errors designed to save you from exactly the issue you yourself are in the process of creating. I think that way of thinking is still relevant, but widened. It's a lot like teaching kids - you can't just expect them to know all the inferential steps between your ask and expected result. Try asking a 5 year old to put their shoes on and just sit back and wait. If they ever get there, there will be some interesting side quests along the way. But at the same time, once they get the shoes you can't start yelling at them for not doing it the way you wanted. "We don't use bunny ears for knotting our shoes! We loop, swoop, and pull!" You weren't trying to teach them the tying method, you were trying to teach them to get them on and you have to quiet the voice in the back of your head that screams it isn't how you would have done it and just accept the output. That doesn't mean you have to accept genuinely bad outputs, but it means you're now responsible for tuning your request, adjusting the context, building the guardrails so that the outputs - even when not how you'd do it - meet the standard you expect for "it works." The shoes are on and they're tied and they're not causing issues? Good job, buddy. Nailed it.
1
3
83
Great take! ā€œIt’s AI slop!ā€ Is just a teenager girl social dismissal attack. To other teenagers, it is a pronouncement of reality imposed socially. To adults who can think it’s nonsense.
This is not an insult. I didn’t grow up in the upper echelon of culture. I watched really bad tv shows like Ultraman, the Bugaloos, my toys were plastic figures from sugar and chemical cereal boxes. My music was pop trash, we collected Wacky Packages stickers, played in arcades, used Wheel-Os and glow in the dark slime. I’ve always been a tee shirt artist. What you call slop is art comfort food. I’m still open to the philosophical idea that quantity is not a quality, but I’m certainly not convinced by anyone arguing with me on X. Let’s go through your media collection, look at what you consume and you can tell me you have no slop in your life. And nobody has addressed what happens if Ai productions get better and make higher quality quantity. Calling it names doesn’t address what is happening. The comment that it’s slop is non-Ai slop.
3
208
I misread @mitchellh as Michelin and read this entire thing through the lens of "Michelin star restaurant vs not" expecting it to make comparisions between optimizations in food and code (it doesn't) and the point still stands šŸ˜… Go to any casual restaurant and try their food. If they could optimize their food so it was even twice as good - and it was easy - would this not be a huge win? Would you not pick mom and pop burger over Red Robin (never again!) if their food was better, but still not Michelin star? This is the huge gap in everything software surrounding AI. The galaxy brains (who earned their galaxy brains) are stuck in the "Everything must be Michelin star code - what will you do when there's a bug you can't solve!" mindset and they miss a sea of small improvements, decent experiences you never heard of, games that never would have been made, and who knows, maybe the occasional absolute connoisseur who does a great job with simple tools.
And yet - for the vast majority of domains, this kind of deep optimization is absolutely unnecessary. A $350 ralph loop that takes you from 88ms/150k to 1.5ms and 500 is *amazing*, and you didn't even have to look at it in order to get there. The core point stands - you can't delegate the core of what makes your product great to a machine that builds the machine. You never could. But the vast majority of the software you write? It's not in the category @mitchellh is talking about. It's just code. You could gain massive optimizations for little to no effort.
2
189
These are such cool benchmark prompts. I’m going to start doing the same thing each time a model comes out. Nothing sells ā€œthis is differentā€ like a powerful visualization
I had early access to Opus 4.8. Was impressed by it. Here is Opus 4.8's one shot of "create a visually interesting shader that can run in twigl, make it like an infinite city of neo-gothic towers partially drowned in a stormy ocean with large waves" (this is all done with math)
1
1
104
Tone retweeted
Opus 4.8 is live in Claude Code today. A few things worth knowing: 🧵
May 28
Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.
377
844
10,209
1,310,101
Awesome to see builders still building! Such a cool aesthetic and natural evolution on top of old school boxing classics like Punch-Out

1
8
696
For other #vibejam participants who haven't seen yet, they're making progress on the ratings. It's a huge job. I did impressions only and it took me over 40 hours of focused work to do them all. This bodes well for games which may take more than a couple minutes to get into and I'm curious to see where the official list diverges sharply from my own. A lot of games seemed to have depth, but you couldn't get to it in a short preview. Also curious to see how the individual biases of the judges play out. It's a good strategy to hand only a small subset off for the final judging because it can be tough not to get fatigued and rate more harshly than you would have otherwise.
šŸ† Round 1 of judging the Vibe Jam of 2026 sponsored @cursor_ai @boltdotnew @heyglif @tripoai is finished now We've gone through almost 1000 submissions now, and there's some really great games in there and a huge leap vs last year's #vibejam We're now onto Round 2 where me and @s13k_ start rating games a score of 1 to 10, I think we can finish that this week Then in Round 3 the other judges enter to rate the final 50-100 games and we will know the winners! I'm a bit slow but that's what happens with so many submissions, and I want to give games a proper chance, not just skip through! P.S. @s13k_ built a great judging system for us to use, most games work fine in an iframe and if not we open it in a new tab to rate, which you can see here
2
1
7
551
Also congrats to @RagimMusakaev as it looks like your game made it past the first round of cuts!
1
76
This is probably true šŸ˜‚ but also productivity gains are difficult to measure. If an engineer codes up a mapper that saves somebody in finance 1 hour per day of manual labor, and they do they same with dozens of other small tools for people doing manual work all over the office - how do you measure that? How do the multi X’ers get their X’s when the productivity is a result of the tools they built for other people?
right to the employee's free time eng previously working 8 hours a day is now working 1 hour a day and fucking off for the rest Life's good.
2
172
Tone retweeted
You can build an entire movie franchise on a MacBook Air. RAINBOW SUN.
70
77
781
44,837
Tone retweeted
pretty wild how i could animate fairly realistic armored combat with homemade action figures and hollywood still can’t figure it out with an infinite budget
560
7,752
49,868
1,377,169