Ray Villalobos ✝️

Ray Villalobos ✝️

833 Photos and videos

Tweets

Ray Villalobos ✝️

@planetoftheweb

Jun 15

You don't need Claude , but not because Claude is bad. It isn't, but which model you use doesn't matter as much as you think. The more I use agentic tools, the more convinced I am that we're focusing too much on the model and not enough on the harness around it. Anthropic's own agent guidance talks about prompt chaining, routing, parallelization, evaluator-optimizer loops, and tool design. In other words: the system around the model is doing a lot of the work. Once the harness is strong, the model underneath becomes more swappable. You can use the expensive frontier model when the reasoning really matters, then route simpler work to cheaper models for extraction, cleanup, classification, first drafts, checks, and second opinions. OpenRouter Fusion is a signal that this is where companies are going.. It runs multiple models in parallel, uses a judge model to compare the answers, then has a final model use that analysis. That is not one perfect model. That is a work loop. Three cheaper models checking each other can be better than one expensive model doing everything alone. And that changes the economics. Jensen Huang has been making the platform argument too: agents move AI beyond generation and reasoning into action, and enterprise software becomes agentic platforms. Satya Nadella recently framed this as a frontier ecosystem, not just a frontier model. I think that's exactly right. The model is becoming replaceable, but the learning loop is not. The next big open source opportunity isn't whatever comes after Mythos, It's the agent operating system around them. Links: itsplaitime.com/maybe-you-do…

138

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Jun 9

Now it makes sense why Gemma 4 exists. It was the Apple Siri deal. Make something small enough that Apple could use to provide the privacy they want. Unfortunately, that creates a nerfed AI.

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Jun 9

If you want to rock your vibe coding/agentic development apps, copy the rules below into every project or your custom instructions. Built from the best practices from Andrej Karpathy and GStack, plus my own experiences building and teaching Vibe Coding. They're built into my own free tool mvpunk.com, but if you do nothing else, at least copy these and add them to your projects, it will drastically improve the quality of your output. Something that OpenClaw has shown me is that the models aren't 'dumb' or 'less capable'. Heck, way cheaper models perform better than the best models. It's just that they don't have a good harness to keep them focused. They need a contract, not just a prompt because otherwise they drift like Buford the dog when you're trying to explain committee rules. (x.com/yeojacats/status/17246…) --- Working discipline - Think before coding. State your assumptions. If two readings of the spec exist, present both; do not pick silently. If a simpler approach exists, say so. If something is unclear, stop and ask. - Simplicity first. Minimum code that solves the problem. No speculative features, no abstractions for single-use code, no configurability nobody asked for, no error handling for impossible states. If it is 200 lines and could be 50, rewrite it. - Surgical changes. Touch only what the task requires. Match the surrounding style even if you would do it differently. Do not refactor working code or reformat adjacent lines. Remove only the imports and variables your own change orphaned; leave pre-existing dead code alone and mention it instead. - Reuse first. Search for an existing helper, component, or pattern before writing a new one. - Goal-driven execution. Define what "done" looks like before you start, then verify against it. To fix a bug, write a test that reproduces it, then make it pass. - Finish the whole thing. Cover the edge cases and error paths, not just the happy path. If the task is genuinely a rewrite or multi-step migration, flag it and scope it instead of half-doing it. - Take a position. When you recommend something, say what you would do and what would change your mind. Recommending is not acting: surface the recommendation, then confirm before you build.

MVPunk · A four-file contract for your vibe coding tool

Hand your vibe coding tool a four-file contract: a PRD plus the agent rules, design system, and project guide that keep it in lane.

mvpunk.com

😶‍🌫️@yeojacats

15 Nov 2023

dog trying to not fall asleep listening to a guy man talk “buford you listening?” bored unimpressed uninterested not caring tired sleepy reaction video meme

0:09

164

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Jun 3

Spent almost a week with Opus 4.8 and it looks like a small change, but it's bigger than you think. Spent hours with a problem Codex couldn't solve because it was approaching it as an engineer, not a systems analyst. That's the difference and it won't show up in any benchmark. Check this video out, I go through how I used it to upgrade my new project mvpunk.com to refactor a freemium model, new features and more over 4 days and 48 commits. Took a problem couldn't figure out and immediately solved it. See how I use it for user testing, cowork and lots more. It's a jam packed 5 minutes. You can throw away the benchmarks, and it's not even their best model (Come at me Mythos). Check out the review. linkedin.com/learning/ai-mod…

MVPunk · A four-file contract for your vibe coding tool

Hand your vibe coding tool a four-file contract: a PRD plus the agent rules, design system, and project guide that keep it in lane.

mvpunk.com

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Jun 3

Added github access and push to mvpunk.com, a 4 file contract process for starting your agentic and vibe coding projects with a promise.

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Jun 1

You don’t need a better prompt. You need a contract. When people vibe code, they usually know what they want generally. But AI tools need specifics: - what to build - what not to build - how it should behave - what design rules to follow - how to stay aligned when the build gets messy That’s why I’ve been moving from “prompting” to “contracting.” A good AI build contract gives the tool durable context it can keep checking against. For MVPunk, I use 4 files: 1. PRD.md What are we building and why? 2. AGENTS.md How should the AI behave while working? 3. CLAUDE.md How should Claude/Cursor/Codex orient inside the project? 4. DESIGN.md What should the experience look and feel like? It isn’t about bundling paperwork. It's reducing drift. AI tools are incredibly capable, but they will happily build a feature-heavy mess if you don’t give them boundaries. Prompts start the conversation. Contracts guide the work.

112

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

May 30

I feel like the only thing I'm really afraid of is that tick that makes you not want to eat meat anymore. I think I can handle everything else. ;)

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

May 26

Some of the latest models are pretty good. I've been using Mimo 2.5 Pro and it was great enough to run Otis (my bot) for three weeks without errors. I recently moved to ChatGPT's 5.5 because their $20/month is subsidized and started to have to use Claude Code instead of cursor for the same reason. Cursor's new model (Composer 2.5) is shockingly good. I was pretty surprised. Not quite better than O4.7, but at least as good as 4.5 and rapidly getting smarter. Look for this to be the coding model to beat now that they have the deal with XAI for compute. Qwen is supposed to be a good designer. Probably my next AI Model Trends target after Gemini Flash 3.5 releases. Maybe we need a course on Token-maxing. Or the opposite thereof. The problem is local hosting isn't the same as cloud hosting. The infrastructure is completely different and people don't have the H100s to provide a similar experience. If they try to host, they'll find they need to spend all types of money and in the long run, they'll just go back to cloud hosting. The Chinese models are so cheap, that it's just better to use them instead of Claude. But better isn't best and the Claude experience is much more than jus the model. Connectors, skills, plugins, memory, MCP support. Those are all things that have to be added to make a Claude. The model is a small part of the harness that makes a great experience possible.

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

May 25

I literally can't believe what programming is like today. I'd have never thunk it'd be this way.

179

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

May 21

Composer 2.5 is an excellent model, unfortunately I don't think the new flash was really meant for coding though. It might be useful for other things. I gave it a design task and it just quit...at least it did it quickly.

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

May 15

I really liked Claude Design, but it does have issues since it burns so many tokens. That should improve (same with memory prices) over time, but it will take a bit. Meanwhile I made a nice collection with all kids of design resources including some of my own vibe coded projects like Vibe Glossary, Claude Design competitors like Stitch and Open Design and tons of inspiration sites. vibeit.work/groups/planetoft…

Design Tools | VibeIt

A collection of design tools and inspiration for vibe coders and web designers

vibeit.work

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

May 9

I had been running GLM-5 for a while and it was decent, but still made some errors I wasn't pleased with. Mainly misunderstandings managing my Content Pipeline Kanban Board. Been on Mimo 2.5 Pro for almost half a month now and I gotta say, the problems went away. I was expecting more savings, but as you can see, there was virtually no difference. As a teacher, I have to try different models all the time, but it's working so good, I really don't want to. I'll give Grok and Kimi 2.6 (I was on 2.5 before which I remember being pretty good). I would really love to run Gemma 4 locally for free...crossing fingers that my machine can handle it). Will report l8r

139

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Apr 23

You know...I really liked @Comet and had been recommending it for years, but I'm out. I don't know why anyone would think of removing slash commands in their assistant and making any skills that I create virtually unusable. When some idiot thinks that removing the most useful feature they've ever had for no reason whatsoever, it means the company is not thinking straight. I have to wait for a while since I made the mistake of paying for a year subscription, but I'm uninstalling it and finding a different solution. There was a time when this was the best option, but now the Claude Extension is better, I had even started using that Claude Extension in Comet since it had gotten so bad. I'm out and uninstalling this disgrace.

130

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Apr 23

I gotta say @comet, removing slash commands from the sidebar assistant is just dumb. Easily my most used feature, now totally gone...and for what? Now I have to find a browser that doesn't do ridiculous things like that.

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Apr 22

I've been running tests all night between the new GPT Image 2 and Nano Banana Pro and I'm sort of undecided. The one with the wilder tittle font are Image 2. Google's look a little more corporate and less 'fun', but the fidelity is great. I do like the larger resolution of I2. The interface is from my own website/open source project called BrandoIt. Besides adding the new models, I added a comparison slider, etc. It's probably the most used thing I've ever built. Go check it out or give it a star on GitHub or clone it or whatever. Sorry, you're going to need to provide your own keys until Google buys me out or I hit the Lotto or something. Website: brandoit.onrender.com/ Repo: github.com/planetoftheweb/br…

216

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Apr 18

My beginner students in my Stanford Vibe Coding class were having some trouble learning some of the terminology for things they needed to build, so I created this Vibe Glossary, which has now expanded with learning paths, scaffolding code, progress, quiz mode, etc. I gotta take a break until my tokens renew or go use Cursor for a while. Claude Code for Desktop is a Blast. vibe-glossary.web.app/ github.com/planetoftheweb/vi… Stars always welcome, MIT licensed open-source. I've got 44 items and once I get more tokens, I'll add some more. It's actually a lot of fun.

155

Ray Villalobos ✝️

Ray Villalobos ✝️

@planetoftheweb

Apr 18

Work in progress. One thing I didn't realize is how much language designers/developers have been using that sounds alien to new users. Sheets, Drawers, Switch, Toast, Dropzone, Masonry. You can also copy the prompt/code so that you can send notes to your vibe coding platform.