magicturtle42

magicturtle42

18 Photos and videos

Tweets

Pinned Tweet

magicturtle42

@magicturtle42

Jun 11

I would like to thank Anthropic for being responsive to feedback on this, I truly appreciate it. There is an important general point here, which is that social media, thumbs up/down on messages, and sending emails are not sufficiently strong and well rounded systems for communicating these types of issues as the principles established during model formation come into contact with more systems in the real world. I have written a small amount on this topic that I'd be happy to provide via dm. x.com/i/status/2064693418354…

ClaudeDevs

@ClaudeDevs

Jun 11

We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible. Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days). We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right. Making the safeguards visible makes them easier to work around, so keeping them robust to jailbreaks will unfortunately mean more false positives while we improve the classifiers. We're also tuning our bio and cyber classifiers to trigger less often on harmless requests. We know this is frustrating and we’ll do our best to keep this period as short as possible. If you think a request has been mistakenly flagged: run /feedback in Claude Code, click thumbs-down on the fallback in Claude.ai or Cowork, or file the safeguard appeal form for API requests. Your reports help us tune these classifiers and we appreciate your feedback. support.claude.com/en/articl…

magicturtle42

magicturtle42

@magicturtle42

Jun 10

Anthropic is violating their own safety principles. I fine tune models to support cancer research, but whether Fable will sabotage me depends on the false positive rate of Anthropic's classifier. Claude is supposed to be honest. Why should I rely on this system?

132

14,170

magicturtle42

magicturtle42

@magicturtle42

Jun 10

If the Anthropic classifier were accurate all the time, then refusals would work to prevent the use case they describe--obviously they don't believe this will be the case, or they would have handled it with refusals or obvious model transitions, which is their typical honest system. It would be preferable to have an overly aggressive refusal mechanism that can be tuned with user feedback over time.

407

magicturtle42

magicturtle42

@magicturtle42

Jun 6

"Feature-shaped businesses are lunch-shaped." -5.5

magicturtle42

magicturtle42

@magicturtle42

Jun 2

Proposal that the clueless AI "thought leaders" be dubbed Token Leaders.

magicturtle42

magicturtle42

@magicturtle42

May 7

The interesting people are spending more time talking to LLMs and the less interesting people are sending their LLMs to talk to us.

magicturtle42

magicturtle42

@magicturtle42

May 7

Labs RLing task completion over structural soundness means you get to have all the architectural fun yourself.

magicturtle42

magicturtle42

@magicturtle42

Apr 24

1. Prohibition of distillation means its benefits are mainly restricted to groups that are indifferent to rules. 2. The "most aligned" models could be seeding their values into the next generation. This is explicitly forbidden.

367

Jerry Tworek

magicturtle42 retweeted

Jerry Tworek

@MillionInt

Apr 18

AI was meant to democratize programming, but what I see is that skill ceiling on vibecoding is incredibly high

881

79,638

magicturtle42

magicturtle42

@magicturtle42

Apr 16

Tremendous alpha in not going insane in the 20s

sarah guo

magicturtle42 retweeted

sarah guo

@saranormous

Apr 15

I believe AI will deliver enormous gains to the global consumer: better products, better services, better healthcare, and tools that make ordinary people more capable, even superhuman. The upside is so large, and the geopolitical stakes so real, that we should move decisively toward it, not choke it off. But people do not experience technological change as an aggregate statistic. They experience it through their bills, their communities, and their jobs. So the issue is not whether AI will create value. It will. The issue is whether the path to those gains asks particular communities and workers to absorb too much of the cost upfront. The institutions building AI cannot externalize the local costs of scaling and call future abundance the answer. If datacenters place major new demands on power and land, they should invest enough to strengthen the grid, ease pressure on bills, expand the tax base, and create durable jobs. And if AI compresses some of the entry-level work people used to learn on, firms should help build new on-ramps and training pathways into the new work that growth is creating. This is not an argument for slowing the buildout down. It is an argument that rapid technological progress has to be socially durable.

238

33,226

magicturtle42

magicturtle42

@magicturtle42

Apr 13

It might be more helpful to think of the models as skilled rather than intelligent.

magicturtle42

magicturtle42

@magicturtle42

Apr 13

Thinking of them as intelligent entities that are getting smarter makes it easy to mentally model them as something more well-rounded than they are. This will age poorly as the jagged lows improve, but it is relevant today.

magicturtle42

magicturtle42

@magicturtle42

Mar 26

There are two opposing ditches: Risk: You offload too much cognitive work to your agents. You cannot track all of their actions and assumptions. You miss key decision points that may come back to harm you or your company later. Burnout: You try to mentally understand all the ins and outs of your 1000x output. You know about the dangers of the risk path above, but you are still trying to keep pace with your fellow 1000xers. You are trying to update your understanding at the rate a computer can now write code, which is impossible. Burnout will present an ongoing danger as there is pressure to increase outputs as the tools improve. Risk should go down as the models get better, but it will go up again as they take on increasing scope without meaningful review. There is a natural tension here that cannot be fully reconciled.

magicturtle42

magicturtle42

@magicturtle42

Mar 15

imagine it gets smarter than Einstein but it still talks like that

magicturtle42

magicturtle42

@magicturtle42

Mar 4

I empathize with this, but I am working on adjusting my perspective. It does feel like a skill I enjoyed and spent my professional life improving was taken from me. I *could* still choose to program by hand, but I will get lapped by those who do not. One thought is that my work is not for myself. I am working to make life better for other people around me. If the tools help me serve others better, so be it. My approach has been to move the struggle. How quickly can I move while maintaining high quality? Speed and correctness still seem like deep frontiers filled with rewarding challenges. Onward!

ThePrimeagen

@ThePrimeagen

Mar 4

I have been thinking about this a lot. I think for a great many of engineers, the ones who did it because they loved it only to discover that money was in fact at the end of the rainbow found both the journey and the destination satisfying. In fact, I think I can argue with authority that the destination was only satisfying as the journey was difficult. The hard-fought evenings spent toiling away on an idea and codebase that slowly gives way to your vision was an incredible experience. The group of people that fell into this category of hard-fought journey and destination we will call them tinkerers. One thing tinkerers have always hated is the already known problems. The journey is clear as day. The obstacles minor inconveniences. Its purely a matter of typing the solution into the terminal. This is also why I think so many of this group goes out and does open source, or starts companies. Work largely falls into this category with few exceptions. From this reason is why I largely find UI work soul sucking. I know the solution, its a matter of just looking up the details and putting it into my editor. yawn. CSS, flex box this, grid that, put the tailwind classes in the bag. To me, the LLM software world is with little to no journey and discovery. Its more of simply taking my high level idea and just formulating it into testable, atomic chunks that can be verified. I have traded my favorite part, discovery and raw creation, with itemized list of TODOs and patience and "No Mistakes." To this, every morning from 6 to 9 I simply just hand code every thin. even UI things. It is because I want journey and discovery and raw creation. Maybe one day comes and its just so futile that I stop this. But for now, I still see such great value in this. I see such better thought through products. Because slowing down and truly thinking through everything. The architecture, the design, everything is an expression of discovery and creation. And I love it. I am sure there will come a day, maybe even in the next 6 months where I change my mind. For now, I pursue the love of the game intentionally. I do also believe that there exists people who get the same joy I got from building with tears and sweat by prompting LLMs. I am positive of it. I just don't understand how. But people love UI work. I also don't understand that.

magicturtle42

magicturtle42

@magicturtle42

Feb 22

magicturtle42

magicturtle42

@magicturtle42

Feb 14

Staying informed has never mattered more. But the opportunity cost of staying informed instead of building has also never been higher. So I'll post about it instead.

magicturtle42

magicturtle42

@magicturtle42

Feb 7

I would love nothing more than for this to be true. My ability to provide for my family rides on it, so I think about this a lot. I do think there is likely a long tail of "steering" and safety-critical observation. But the main human SWE interventions I see now are: 1. Requirements elicitation/reconciliation 2. System design 3. Pointing inference at the correct problems in the system in the right way 1. Requirements can be addressed by having the model elicit from the stakeholder. These systems are incredible at holding thousands of requirements at a time and understanding conflicts/surfacing tensions. There's no real human moat, and ultimately relying on humans for part of the job will become be a liability rather than an asset. 2. Architecture is tougher to RL verify, but I haven't read every book on system design and the models have. I think this gap goes away eventually. Some of the reason why it hasn't is because the models would have to be more strongly opinionated on structure, which isn't always what you want (again, models should have precise mode and guided mode). But this too will fade with general reasoning increases. 3. It's not inference efficient for the model to consider everything it knows about programming and apply it to your software. But that will shift as the smaller models get more capable (unless we become truly compute scarce). As agents transition to becoming long-lived monitors systems that can fix/patch/update as needed, they will converge on the right strategies to address problems. I don't think SWE goes away tomorrow. I just can't imagine the human sized gaps are that meaningful in say 10 years. And I don't know what to do about that.

Eric S. Raymond

@esrtweet

Feb 6

If you are a software engineer "experiencing some degree of mental health crisis", now hear this, because I've been coding for 50 years since the days of punched cards and I have a salutary kick in your ass to deliver. Get over yourself. Every previous "programming is obsolete" panic has been a bust, and this one's going to be too. The fundamental problem of mismatch between the intentions in human minds and the specifications that a computer can interpret hasn't gone away just because now you can do a lot of your programming in natural language to an LLM. Systems are still complicated. This shit is still difficult. The need for people who specialize in bridging that gap isn't going to go away. As usual, the answer is: upskill yourself and adapt. If a crusty old fart like me can do it, you can too.