Ty Robben

Ty Robben

75 Photos and videos

Tweets

Ty Robben

@TyRobben

20h

Quick round trip to Denver and back to drop Mom off in Colorado. Gives me 3 hours in flight to go through the 80 pages (!!!) of underwriting guidelines gpt 5.5 and I generated with wisprflow during insurtech insights

712

Ty Robben

Ty Robben

@TyRobben

Jun 13

It’s times like this (big news on topics important to you) that curating your signals is put into hyper drive You shed the people who have obviously bad takes or amateur ones are exposed and the highest signal accounts clearly rise I love these times on x it’s a level up

442

Ty Robben

Ty Robben

@TyRobben

Jun 13

Tokenmaxxxxxing before 5.5 gets shut down

Ty Robben

Ty Robben

@TyRobben

Jun 13

Oh noooo

Andrew Curran

@AndrewCurran_

Jun 13

Replying to @AndrewCurran_

This was all allegedly triggered by a Mythos jailbreak that was shared with the US Government. This is Anthropic's response: 'To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed a report that we believe is the basis of the government's directive and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.'

207

Ty Robben

Ty Robben

@TyRobben

Jun 13

Got some incredible work out of it while it lasted

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

196

Ty Robben

Ty Robben

@TyRobben

Jun 12

you need to read and understand every line and figure in your business plan, underwriting guidelines and rating algorithm going into your mga and also know what’s missing or get wrecked

kache

@yacineMTB

Jun 12

you need to read and understand every line of code going into your codebase

438

Ty Robben

Ty Robben

@TyRobben

Jun 12

With Fable 5, GPT 5.5, repo prompt context builder, the droid harness, claude and codex there is still a non-zero chance I build this entire infra out and launch as a one man underwriting company, a very small non-zero chance but there's still a chance!

219

Ty Robben

Ty Robben

@TyRobben

Jun 12

This one hurts

DROID

@droidbuilds

Jun 10

"mom, how did we get so poor?" "your father had Claude Max, ChatGPT Pro, Cursor Pro and shipped absolutely nothing"

524

Ty Robben

Ty Robben

@TyRobben

Jun 11

still less than halfway through my monthly allotment of droid tokens adding a 3rd concurrent project with fable before taking my mom to the fair hopefully all 3 projects complete and allotment is met when I return if these 3 projects complete in a single day it’s a game changer for insurance

289

Ty Robben

Ty Robben

@TyRobben

Jun 11

This is incredibly exciting...Fable 5 orchestrating, GPT 5.5 worker and validator. Let's see what this thing can do with the droid harness.

698

Ty Robben

Ty Robben

@TyRobben

Jun 9

My factory ai pro plan resets on the 12th Fable 5 preview ends on the 22nd? Might be time to try and finish production grade platform who needs funding and my CTO

230

Taelin

Ty Robben retweeted

Taelin

@VictorTaelin

Jun 9

this is my personal singularity moment this post may sound like a paid ad. I only wish. I'm concerned, more so than happy. the world is changing, and, among the scenarios where AI goes terribly wrong, inequality is the most realistic, yet, the one Anthropic seems to be the least concerned about. I'm glad OpenAI is taking the opposite stance: *personal AGI for everyone*. I think this is a commendable position in the times we live. but who am I in the queue of the bread? anyway, Fable is here, so I'll just report my first-hour experience first of all, all my pet prompts are solved. → λ-calculus puzzles → bug questions → one-shot apps all are trivial to it. I don't have anything harder other than my ongoing work so, in the last several days, I've been toying with HVM5, a new interaction net evaluator with a faster loop. after writing the first version, I left 32 GPT-5 agents working for ~20 hours each. this resulted in up to 2x speedups, but the file size increased by 2-fold and quality decreased significantly. I then simplified the whole thing into an even simpler core, and left Opus 4.8 and GPT 5.5 optimizing it for 8 hours. Opus got a legit 6% - 34% speedup in most benches. GPT got better results, but, sadly, an unusable file. I then asked Fable to optimize it. 2 hours later, it landed a 1770% speedup in one case, 100% in other 4, and 22% in average. yes, in 2 hours it outperformed me, opus 4.8 and a swarm of gpt 5.5 agents, by one order of magnitude. that could not possibly be legit. "it must be hardcoding the benchmarks" (GPT trauma). so I read its explanation and what it did was, indeed, the most high impact optimization one could try first. seems like HVM5 was wasting a lot of time garbage-collecting unused branches of pattern-match nodes. I had optimized that for static mats, but not for dynamic mats. skill issue. Fable figured how to do it for these, resulting in a massive speedup in some benches but wait, is that *correct*? I'm not sure yet, it is credible, but this is the kind of thing that is very easy to get wrong on interaction nets. the problem is, when I was ready to start auditing Fable's solution so I could tell whether it was buggy or legit, it interrupted me to tell me it had found a massive bug on the code *I* had written. ... wait, what? so... for garbage collection purposes, I stored a bit on lambda term pointers that meant "the variable bound by this lambda has been freed, so, its lambda must free whatever argument it is applied to". that's fine. yet, on duplicator nodes, I also used the same bit to mean "one of the duplicated variables was freed, so, treat this dup as a passthrough no-op". so, if a lambda entered a duplicator, it would mistake the lambda's collection bit for its own, resulting in corrupted interaction! that's a mouthful, why I'm writing this? just so you can appreciate the sheer absurdity of what just happened. I didn't ask it to find bugs. I asked it for an optimization. and even if I did ask it to find bugs, this bug is so astonishingly subtle and specific, identifying it takes mastering the domain to an extent that it beyond even me. I'd easily need hours or days to fix it, *if* I ever came across it. chances are it would just go unnoticed. and Fable found it and fixed it like it was nothing, while it was busy adding a 17x speedup to a file that neither I, nor Opus 4.8, nor a fleet of GPT 5.5 managed to barely make 2x faster. oh and there is also another tab where it is also ripping through Bend's codebase and finishing everything I had to do I don't know what to say anymore this isn't about Anthropic or OpenAI, this is about our collective future as a species. the world is changing, and we need to be aware of it, and discuss how to handle this change. receipt below . . .

251

680

7,582

1,455,319

Ty Robben

Ty Robben

@TyRobben

Jun 9

I can’t shill these guys enough has a coding harness They’re literally partnering with the labs on these releases and have the prompts optimized for them at release This team is so elite you have no idea

Factory

@FactoryAI

Jun 9

Replying to @FactoryAI

We're excited to have partnered with @AnthropicAI on this launch, in early testing Fable is: • More autonomous on multi-step engineering work • Better at recovering from dead ends • Stronger at deep bug hunts and security review • Built for tasks that require hours of repo-wide context

229

Ty Robben

Ty Robben

@TyRobben

Jun 9

This bench is most accurate to my use in insurance (it measures accuracy for legal research and obscure facts) Shows why personal experience says gpt has mogged Claude for a while in my own use. May change with Mythos, we’ll see!

prinz

@deredleritt3r

Jun 3

Added to prinzbench: Opus-4.8. For the very first time, the Max setting was available to me in the Claude app when I used this model. Using this setting, Claude's performance improved dramatically vs. all prior Anthropic models. Opus-4.8 (Max) scored 42/99 on prinzbench, as compared to 25/99 for Opus 4.7 (Extended). This was the second-highest score of all tested models to date for a model: (i) not released by OpenAI, and (ii) not utilizing a multi-agent setup or parallelized compute. (Gemini 3.1 Pro is still the best such model, having scored 50/99.) I am now very curious about how the "Mythos-class models" that Anthropic has promised to release in the near future will perform on my benchmark.

460

Ty Robben

Ty Robben

@TyRobben

Jun 8

Good example of the levels to this game Data center projects - typically great casualty write - remote, commercial grade and huge values to rate on to fund claims That alpha evaporated quickly once multiple top broker shops announced 1b tower setups and pricing fell through floor Construction is rated on payroll (more workers more liability) but that’s never obtainable on projects so most uws rate on hard costs News like this normally provides a nice capture opportunity carriers that rate on payroll will be too high but if you’re aware payroll is inflated you have an opportunity to capture In this case it won’t matter because most are rating on hard costs anyway Shows how you need to be aware of what’s actually going on with exposures, but also how the market competition is, and how other markets are underwriting the same risks

227

Ty Robben

Ty Robben

@TyRobben

Jun 8

Have devs that want to bring my end-to-end underwriting/policy admin/accounting/reinsurance/claims platform from prototype to prod How to get them up to speed on architecture is the huge hurdle PDF? Word? Recorded demo? Going to try Codex building an html microsite and see how that goes

379

Ty Robben

Ty Robben

@TyRobben

Jun 7

Nico took the 996 bit too far but he’s not wrong about how hard the rebuild of the insurance transaction is

426

Ty Robben

Ty Robben

@TyRobben

Jun 7

Speed without expertise is useless Expertise without speed loses profitable business

361

Ty Robben

Ty Robben

@TyRobben

Jun 6

This is an insurance post if there ever was one

kache

@yacineMTB

Jun 6

The way I'm doing AI robotics is completely and totally different from all the super funded silicon valley companies. The reason I'm doing it differently is because they're all wrong

724

Ty Robben

Ty Robben

@TyRobben

Jun 5

Haven’t networked outside of the broker, underwriter, reinsurer space in decades Doing so in the tech and private capital markets has injected so much energy into me. I’m meeting some of the most incredible people. Love this industry so much.

755