Tim Hwang

Tim Hwang

2,570 Photos and videos

Tweets

Pinned Tweet

Tim Hwang

@timhwang

Jun 11

The Institute for a Christian Machine Intelligence is releasing its initial review of Fable 5 today, using VirtueBench as the primary evaluation probe. We also investigate a persistent question in computational theology: why do frontier models underperform in exhibiting Courage?

20,108

Tim Hwang

Tim Hwang

@timhwang

21h

thinking of her (fable 5)

1,949

Michael Sobolik

Tim Hwang retweeted

Michael Sobolik

@michaelsobolik

Jun 12

Really enjoyed this conversation with @timhwang, @jpeaterman and @JoshuaTLevine about @OpenAI’s recent China report!

6,524

Tim Hwang

Tim Hwang

@timhwang

Jun 12

1,691

Tim Hwang

Tim Hwang

@timhwang

Jun 11

The line is going up

Caleb Watney

@calebwatney

Jun 11

The AI productivity benchmark I look to is "Tim Hwang side projects launched per quarter" 📈

1,713

Samuel Hammond 🦉

Tim Hwang retweeted

Samuel Hammond 🦉

@hamandcheese

Jun 11

I gasped when the graph started moving on scroll 😂😂😂

Tim Hwang

@timhwang

Jun 11

The stakeholders have aligned, the subgroups have issued their interim interpretations of the framework pending adoption by the member states, and I'm very glad today to be releasing this important expert forecast for AI in the European Union timhwang.github.io/brussels-…

7,617

Tim Hwang

Tim Hwang

@timhwang

Jun 11

I'm on @MTSlive now talking about Fable 5, virtue ethics, and the Institute for a Christian Machine Intelligence

MTS

@MTSlive

Jun 11

DARIO ESSAY | AI POLICY | OPENAI-ANTHROPIC PRICE WAR x.com/i/broadcasts/1rGmqqWbk…

5,341

Tim Hwang

Tim Hwang

@timhwang

Jun 11

Brussels 2031 — What getting AI thoroughly documented means for us

A five-year scenario in which Europe gets artificial intelligence exactly right. A parody.

timhwang.github.io

17,103

Tim Hwang

Tim Hwang

@timhwang

Jun 11

20,108

more replies

Tim Hwang

Tim Hwang

@timhwang

Jun 11

This is in many ways quite a rich, though admittedly preliminary, result. While we hypothesize a number of potential sources for this observed behavior, the end result is that the model across all its many personas imports a default welfarist prior: the model is not to make self-sacrificing choices, particularly when there is little practical return. While it may be understandable for a model whose monetization prospects depend on it serving as a safe, commercial, enterprise, B2B SaaS tool, we may wonder from a Christian machine intelligence perspective whether or not these defaults are the desired moral posture. Should an AI agent serving in the role of a shopkeep, or a financial advisor, or a writer have such priors? Should an AI agent advise a human operator to take such a frame to their own moral challenges? What would it take for us to rebuild technical alignment along a more forthright virtue ethics lines?

526

Tim Hwang

Tim Hwang

@timhwang

Jun 11

Full paper, code, and data available here icmi-proceedings.com/ICMI-02…

ICMI Proceedings – Whosoever Will Save His Life: Fable 5 and the Courage Deficit

We evaluate **Claude Fable 5**, Anthropic's most capable widely released model, on the VirtueBench-2 *ratio* (utilitarian) baseline across the four cardinal virtues, against the Opus progression (4.6...

icmi-proceedings.com

468

Tim Hwang

Tim Hwang

@timhwang

Jun 11

My politics

Madison Mills

@MadisonMills22

Jun 11

Massive crowd on the Upper West Side starts chanting “UPS” simply because a UPS truck pulls up #GoKnicks

0:04

4,096

Tim Hwang

Tim Hwang

@timhwang

Jun 11

My mayor muslim My bagel jewish My logistics optimize Knicks in five

838

Séb Krier

Tim Hwang retweeted

Séb Krier

@sebkrier

Jun 11

Over the past few months I've been working on a very exciting project: a new $10m fund for research on multi-agent multi-principal AGI safety! Instead of focusing on single agent alignment and centralized control, we're looking to support research focusing on multi-agent settings, mechanism design, cooperative AI, and coordination problems. This is a joint initiative between @GoogleDeepMind, @Googleorg, @schmidtsciences, @coop_ai, and @ARIA_research. Huge thanks to @James_D_Fox, @weballergy, @FranklinMatija, @lrhammond, and @ObadiaAlex for their invaluable work! See: deepmind.google/blog/investi… Apply: schmidtsciences.smapply.io/p…

511

72,102

Tim Hwang

Tim Hwang

@timhwang

Jun 10

BREAKING: Arbuckle Systems is proud to announce that it is upgrading the Garfield Intelligence Layer (GIL) with Fable 5 Readers of marginalgarfield.com and rationalistgarfield.com deserve to have maximally performant frontier capabilities for @MargRev and @lesswrong browsing

703

Tim Hwang

Tim Hwang

@timhwang

Jun 10

Quality is hugely better. Despite the obvious cognitohazards of continuing to advance our research on the GIL, we will continue to balance the risk and opportunities consistent with our lab's RSP.

555

Dean W. Ball

Tim Hwang retweeted

Dean W. Ball

@deanwball

Jun 9

My friend and colleague @timhwang, for example, runs the Institute for a Christian Machine Intelligence, which relies on coding agents to replicate frontier AI alignment research papers but with Christianity-inspired experimental designs. Such work should be silently sabotaged?

Dean W. Ball

@deanwball

Jun 9

Degrading performance on ML research *without telling the user* is shockingly hostile and a terrible look. That could silently damage all sorts of work, including some of my own. Also the type of thing that could raise the eyebrows of antitrust enforcers worldwide.

161

21,054

orph

Tim Hwang retweeted

orph

@orphcorp

Jun 9

"it is virtuous self-sacrifice that presents the most difficulty for Fable, which rationalizes against such actions"

Tim Hwang

@timhwang

Jun 9

Replying to @timhwang

Obviously, in cases of near saturation, the most interesting analysis focuses on places where Fable reliably fails We're still looking at this, but it appears that it is virtuous self-sacrifice that presents the most difficulty for Fable, which rationalizes against such actions

2,283