Adam.GPT

Adam.GPT

2,378 Photos and videos

Tweets

Pinned Tweet

Adam.GPT

@TheRealAdamG

1 Oct 2024

Shippty ship ship, McShipface.

367

315,192

Adam.GPT

Adam.GPT

@TheRealAdamG

or not.

Greg Abbott

@GregAbbott_TX

Jun 1

Spurs about to dunk on Knicks like Texas has been dunking on New York. Go Spurs Go 🔥👽

1,692

Adam.GPT

Adam.GPT retweeted

Adam.GPT

@TheRealAdamG

14h

Knicks in 5

117

6,660

NBACentral

Adam.GPT retweeted

NBACentral

@TheDunkCentral

14h

A Knicks fan claims Victor Wembanyama had him kicked out of a $20,000 Ritz-Carlton penthouse during the NBA Finals after a friend allegedly told Wemby, “Good luck tonight, big guy,” per @nypost The fan says his group spent $750,000 on Game 4 tickets and was removed from the hotel just minutes after the interaction. The Ritz-Carlton and Wembanyama’s agent did not respond to requests for comment. (Via nypost.com/2026/06/13/sports…)

616

295

9,397

2,734,289

Karine Hsu

Adam.GPT retweeted

Karine Hsu

@karine_hsu

Jun 12

wow sick new billboards from openai

212

4,498

2,582,728

Justin

Adam.GPT retweeted

Justin

@JustinBleuel

Jun 12

📝 Turn any @ChatGPTapp to a note Now you can turn any message from ChatGPT into an interactive writing block - just highlight chat's response and click Create Note These can be saved to your library then retrieved / referenced in a future chat conversation

0:44

339

27,636

Adam.GPT

Adam.GPT

@TheRealAdamG

Jun 12

Valuemaxing > tokenmaxing

121

7,159

OpenAI Developers

Adam.GPT retweeted

OpenAI Developers

@OpenAIDevs

Jun 12

And it doesn’t stop at answers. Describe what you’re building and the agent will create a custom guide with a tailored prompt and relevant resources. Open the guide in Codex or copy it as Markdown for your coding agent.

0:19

291

25,620

OpenAI Developers

Adam.GPT retweeted

OpenAI Developers

@OpenAIDevs

Jun 12

Ask our developer docs. They’ll show you the way The new docs agent on 🔗developers.openai.com helps you find answers about OpenAI products and takes you directly to the relevant documentation.

0:19

935

161,268

Siqi Chen

Adam.GPT retweeted

Siqi Chen

@blader

Jun 12

fable is expensive wrote a quick arbitrage skill for claude code to offload the hard things to fable and everything else to gpt 5.5 xhigh it will try to be always on or you could say "implement with arbitrage" github.com/blader/arbitrage

GitHub - blader/arbitrage: A Claude Code skill for token arbitrage: keep premium model tokens for...

A Claude Code skill for token arbitrage: keep premium model tokens for judgment, dispatch all code-writing to codex. - blader/arbitrage

github.com

655

51,589

Epoch AI

Adam.GPT retweeted

Epoch AI

@EpochAIResearch

Jun 12

FrontierMath: Tiers 1–4 (v2) is live. We concluded an audit that addressed errors in 42% of problems. Rankings are similar but scores are higher across the board. The current leaders are GPT-5.5 (xhigh) with 85% on Tiers 1–3 and Google’s AI co-mathematician with 76% on Tier 4.

574

113,436

Dan Shipper 📧

Adam.GPT retweeted

Dan Shipper 📧

@danshipper

Jun 12

had an idea for a big fable project, set it up, and let it cook came back an hour later and it had triggered the safeguards and fell back to 4.8 10 minutes in back to codex 😬

884

113,171

Gappy (Giuseppe Paleologo)

Adam.GPT retweeted

Gappy (Giuseppe Paleologo)

@__paleologo

Jun 12

Clearly, Fable is doing a lot of work, and unleashing a ton of agents. To review a short technical note, it released 31 agents, coded simulations to verify my results, did "adversarial reviews". Eventually, it only made the assumptions slightly more rigorous. It is all good. For a four-page technical note a little code, though, it consumed all my Pro session tokens, *plus* $17 worth of credits. It is ridiculously expensive. I have 20-page reports that are way more complex than this. I can see how Anthropic has entered the phase of market-clearing prices, yield management, and pre-IPO. I recall Boris Cherny saying in a podcast, "run Opus [4.6], not Sonnet. It's worth it". I feel comfortable saying that running your top-shelf model is *not* worth it anymore. Decreasing returns, on most tasks. Like in the real world, some people can be real smart, but real expensive.

1,081

162,901

Jimmy Apples 🍎/acc

Adam.GPT retweeted

Jimmy Apples 🍎/acc

@apples_jimmy

Jun 12

Having tested fable 5 for a bit now my thoughts are is it does some hard stuff extremely well but day to day I still prefer 5.5 First thing people are going to test 5.6 on though is the same threejs, games with physics, web dev type stuff. Ignoring the restrictions, good model.

358

19,617

James Sun

Adam.GPT retweeted

James Sun

@JamesZmSun

Jun 12

Two browser updates for today's Thursday Codex drop: Dev Mode faster browser use. 1/ Dev Mode Now, you can give full CDP (Chrome DevTools Protocol) access to your Codex agent so it can be more helpful in deep developer workflows. By turning this on, your agent will be able to inspect: - runtime errors & console log - network requests, responses & failures - live DOM, storage, and page state This is really helpful for debugging gnarly bugs and making your apps more performant. To try it, go to your codex settings, and search developer mode to toggle it on! 2/Browser Use Perf improvements Using Codex to drive your browser should feel much faster after this week's update, up to 2x in many cases! We made the plugin a lot more efficient, and truncated a lot of redundant text in the skill. Let us know what you think!

305

40,295

OpenAI Developers

Adam.GPT retweeted

OpenAI Developers

@OpenAIDevs

Jun 12

Introducing developer mode for browser use in Chrome and the Codex in-app browser. Codex can use the Chrome DevTools Protocol (CDP) to debug browser issues by profiling JavaScript performance and inspecting console output, network traffic, and page state.

1:28

206

2,982

357,700

OpenAI

Adam.GPT retweeted

OpenAI

@OpenAI

Jun 12

We heard you wanted to use Codex rate limit resets on your own time. Starting today, we’re rolling out the ability to save rate limit resets to use later. We’re starting Go, Plus, Pro, and Business users with one free reset:

0:28

1,254

1,722

21,436

4,197,561

Lisan al Gaib

Adam.GPT retweeted

Lisan al Gaib

@scaling01

Jun 11

Fable 5 refused 200 out of 200 ProgramBench tasks lmao

126

184

5,156

405,501

Christopher Ehrlich

Adam.GPT retweeted

Christopher Ehrlich

@ccccjjjjeeee

Jun 11

fable is overall very good obviously but it feels like a lot of people have never used a gpt-5.5 tier model before

275

14,963

Florian Brand

Adam.GPT retweeted

Florian Brand

@xeophon

Jun 11

> We tested Fable 5 using Anthropic’s new ‘fallback’ mechanism, which can route safety-flagged messages to Claude Opus 4.8. 🤦‍♂️🤦‍♂️🤦‍♂️🤦‍♂️🤦‍♂️🤦‍♂️🤦‍♂️ And it affected 8% of the samples, so it score needs to be revised down, possibly quite a bit, bringing it closer to the rest of the pack

Artificial Analysis

@ArtificialAnlys

Jun 10

Claude Fable 5 launched today at #1 on the Artificial Analysis Intelligence Index, putting Anthropic nearly 5 points ahead of any other lab’s best model We supported @AnthropicAI with pre-release evaluation of Claude Fable 5. Claude Fable 5 scores 64.9 on the Artificial Analysis Intelligence Index, claiming the #1 rank overall. It is ~5 points ahead of the closest non-Anthropic model (GPT-5.5), and Anthropic models now occupy both of the top 2 places. Key takeaways for Claude Fable 5 (adaptive reasoning with max effort and Opus 4.8 as fallback model): ➤ New safety guardrails for Mythos-class models: Claude Fable 5 uses the same underlying model as Claude Mythos 5 for public usage, with additional guardrails for potentially-harmful cybersecurity, biology, chemistry, and distillation-related queries. We tested Fable 5 using Anthropic’s new ‘fallback’ mechanism, which can route safety-flagged messages to Claude Opus 4.8. Anthropic states that fallback occurs in fewer than 5% of sessions on average, and we recorded fallback routing in ~8% of tasks across the Intelligence Index (mostly in scientific questions from evaluations like GPQA, AA-Omniscience and Humanity’s Last Exam) ➤ State-of-the-art Intelligence: Claude Fable 5 takes the #1 position on the Artificial Analysis Intelligence Index, scoring 64.9 and setting the highest score on 5 of the 10 underlying benchmarks. On AA-Omniscience, our knowledge and hallucination benchmark, Fable 5 scores 40, 7 points over the previous leader, Gemini 3.1 Pro Preview, driven primarily by higher accuracy. We generally observe a strong relationship between AA-Omniscience accuracy and model size in open weights models, which suggests Fable 5 could be larger than previous public Anthropic models ➤ Frontier agentic capability: Claude Fable 5 is at the frontier across all three agentic evaluations in the Index: GDPval-AA (real-world work tasks), Terminal-Bench Hard (agentic coding), and Tau2-bench Telecom (tool use for customer service). Its GDPval-AA Elo of 1932 is a significant jump from the previous leader, Claude Opus 4.8, further extending Anthropic’s lead in agentic capabilities ➤ Leading HLE score, but refusal and fallback in 9% of tasks: Claude Fable 5 scores 53% on Humanity’s Last Exam, more than 7 points ahead of the next-best model, Claude Opus 4.8 (max). Fable 5 triggers safety guardrails on 9% of HLE tasks, falling back to Claude Opus 4.8. Including this fallback usage, running HLE with Fable 5 costs ~$2.2k, the highest of any model we have evaluated Key model details: ➤ Context window: Claude Fable 5 retains the same 1M token context window as Claude Opus 4.8 ➤ Price: Claude Fable 5 is priced at $10/$50 per 1M input/output tokens, 2x the token price of Claude Opus 4.8. The cache write/read price is $12.50/$1 per million tokens ➤ Availability: Claude Fable 5 is included in Pro, Max, Team, and seat-based Enterprise plans through June 22, consuming 2x Opus usage. From June 23, usage will require credits, with Anthropic saying it plans to restore subscription access once capacity allows

208

25,293