Unaligned. Previously: AR Glasses @Google, marcopolo.me, @Jawbone, iOS Bluetooth, iPod, Microsoft

Joined June 2007
13 Photos and videos
Adam MacBeth retweeted
The main contribution of AI Safetyists has been poisoning the AI's priors with their fanfic
Replying to @beffjezos
At least they filled the training corpus with fan fic of rogue malicious AI, a good first step for an architecture that token predicts.
5
6
90
6,605
Adam MacBeth retweeted
Not your weights, not your model. Importance of decentralized intelligence and ownership over weights and compute has never been felt so sharply.
26
63
451
17,704
Adam MacBeth retweeted
Replying to @martin_casado
It is very important to regulatory capture for the firm to be part of the "in group" when it comes to decision-making to a point when decision-making is deferred to them as experts. A key but erroneous assumption they made was that when facing regulation they would be insiders.
1
1
37
1,871
Adam MacBeth retweeted
I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true: — As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable. — Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.) — A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused. — In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.” — In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety. — In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community. — The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority. — Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.
2,122
3,129
24,430
7,164,946
Adam MacBeth retweeted
The lesson I take from the SpaceX IPO is that the only thing stopping us from solving arbitrarily difficult problems is extreme creativity in business models. No amount of tax and spend programs got us reusable rockets and great electric cars. Customer delight is a necessary precondition for success. There seems to be some discussion around whether successful entrepreneurs should give up control of their companies so they can subsidize some philanthropic venture that otherwise has no value prop sufficient to run it as a business where customers voluntarily exchange money for goods and services at a competitive and reasonable price. This misses the point. Transformational products deliver tangible value at 1000x the rate of charities whose value cannot be tested in the market place. Think about the undeniable value of the smart phone, satellite Internet, electric consumer devices, etc etc. I think the transformational moment for SpaceX was when Elon stepped away from the philanthropic Mars greenhouse concept and fixed his resolve on unlocking radically better rockets for humanity. The greenhouse would have been, at best, a neat trick. Falcon and Starship give humanity a durable economic engine to maintain and improve access to space, forever.
78
240
2,388
249,934
Adam MacBeth retweeted
all the labs will now race to get similarly banned by govt to avoid serving their frontier big chungus at a loss and subsequently write-off big chungus R&D costs as patriotism
9
16
527
22,386
Adam MacBeth retweeted
Just want to observe that Anthropic has plenty of compute. They did manage to train the SOTA model. It’s Anthropic’s customers that don’t have enough compute. Their business is not selling tokens. Their business is being in control of AGI/ASI. Do with that what you want.
1
1
4
329
Adam MacBeth retweeted
want to point out a few really interesting things here 1. Claude Code is actually the worst performing harness when using the same model, significantly behind opencode and cursor cli this is the core reason i've been against the LLM companies focusing their business on locking people into their harness what they are good at is making great models. they suck at making good harness products, just like how power plants won't make the best dishwashers, and how internet providers won't make the best phones if anthropic wants to do what's best for their users, they should let people use their subscriptions in whatever harness they choose, not locked into claude code alone 2. fable 5 max is only 1pt above gpt 5.5 xhigh (77 vs 76) this matches my experience so far - fable 5 does have the big model smell and it's pretty good, but it's not a massive jump forward like their marketing suggested, at least not on building software this is actually alarming for anthropic because it's very unlikely people will want to pay 2x higher cost for the 1pt difference. my speculation would be that in enterprises people will be restricted to adopt fable & mythos only on some mission critical tasks, not used at scale
We've updated the Artificial Analysis Coding Agent Index, replacing SWE-Bench Pro with Datacurve's DeepSWE benchmark - the swap lifts Codex with GPT-5.5 (xhigh) above Claude Code with Opus 4.8 (max), while the newly released Claude Fable 5 (max) in Claude Code debuts at the top DeepSWE, built by @datacurve, writes its tasks from scratch rather than adapting them from public GitHub issues or pull requests, so no model has seen the solutions during training. That matters because SWE-Bench Pro, the benchmark it replaces in our Coding Agent Index, had grown gameable, with some models recovering the fix from the repository's commit history instead of solving the task. The swap reorders the index: Codex with GPT-5.5 (xhigh) rises from 65 to 76, overtaking Claude Code with Opus 4.8 (max) at 73. Claude Code with Fable 5 (max), which enters directly on the refreshed index, leads at 77. SWE-Bench Pro had been flattering some combinations and penalizing others. More below.
74
92
836
139,407
Adam MacBeth retweeted
LLM model matrix
84
251
1,567
666,789
Adam MacBeth retweeted
The margin on a subscription plan is a function of the average utilization. If we assume both companies have 75% API gross margins, this results in the following subscription margins. (3/4)
25
60
1,096
270,431
Adam MacBeth retweeted
The reality is that in no world do hyperscale AI data centers get built in Seattle city limits to begin with. This is effectively a hollow victory prohibiting something that never would have happened.
Seattle City Council Member Debora Juarez says if she could today she would pass a total ban on AI and data centers. For now, Seattle becomes the largest city in the country to pass a year-long moratorium on new data centers
7
2
59
7,952
Adam MacBeth retweeted
One of my personal favorite features announced at WWDC will I suspect be a sleeper hit: container machines, allowing your Mac to run a lightweight, persistent Linux environment with your home directory and repos automatically mounted: github.com/apple/container/b…
228
815
9,698
728,543
Great story!
I flew to London and spent the day with Carl Pei, co-founder and CEO of Nothing. Carl already helped build one breakout phone company with OnePlus. Now he’s doing it again with Nothing, but in a much harder market. Most phone companies compete on the same things: better cameras, faster chips, bigger batteries, and slightly brighter screens. Nothing made a different bet: make the product instantly recognizable, make the software feel considered, and make the brand something people actually want to be part of. I went behind the scenes with Carl and the team to see how they're building the next great consumer electronics company. Full episode below. @getpeid @nothing
96
Adam MacBeth retweeted
Don't Surrender to the Machine Tony Fadell, co-creator of the iPod and iPhone, founder of Nest, partner at Build Collective, interviewed by @lennysan (Lenny's Podcast) Summary: AI makes shipping cheap. That raises the value of taste, judgment, and storytelling. Fadell argues the products people remember are the ones a small group with strong opinions built deliberately, fought for across three generations, and surrounded with the right marketing context. Vibe-coded shortcuts pay short-term and rack up structural debt. Luxury software gets there with humans still in the loop. 1. Cognitive Surrender. Use the machine, never hand it the wheel. Fadell's central rule is that AI can assist coding, copy, prototyping, and inventory counts, while humans still have to decide what gets built and why. The Claude main-loop code that leaked looked brittle to actual engineers because no architect had touched it. Short-term gain, long-term loss is the trade you make when you let the model run unsupervised. 2.Benevolent Dictatorship. 1.0 products get made by one or two people willing to own the opinion-based calls. Committees pulling data on a category that does not exist yet just produce a worse copy of something already in the market. The keyboard fight on the iPhone went on for months; Steve ended it by saying we are going this way and anyone not on board can switch projects. The discomfort is the cost; the product is the return. 3. Pain Plus New Tech. Worthy ideas start at someone's pain and ride a technology that just became possible. The Nest worked because thermostats were arcane to program, 50% of the energy bill ran through them, and AI was finally cheap enough to learn a household's pattern. iPod required portable mass storage plus lithium polymer plus ARM at once. Both halves of the equation have to land at the same time; one-half ideas only produce evolutions. 4. Three Generations. Everything needs three swings: make the product, fix the product, fix the business. The first iPod sold only to Mac geeks (under 1% of the market), the second mostly did the same, the third with iTunes and Windows finally moved volume. Nobody nails margins, reliability, and message in the first build, and the only failure is stopping. Founders quit because they expected one launch to clear all three bars. 5. Micromanage The Decision. Sweat a few details ruthlessly, delegate the rest. Fadell's early mistake was micromanaging operations, which exhausted his team and produced a single bottleneck. The real fight is over the data behind a call (keyboard error rates, hardware-software coupling, the load-bearing visual) and the system-level changes that only land if everyone moves at once. Everything outside that radius is somebody else's job. 6. Marketing Is Product. Customers see the product first through the press release, the first ad, and the storefront. They never see the inside. Apple's same iPod campaign flopped in Europe because European adopters were earlier on the curve and needed a different message. The right move is to write the press release before the build starts, so the three key features and the why are locked before engineering begins. 7. The Story A Thousand Times. Steve Jobs honed the iPhone story every day for two and a half years before stage. Storytelling is the loop that exposes which features matter, which words land, and which version of the truth is actually true. Borrow technique from infomercials: set up the virus of doubt, name the pain, show the relief, and dial off the cheese. 8. Luxury Versus Fast Software. Vibe-coded apps are fast fashion: cheap, throwaway, structurally brittle by version five. Real products are handcrafted, layered, and survive customer feedback for years. Use coding agents to prototype faster and reach an informed gut, then architect the spine yourself and let the model fill scoped subfunctions. The original Flighty got built by humans; a clone could be vibe-coded, but the original could not have been. 9. Flip The Stack. Long-term, voice should be primary input, keyboard secondary, tap-and-swipe tertiary, the opposite of how every smartphone is built today. The display still has to exist: maps, video, and complex visuals need glass, even in the movie Her. Pure-audio devices like Humane failed because removing the screen is "different, not better." The next iPhone is still a slab, just one you mostly talk to. 10. Atoms Beat Bits Long-Term. Hardware founders get laughed at in software cycles, then rewarded when the next platform arrives. Fadell pitched hardware in 1999 and got told it was the stupidest idea ever; the iPod shipped two years later. The durable companies have atoms in the plan: sensors, robots, devices, because software-only categories get vibe-coded into commodity. Waymo is a sensor-stacked electric car, and that is exactly the platform other companies will build on. 11. The Hype Cycle Is A Trap. Buy in before the term is fashionable; hold discipline once round sizes go to ten digits. Fadell was early on Groq and Cerebras because the valuations were small and the bets were obvious to a builder, not a market. By the time a category needs a five-billion-dollar raise to start, the venture math no longer works. Chasing what is hot guarantees showing up late. 12. No Surrender In Ethics Either. When iTunes Video was being scoped, somebody floated porn and Steve killed it on the spot: "is that the world you want your kids to grow up in." Today's analog is companies shipping sex-chat AI to juice engagement; users will feel it and brands will pay. The iPhone is a refrigerator: it stores junk food and good food alike, but the operating system can ship the nutrition labels, the limits, and the tools, and platform owners have so far refused.
21
24
119
24,123
Adam MacBeth retweeted
🔺NEW: Apple is expanding Private Cloud Compute (PCC) beyond our data centers. PCC on Google Cloud: NVIDIA Confidential Computing, Intel TDX, and Google's Titan chip, with capabilities that go far beyond a traditional confidential computing deployment. security.apple.com/blog/expa…
6
97
509
53,979
Adam MacBeth retweeted
Everyone who over-hired or lowered the bar too much in the 2021-2023 wave, or isn’t growing as fast as budgeted, now pretends they’re laying people off “due to AI productivity.”
150
270
3,839
424,724
Adam MacBeth retweeted
some thoughts on when ai builds itself 1) anthropic put out a piece on recursive self-improvement 2) for those that have been following ai progress, there isn't much new in this report 3) if you have seen the metr graph, you know we've seen rapid progress over the last year in coding agents 4) there is some internal information that anthropic provided, which is new but hard to interpret without additional information that anthropic doesn't give us 5) anthropic engineers are shipping 8x as much code as they were before claude code; but we don't know how to translate that into ai progress 5) mythos can optimize the training code for a small model much faster and more extensively than a human researcher can; but what does this mean for the frontier 6) given a sample of just problems where researchers made the wrong decision, a claude judge preferred mythos's next step 64% of the time; but apparently sonnet 4 was preferred 50% of the time 7) so, anthropic withholds the information that would really be useful for assessing each of these new datapoints; they read almost like marketing 8) i dislike how the tone of the piece is very "be worried, be scared" but they do not give us datapoints that would really tell us more about the pace of progress 9) i think that if you actually take this risk seriously and want other people to take it seriously, it is incumbent on you to do some amount of disclosure; 10) some things they could have given us: 10a) in 2025/2026, how fast has algorithmic progress accelerated in pretraining, measured in effective compute on pretraining loss 10b) in 2025/2026, how fast has algorithmic progress accelerated in post-training, measured on their internal benchmarks across a range of tasks 10c) what percentage of the large-scale, mid-scale and small-scale improvements needed to go from opus 4 to mythos, which are not in the training data, can be found independently by mythos 10d) since mythos was released, what percentage of large-scale and mid-scale improvements discovered at anthropic should be primary attributed to mythos 11) without this kind of information, anthropic has given us nothing new on the rate-of-progress question 12) they also suggest a pause; but, i find pause arguments unconvincing; the whole posture from anthropic seems a mix of unserious and performative 13) i don't like to read vague statements from parties that say i should be *very concerned* but then won't disclose anything significant;
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
15
23
327
42,408
Adam MacBeth retweeted
Starlink launches now substantially outnumber all other satellite launch sources.
275
1,208
6,148
855,224
Adam MacBeth retweeted
The xAI master plan was right under our nose all along...
4 Sep 2025
Step 1: Buy a sh*tload of GPUs Step 2: ? Step 3: Profit
55
93
2,460
343,388
Concerning.
Trump administration, OpenAI discussing possible government stake in the AI startup cnbc.com/2026/06/05/trump-op…
1
1
267