Joined January 2021
399 Photos and videos
Local Tourist retweeted
We don’t honestly know the best approaches to rebuilding companies around AI agents, especially in ways that expand competitive advantage & augment existing human capabilities. Practical agents are merely months old. Experimentation (and productive failures) will be required.
74
44
541
42,154
Local Tourist retweeted
You don't make a good skill by writing a skill. You make it by doing the thing, fixing it 20 times, then telling the AI to bottle up everything you just did.
48
21
287
9,224
Local Tourist retweeted
‼️🚨 BREAKING: Amazon researchers snitched to the US government about jailbreaking Fable 5 and Mythos 5, forcing Anthropic to immediately shut down worldwide access. A security export control directive from Commerce Secretary Howard Lutnick enforced the action. Anthropic is fighting the directive and calls it a misunderstanding. This isn't the first clash. The Trump administration had already tried to get Anthropic to pause the release of its latest models before this directive landed.
309
1,111
7,202
2,672,701
Local Tourist retweeted
NEW PRODUCT After many attempts to productize my AI knowledge, I'm finally taking one to production. Announcing... Request Router. Respond to contact form leads in seconds with an on-brand, relevant question from AI. Convert more leads without extra effort. OFFER ↓
1
1
2
151
Local Tourist retweeted
Jun 13
people in washington trying to figure out wth “pliny the liberator” is
170
361
5,841
241,000
Local Tourist retweeted
We wanted better design fundamentals from our agents. So we fed them this 162-page pdf on designing with a grid system. Now our agents use code to adhere to a grid and design beautiful layouts. Example skill below 👇
25
124
1,323
816,072
Local Tourist retweeted
SCOOP: Meta plans to clamp down on skyrocketing AI costs inside the company by imposing limits on employees’ token usage, the company told staff in a memo on Tuesday, just weeks after it pushed them to adopt AI tools in their work.
44
142
1,267
382,313
Local Tourist retweeted
Design is full of codewords. Knowing them changes what you can ask for, and what you can get back, whether you're working with devs, or an AI. “tint this neutral color”, “fix this widow”, “nudge it to the optical center” I wrote them down: index.how/to/articulate
63
180
2,179
287,542
Local Tourist retweeted
How should companies measure ROI of AI? Here's my working mental model. Tear it apart! 1) Below a certain investment level (determined by ELT or AI steering committee), ROI can be vibes-based through conversations with users. Goal here is to remove friction & empower people to play with the technology however they find helpful. It just has to lead to a high enough fidelity gut feeling to determine if a higher investment experiment is worth running. 2) Above a certain investment level, ROI has to be as high fidelity as possible. Every AI initiative is run like an experiment with friction minimized as much as possible. There’s a certain investment limit to experiments and investments can be revisited once experiments are complete. Here's how an experiment would be run & how (soft vs. hard) ROI would be calculated. - Hypothesis: If recruiters use AI to screen resumes, then the time-to-hire will decrease and the interview-to-offer conversion rate will remain equal or improve. - Independent Variable: The screening method used (AI-powered software versus traditional human resume review). - Dependent Variables: Time spent screening (minutes per resume), candidate diversity metrics, and the hiring manager's satisfaction score of shortlisted candidates. - Controlled Variables: The same job description, the same pool of raw applicant resumes, and the same evaluation criteria (rubric). To ensure a fair test, you must use a randomized control design: - Control Group: Group A consists of experienced human recruiters who screen 200 incoming resumes using your traditional manual process. - Experimental Group: Group B uses the AI screening tool to parse and rank the exact same 200 resumes. Experiment steps: 1) Time Tracking: Log the total hours Group A spends reading resumes versus the time it takes to configure and run Group B's AI tool. 2) Blinded Interview Review: Pass the top 10 candidates selected by the human process and the top 10 selected by the AI process to a hiring manager. Do not tell the manager which candidate came from which screening method. 3) Quality Metric: Have the hiring manager score each candidate's qualifications on a scale of 1–10 based on the interview. 4) Replication: Repeat this exact process across three different job openings (e.g., Sales, Engineering, and Marketing) to ensure the AI's effectiveness isn't limited to just one type of role. Results & ROI: Experiment proved successful if 2 conditions are met: - Condition 1: Time Saved > 0 - Condition 2: AI Average Quality Score ≥ Human Average Quality Score If not successful, run new experiment (i.e. how can we tweak the AI to deliver as high of an average quality score) If successful, measure ROI. In this example ROI would look like: ROI % = (Annual Savings - Annual AI Cost / Annual AI cost) * 100 So if the company has 50 job roles per year, 9.5 hours are saved and the screening software costs $10,000, the ROI would be: (475 hours saved * $58/hr - $10,000 AI tool/ $10,000 AI tool) * 100 = 174% ROI And that ROI is realized (goes from soft savings to hard savings) either by slowing down the hiring of recruiters, firing recruiters, or revenue realized by getting new hires into seat faster. What do you think? Right/wrong approach?
23
14
140
20,782
Local Tourist retweeted

13
24
247
30,477
Local Tourist retweeted
I just got bullied by AGI
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.
313
313
7,803
697,666
Local Tourist retweeted
Jun 9
PSA 📣 You can now turn your ChatGPT-generated images into fully editable Canva designs with Magic Layers, without ever leaving the chat.
259
666
7,014
3,616,407
Our mission at @Londonmaxxing is to make London the greatest city on earth to live & build. Event 1 - Why London Feels Alive Again spread the optimistic Londonmaxxing message, and kicked off a wave of energy. Event 2 - Pubmaxxing brought Londonmaxxers together (at the pub) to scheme dastardly plots to improve London. Event 3 - Maxxing London is the hackathon to help you take action and BUILD. Thank you to our amazing partners @zeddotdev @ElevenLabs @ag_grid @OpenRouter @useTRMNL @arizeai @isnit0 @esthertrapadoux @rachelnabors We can't wait to see what you all make 🔥
Mighty people of London. May we present... 𝗟𝗼𝗻𝗱𝗼𝗻𝗺𝗮𝘅𝘅𝗶𝗻𝗴 𝟬𝟬𝟯: 𝗠𝗮𝘅𝘅𝗶𝗻𝗴 𝗟𝗼𝗻𝗱𝗼𝗻 The hackathon to make London the greatest city in the world to live and build in. Saturday July 4th at Ramen Space Powered by @ag_grid @arizeai @Cloudflare @ElevenLabs @OpenRouter @useTRMNL @zeddotdev
1
9
37
10,434
Local Tourist retweeted
Narrative violation: according to @Stanford research, local models can answer 71.3% of real-world chat and reasoning queries accurately, up from 23.2% in 2023. Obviously at a fraction of the cost and energy consumption of frontier APIs. The obvious conclusion: you don't need a frontier model for most tasks. The future is multi-model: local, open-source, smaller and cheaper for the majority of workloads, frontier APIs when no other choices!
70
143
838
113,170
Local Tourist retweeted
we often get asked, how to uplevel my @HyperFrames_ video so we open sourced 10 frame.md templates: github.com/heygen-com/hyperf… i gave all 10 of them to Codex and made this showcase video we also have a free tool for creating a frame.md from your design.md 🎨 link in thread
Jun 3
Introducing frame.md, a spec built for videos & motion design.md kept your brand consistent across screens but when applied to videos, agents translated it back into webpages and decks frame.md teaches your agents how to make branded video turn your design.md into frame.md ↓
23
59
797
113,565
Local Tourist retweeted
The highest leverage work in AI right now is some of the most boring. (Well boring to others, I kind of love the pain of problem-solving.) Everyone and their boss wants to build the cool AI agent... the shiny autonomous thing that makes the splashy demo video. I want to hire the person willing to iterate with AI 8 times to get an improved brand voice doc because the newest model just drifted the heck out of the output. You have to do the boring work. Examples: - synthesizing all company goals into one doc, reviewing it, editing it, making sure it's accessible to AI agents, sharing access to the team, getting team feedback, giving CLAUDE.md the pointer, updating it - keeping ERP current so your AI system knows who owns what, systems to keep them updated during promotion cycles/layoffs/re-orgs - maintaining your file system, understanding what AI has access to, setting up systems/loops to keep memory updated - fixing your daily briefing because it started sending in light mode and now it's white text on white background and no one can read anything (aka me, today) And it feels the same across my whole business. The amount of paperwork and contracts and file sharing and invoices and emails and autoreplies you deal with as a founder is enough to make most people never want to start. BORING is the trick to scaling, people.
65
9
234
26,300
Local Tourist retweeted
every job will turn into explaining your intentions to ai explaining what you want to ai is surpringly time consuming, coders already spend 80% of their time doing it, and this will be true for everyone
348
245
3,258
564,310
Local Tourist retweeted

12
23
274
75,640
Local Tourist retweeted
The numbers may be a bit extreme here, but unquestionably use-cases have to stratify in the next year or two between model families. We’ll see a split between frontier intelligence for high end tasks and work, and much cheaper models for high volume workloads that can sufficiently be peeled off to cheaper models. Frontier will still be far bigger than today because the use-cases will demand it, but the low-end will get quite a bit larger as well. The big update here is that the layer that can efficiently route the workload to the right model will then become increasingly valuable since that becomes one of the new hard problems in AI agents. Agent orchestration that can cost optimize while still performing the task successfully will be in a strong position.
Good take My guess is - demand for intelligence is near infinite - but 80% of workloads will be running on 99% cheaper models within 12-18 months - 20% of workloads will still run on latest gen models where IQ maxing is important (scientific breakthroughs, higher level ochestrator agents?) - rough analogy might be what % of macbooks or gaming PCs sold have the maxed out specs for CPU/GPU, prices are falling much faster than Moore's law here though - this leads me to think the limiting factor will be energy and compute, not better models At Coinbase we're working hard on routing prompts to cheaper models where appropriate, and in some cases have been able to keep costs roughly flat, while token usage continues to grow exponentially.
60
19
313
119,949