⿻ Andrew Trask

⿻ Andrew Trask

257 Photos and videos

Tweets

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

Apr 8

A few days ago I asked if anyone was still interested in decentralized AI. Turns out... yeah! So here's lecture 1: Decentralized AI From Scratch We build a peer-to-peer AI from scratch in about 50 lines of Python. It runs on your laptop, answers your friends' WhatsApp messages using your local data, and begins to address the privacy / prompt-injection problem through user-specific context management.

33:43

⿻ Andrew Trask

@iamtrask

Apr 4

is anyone still interested in decentralized AI? mind if I ask why?

222

25,625

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

Apr 4

is anyone still interested in decentralized AI? mind if I ask why?

174

231

82,202

Dawn Chen

OpenMined retweeted

Dawn Chen @dawnchenx

Feb 19

Excited to share our preprint on BioVault – a open-source privacy-first platform for global biomedical collaboration using data visitation. 🌍 🔗 biovault.net 📄 biorxiv.org/content/10.64898…

Join the Beta - BioVault

BioVault is a free, open-source, permissionless network for collaborative genomics. Share insights without ever sharing raw data.

biovault.net

1,555

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

19 Oct 2025

I've just drafted a new blogpost "GPU demand is (~1Mx) distorted by efficiency problems which are being solved" Mid-2024, Andrej Karpathy trained GPT-2 for $20. Six months later, Andreessen Horowitz reported LLM costs falling 10x annually. Two months after that, DeepSeek shocked markets with radical reductions in training and inference requirements. For AI researchers, this is all good news. For executives, policymakers, and investors forecasting GPU demand... less so. Many were caught off guard. The problem isn’t that executives / policymakers / investors lacked access to information per se… it’s that the technical/non-technical divide prevents them from seeing the difference between waste-based GPU demand and fundamental GPU demand. Meanwhile, tech experts like Karpathy, a16z, and DeepSeek understand fundamental principles which are easy to overlook if you’re not implementing the algorithms yourself. But in presenting their results as merely “AI progress”, they buried the lede… The Lede: If version X of an algorithm achieves the same result as version X-1 at 1/10th the compute cost, what exactly were we paying for in version X-1? The answer has profound implications for anyone forecasting future GPU demand: version X-1 was roughly 90% waste. And a16z’s report, Karpathy’s achievement, and DeepSeek’s breakthrough indicate this isn’t a single 12-month event… it’s a multi-year pattern. Version X-1 was 90% waste. Version X-2 was 99% waste. Version X-3... Wait… Leading AI labs allow waste? The obvious question: if this waste exists at such scale, wouldn’t the labs building these systems have eliminated it already? They are eliminating it. That’s what the 10x annual cost reduction represents. While hardware cost reduction accounts for some of the annual efficiency gain, software updates from AI labs constitute the vast majority… an ~86% efficiency gain annually. The puzzle isn’t whether labs are optimising… clearly they are. The puzzle is why so much waste existed to eliminate in the first place... and how much remains. ... (link on profile page)

358

70,589

Foresight Institute

OpenMined retweeted

Foresight Institute

@foresightinst

30 Sep 2025

We are very excited to announce our amazing speaker line-up for Vision Weekend! Join these field-leading researchers and builders as we explore the frontiers of neurotech, biotech, AI, security, space, and energy! Get tickets: foresight.org/events/vision-… Speakers: • Ed Boyden (Boyden Lab) @eboyden3 • Viren Jain (@Google) @stardazed0 • Chiara Marletto (@UniofOxford) • Laura Deming (@untillabs) @LauraDeming • Alan Mardinly (Science) @mardinly • Andrew Trask (@openminedorg) @iamtrask • Liv Boeree (Win-Win Podcast) @Liv_Boeree • Cate Hall (@AsteraInstitute) @catehall • Adam Brown (@GoogleDeepMind & @Stanford) • Greg Wayne (Google DeepMind) • Joe Betts-LaCroix (@RetroBio_) @bettslacroix • Andrew Payne (@E11BIO) @Andrew_C_Payne • Ariel Ekblaw (@aurelia_labs) @ariel_ekblaw • Eli Dourado (@AsteraInstitute) @elidourado • Gwern Branwen (gwern.net) • Adam Goldstein (Softmax) @adamjgoldstein • Erika Alden DeBenedictis (@Pioneer__Labs) @erika_alden_d • John Hallman (@OpenAI) @johnohallman • Joshua Elliott (@RenPhil21) • Juan Benet (@protocollabs) @juanbenet • Matthew Cullinen (HSBC) • Sean Escola (Protocol Labs/ARNI) • Steve Jurvetson (Future Ventures) @FutureJurvetson • Molly MacKinlay (Protocol Labs) @momack28 • Anastasia Gamick (@Convergent_FROs) @AGamick • Ela Madej (@fiftyyears) @elamadej • Brandon Goldman (@LionheartVC) @BrandonGoldman • Ant Rowstron (@ARIA_research) @rowstron

0:31

32,007

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

24 Sep 2025

IMO — Ilya is wrong - Frontier LLMs are are trained on ~200 TBs of text - There's ~200 Zettabytes of data out there - That's about 1 billion times more data - It doubles every 2 years The problem is the data is private. Can't scrape it. The problem is not data scarcity, it's data access. The solution is attribution-based control (article below) "Unlocking a Million Times More Data For AI"

Andrew Curran

@AndrewCurran_

14 Dec 2024

Ilya Sutskever made a rare appearance at NeurIPS. He said the internet is the fossil fuel of AI, that we are at peak data, and that 'Pre-training as we know it will unquestionably end'.

134

986

269,000

OpenMined

OpenMined

@openminedorg

16 Sep 2025

Want to demo/play around with new analysis tech? Are you an academic using AI in your data analysis? We’re building open-source tools to solve using private and unpublished data in AI workflows. Help us help you (5min) bit.ly/3VPxBPp #AI #OpenScience #AcademicTwitter

AI and Data collaboration for Academic Research

Thank you for your interest. At OpenMined, we are exploring new AI-powered tools to help academics collaborate and leverage their data. This survey should take about 5 minutes to complete.

docs.google.com

3,670

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

12 Sep 2025

IMO — Decentralized AI is more than: - an AI model in the sky, with good external auditing - an AI model in the sky, which people vote on how to use - an AI model in the sky, which is free for anyone to use - open source AI - federated training None of these are truly an interface to the world's collective intelligence. Each is actually... *mostly* centralized AI... but with the right ambitions!!! In this podcast, I lay out what I think a true decentralized AI ecosystem looks like, and my guesses on how to get there. The key use-case is broad listening (video below describes broad listening) (link to full podcast in reply)

1:04

65,186

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

11 Sep 2025

Genuine breakthrough in hallucination detection UX, but the fine-tuning approach repeats the exact flaw that creates hallucinations. But that's fixable — which makes me optimistic the hallucination problem is solvable w/ 3 ingredients 1) take this UX breakthrough 2) combine it with attribution breakthroughs 3) layer over the right cryptography tech IMO - that starts to look like a real solution to the hallucination problem. I mean there's work to do, but that looks plausible to me because there's at least *some* way to address the main sub-problems of hallucinations. The central problem of hallucinations is pretty easy to understand: users don't get to choose which pre-training data sources are combined into which tokens (and with what weighting) If they did, AI users could detect and steer AI models around hallucinations pretty easily. Why? Take an example. Let's say you prompted: Prompt: "Who is Kim Kardashian's boyfriend?" FAILURE MODE 1: wrong documents (prompt missed context) If you could observe the LLM starting to use pre-training documents titled: - Kim's Teenage Years - How Kim Rose to Fame - ... you could suspect it's about to hallucinate... becuase it's not indexing into documents from today. FAILURE MODE 2: never heard of Kim Kardashian You might start seeing the LLM using pre-training documents titled: - Disney's Kim Possible vs Ron Stoppable - Kimmy Schmidt is Live on Saturday Night - ... the LLM would still say something grammatical and confident... but it's clearly not focusing on the same thing FAILURE MODE 3: someone poisoned your data If you start seeing the LLM index into pre-training documents from sources that are questionable: - kardashianfanfiction.com - theonion.com If you could see this metadata behind EVERY token an LLM produces - hallucinations would become pretty hard. Every user could ensure their predictions are only coming from documents/sources they trust. Now... why can't you normally see/control what documents from an LLM's pre-training are informing the current tokens? Well, that's a complex topic. But it comes down to the fact that attribution data gets erased during training. Where does it get erased? It gets erased during ADDITION!! Consider the difference between addition and concatenation ADDITION: 2 3 = 5 1 4 = 5 CONCATENATION: 2 3 = 23 1 4 = 14 When you add two numbers, you inadvertently erase ANY signal about the original source numbers which created it. You have NO idea what numbers were used to create the number 5... just by looking at the 5. But with concatenation... totally different story. The resulting number (e.g. 23) reveals loads of information about what numbers were used to create it! Ok, let's return to the problem of attribution... why can't you tell which documents are informing which AI tokens? It's because addition is all over the place... two places in particular. When you train an AI model, each weight update is addition, so the influence of different documents gets smeared across all the weights. And of course, when you make a prediction, every matrix multiplication is full of additions and multiplications (both of which have this same property of erasing source information... although multiplication tends to leave more traces). So how do we solve hallucinations? We need to replace additions with enough concatenations that we can see which datapoints contribute to which other ones? This might sounds like a really revolutionary concept... but it's actually super ordinary. Consdier a few examples: - RAG: keep data concatenated in a database - Mixture of Experts: train separate sub-models and pick which model you want at inference time - Model Ensembling: like MoE but simpler - Model Merging: did you know there's whole paradigms for merging models losslessly? (git-rebasin is insane!!) This is why things like RAG are helping with hallucinations. They increase the concatenation-to-addition ratio. So you can imagine a world where: - RAG keeps data separate - Mixture-of-Experts has 1 expert per data source - You can ensemble models from different sources - You can on-the-fly merge models from sources you trust for a specific prompt Keep in mind... several of these techniques are already in production in the SOTA models... Ok... continuing on... we still have some problems we need to solve to help with hallucinations... Even if you use these techniques... the model stil outputs tokens without any metadta about whose data is causing them This is where dual-number systems, and sensitive systems from cryptography are really quite powerful. Things lke differential privacy can calcualte "if i modify the input to a function... how much will the output change?" It turns out... that's the key problem we need to solve for hallucinations. you need to know... "if i removed this pre-training datapoint... how much would the output token prediction change?" If you know the answer to that... you can detect/stop hallucinations. And sensitivity tracking systems like differential privacy can do this (specifically... individual differential privacy...). The problem they usually face is the computational complexity gets INSANE when you do this for highly non-linear functions. But this is where the RAG/MoE/etc. stuff comes in... it linearizes the relationship between input sources and the final prediciton... making sensitivty tracking computationally tractable. But all of this is irrelevant unless we can actually empower end-users to know and control which sources they're using to make predictions (i.e. full "Attribution-based control" or ABC) And this is why the paper I'm quote-tweeting is so exciting. They've got the right INTERFACE. Users need to be able to see at the token level... highlighting which indicates which sources are informing which predictions. The underlying deep learning cryptography tech is there (RAG/MoE sensitivity tracking), the interface was a major missing piece. It's an exciting time for people working on hallucinations. Ok to wrap up... this means you could prompt something like: Prompt: Who is Kim Kardashian's boyfriend? and then you'd get token-by-token highlights which give you the % that each token is being informed by different documents/sources from the pre-training data. That's a powerful interface. Anyway... this is what makes me so optimistic that hallucinations can be solved. Few deep learning tweaks, little bit of cryptography... of course there's still some engineering to do to get there... exciting times! For more on attribution-based control - see the link below.

Oscar Balcells Obeso @OBalcells

9 Sep 2025

Imagine if ChatGPT highlighted every word it wasn't sure about. We built a streaming hallucination detector that flags hallucinations in real-time.

0:15

15,595

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

7 Sep 2025

IMO — this paper misses the core driver of hallucinations A LLM with a billion neurons is like a billion tiny databases — database per neuron When you prompt it, the LLM looks in all the databases (i.e. neurons) for patterns it recognizes For example, when you prompt "Kim Kardashian is dating ..." The LLM looks in its billions of little hash tables and pulls out patterns: - vocabulary (words like Kim, instagram, etc.) - grammar (subjects -> verb -> object) - semantics (Kim Kardashian's known associates) But here's the problem.... when you prompt it for something unfamiliar, the LLM still recognizes some patterns (e.g. good grammar) - vocabulary (words like Kim, instagram, etc.) - grammar (subjects -> verb -> object) But if it doesn't find all the right cache entries: - semantics (Kim Kardashian's known associates) - date ranges (maybe she dated different people at different times) Then the LLM will make next-token predictions based on the hash-hits it found... but without the benefit of the hash-misses it lacks. So to return to the prompt: "Kim Kardashian is dating ..." - Grammar patterns: the next token will be a noun - Semantic patterns: the next token will be a first name (because "is dating" is usually followed by a name) - Gender pattern: the next token will be a male - Relationship patterns: the next token will be a male Kim is associated with a lot ... but if it can't find the hash-hit in its internal neuraons for the SPECIFIC male she's dating... it can hit on other things.... like - generic male names - males who appear in articles with Kim - other grammatically correct words like "no-one" We call this a hallucination, but IMO it's closer to a cache miss. So how do you solve hallucination? This paper from OpenAI suggests that we solve hallucination by putting "I don't know" in a bunch of the databases. But this isn't how you solve for cache misses — this is just how you create more cache hits of a certain type. If you had a database which was returning erroneous results, would you *fill* the database with "I don't know" entries???... On the one hand, that WOULD increase the chances that the erroneous result was "I don't know"... so you'd make some partial progress at a surface level. But IMO it's not solving the underlying problem... which is closer to detecting the sources/datapoints used for each prediction (MoE, RAG, etc. are making progress on this). IMO - a more fundamental solution would involve solving attribution-based control (link below)

Ethan Mollick

@emollick

6 Sep 2025

Paper from OpenAI says hallucinations are less a problem with LLMs themselves & more an issue with training on tests that only reward right answers. That encourages guessing rather than saying “I don’t know” If this is true, there is a straightforward path for more reliable AI.

721

100,082

Existential Hope

OpenMined retweeted

Existential Hope

@HopeExistential

19 Aug 2025

What futures could, and should, we create with advanced AI? Today we are announcing two possible paths – Tool AI and d/acc – with contributors like @VitalikButerin @owocki @AdamMarblestone and @AnthonyNAguirre

0:16

13,389

⿻ Andrew Trask

OpenMined retweeted

⿻ Andrew Trask

@iamtrask

15 Jul 2025

A different take — when LLMs allow people to summarise (more or less) infinite amounts of content, attention will cease to be a bottleneck as it once was. The attention economy is an imbalance of two things: - broad-casting scale: 1 person can talk to 1 million - broad-listening scale: we each mostly listen to 1 person at a time. This is the problem of "information overload"... BUT .... LLMs can enable you to summarise millions of pieces of content into overall vibes/summaries/reports/etc. LLMs are the beginning of the end of the attention economy.

Andrej Karpathy

@karpathy

10 Jul 2025

I often rant about how 99% of attention is about to be LLM attention instead of human attention. What does a research paper look like for an LLM instead of a human? It’s definitely not a pdf. There is huge space for an extremely valuable “research app” that figures this out.

17,884

OpenMined

OpenMined

@openminedorg

1 Jul 2025

How can the UK deliver a National Data Library that actually works? Join us at London Data Week for a hands-on event built for technologists, policy thinkers & public data advocates. 🗓 10 July 📍 UCL East 🎟 bit.ly/4lCIejH 🧵↓

National Data Library: Practical Tech for Public Data

Join us for demos and a panel exploring the potential of open-source data infrastructure tools in building the National Data Library.

eventbrite.co.uk

3,282

OpenMined

OpenMined

@openminedorg

1 Jul 2025

National Data Library: Practical Tech for Public Data In collab with @uclsteapp ⚙️ Featuring real, working tools: • @openminedorg's SyftBox • @bennettoxford’s OpenSAFELY 🎤 Speakers from @ODIHQ , @tenthinktank, @mozilla more

1,316

OpenMined

OpenMined

@openminedorg

1 Jul 2025

Wrap up with a drinks reception 🍻 — keep the conversations going with fellow builders & changemakers. Don’t miss it → bit.ly/4lCIejH #LondonDataWeek #NDL #PublicData

National Data Library: Practical Tech for Public Data

Join us for demos and a panel exploring the potential of open-source data infrastructure tools in building the National Data Library.

eventbrite.co.uk

751

OpenMined

OpenMined

@openminedorg

26 Jun 2025

We just dropped a new FL library to support researchers and organizations grappling with Federated Learning projects. Syft_Flwr combines Flower’s flexibility with the privacy-preserving networking capabilities of SyftBox. Links in the thread ↓

7,207

OpenMined

OpenMined

@openminedorg

26 Jun 2025

📚 Tutorial: bit.ly/3HYURqN 🐙 Library on GitHub: bit.ly/4nn8nEy ✨ Is your team actively working on a challenging FL project? Check out our Co-Design Program: bit.ly/3TH4wEK #FederatedLearning #AI #DataPrivacy #FlowerAI #OpenMined #Syftbox

480

OpenMined

OpenMined

@openminedorg

3 Jun 2025

Demo time! – Learn how enable secure, privacy-preserving data access. Links in the thread ↓

1,859

OpenMined

OpenMined

@openminedorg

3 Jun 2025

➡️ Register and learn more → bit.ly/4juvhqJ 📆 Don't miss invites to events like this → bit.ly/4dP9Tv9

378

OpenMined

OpenMined

@openminedorg

19 May 2025

📆 Unlocking Private Data for AI: Join OpenMined’s Masterclass during #NYTechWeek 🔗 Link in the thread ↓

1,910

OpenMined

OpenMined

@openminedorg

19 May 2025

📆 This masterclass → bit.ly/3FlpNAk ✉️ Get event invites from OpenMined → openmined.org/event-invitati…

AI Alliance Masterclass: Unlocking Private Data for AI - #NYTechWeek | Partiful

To date, AI has consumed nearly all publicly available data. This raises an urgent question: Where can the next wave of data available for AI come from? Non-public, sometimes sensitive, or private...

partiful.com

384