Joined August 2022
865 Photos and videos
Self retweeted
The 3rd edition of my book Deep Learning with Python is being printed right now, and will be in bookstores within 2 weeks. You can order it now from Amazon or from Manning. This time, we're also releasing the whole thing as a 100% free website. I don't care if it reduces book sales, I think it's the best deep learning intro around, and more people should be able to read it.
313
940
6,236
913,827
Self retweeted
Llama.cpp supports the new gpt-oss model in native MXFP4 format The ggml inference engine (powering llama.cpp) can run the new gpt-oss model with all major backends, including CUDA, Vulkan, Metal and CPU at exceptional performance. This virtually brings the unprecedented quality of gpt-oss in the hands of everyone - from local AI enthusiasts to enterprises doing inference at the edge or in the cloud. The unique inference capabilities of ggml unlock a vast amount of use cases for the entire spectrum of consumer-grade hardware available on the market today - use cases that are impossible to support with any other inference framework in existence. Today, gpt-oss trained with the MXFP4 format, effectively “leaps” over the existing resource barriers and allows us to experience SOTA AI quality on our own personal devices. The era of natively trained 4-bit local models has officially began and ggml will continue to lead the way forward!
19
140
1,021
113,972
Self retweeted
12 Apr 2025
do you guys realise everyone is making the same fucking thing
12 Apr 2025
wow, Canva just launched its own AI code generator (competitor to v0, Lovable, Bolt, etc.) Things just got interesting 👀
286
676
14,493
834,341
Self retweeted
The next level has been unlocked. America First 🇺🇸
93
204
643
31,374
Self retweeted
How to be more emotionally intelligent (without trying so hard) 🧵 for @threadapalooza
146
926
5,603
1,642,574
Self retweeted
24 Feb 2025
Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.
1,015
2,730
18,634
3,617,243
Self retweeted
SFT Memorizes, RL Generalizes. New Paper from @GoogleDeepMind shows that Reinforcement Learning generalizes at cross-domain, while SFT primarily memorizes. rule-based tasks, while SFT memorizes the training rule. 👀 Experiments 1️⃣ Model & Tasks: Llama-3.2-Vision-11B; GeneralPoints (text/visual arithmetic game); V-IRL (real-world robot navigation) 2️⃣ Setup: SFT-only vs RL-only vs hybrid (SFT→RL) pipelines RL variants: 1/3/5/10 verification iterations (”Reject Sampling”) 3️⃣ Metrics: In-distribution (ID) vs out-of-distribution (OOD) performance 4️⃣ Ablations: Applied RL directly to base Llama-3.2 without SFT initialization; Tested extreme SFT overfitting scenarios; Compared computational costs versus performance gains Insights/Learning 💡 Outcome-based rewards are key for effective RL training 🎯 SFT is necessary for RL training when the backbone model does not follow instructions 🔢 Multiple verification/Reject Sampling help improve generalization up to ~6% 🧮 Used Outcome-based/rule-based reward focusing on correctness 🧠 RL generalizes in rule-based tasks (text & visual), learning transferable principles. 📈 SFT leads to memorization and struggles with out-of-distribution scenarios.
15
171
950
95,576
Self retweeted
4 Jan 2025
There's still no better explainer of what neural networks are and how they work than @3blue1brown's 8-video playlist. He recently added four new videos to this playlist: 1. Large Language Models explained briefly 2. Transformers (how LLMs work) explained visually 3. Attention in transformers, visually explained 4. How might LLMs store facts If you are interested in AI, I can't recommend these videos enough.
26
199
2,135
173,384
Self retweeted
7 Jun 2024
PVP - PPP One of the most important things for me is to shift crypto from a PVP into a PPP mindset. The difference: - PPP (Player pump player) communities wants the last person in to win - PVP (Player vs player) communities wants the last person in to be their exit PPP are by definition infinite games, while PVP are by default deeply finite games with a clear end point - the last sucker. But why is PVP so prevalent , to the point where most outsiders and even many insiders feel that crypto is fundamentally a PVP game? It is because PVP "communities" are super easy to form, since all the members need to agree on is to that attracting more people to buy their coin quite literally makes them richer (thanks to magic money creation) and so they should do a lot more of that. In contrast, PPP communities are incredibly hard to build - since the core emphasis of the community cannot be around attracting others in the hopes of dumping of them, but rather towards having more allies towards building a common long term future together. That said, the future of crypto is most certainly PPP, because even if 99.999% of new "communities" are PvP, the 0.00001% that lasts and thrives are certainly all PPP. The hallmark of great communities like Bitcoin, Eth, Sol (and hopefully Jup) have been a community that is very aligned towards helping every single new entrant into the network *win* vs treating them as a pile of money to buy their bags. PPP will win in the end, because: - It is the best mindset in crypto - combining the essence of communal vision and profiting - It is neither cynical nor delusional - instead having both idealism and pragmatism - It is the only way forward in crypto - because there is no community that can survive being eaten from the inside Yet of course today, crypto has a really bad rap for being PvP, where the public perception is often that of greedy mofos creating and shilling magic internet money and saying anything in order to dump on the last suckers entering. The core problem with that perception of course, is that there is a lot of truth to that to the point that even the most earnest participants often think The thing we can do of course, is to actively reject PvP communities, and invest your time, effort and energy into cultivating real relationships and expertise in PPP community, vs the transient man eat boy ones in PVP spaces. If we all eat each other vs grow each other, there’ll be nothing much to eat soon for anyone. Crypto is probably the one industry where there’s absolutely no limit to which game you as an individual choose to play, and which communities you choose to join. Choose PPP. Your mom will be proud.
1,060
1,555
5,122
1,728,054
31 Dec 2024
Interesting Next to @pydantic Pydantic AI this is one of the more interesting abstractions. Need to check out what workflow makes most sense.
31 Dec 2024
For months, we've worked on building @huggingface's new moonshot: agentic systems. So today we're very proud to announce the release of 𝚜𝚖𝚘𝚕𝚊𝚐𝚎𝚗𝚝𝚜! It's the simplest library we could make to let people build powerful agents. 💥 The main logic for agents fits in ~1000 lines of code. So it's really dead simple. 🧑‍💻 The main agent class is CodeAgent, and agent that writes its actions in code. That means, contrary to the standard set by OpenAI of writing tool calls as JSON blobs, this agent writes code snippets. It's much more natural for LLMs to write actions this way, and as a result performance is vastly improved. 🌍 It supports any LLM through @LiteLLM integration. 🛡️ We enabled secure code execution via @e2b_dev sandboxes.
1
4
704
Self retweeted
24 Dec 2024
Introducing ASAL: Automating the Search for Artificial Life with Foundation Models sakana.ai/asal/ Artificial Life (ALife) research holds key insights that can transform and accelerate progress in AI. By speeding up ALife discovery with AI, we accelerate our understanding of emergence, evolution, and intelligence–core principles that can inspire the next generation of AI systems! We proudly collaborated with MIT, OpenAI, Swiss AI Lab IDSIA, and Ken Stanley on this exciting project. Full Paper (Website): pub.sakana.ai/asal/ Full Paper (arxiv): asal.sakana.ai/paper/ Code: github.com/SakanaAI/asal/ In this work, we propose a new algorithm called Automated Search for Artificial Life (“ASAL”) to automate the discovery of artificial life using vision-language foundation models. Instead of tediously hand-designing every tiny rule of an Alife simulation, simply describe the space of simulations to search over, and ASAL will automatically discover the most interesting and open-ended artificial lifeforms! Because of the generality of foundation models, ASAL can discover new lifeforms across a diverse range of seminal ALife simulations, including Boids, Particle Life, Game of Life, Lenia, and Neural Cellular Automata. ASAL even discovered novel cellular automata rules that are more open-ended and expressive than the original Conway’s Game of Life. We believe this new paradigm may reignite ALife research by overcoming the bottleneck of manually designed simulations, thus advancing beyond the limits of human ingenuity.
75
626
2,827
749,969
24 Dec 2024
the next unlock is realizing that "ordinary" now is already a mystical experience. no fireworks and state chasing needed.
I can’t find it, but I saw some tweet recently that said something like “I thought the path would be about crazy mystical experiences, but it turns out it’s been mostly about feeling okay with my regular emotions” and I feel that.
102
23 Dec 2024
type(Christ_consciousness) === type(Buddha_nature) === type(Brahman)
1
1
144
Self retweeted
19 Dec 2024
ᴡᴀʟᴋɪɴɢ ᴀ ᴛʜᴏᴜꜱᴀɴᴅ ᴍɪʟᴇꜱ ᴡɪᴛʜ ᴍʏ 𝔪𝔬𝔫𝔰𝔱𝔢𝔯 ㅤ
47
854
5,953
130,392
Self retweeted
I'll get straight to the point. We trained 2 new models. Like BERT, but modern. ModernBERT. Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff. It's much faster, more accurate, longer context, and more useful. 🧵
127
654
4,687
437,356
Self retweeted
19 Dec 2024
I feel like I missed the pdf wrapper bubble and now might miss the agent bubble. Gotta build quick
21
15
517
36,476
Self retweeted
I can't be the only one that sees this correspondence between the past and the present.
43
67
698
64,395
Self retweeted
28 Oct 2024
Tried to find the Rogan/Trump interview on YouTube but no matter what I search, it's not coming up. Would be beyond bonkers if they're actively trying to suppress it. Must be a glitch, right?
2,156
1,919
17,275
3,530,346
Self retweeted
The junior dev asked the senior dev “why are you pushing this code with no abstraction? What if you want to change it in the future?” The senior dev responded “then I will change it in the future” In that moment the junior dev was enlightened
119
781
12,636
1,103,692
Self retweeted
27 Sep 2024
Cultivating and exploiting the insecurities of developers is a massive growth opportunity and market. No wonder VCs are rushing in to get some of that sweet 100x arbitrage action. We need open source to push back with tools and education.
27 Sep 2024
The pendulum is swinging back
32
112
1,277
155,456