David Hendrickson

David Hendrickson

6,615 Photos and videos

Tweets

Pinned Tweet

David Hendrickson

@TeksEdge

Mar 27

🎗️ "Medium-Sized" LLM Burners Coming Soon! 🔥 This Could Make Local HyperToken Generation a Reality. ⚡️ NVIDIA’s worst nightmare? 😱 ⚙️ Application-Specific Hardware Taalas new PCIe ASIC board would burn the entire medium-sized Qwen 3.5-27B LLM straight into silicon 🤯 (already doing it with small models) Taalos said medium models on ASIC would be available in their lab by Spring '26. 💭Imagine: 🚫 No more loading weights 🚀 ~10,000 Tokens Per Second locally (Llama 3.1 8B already @ 17,000 tps) 💻 Standard PC slot, ultra-low power (10x less) 🔋 🌍 100% offline with no cloud, no GPU farm 💰 Reddit unit cost rumor $300 to $400 🖥️ Imagine HyperToken generation on your desktop. 🤖 AI agents that think at light speed. ⚡️ Are you ready? 👀

179

421

2,716

491,862

David Hendrickson

David Hendrickson

@TeksEdge

💡New RISC-V @SipeedIO K3 AI-Box Tested ⇢ Yes, it can inference! $600 Local AI option Been waiting for a good post, and @rcarmo came through. Here is what he found 👇 🛠️ HW: Sipeed K3 (32GB LPDDR5, 128GB UFS) 💰 $600 🔋 11W idle/22W load 𖣘 quiet 🖧 10GbE/WiFi 🐧 Bianbu Linux 🧠 Real Benchmarks (llama.cpp fork A100 cores) » TinyLlama: ~36 tps » Gemma4 E2B: ~13 tps » Qwen3.6-28B-REAP-A3B: ~7 tps » Gemma4 E4B: ~6 tps » Gemma 4 26B-A4B: ~5 tps Not bad for a little dev board imho Link to his article in ALT

ALT https://taoofmac.com/space/reviews/2026/06/11/1830

Rui Carmo ☯️@rcarmo

Jun 11

Wrote another thing about local models and dedicated hardware. Anyone want to send me a spare GB10? taoofmac.com/space/reviews/2…

1,346

David Hendrickson

David Hendrickson

@TeksEdge

See what it says at the bottom of the sale page that sells a similarly spec’d AI Max 395 PC w/2TB?

David Hendrickson

@TeksEdge

Jun 13

🚀 AMD Ryzen AI Halo is now available for pre-order! A compact local AI developer platform powered by the Ryzen AI Max 395: 🧠 128GB unified LPDDR5x memory ⚡ 40 CU Radeon 8060S graphics (RDNA 3.5) 📦 Run models up to 200B parameters locally 🖥️ Windows Linux support out of the box Build and deploy AI workflows without cloud dependency. Pre-order → @ amd

0:16

618

David Hendrickson

David Hendrickson

@TeksEdge

Holy 💩! 56,000 tps has to be a 🌎 Guinness World Record. 🤯 If true, this is the inferencing 👑! FPGAs applied to LLMs can only support smaller samller models. ASICs can handle a much larger LLM. What do you do with 56,000 tokens per second? ⏲️ Responds to a "Hi" prompt in 1 millisecond.

Fabio Guzman

@FGuzmanAI

17h

56,000 tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running @karpathy microGPT, spelling out names on a tiny LCD. This is GateGPT 👇

0:24

1,880

David Hendrickson

David Hendrickson

@TeksEdge

Most are betting Fable 5 will be restored in the new week or so.

2,618

David Hendrickson

David Hendrickson

@TeksEdge

10h

More shots of the new AMD Strix Halo Dev PC going head to head with DGX-Spark (ROCm/Vulkan vs. CUDA).

Wccftech

@wccftech

Jun 13

AMD tackles NVIDIA's $4679 DGX Spark AI PC with its $3999 Ryzen AI Halo: Now available with 128 GB memory for blazing fast LLMs. 🔗 wccf.tech/1kmsb

5,822

David Hendrickson

David Hendrickson

@TeksEdge

13h

So people are asking where it says Gemini-3-Flash Kimi-K2.6 Deepseek-V4-Pro got within 1% of Fable 5 @ 50% the cost using the new Fusion tool, here it is from @OpenRouter's official blog post. 🧪 What is the DRACO Benchmark? 👇 DRACO (Deep Research Agentic Comparison) is a benchmark designed to test AI models on complex, real-world research tasks. Key details: 📍 Created by Perplexity AI Contains 100 deep research tasks across 10 domains (law, medicine, finance, tech, product comparison, etc.) Evaluates reasoning, tool use, synthesis, factual accuracy, and citation quality Uses a detailed rubric with ~39 weighted criteria per task ⚠️ Is it independent? No, it was developed by Perplexity, so it’s not fully independent. However, the benchmark is public (arXiv) and can be used by anyone.

6,357

David Hendrickson

David Hendrickson

@TeksEdge

14h

Is this for real? Did someone leak this intentionally?

Piotr Pomorski

@PtrPomorski

23h

Someone put Fable 5 on the pirate bay, 3.4TB 😂

Community note

A Pirate Bay search for "fable" returns no relevant results, and further, there is no "Other / Models" category as claimed in the screenshot. thepiratebay.org/search.php?q=f…

307

David Hendrickson

David Hendrickson

@TeksEdge

14h

🚨 Exciting news! 🔀 OpenRouter Fusion is now available and it might help while Fable 5 is restricted. 💰 A budget panel!! 👀 👀 👉 Gemini 3 Flash Kimi K2.6 DeepSeek V4 Pro ➤ scored within 1% of Fable 5 performance at roughly half the cost How to use it:. → Set model to "openrouter/fusion" 🔧 It runs server-side with tools enabled and supports custom panels. 🛡️ Set up ZDR for additional privacy!!!

OpenRouter

@OpenRouter

16h

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

7,456

David Hendrickson

David Hendrickson

@TeksEdge

16h

People are waking up to the reality that the AI we use every day can be taken away or priced out of reach.

338

David Hendrickson

David Hendrickson

@TeksEdge

17h

It's hard to believe that, with all the AI news this week, WWDC '26 was just last Monday.

275

David Hendrickson

David Hendrickson

@TeksEdge

17h

"Token austerity" advice coming from Big Tech is bizarre indeed. Strange times we live in for sure.

480

David Hendrickson

David Hendrickson

@TeksEdge

17h

🚀 📰 GLM-5.2 Status Update Here's where things stand right now: ✅ Available today! → GLM Coding Plan users (Lite, Pro, Max, Team) ⏳ API Chatbot → Launching next week ⏳ Open Source (MIT) → Releasing next week on Hugging Face 📊 Currently on AgentArena ❖ GLM-5.2 brings strong coding performance 1M context. Full public release is coming soon.

Z.ai

@Zai_org

Jun 13

Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere. GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans. docs.z.ai/devpack/latest-mod… As our new flagship model, GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks. API and Chatbot services will launch next week. The model will also be officially open-sourced next week under the MIT License. The future of AI is open, and it belongs to the people.

115

10,085

David Hendrickson

David Hendrickson

@TeksEdge

Jun 13

Maybe this is a better deal.

David Hendrickson

@TeksEdge

Jun 12

At least the cable is included in the price! Now can it run MiniMax M3?

18,268

David Hendrickson

David Hendrickson

@TeksEdge

Jun 13

Best open source model today. Now just need 256GB of VRAM to run it.

David Hendrickson

@TeksEdge

Jun 12

📊 With MiniMax M3 open source now out, here is what to expect on quants and sizes, including VRAM needed: MiniMax M3 (428B MoE, ~23B active) 🔥 GGUF Size Estimates Q8_0 → ~430-450 GB Q6_K → ~340-360 GB Q5_K_M/XL → ~280-310 GB Q4_K_M/XL → ~220-250 GB (Best balance) Q3_K_XL → ~170-200 GB Q2_K → ~110-140 GB Last resort Very efficient due to extreme sparsity! Practical local runs will need high-VRAM setups (multiple 5090s or better).

12,191