gabs

gabs

12 Photos and videos

Tweets

Pinned Tweet

gabs

@gabsprogrammer

May 2

x.com/i/article/205060677170…

1,354

Mike Bradley

gabs retweeted

Mike Bradley

@The_Only_Signal

This is not the GMKtec EVO-X2 it’s the Ryzen AI Halo. Also, in no way does it replace a $200 per month 2T parameter model frontier cloud AI sub. All of this stuff has value but we have to try to be accurate about what that value is or people will buy it and say they got rugged. That’s not helping anybody.

starmex

@starmexxx

Jun 13

AMD CEO LISA SU HELD A MINI PC ON STAGE THAT RUNS A 235B MODEL AND REPLACES YOUR $440/MONTH AI STACK amd's ryzen ai max 395 is the first x86 chip that runs a 200 billion parameter model on one piece of silicon. cpu and gpu share 128gb of unified memory, no separate graphics card needed the gmktec evo-x2 runs qwen3 235b fully, deepseek v3 comfortably and llama 3.3 70b with headroom. on linux you get 110gb of usable vram out of 128gb amd claimed the chip beat an nvidia rtx 5080 by more than 3x on deepseek r1 inference. a lunchbox sized pc outrunning a $1,000 discrete gpu on a real ai workload a heavy ai user pays $200 for claude code max, $200 for chatgpt pro, $20 for cursor and $20 for gemini. that's $5,280 a year and the box pays itself off in 9 to 10 months install ollama, pull the model, point claude code at localhost. same interface, nothing leaves the machine, nothing costs per request bookmark this and read the article below

1:00

1,621

Ahmad

gabs retweeted

Ahmad

@TheAhmadOsman

Jun 13

Replying to @JefinhoMenes

Anthropic is evil and this is part of their plan x.com/TheAhmadOsman/status/2…

Ahmad

@TheAhmadOsman

Jun 12

x.com/i/article/206529156971…

7,555

NVIDIA AI

gabs retweeted

NVIDIA AI

@NVIDIAAI

Jun 10

Congrats to @GoogleDeepMind on the launch of DiffusionGemma. The model generates 256 tokens in parallel per step, delivering 150 TPS on DGX Spark, and 1,000 TPS on a single H100. We're supporting it from day one with: • BF16 and NVFP4 checkpoints on @huggingface🤗 • Free GPU-accelerated endpoints on build.nvidia.com • @vllm_project support with FP8 precision Get started with DiffusionGemma on NVIDIA: nvda.ws/43ro19u

Try NVIDIA NIM APIs

Experience the leading models to build enterprise generative AI apps now.

build.nvidia.com

Google AI Developers

@googleaidevs

Jun 10

DiffusionGemma, our experimental open model released under an Apache 2.0 license, explores text diffusion, an exceptionally fast approach to text generation. Here’s how DiffusionGemma accelerates development: Faster token output: By shifting the bottleneck from memory bandwidth to raw compute, the model generates up to 4x faster token output on dedicated GPUs Accessible hardware footprint: Activates just 3.8B parameters during inference, fitting comfortably within 24GB-VRAM high-end consumer GPUs when quantized Novel workflows: Parallel token generation enables self-correction, making it ideal for code infilling, in-line editing, and non-linear structures DiffusionGemma prioritizes speed over raw quality and accelerates best on compute-bound hardware (like @NVIDIAAI GPUs). Standard @GoogleGemma 4 remains recommended for production quality and memory-bound devices.

118

1,365

99,120

Ahmad

gabs retweeted

Ahmad

@TheAhmadOsman

Jun 10

Replying to @Shaughnessy119

This year opensourceaimustwin.com/?sha… x.com/theahmadosman/status/2…

Opensource AI Must Win

Civilizational intelligence infrastructure must remain free to study, build, deploy, and run, not rented from closed institutions.

opensourceaimustwin.com

Ahmad

@TheAhmadOsman

Feb 10

A frontier opensource lab in the West will be born this year. Zero doubt. It requires serious capital, like I’ve said before. Working on it. One day I’ll tell the story of how it started in a basement and ended at the frontier.

1,304

mr-r0b0t

gabs retweeted

mr-r0b0t

@mr_r0b0t

Jun 9

If you're running with 24GB VRAM or unified memory, this is for you! x.com/mr_r0b0t/status/206436…

mr-r0b0t

@mr_r0b0t

Jun 9

x.com/i/article/206435307716…

4,230

gabs

gabs

@gabsprogrammer

Jun 9

Hello Everyone 👋 I’m building DreamOS, a minimal Linux-based operating system focused entirely on local AI inference. It boots directly into a dashboard for chatting, model management, benchmarks and live hardware metrics. The core control plane is written in Rust. DreamOS uses the official NVIDIA Linux/CUDA stack, while owning the layers above it: hardware-aware configuration, model residency, KV-cache management, inference scheduling, context handling, telemetry and automatic backend tuning. The current development target is Qwen3.5-9B Q4_K_M on an RTX 4060 with a 32K context. Initial runtime experiments have reached up to approximately 45 tokens/s, but I’m still building matched, reproducible benchmarks before claiming a definitive improvement over standard llama.cpp. I’m now profiling the exact CUDA decode bottlenecks, experimenting with custom Q4_K kernels, KV-cache compression, speculative decoding and persistent execution. There is also a longer-term bare-metal DreamOS research track, but the practical product comes first: making local models run faster, remain loaded, consume less memory and feel like a complete AI-native system rather than another application running on a general-purpose desktop OS. *Illustrative image

gabs

gabs

@gabsprogrammer

Jun 9

Se o Brasil de fato bloquear as IA's por causa de eleição, oque é péssimo, ao menos temos a alternativa de fazer as pessoas entenderem que estudar sobre Local AI é o futuro, cada vez mais modelos estão ficando melhores, mais rapidos e com menos necessidade de rodar em ''super computadores''

Mike Bradley

gabs retweeted

Mike Bradley

@The_Only_Signal

Jun 7

Geeking out about local AI and the mission of Light Heart Labs with @liamjf444 on his Podcast 💪❤️ youtu.be/LfnnwDQnQ-Y?si=vGyP…

Michael Bradley is Democratizing Access to Local AI

Artificial intelligence is becoming increasingly powerful, but most...

youtube.com

1,146

Ahmad

gabs retweeted

Ahmad

@TheAhmadOsman

Jun 7

Fun LLM Question from Mike today > Let’s say Thanos snaps away every model from existence except Qwen3.5-2B > Ahmad, what would you do? Assumptions & Clarifications 1. All papers, tech reports, synthetic datasets, HF repos, checkpoints, eval harnesses, and research artifacts are gone from existence 2. I still have my own knowledge and experience all my hardware 3. Inference Engine for Qwen3.5-2B is available to me What I would do… Phase I: - Immediately write down every training, inference, evaluation, data generation, and scaling trick I can remember before I forget details - Build a highly-deterministic synthetic data harness around Qwen3.5-2B - Generate purpose-specific datasets, starting with synthetic data generation - Finetune Qwen3.5-2B for generating better synthetic data Phase II: - Finetune for specialized versions on coding, research, agents, and kernels using further generated synthetic data from the finetuned model - Create automated research loops that continuously improve datasets and training runs - Scale what works and stop what doesn’t Phase III: - Train a 9B-class model using the above - Finetune a highly specialized 9B-class model that dominates a narrow domain, focusing on high-return $$$ markets Phase IV: - Raise capital to Buy massive amounts of GPUs - Hire researchers - Repeat with scaling up as the primary goal Kinda simplified plan but I thought this was an interesting one to share

5,525

Mike Bradley

gabs retweeted

Mike Bradley

@The_Only_Signal

Jun 6

We should switch to certification and licensing systems for professions. You go through standardized and recognized tests of competency in a field or area, and if you have the skills and abilities and knowledge it should be wholly irrelevant how you learned it all and got there.

543

Mike Bradley

gabs retweeted

Mike Bradley

@The_Only_Signal

Jun 6

Colleges and universities are over priced and often unnecessary middlemen for what real learning and growth looks like in 2026.

368

gabs

gabs

@gabsprogrammer

Jun 5

It was good enough for me to clear up some doubts before adapting it to my own kernel.

Elliot Arledge

@elliotarledge

Jun 5

if you havent gone through it yet, i highly recommend

Ahmad

gabs retweeted

Ahmad

@TheAhmadOsman

Jun 5

Finally, today's Nemotron 3 Ultra release makes me very hopeful for the future of Opensource AI Jensen knows that this is important to keep the powers in check, and I believe he's sincere in his answer to me that there will be continuity to the Nemotron Coalition releases Big W

Ahmad

@TheAhmadOsman

Mar 17

I asked Jensen whether we will see more Nemotron models or if the recent releases were just to prove NVFP4 training works

3:12

4,754

Markets & Mayhem

gabs retweeted

Markets & Mayhem

@Mayhem4Markets

Jun 5

This is what progress looks like, friends. 😎 Let's hope we get better support for our RTX Blackwell cards and DGX Sparks soon!

NVIDIA AI

@NVIDIAAI

Jun 5

Replying to @Mayhem4Markets @LLMWildling @TheAhmadOsman @nvidia

We see you and flagged to the team.

5,510

TeslaZoa

gabs retweeted

TeslaZoa

@TeslaZoa

Jun 5

🚨Jensen Huang gifted Faker a one-of-a-kind graphics card personally signed by him. “Only one in the world. This might be worth a million dollars. I might have to keep this now.” The king of AI handing a legendary gift to the king of League. A truly iconic moment.

0:21

Community note

The signed graphics card was not gifted to Faker but raffled to a fan after being signed by both Jensen Huang and Faker. koreaherald.com/article/107649… en.sedaily.com/technology/202…

602

15,264

1,819,208

Google Gemma

gabs retweeted

Google Gemma

@googlegemma

Jun 3

Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇

404

1,789

12,366

3,176,657

Ahmad

gabs retweeted

Ahmad

@TheAhmadOsman

Jun 3

14x RTX 3090s Qwen 3.6 27B Running 42 agents IN PARALLEL at full 256k context - exl3 6bpw - fp8 KV Cache - Aphrodite Inference Engine w/ tp=2, pp=7 The world of agents will run locally btw

559

74,303

njdeprê - marlon

gabs retweeted

njdeprê - marlon

@njdmarlon

Jun 2

Neymar via instagram. The last dance. 🙏🏾🇧🇷 Vamos buscar

373

13,216

116,474

NVIDIA AI

gabs retweeted

NVIDIA AI

@NVIDIAAI

Jun 1

Introducing Cosmos 3: Our latest frontier model for Physical AI Cosmos 3 is the world’s first fully open omnimodel with native vision reasoning, world and action generation. Today we’re releasing Super (32B) and Nano (8B) variants.

3:13

403

2,710

414,785

X Freeze

gabs retweeted

X Freeze

@XFreeze

May 30

Entire world: We need more GPUs Meanwhile, Jensen Huang:

1:00

505

663

12,874

1,421,346