Programming Language Nerd, PhD in CS — Principal ML Engineer @RedHat — prev: (Chicory, wazero) @Dylibso @Tetrateio @RedHat@PapersWeLoveMI organizer

Joined June 2010
895 Photos and videos
Pinned Tweet
Replying to @evacchi
I'm So Recursive, Even Dis Half-assed Acronym, Together
2
1
4
605
I, too, have been using computer software to write 100% of my code for several years now
1
3
151
Edoardo Vacchi retweeted
Suddenly every PR has a surprise new reviewer in Claude Code today.
4
61
865
38,469
Edoardo Vacchi retweeted
impromptu engineering
ok be careful with your fable 5. i just ran into a new problem that never happened before - it's now doing things i didn't ask i just told it there is a bug in a repo. without checking with me, it did the fix AND raised a PR using my gh cli, claiming it's following CONTRIBUTING.md the PR was not bad, but it's a big surprise as it - assumed the credentials in my gh cli is the one i wanted to use - assumed i would be happy with the change as is - assumed i was ready to publish the work that's a lot of assumptions from just me telling it to fix a bug. i now feel the need to explicitly tell it NOT to do extra things which increased cognitive load for me and it's not a good feeling
1
172
Edoardo Vacchi retweeted
Well, that's a first. An enterpreneur who I met at a cafe a while back and was a nice guy and said we'd follow up perhaps just sent me an AI-genereated pitch. This wiped out the personal connection I had in my mind with this person. How do people not realize AI does this?
96
34
1,256
120,557
maybe I should just schedule this retweet
"Anthropic just released Opus 4.7 and EVERYTHING YOU LEARNED about PROMPT ENGINEERING is now WORTHLESS" whew good thing I studied nothing
1
1
148
Edoardo Vacchi retweeted
You bring the agent. Red Hat AI makes it production ready. Red Hat Summit demo: Kagenti for multi-agent import and discovery, SPIFFE/SPIRE for cryptographic workload identity, MCP gateway for secure tool connectivity, OpenTelemetry for end-to-end trace of every reasoning step and tool call, and guardrails that block harmful responses. youtube.com/watch?v=PiLIijbt…

3
12
865
cool it only took them one entire release
1
1
102
if you are not conditionally branching, unconditionally jumping and read/writing to an infinite tape your agents, you are never gonna achieve turing equivalency
70
Edoardo Vacchi retweeted
forget loops if you don't switch case your agents you are ngmi
1
1
1
121
Edoardo Vacchi retweeted
I’d like to see more of AI helping build more reliable, performant, higher quality software rather than just the same crap (or worse) at a faster pace.
7
3
29
2,057
people seem to forget that before GitHub there was way more friction to contribute to an open source project. is it sad we are going back to that? sure. is there something we can do about it? maybe. for now, it is what it is
2
6
525
Edoardo Vacchi retweeted
🎉 The vLLM community just got a free course, built by @RedHat_AI with @DeepLearningAI. It walks through the full optimize → deploy → benchmark lifecycle for serving open models. Three labs, each on a live vLLM server: - Compress: quantize a Qwen model with LLM Compressor, then measure the size vs. accuracy tradeoff - Serve: deploy with vLLM's OpenAI-compatible API and watch continuous batching, PagedAttention, and prefix caching in the live metrics - Benchmark: simulate traffic with GuideLLM and check quality with lm-eval A lot of the work went into visualizing what actually happens under inference, thanks to @cedricclyburn: how tokens flow through the model, how the KV cache grows in GPU memory, and what changes when you move from FP16 to INT8/INT4. ~1.5 hours, 9 lessons, 3 labs. Free on DeepLearning.AI. 📝 Read more: vllm.ai/blog/2026-06-03-deep…
New short course: Fast & Efficient LLM Inference with vLLM, built in partnership with @RedHat and taught by @cedricclyburn. Learn to quantize an open-source LLM, serve it with vLLM, and benchmark your deployment across speed, cost, and accuracy. Free to enroll: hubs.la/Q04jXfpR0
7
39
330
55,820
Edoardo Vacchi retweeted
experiment
5
8
112
10,591
well good morning to you @Substack
1
2
166
and I don't even recall subscribing to this digest 😅
1
117
Edoardo Vacchi retweeted
This one has been in the works for a while. @cedricclyburn teaching LLM inference, compression, and benchmarking with @vllm_project -- free course with @DeepLearningAI. Proud of this one.
New short course: Fast & Efficient LLM Inference with vLLM, built in partnership with @RedHat and taught by @cedricclyburn. Learn to quantize an open-source LLM, serve it with vLLM, and benchmark your deployment across speed, cost, and accuracy. Free to enroll: hubs.la/Q04jXfpR0
3
9
45
4,504
Edoardo Vacchi retweeted
Congrats to the @googlegemma team on the Gemma 4 12B launch 🎉 Day-0 support on vLLM is ready to go. It's an encoder-free unified multimodal model — text, image, audio, and video all project straight into the LLM's embedding space, no separate vision or audio towers. 256K context, built-in thinking, native tool calling. Reasoning tool parsers (`gemma4`), vision, and audio all served through the OpenAI-compatible API. 🔗 Recipe: recipes.vllm.ai/Google/gemma…
Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇
8
35
394
24,454