Soverius AI

Soverius AI

88 Photos and videos

Tweets

Murat Sari retweeted

Soverius AI @SoveriusAI

Jun 12

What are Claude Code & Codex doing under the hood? At a minimum: Running a model, a loop, and a few tools. @wolfmanfx explains how to build a fully local coding agent from scratch: llama.cpp, a hand-written harness, and NVIDIA OpenShell as the sandbox. soverius.ai/blog/implementin…

Build a local AI coding agent from scratch

Build a local AI coding agent from scratch — Gemma 4 on llama.cpp, three tools, one loop — then learn why running it unsandboxed is dangerous and how NVIDIA OpenShell contains it.

soverius.ai

Soverius AI

Murat Sari retweeted

Soverius AI @SoveriusAI

Jun 9

Tokens are becoming increasingly expensive. Are local LLMs ready to replace them? Join our free webinar to see for yourself. We'll demo local setups, tools, and hardware. You'll leave with a production-ready setup for immediate use. Register now! soverius.ai/webinars/softwar…

Gokhan Avkarogullari

Murat Sari retweeted

Gokhan Avkarogullari @gavkar

Mar 20

We released two tech talks today going over how to take advantage of the new architecture, features and associated developer tools. 1) Accelerate your machine learning workloads with the M5 and A19 GPUs developer.apple.com/videos/p… youtube.com/watch?v=wgJX1Hnd…

Accelerate your machine learning workloads with the M5 and A19 GPUs - Tech Talks - Videos - Apple...

Discover how to take advantage of the M5 and A19 GPUs to accelerate machine learning. Find out how to use the Neural Accelerators inside...

developer.apple.com

177

15,956

Murat Sari retweeted

Jun 8

macOS 27 brings 3 new Metal command-line tools. Now you can capture, debug, and analyze the performance of your Metal app seamlessly from your favorite coding agents, scripts, or terminal. Learn more about these tools in developer.apple.com/videos/p… and developer.apple.com/videos/p…

121

10,635

Murat Sari retweeted

Jun 8

Building super fast experiences with Gemma just got easier. Gemma 4 MTP is now officially merged into llama.cpp. Developers can now pair MTP with Gemma 4 QAT for a fast, lightweight setup.

185

2,077

103,673

Rainer Hahnekamp

Murat Sari retweeted

Rainer Hahnekamp

@rainerhahnekamp

Jun 3

Angular 22 is released today, and we have an "Angular 22 celebration discount". Our Modern #Angular Testing workshop takes place next week and will already include the relevant Angular 22 testing updates. 12% off with CGNE4E5G angulararchitects.io/en/trai…

899

Georgi Gerganov

Murat Sari retweeted

Georgi Gerganov

@ggerganov

May 29

llama.cpp now has an official website: llama.app Our goal is to make local AI accessible to everyone, and improving the user experience is a big part of that. On the new landing page you’ll find a single-line cross-platform installer. The installation provides a single unified `llama` entrypoint which you can use to run/serve models and interface with 3rd-party agentic applications. While oriented towards simplified user experience, the new `llama` application also provides all the advanced functionality of the existing llama.cpp tooling with which experienced users are already familiar. Also note that all GGUF models that you might have already downloaded with llama.cpp in the past will be automatically available to use without downloading again (they are stored in the common HF cache on your machine). We have many improvements in the pipeline both at the UX and at the engine level and we plan to iteratively ship new things over the coming months. One of the main focuses will be seamless integration with local-friendly 3rd-party agents (such as Pi). In the meantime, we’ll continue to listen for feedback from the community and adjust accordingly, so keep letting us know what you think and need.

483

2,980

164,068

Murat Sari retweeted

May 28

I've got an agent in a loop optimizing a renderer with the goal to minimize frame times (and tests to measure). It got times down from 88ms to 2ms and allocations down from ~150K to 500. Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem. As an experiment, I rewrote the Ghostty core render state in Go, with access to identically laid out data structures as Ghostty and the exact same validation tests. I made a purposely naive renderer (simple, correct, but slow). 88ms per frame with 150,000 allocations (horrendous, lol)! I then kickstarted a Ralph loop to bring the frame times down. I told it it can't modify input data structures or the public API or tests (they're correct), but it can do anything else it wants. It got to work. It has worked for about 4 hours. I've spent around $350 on this experiment so far. The results? 88ms => 1.5ms 150K allocs => ~500 allocs Incredible right? Nope. My hand-written renderer I ported has frame times (same benchmark) of ~20us (0.020ms) and 0 allocations in the update path. This is the problem with psychosis and lacking systems understanding. If you don't understand the system, you're going to accept that this is an incredible result. If you understand the system, you'll see better solutions immediately and can do roughly 75x better on throughput. The people who blindly trust agent output are in the former camp. They're sheeple, overdrinking from a fountain of mediocrity. Standard disclaimer: I use AI all the time. I like AI. The point I'm making is to not blindly accept results. Think. Analyze. Learn.

308

979

8,937

791,354

Soverius AI

Murat Sari retweeted

Soverius AI @SoveriusAI

May 28

For a long time, AI discussions were mostly about the model. 🤖 Which one is better? 📊 Which benchmark was beaten? 🚀 Which new release changed everything? But only the model is not enough. The layer around it is just as important: That's the harness. 📖 soverius.ai/blog/what-is-a-h…

What is an Agent Harness?

Learn what an Agent Harness is, why it matters, and how it improves LLMs through tools, context management, agent runtimes, guardrails, and intent alignment.

soverius.ai

755

Rainer Hahnekamp

Murat Sari retweeted

Rainer Hahnekamp

@rainerhahnekamp

May 28

As teased last week, my article about harnesses has finally been published. This is something I wanted to write for a long time. Not just what it is, but how the term evolved and why it is so important for AI in general. Big thanks to @wolfmanfx for the review & discussions.

Soverius AI @SoveriusAI

May 28

583

Soverius AI

Murat Sari retweeted

Soverius AI @SoveriusAI

May 18

Using @CopilotKit & Google's #A2UI with local LLMs? 🤖 Following up on our article last week, here's the fully interactive, visual dashboard showcasing our benchmarks, created by @wolfmanfx: 👉 a2ui-bench.web.app/ The 3 key takeaways from our evaluation prompts 👇 (1/4)

916

Murat Sari

Murat Sari @wolfmanfx

May 17

I need a new office chair any tips are welcome 🙏

Mike Hartington

Murat Sari retweeted

Mike Hartington @mhartington

May 13

Welp it's official, I'm on the hunt for my next role. I learned a lot in the past 8 months, and I'm ready to get back out there for my next role. If you're looking for someone for devrel, I got over a decade of experience in developer relations and ready help your team!

24,406

Rainer Hahnekamp

Murat Sari retweeted

Rainer Hahnekamp

@rainerhahnekamp

May 13

And here’s our article for the week! 🚀 This time, we’re diving into A2UI - in my opinion, one of the most critical topics at the intersection of AI and Frontend development right now.

Soverius AI @SoveriusAI

May 13

AI offers massive potential for UIs. The interface can be dynamically generated for each user. A key technology is #A2UI, which is framework-agnostic. Our new post covers: ✅ What is A2UI? ✅ How does it work? ✅ How to run it fully autonomously. soverius.ai/blog/behind-the-…

768

Soverius AI

Murat Sari retweeted

Soverius AI @SoveriusAI

May 13

A2UI, Generative UI, LLM Benchmark, Local LLMs, DGX Spark, vLLM, llama

Can local LLMs produce valid A2UI? I benchmarked 14 model and engine combos on a DGX Spark. The 14B cliff and the failure modes are not what you'd guess.

soverius.ai

889

Rainer Hahnekamp

Murat Sari retweeted

Rainer Hahnekamp

@rainerhahnekamp

May 8

I published a deep dive into #NgRx SignalStore extensions For me, SignalStore's extensibility is its outstanding feature. So I always enjoy recording videos about it. And this one is my most complete so far youtu.be/dM9lfElODK4

2,067

Soverius AI

Murat Sari retweeted

Soverius AI @SoveriusAI

May 8

RAG in a browser tab? No backend? Yes, it is possible. @wolfmanfx spoke at AI India on "From the AI Jungle to RAG in a Tab." If you want to build full RAG pipelines that run entirely client-side for ultimate privacy and speed, the slides are live: soverius.ai/talks/from-the-a…

From the AI Jungle to RAG in a Tab

“RAG”, “local AI”, “vector databases”… it’s easy to treat them as a checklist. I did too—until I tried to build a RAG app that runs entirely inside the browserIn this talk, I will show how a local-...

soverius.ai

Rainer Hahnekamp

Murat Sari retweeted

Rainer Hahnekamp

@rainerhahnekamp

May 4

Our first video on AI Fundamentals is live! It’s the video version of last week’s article - covering principles, models & limitations. It’s a deep dive (~1hr), but chaptered so you can watch in stages. 📺 youtu.be/nwkrpyGh4F8

Fundamentals of AI Engineering: Models and Their Limits

Given the current pace of AI, it is hard to keep up. In this video,...

youtube.com

1,227

Andrew Price

Murat Sari retweeted

Andrew Price

@andrewpprice

May 1

x.com/i/article/205031117141…

156

691

308,085

Rainer Hahnekamp

Murat Sari retweeted

Rainer Hahnekamp

@rainerhahnekamp

Apr 30

As promised last week, here’s our first article on AI. We start at the core: the model. What you need to know to understand how it works, its limitations, and how modern AI mitigates them. Long read, but long weekend ahead. So you have enough time 😅 soverius.ai/blog/ai-understa…

382