MoE support landed across three stacks this window - ggml-org/llama.cpp shipped Cohere2MoE parsers, vLLM added GraniteMoe work, tokenspeed pushed MoE quantization.
Read on if you deploy inference, run quantized models, or optimize throughput. @vllm_project
Chroma pushed three Rust branches for agent infrastructure in one coordinated move.
Read on if you run vector DBs at scale, build agent backends, or deploy inference pipelines.
The Python-to-Rust migration for ML infra isn't slowing down.
Ultralytics v8.4.67 adds `ULTRALYTICS_SAFE_LOAD` for opt-in safer pickle loading. Model deserialization is a real attack surface - this gives teams a guardrail for loading untrusted weights.
LiteLLM created a `litellm_mcp_v2_rewrite` branch this window - signals major MCP architectural changes.
Read on if you build agentic systems, manage multi-model backends, or ship LLM gateways.
MCP is becoming the standard for tool-model coordination. @LiteLLM
OpenClaw released v2026.6.7-beta.1, created a security audit branch, and has Telegram rich messages in development. Mem0 shipped v2.0.6 with contextual notices for scale thresholds and slow query detection.
Firedancer v1.0.0 hit testnet - a Solana validator rewritten from scratch in C, binary renamed from fdctl to firedancer.
Read on if you run validator infrastructure, deploy inference at scale, or track execution client diversity.
BerriAI/litellm v1.84.8 added cosign Docker image signing for supply chain verification. Cisco guardrail integration also in active development across multiple branches. @LiteLLM