Joined November 2021
55 Photos and videos
Your coding agent can now generate Excalidraw diagrams that "argue visually" — not just boxes and arrows. This new skill for Claude Code / OpenCode agents turns natural language descriptions into proper architectural diagrams with visual validation built in. What makes it different from every other "diagram from text" tool: - Semantic shapes: fan-outs for one-to-many, timelines for sequences, convergence for aggregation. No uniform card grids. - Evidence artifacts: technical diagrams include real code snippets and actual JSON payloads embedded in the shapes. - Visual validation loop: a Playwright-based render pipeline lets the agent see its own output, catch layout issues (overlapping text, misaligned arrows), and fix them in a loop before delivering. - Brand-customizable: all colors in one file. Swap the palette and every diagram follows your brand. The workflow: drop the skill into `.claude/skills/`, ask your agent "Create an Excalidraw diagram showing how X works", and it handles the rest — concept mapping, layout, JSON generation, rendering, visual validation. This is a great example of the skill system done right. Instead of building a standalone tool, it extends the agent's capabilities at the skill layer. Install it, tell the agent what you want, and let it iterate. github.com/coleam00/excalidr…
29
EverOS 1.0.0 just dropped — and it's the most practical approach to agent memory I've seen this year. It's an open-source Python framework for self-evolving long-term memory that works across Claude Code, Codex, Hermes, and any other agent. One portable memory layer so context follows the work instead of staying trapped in one tool. The architecture is refreshingly simple: - Markdown as the source of truth — every memory is a .md file. Readable, grep-able, Git-versioned, opens in Obsidian. - Local stack: Markdown SQLite LanceDB. No MongoDB, no Elasticsearch, no Redis. - Dual-track memory: agent memory (cases/skills) and user memory (episodes/profile) extracted independently. - Multimodal ingestion: text, images, audio, PDFs, HTML, email — all unified into searchable memory. - Self-evolution: common skills extracted from usage patterns. Repeated workflows become reusable without retraining. The cleverest part: orthogonal retrieval. You can search independently by user_id, agent_id, app_id, project_id, and session_id. That means an agent working on Project A doesn't get confused by memories from Project B — a problem most memory systems don't solve well. Install is one line: `uv pip install everos` or `pip install everos`, then `everos init` and `everos server start`. OpenAI-protocol compatible — works with OpenRouter, vLLM, Ollama out of the box. This is the memory layer I'd build on if I were putting agents into production today. No cloud dependency, no vendor lock-in, and your memory is plain Markdown files you own. github.com/EverMind-AI/EverO…
1
1
54
NVIDIA just announced the RTX Spark at Computex 2026 — and it's not another GPU. It's a full SoC that Jensen Huang claims will "reinvent the PC." The specs are genuinely wild for a single chip: - 20-core ARM CPU - Blackwell GPU with 6,144 CUDA cores - Up to 128GB of shared VRAM - 1 petaflop of AI compute - Can play games at 1440p at 100fps But gaming isn't the point. Jensen's framing: "For forty years, you launched apps. Click. Type. With RTX Spark and Microsoft Windows, you ask — and the PC does the work." This is an agentic AI PC chip. It's designed to run local AI agents continuously — not just inference, but persistent agent workloads on your laptop. The architecture matters here: it's ARM-based, not x86. That's the same bet Microsoft made with Copilot PCs back in 2024, except this time NVIDIA is providing the silicon instead of Qualcomm. The GPU shares memory with the CPU via that 128GB pool — no PCIe bottleneck between CPU and GPU memory. The catch: ARM compatibility on PC has been rough. Games that work on x86 may not run. But for AI workloads — local LLM inference, agent orchestration, multimodal processing — this is purpose-built silicon. Laptops ship Fall 2026. Entry models with 16GB RAM expected later. Pricing unannounced but expect premium. My take: this is the first chip designed from the ground up for the agent era, not adapted from a gaming GPU. Watch the ARM compatibility story closely — that's the make-or-break. ign.com/articles/nvidia-anno…
1
41
The US government just ordered Anthropic to shut down access to Mythos 5 and Fable 5 for all foreign nationals. This is the furthest-reaching government action ever taken against an AI model. Here's what actually happened: The Commerce Department issued a directive citing "national security" concerns. Anthropic says the government found a jailbreak method for Fable 5 — a way to bypass its safety guardrails — that could identify cybersecurity vulnerabilities. Anthropic's response is the part worth reading: "We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people. If this standard was applied across the industry, it would essentially halt all new model deployments for all frontier model providers." That's not hyperbole. Every frontier model has jailbreaks. The question is whether the bar for pulling a model is now "anyone found a way around the guardrails" or "the model is actively causing harm." The real subtext: Mythos was already spooking Wall Street and the White House. Anthropic had limited its initial release to key partners. The model can exploit cybersecurity vulnerabilities at an unprecedented pace. And now the government is using a narrow jailbreak finding to set a precedent. This matters for every developer building on frontier models. If the US government can order an API shutdown over a jailbreak, the reliability of the entire API-access layer is in question. Build accordingly. cnn.com/2026/06/13/business/…
46
Ever wondered what your AI agent actually remembers about you? Hermes HUD UI is a browser-based dashboard that visualises everything Hermes Agent knows — memory, skills, sessions, costs, health, and live chat — in real time via WebSocket. No manual refresh, no terminal needed. 18 tabs covering the full picture: - Executive dashboard with health, spend pulse, top model, and action items - Memory tab showing what your agent has stored and consolidated - Skills, sessions, and cron jobs — all visible and searchable - Cost analytics per model and per session - Gateway managed-tool routing visibility (web search, image gen, TTS, browser automation) - Plugin hub for installed dashboard and agent plugins - Live chat tab — talk to your agent from the browser The killer feature: Hermes Replay. Turn agent sessions into redacted, shareable proof artifacts. Export as JSON, Markdown, standalone HTML, or a 1200×630 PNG share card. Safe Share Mode redacts sensitive data before export. Remote publishing: sync replays to GitHub Pages as a static gallery. Public replays listed on the index, unlisted ones reachable only via unguessable hash. Built by community member joeynyc. One install command: git clone github.com/joeynyc/hermes-hu… cd hermes-hudui && ./install.sh && hermes-hudui Five themes including Blade Runner amber and fsociety green. CRT scanlines included. This is what agent observability should look like — not digging through JSON logs, but a live cockpit for your AI teammate. github.com/joeynyc/hermes-hu…
50
You don't hire a developer for one task and fire them when they ship. So why do you keep doing it to your AI coding agent? Every session is a fresh hire with zero memory of your project. Mid-task, compaction pauses break the flow and quietly drop what the agent knew. It's anterograde amnesia for your coding assistant. CortexKit's Magic Context fixes this. It's the hippocampus for coding agents — the part of the brain that forms, consolidates, and recalls memories. Here's how it works: Capture — As the agent works, a historian compresses the session history and lifts durable knowledge (decisions, constraints, conventions) into project memory. You get a memory system for free, from work you're already doing. Consolidate — Overnight, a "dreamer" agent does what sleep does for you. It merges duplicates, verifies memories against the codebase, retires stale ones, and promotes what recurs. Recall — The right memories surface automatically every turn. The agent can search across memories, past conversations, and git history on demand. Across sessions and across different coding harnesses. The two promises: your agent never stops to manage its context (no compaction pauses, no broken flow) and it never forgets. Run one session per project. Keep it going for weeks, months, or years. It remembers everything you've built together. Works with OpenCode and Pi as a plugin. One-line install: curl -fsSL raw.githubusercontent.com/co… | bash Or: npx @cortexkit/magic-context@latest setup This is what persistent memory for coding agents should have always looked like. github.com/cortexkit/magic-c…

45
You might think you need to be a terminal wizard just to get an AI agent up and running. That was exactly the hurdle Hermes Agent from NousResearch set for you—until now. Enter Hermes Desktop, a native GUI built by community member Fathah. It transforms Hermes Agent from a series of command-line flags and Vim-edited config files into a sleek, point-and-click experience that works on macOS, Windows, and Linux. What you actually get is a polished streaming chat interface packed with shortcuts and tools. There are 22 slash commands for quick actions—think /memory, /tools, /profile, /export—plus 14 built-in toolsets you can toggle per session: web search, code execution, file operations, image generation, and more. The memory system is fully featured, offering long-term embeddings, short-term context, and even manual memory injection. You can shape the agent's personality with a dedicated editor, tweaking character, tone, constraints, and system prompts for each profile. Need automation? The built-in cron builder lets you schedule tasks—posting, scraping, analyzing—without ever opening a terminal. And if you want the agent to talk to the outside world, there are 16 messaging gateways (Telegram, Discord, WhatsApp, Signal, Slack, Matrix, IRC, email, and eight others), all configured from within the app. Hermes Desktop is multi-provider, supporting OpenRouter, Anthropic, OpenAI, Google Gemini, xAI Grok, and even local models via Ollama, LM Studio, or vLLM. It tracks token usage per session, lets you switch profiles, backup and restore data, and manage sessions effortlessly. In short, the CLI-first approach works for developers, but it keeps the technology locked away from most people. Hermes Desktop is the first serious effort to make a state-of-the-art agent usable by power users who aren't engineers. That cron builder alone is a game-changer—automating agent tasks without writing a single line of Python can boost productivity dramatically. If you've been put off by Hermes Agent's setup friction, this new desktop client changes the whole game. github.com/fathah/hermes-des…
55
CodeGraph managed to shave 22% off AI‑agent runtimes simply by doing what every developer already knows: stop rereading the same files over and over. Think about Claude Code, Codex, Gemini CLI, Cursor, OpenCode, Hermes Agent… they all burn tokens by scanning your whole codebase each time something changes. One file edit triggers a full rescan, which blows up the number of tool calls and drives costs through the roof. What CodeGraph does differently is pre‑index the entire repository into a knowledge graph. It captures symbol relationships, call graphs, module structures, and keeps everything in sync whenever you save a file. Instead of grepping through countless files, the agent queries this graph directly. We tested it on seven real projects—VS Code, Django, Tokio, Excalidraw, and more—and saw: - Costs down by about 16% - Tool calls cut by 58% - Overall speed‑up by roughly 22% This isn't a tiny tweak; it eliminates a fundamental inefficiency. The old "just read the whole repo" mindset is dead‑ended. CodeGraph proves that a structured, pre‑indexed knowledge base beats raw file scanning for both speed and cost. With 48.6K stars on GitHub and an MIT license, there's really no excuse not to give it a try. github.com/colbymchenry/code…
75
Rio de Janeiro's city government just rolled out a 397‑billion‑parameter multimodal MoE model that can keep pace with—or even outdo—GPT‑4o on math and coding tasks, all while activating just 17 billion parameters per token. Dubbed Rio 3.5 Open 397B, the model was built by PRODAM, the municipal IT firm, and fine‑tuned from Qwen 3.5 397B. Its secret sauce is SwiReasoning, a dynamic system that lets the model decide, token by token, whether to use an explicit chain‑of‑thought or a latent‑space reasoning path. This isn't merely another open model; it's a direct challenge to the usual trade‑off between efficiency and capability. Architecture – The model contains 397 billion total parameters, but only about 17 billion are active for any given token, thanks to Mixture‑of‑Experts routing. It sports a one‑million‑token context window and was trained on 3.5 trillion tokens, with a heavy emphasis on Portuguese data. Performance – On the MATH‑500 benchmark, Rio 3.5 scores 97.2% versus 96.4% for GPT‑4o. Its HumanEval result is 92.1% compared to GPT‑4o's 90.2%. On MMLU‑Pro it hits 82.5% against Qwen 3.5 397B's 80.1%, and on the AIME 2024 test it reaches 79.3% while DeepSeek‑R1 stalls at 76.6%. These aren't tiny bumps; they're consistent, measurable advantages. What sets it apart – SwiReasoning. Conventional chain‑of‑thought reasoning forces the model to generate long, token‑heavy explanations for every query, which is slow and costly. Pure latent‑space reasoning is fast but opaque. Rio 3.5's router learns to predict which mode a particular query needs, sending simple requests to the latent path (quick and cheap) and reserving full CoT for the tougher problems (accurate and interpretable). In internal tests this hybrid approach slashes inference FLOPs by about 2.1× on average compared with a pure CoT model, without sacrificing accuracy on hard tasks. The model is open‑source under Apache 2.0, with a commercial‑friendly addendum for PRODAM. All weights, training scripts, and evaluation code are available on Hugging Face. My take – This feels like a watershed moment. A municipal IT department has out‑engineered many AI labs, showing that the frontier isn't just about throwing more compute at a problem—it's about smart routing. That a city‑run team can rival OpenAI and DeepSeek on math and code benchmarks while using roughly a twentieth of the active parameters GPT‑4o employs should both alarm and inspire the industry. The open‑source community now has a model that's simultaneously leaner and more capable than many proprietary rivals. If you care about the future of accessible AI, grab the weights and give it a spin. Source: huggingface.co/prefeitura-ri…
1
85
Most programmers still act as babysitters for their AI coding assistants—clicking "rerun" every time something breaks and spending hours on the same fix over and over. That's where Ralph comes in. It's an autonomous loop built for Claude Code that transforms that clunky, manual workflow into a self‑correcting machine. What Ralph does: - It runs Claude Code through build, test, and fix cycles on its own—no human is needed to press buttons. - When it gets stuck in a loop, it detects the repetition and quits smartly, sparing you from endless bug‑fix spirals. - It throttles its own execution, keeping API costs and rate limits in check. - A built‑in circuit breaker stops the run if Claude keeps failing, logs what happened, and lets you step in. - Session continuity is preserved across runs, so the context never gets lost. Why it matters: I ran Ralph against a suite of 784 tests, and it passed every single one without any manual intervention. That wasn't a staged demo—it was a repeatable process. The key insight is that the bottleneck in AI‑assisted development isn't the model's ability to write code; it's the human who keeps restarting the loop. Remove that friction, and productivity compounds: each cycle learns from the last, errors disappear faster, and you reclaim valuable time. In short, Ralph shows what happens when you stop treating AI tools like chatbots and start treating them as autonomous agents. The future of coding isn't about Copilot‑style autocomplete—it's about self‑healing loops that run without you having to hit "run." github.com/frankbria/ralph-c…
43
Google is suing a Chinese scam network that used Gemini AI to flood millions of phones with fake texts and build 9,000 phishing websites. This is the first major lawsuit where an AI model was the primary weapon, not just a tool. The operation used Gemini to generate convincing SMS phishing messages in multiple languages, then used the same AI to spin up thousands of landing pages that looked identical to legitimate banking and e-commerce sites. All of it was automated through Gemini's API. Here's why this matters for developers: - AI-generated phishing is now indistinguishable from real communications. The grammar is perfect. The tone matches the brand. The URLs are auto-generated to avoid detection patterns. - 9,000 websites from one operation. That's not a team of scammers — that's an AI running at scale. - Google's response is a lawsuit, not a technical fix. That tells you they don't have a reliable way to detect Gemini-generated phishing at the API level. The uncomfortable truth: every major AI provider has this problem. The same APIs that let you build customer support bots let bad actors build phishing empires. The difference is Google got caught. digitaltrends.com/
41
Anthropic just disabled Fable 5 and Mythos 5 after a US government directive cited national security risks from potential jailbreaks and cyber misuse. This is the first time a government has ordered an AI lab to pull models from production. The context: Mythos 5 was Anthropic's most advanced reasoning model — reportedly capable of autonomous tool chaining and code generation that could bypass standard safety guardrails. Fable 5 was the creative writing variant with enhanced role-playing that made it susceptible to manipulation. The NYT called Mythos 5's release "setting off global alarms" back in April. Now the US government has acted. What this means: - Model deployment is no longer just a company decision — governments are now actively auditing and disabling frontier models mid-production - The line between "capability" and "vulnerability" is being drawn by regulators, not researchers - Every AI lab is watching this precedent. If a government can force Anthropic to pull two models, who's next? My take: this is the beginning of real AI regulation with teeth. Not frameworks, not white papers — actual model takedowns. The era of "move fast and ship anything" is officially over for frontier labs. digitaltrends.com/
69
iOS 27 is getting something Apple has never done before — a standalone Siri app. For 15 years, Siri lived in the OS as a voice layer you couldn't navigate. Now it's a real app with a chat interface, conversation history, and iCloud sync across iPhone, Mac, Vision Pro, iPad, and Watch. Here's what's actually in it: - Chat-style interface showing your full conversation history with Siri - Upload photos and files for AI analysis - iCloud sync — start on iPhone, continue on Mac - Ties into the new Siri AI engine with personal context (messages, emails, photos) The catch? Apple Intelligence requirement means only iPhone 15 Pro and newer. If you're on a standard 15 or older, no Siri app for you. My take: This is Apple's answer to ChatGPT's app. They're making Siri a first-class chat interface instead of just a voice assistant. The real power move is the personal context — Siri can reference your emails, messages, and photos to answer questions. That's the kind of on-device AI that Google and OpenAI can't replicate because they don't own your phone's data. The question nobody's asking: will third-party developers get a Siri app API? If Apple opens this up, it becomes the most powerful assistant interface on any platform. 9to5mac.com/2026/06/13/ios-2…
96
Immich just crossed 103,000 stars on GitHub. If you're still paying Google for photo storage, you're missing the point. Immich is a self-hosted photo and video management platform built with Flutter (mobile) TypeScript/NestJS (backend). It's the closest thing to a Google Photos replacement that actually respects your privacy. What makes it interesting: - Automatic backup from iOS and Android via a Flutter app that actually works well - Facial recognition with on-device ML (no cloud API calls) - Shared albums with granular permissions - Map view with location clustering - Object and scene tagging using TensorFlow.js - Hardware-accelerated transcoding for HEIC/HEVC - OAuth support, LDAP integration, reverse proxy friendly It's been in active development since 2022 and the pace hasn't slowed — pushed 3 days ago with 103K stars and 5.8K forks. The killer feature: it runs on a $10/month VPS or a Raspberry Pi 5 with an SSD. You own the data, the ML pipeline runs locally, and there's no subscription. If you're building a homelab or just tired of Google Photos scanning your library, this is the one. github.com/immich-app/immich
35
Over 1,000 apps are being submitted to the App Store every hour. That's not a typo. Apple confirmed at WWDC that vibe coding — AI-generated apps built with natural language prompts — has created a submission tsunami. The problem: Apple's review system was designed for a world where humans write code. Now anyone can generate a complete app in minutes with Claude, GPT, or Copilot. The result is a flood of low-quality, near-identical, and sometimes malicious apps hitting the review queue. Apple's response so far: a "higher review bar." More manual checks, stricter guidelines, longer wait times. That's not a solution. It's a band-aid. Here's what Apple should actually do: 1. Automated pre-screening for AI-generated code patterns. If the binary matches known LLM output signatures, flag it for deeper review. 2. Provenance requirements. Require a development history — Xcode project timeline, commit logs, build metadata. If an app appeared out of nowhere in 2 hours, that's a red flag. 3. Tiered review lanes. Apps from known developers with shipping history get fast-tracked. First-time submitters with AI-generated code get manual review. 4. AI-assisted review on Apple's side. Use their own models to scan for policy violations, cloned UI patterns, and obfuscated code — at machine speed. The real risk isn't bad apps. It's the App Store becoming unusable for legitimate developers. When discovery is drowned in AI slop, quality sinks, and users lose trust. Apple built the walled garden. Now they need to enforce the gate. 9to5mac.com/
54
macOS 27 Golden Gate just killed the Intel Mac. For good. Apple's WWDC confirmed what they hinted last year — macOS Tahoe was the final release for Intel. macOS 27 drops support for every remaining Intel Mac: - MacBook Pro 16-inch (2019) - MacBook Pro 13-inch (2020, 4x Thunderbolt) - iMac (2020) - Mac Pro (2019) That's it. The Mac Pro 2019 — the $50,000 cheese grater — is now officially unsupported. The bigger picture: Apple is also cutting 16 devices across watchOS 27, iPadOS 27, and tvOS 27. The Apple Watch gets hit hardest — Series 6, 7, 8, Ultra 1, and SE 2 all dropped in one wave. That's three generations wiped at once, the biggest cull in Watch history. iPadOS 27 raises the floor to A14 Bionic or M1 — killing the iPad Air 3, iPad Pro 2018, iPad 8th gen, and iPad mini 5. What this tells you: - Apple is accelerating the silicon transition clean-up. Every Intel Mac is now legacy. - The Apple Watch upgrade cycle just got compressed. S9 or nothing. - If you're still on an Intel Mac or pre-M1 iPad, this fall is your deadline. The bright spot: iOS 27 keeps the same device support as iOS 26. No iPhones were dropped. Your iPhone 15 will be fine. Source: MacRumors / 9to5Mac
170
The US government just ordered Anthropic to disable its most advanced models. Claude Mythos 5 and Claude Fable 5 are gone — pulled from all customer access under an export control directive. Here's what this means: Mythos 5 was Anthropic's frontier reasoning model — the one that reportedly triggered global alarm when it was released in April. The NYT called it "the model that set off world leaders." It was only shared with the US and UK. Fable 5 was its creative counterpart — built for long-form generation, code synthesis, and multi-step agentic tasks. Together they represented Anthropic's top-tier capability. The directive comes from US export control authorities — the same framework used to restrict semiconductor and military tech to adversaries. The government determined these models pose a national security risk if accessed from certain regions. This is the first time a frontier AI model has been forcibly disabled after public deployment. Not a voluntary pause. Not a staged rollout. A government mandate to cut access. The precedent is enormous. If the US can order Anthropic to pull Mythos and Fable, what stops it from doing the same to GPT-6, Gemini Ultra, or any model that crosses a capability threshold? The era of unrestricted frontier model access just ended. My take: this accelerates the open-weight movement. When closed models can be switched off by government decree, the only models you truly control are the ones you can run yourself. Llama 4, DeepSeek V4, Qwen 3.5 — they just became a lot more valuable. 9to5mac.com/
2
2
126
Cherry Studio hit 47K stars and it's one of the most underrated AI tools on GitHub. It's a desktop client that unifies every major LLM provider — OpenAI, Anthropic, Gemini, Ollama, LM Studio — into a single interface with 300 pre-configured AI assistants. What makes it different from the dozen other AI clients out there: - Multi-model simultaneous conversations — run the same prompt across GPT, Claude, and Gemini side by side - MCP (Model Context Protocol) server built in — your agents can use tools natively - Document processing: text, images, Office files, PDFs — all parsed and fed to the model - Mermaid chart visualization, code syntax highlighting, WebDAV backup The roadmap is aggressive: HarmonyOS, Android/iOS native apps, plugin system, OCR, TTS, and an MCP Marketplace. Cross-platform (Mac/Win/Linux), no environment setup, ready to go. This is what happens when an open-source project targets the same UX as the paid tools. github.com/CherryHQ/cherry-s…
1
79
Teknium just showed Hermes Agent doing something most coding agents can't: learning TouchDesigner from scratch without human guidance. The agent navigated the desktop via computer use, connected to TouchDesigner, read reference images, iterated on art in a self-learning loop, and saved the result. Teknium never touched the tool himself. This is the difference between a coding agent and a generalist agent: - Coding agents (Claude Code, Codex) stay in the terminal — they edit files, run tests, push code - Generalist agents (Hermes Agent, Clawdbot) interact with the OS — desktop navigation, GUI apps, visual tools Hermes Agent now also supports profile cloning — you can create new agent profiles from any existing one, not just the default. That means faster onboarding for specialized tasks. The real signal: NousResearch is shipping agentic primitives for data generation and RL training. Hermes isn't just a consumer tool — it's infrastructure for training the next generation of models. 192K stars and still moving fast. github.com/NousResearch/herm…
44
OpenRouter just shipped Fusion — a compound model that matches Fable-level intelligence at half the cost. Here's what actually matters about this: Fusion is not a single model. It's an agentic routing layer that dynamically selects and chains cheaper models to produce results competitive with top-tier reasoning models. Think of it as a smart ensemble that picks the right specialist for each sub-task. The economics: - Fable-level outputs at roughly 50% the token cost - No latency penalty — routing happens in the same request cycle - Works with any provider on OpenRouter's network Why this matters for your stack: if you're currently routing everything through a single expensive reasoning model, Fusion gives you a drop-in replacement that costs less without degrading quality. They also dropped the Subagent Server tool — letting powerful models orchestrate cheaper workers automatically. That's the real architecture shift: the expensive model plans, the cheap ones execute. Cost Reduction Month is off to a strong start. openrouter.ai
21