Research shows that 26% of Agent Skills contains vulerabilities and 5% show likely malicious intent, NVIDIA just released a security scanner that helps you check if a skill is safe to install!
You might be thinking that it has been a while and LLM models are more secure and safe from prompt injections and data exfiltration attacks @wunderwuzzi23 has bad news for you all. Read the article in the reply!
imo there’s a pretty solid default recipe that everyone should use to optimize a system of
Agent = Model Harness
you should “train” both
1. Build v1 agent using a sensible base harness and some task specific prompting tools
2. Harness Engineering using eval tasks that roughly match prod
this is often enough - most companies can get acceptable perf doing this. then they collect traces, mine them for patterns, and make slight tweaks from there
3. SFT using data collected from traces) or synthetic data. Often is good candidate for “distillation tasks” to train a cheaper model while maintaining existing performance
4. RL if you have the bandwidth and ability and desire to create environments and designing rewards that represents the tasks you want your agent to be good at. Push past the SFT behavior of “copying” data from existing model to pushing past in some dimension
5. Light harness engineering again to squeeze any more juice (ex: slight prompting) using the trained model that’s better at your task distribution
this loop will largely be productized as a general purpose recipe for building and improving agents
we’re still in the earliest innings of the world’s companies getting comfortable with steps 1-2 of this loop. Harness engineering will probably be the dominant way ppl will optimize agents
but i expect a large number of companies to onboard through this entire loop on some trial project of interest in the next year
NVIDIA's LocateAnything is a new vision model for grounding and detection. Very performant and accurate!
> 10x faster than Qwen3-VL
> 138M queries 785M boxes
> GUI, OCR, docs, dense detection
> Free & open source
research.nvidia.com/labs/lpr…
We’ve shipped a security-guidance plugin for Claude Code that helps identify and fix vulnerabilities as you’re writing code.
Available for all Claude Code users. Install from the plugin marketplace (/plugins).
A 16-line bash script that builds its own coding agent harness, then uses the harness to build a backdoor file browser. No frameworks. No dependencies beyond `curl`, `jq`, and `python3`
A simple prompt and a network call and you have a T-1000 kind of “weapon” at your disposal.
We’re adding more visibility into where your Claude Code usage goes.
Run /usage to see a breakdown of what's driving it: parallel sessions, subagents, cache misses, long context, plus tips to optimize each.
Built a small AI project over the last few days as a submission for the @devpost Learning Hackathon for Spec Driven Development. The idea: provide a company's stock ticker and get analysis around buy/sell/hold signals - based on fundamentals and recent company and economic news
🔥 Redpoint's 2026 AI Market Update is a must-read. The TL;DR will make you rethink everything about software.
AI is splitting tech in two:
Public SaaS at its lowest multiples since 2007 (4.1x). Horizontal SaaS down 35%. The market isn't questioning this quarter — it's questioning survival.
Meanwhile AI-native companies hit $100M ARR in under 2 years. Cursor does $6.1M ARR/employee — 12x Salesforce.
The paradox: AI automates coding, yet dev job postings are rising. It's the ATM effect — lower costs → more output → more demand.
The timing matters. Across every platform shift, Years 4–5 produced the biggest companies. ChatGPT launched Nov 2022. We're in the window.
Key signals from 141 CIOs:
- 45% of AI budgets replacing existing software
- 54% consolidating vendors
- Only 3% expect AI to mean more vendors
Markets are pricing worst-case disruption for incumbents AND funding AI-native replacements at record valuations. Someone's wrong.
Software is being re-founded in real time.