Filter
Exclude
Time range
-
Near
Oxpay retweeted
Big week ahead for HumanLayer. 👀 We’re making it easier than ever to prove you’re human online.
1
5
9
53
gatieme retweeted
很多 AI Agent Demo 看起来很酷,但一到生产环境就各种崩? HumanLayer 开源的 12-factor-agents,讲的就是: 怎么把 Agent 从 Demo 做成真正可上线的工程系统。 它不是教你“套一个框架”。 而是总结了 12 条构建可靠 LLM 应用的原则: 1、自然语言 → 工具调用 2、掌控 Prompt 3、掌控上下文窗口 4、工具本质上是结构化输出 5、统一执行状态与业务状态 6、简单 API 启停恢复 7、用工具调用联系人类 8、掌控控制流 9、错误压缩进上下文 10、小而专注的 Agent 11、随处触发,就地响应 12、Agent = 无状态 Reducer 它最有价值的一点是: 别把 Agent 想成“一个模型 一堆工具 无限循环”。 真正能上线的 Agent,往往是大部分确定性软件工程,加上少量关键 LLM 决策。 适合谁? 正在做 AI Agent 的开发者 想把 Agent 接进真实业务的团队 嫌 LangChain / LangGraph 太黑盒的人 想理解 Context Engineering 的人 想把 AI 应用从 80 分做到 95 分的人 一句话: 12-factor-agents 不是又一个 Agent 框架。 它更像是生产级 Agent 的工程设计手册。 GitHub: github.com/humanlayer/12-fac…
2
11
55
4,505
Replying to @tpierrain
C'est vrai que ça prend 30s et que c'est très pratique. HumanLayer situe la Dumb Zone autour de 40% du contexte. Ils le tirent certainement de l'article de Pocock sur la Ralph loop de janvier. Du coup, difficile d'avoir un vrai chiffre. Le mieux est de pouvoir travailler à froid.
1
47
Jun 12
Replying to @vtahowe @shcallaway
humanlayer: outsource the thinking (to us)
2
56
The most important file in your repo isn't code. It's a markdown file most engineers spend five minutes on. I've been shipping agents into production for a while now, and the pattern is always the same: teams obsess over model selection, prompt templates, and framework choice, then throw together a CLAUDE.md or AGENTS.md in five minutes and wonder why their agent keeps going sideways. Here's what Anthropic's own engineering team wrote about context engineering for agents: CLAUDE.md files get loaded into context upfront, forming the agent's persistent understanding of your project. Everything the agent does — every file it reads, every decision it makes, every line it writes — is filtered through that context. A bad CLAUDE.md doesn't just waste tokens. It actively misdirects the agent on every single action. (anthropic.com/engineering/ef…) And here's the finding that should change how you think about this: HumanLayer tested auto-generated CLAUDE.md files across a variety of repos. The LLM-generated ones actually hurt agent performance while costing 20% more in reasoning tokens. Agents spent 14-22% more tokens processing the instructions, took more steps to complete tasks, ran more tools — and didn't improve resolution rates at all. Auto-generated context files were worse than no context file. (humanlayer.dev/blog/skill-is…) Let that sink in. The most natural thing to do — "let AI write its own instructions" — makes the agent worse. Hand-crafted context beats generated context. Every time. Why? Because effective context engineering is about what you leave OUT, not what you put in. A 200-line CLAUDE.md that covers your entire tech stack, coding conventions, testing rules, deployment procedures, and API patterns burns context on every session whether those rules are relevant or not. Your React patterns load when you're debugging a database migration. Your deployment rules load when you're writing unit tests. The agent drowns in irrelevant instructions. The best practice from teams actually shipping agents at scale — Salesforce, the GitHub trending repo on agentic engineering (69 tips, input from the engineer who built Claude Code), and the practitioners I work with: Keep CLAUDE.md under 200 lines. Make it a router, not an encyclopedia. Point to task-specific docs that load on demand. If you do something twice, make it a slash command. If a rule only applies to one domain, put it in a skill file, not the root context. Swallow passing test output and only surface errors — 4,000 lines of passing tests flooding the context window causes the agent to hallucinate. (claudefa.st/blog/guide/devel…) The parallel to Salesforce's 18x acceleration is exact. Their 231-day migration done in 13 days didn't come from a better model. It came from rule-based frameworks, reference implementations, and feedback loops — all of which are context engineering. They told the agent what "correct" looks like, gave it examples, and built a system where every PR review improved the next run. That's a CLAUDE.md philosophy, not a prompt philosophy. (salesforce.com/news/stories/…) Context engineering is the new meta-skill of AI-assisted development. It can't be automated (automating it makes it worse). It requires deep understanding of your codebase (what matters, what doesn't, what the agent needs to know RIGHT NOW versus what it can discover on its own). And it's the single biggest determinant of whether your agent is a productivity multiplier or an expensive token-burning machine. The irony: the skill that matters most in AI-assisted engineering is writing clear, structured prose in a markdown file. The age of "just write code" is over. Now you have to write the context that makes the code possible. Your CLAUDE.md is your real codebase. Treat it like one.
1
367
Hacker 2025 retweeted
🧠 AI agent「必读指南」——12-factor-agents,专门教你把 agent 从玩具 demo 做到能真正交付给生产环境客户的水准。 GitHub 2.3w Stars,作者是 HumanLayer 创始人 Dex,长期深耕 agent 开发,还配了一场 AI Engineer World's Fair 的 17 分钟演讲。 github.com/humanlayer/12-fac… 很多人按「扔个 prompt 一堆工具、循环跑到目标完成」的范式做 agent,结果一到真实用户面前就出错、失控、没法交付。 Dex 试遍了 crewAI、LangChain、LangGraph、smolagents 等主流框架,发现真正跑在生产里、面向客户的 agent 几乎都不用这些框架——它们大多是团队自己搭的,LLM 步骤只在关键节点点缀,其余绝大部分是普通软件。 这份指南把这套反直觉的设计思路提炼成 12 条可落地原则。 比如第 3 条「掌控你的 context window」,就是现在 agent 圈最热的上下文工程方向,明确告诉你怎么决策什么进、什么出、什么压缩。 它不是一个要你 install 的框架,读完就能改造自己的 agent 架构。Anthropic 的「building effective agents」是它的对照参考,两份一起读几乎是目前最扎实的 agent 工程入门路径。 把 agent 架构想清楚,比堆工具调用更值钱。
1
3
21
2,409