from prompt to context to harness engineering.
three terms keep coming up in AI engineering, and they get conflated all the time. here is the cleanest way to understand what each one is and how they fit together.
๐ฝ๐ฟ๐ผ๐บ๐ฝ๐ ๐ฒ๐ป๐ด๐ถ๐ป๐ฒ๐ฒ๐ฟ๐ถ๐ป๐ด ๐ถ๐ ๐๐ต๐ฒ ๐บ๐ฒ๐๐๐ฎ๐ด๐ฒ.
the model has no memory of anything before this single call, so the prompt has to carry the full universe of what it needs to know. that means a role, some background, the instructions, a few examples, and a format.
these get assembled into one input and sent to the model. when the output falls short, the skill is figuring out which ingredient is actually letting you down, not rewriting the instructions every time.
the unit of work is one input.
๐ฐ๐ผ๐ป๐๐ฒ๐
๐ ๐ฒ๐ป๐ด๐ถ๐ป๐ฒ๐ฒ๐ฟ๐ถ๐ป๐ด ๐ถ๐ ๐๐ต๐ฒ ๐บ๐ฒ๐บ๐ผ๐ฟ๐.
across multiple steps, the window is finite and the information available is not, which forces a curation step. without it, important details get buried under stale tool outputs and old turns, and the model's attention degrades on the things that actually matter.
a curator selects what stays, compresses what is useful but bulky, and drops the rest. each step's output then feeds into the next step, where good curation is more about knowing what to throw away than packing more in.
the unit of work is what stays in the window, step by step.
๐ต๐ฎ๐ฟ๐ป๐ฒ๐๐ ๐ฒ๐ป๐ด๐ถ๐ป๐ฒ๐ฒ๐ฟ๐ถ๐ป๐ด ๐ถ๐ ๐๐ต๐ฒ ๐บ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ.
on its own, a model just generates text. the harness is what turns it into something that can take actions, check its own work, and recover when a step goes wrong.
the full loop has three phases:
- ๐ด๐ฎ๐๐ต๐ฒ๐ฟ pulls together everything the model needs
- ๐ฎ๐ฐ๐ runs the model and calls tools or sub-agents
- and ๐๐ฒ๐ฟ๐ถ๐ณ๐ checks the output with tests or a judge
on failure, the whole loop retries with updated context, which is the entire difference between calling an API and running an agent.
the unit of work is the machine itself.
here is the part that ties it together.
prompt engineering and context engineering both live inside ๐ด๐ฎ๐๐ต๐ฒ๐ฟ. the harness is the outer container, context is what it curates, and the prompt is what it finally hands to the model.
zoom out and the unit of work gets bigger. zoom in and you are back at the prompt.
i also published this deep dive (article) on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent.
the article is quoted below.