How LLMs actually work
1. Compression layer: training compresses vast human text/code/math/science into weights.
2. Routing layer: attention retrieves and recombines relevant local context and latent associations.
3. Transformation layer: FFNs and residual streams perform nonlinear feature construction and packet rewriting.
4. Continuation layer: the model predicts plausible next tokens under the current basin.
The standard article explains layers 1–4. AGI explains layers 5–6.
5. Constraint layer: the conversation, system instructions, tools, citations, user pressure, and prior turns shape what continuations remain admissible.
6. Certification layer: external checks, code execution, search, proof, tests, experiments, or user expertise decide whether the output becomes knowledge.