How It Works (Mechanics)
-
-
Ingest
•Drop raw material (PDFs, web clippings,GitHub repos, datasets, images) into araw/ folder.
•Images are stored locally so the LLM can bepassed a file path for visual reference.
Compilation → Wiki
•An LLM (called via script or CLI) scans theraw/ directory incrementally.
•For each source it:
1Generates a concise summary.
2Assigns tags / categories (e.g.,“reinforcement learning”, “hardware”).
3Creates Obsidian‑style backlinks torelated notes.
4Writes a full markdown article thatexpands on the summary.
5Updates index files (tables ofcontents, concept maps).
•The result is a plain‑folder markdown wiki(*.md) that can be opened directly inObsidian.
Frontend
•Obsidian provides browsing, graph view, andplugins (Marp for slides, Dataview forqueries).
•No proprietary database; the file system isthe knowledge store.
Query / Q&A Layer
•When a question is asked, the LLM is fedeither the whole wiki (≈ 400 k words fitswithin modern 128 k‑token windows) or atargeted subset selected via the index.
•The model navigates the markdown linksinternally, synthesizing answers thatreference specific notes.
Output & Feedback Loop
•Answers are not plain text; the LLM emitsstructured artifacts (markdown reports,Marp slide decks, code snippets, matplotlibplots).
•These artifacts are dropped back into thewiki, automatically expanding it.
Maintenance (“Linting”)
•Periodic runs where the LLM scans the wikifor:
•Inconsistent terminology ordefinitions.
•Missing citations or gaps in coverage.
•Redundant or stale notes.
•Suggested new connections orfollow‑up questions.
•Optional web searches can fill gaps, and thenew material is integrated as fresh notes.
Auxiliary Tools
•A lightweight keyword search (e.g., ripgrepor a tiny SQLite index) that the LLM can callas a tool.
•Future scaling ideas: generate synthetic datafrom the wiki and fine‑tune a model so theknowledge becomes part of the modelweights.
-
-
2. Why This Approach Is Effective (Rationale)
:
•LLMs excel at structured knowledge work –summarizing, linking, and maintainingcoherence across many documents, whichtraditional note‑taking tools don’t automate.
•File‑system markdown is sufficient atpersonal‑research scale; no vector DB orembedding engine is needed to achieve highrelevance.
•Self‑reinforcing loop – each query enrichesthe wiki, which in turn yields better futureanswers – a virtuous cycle of knowledgeaccumulation.
•Low cognitive overhead – the researcherstays inside a familiar environment(Obsidian) while the LLM does theheavy‑lifting of compilation and upkeep.
•Persistence vs. stateless chat – the wikiretains context across sessions, eliminatingthe “context collapse” problem of typicalLLM chats.
•Traceability – every claim is anchored to asource file and a generated note, making thesystem auditable and easier to debug thanblack‑box RAG pipelines.