Skip to content

Knowledge Ingestion Workflow

Move useful material from capture to reviewed knowledge without letting drafts, raw output, or private material enter default retrieval.

This is a sub-pattern of the Knowledge Operating System. It is the workflow that decides which captured material can become reusable context for agents.

For leaders

The problem in one sentence. A useful note, a rough transcript, and a reviewed operating artifact should not land in the same searchable pile.

What the pattern changes. Candidate material is classified, curated, approved, logged, and only then made eligible for retrieval.

The failure it prevents. Drafts, private material, or unreviewed claims surfacing inside an agent's answer as if they were reviewed knowledge.

What to ask your team. What has to happen before a captured item becomes something our agents are allowed to retrieve?

The Problem

Useful material arrives in uneven shape. One item may be a rough note. Another may be a working summary, a cleaned-up guide, a raw session export, or a private record. If all of it goes into one search index, the agent cannot tell what is ready to use.

That creates two failures. Raw or private material can show up in a task where it does not belong. Working material can also look more settled than it is. A reader sees a confident answer, but the answer may rest on a draft, a partial capture, or a source that was never approved for reuse.

The better move is to make ingestion a workflow. Each candidate item needs a lane, a sensitivity label, a source trail, a review decision, and a clear retrieval rule.

The Pattern

Treat every captured item as a candidate artifact first. A candidate is not trusted knowledge yet. It is material that may be useful after someone classifies it, cleans it up, checks its sources, and approves where it can be used.

The workflow has five moves:

  1. Identify the candidate. Name what the item is, where it came from, who owns it, and why it may be useful later.
  2. Classify it. Set its lifecycle zone, sensitivity, source type, and default retrieval eligibility.
  3. Curate it. Attach source references, remove private detail, clean up the wording, and mark open questions.
  4. Approve it. A reviewer or validation check decides whether the item has enough source detail and metadata to be promoted.
  5. Promote it. The registry records the manifest entry and promotion log, then marks the item eligible for retrieval only if it passed.

This keeps the boundary visible. Capture is not knowledge. Curation is not approval. Approval is not factual proof. The workflow only decides whether an item has been reviewed enough to become reusable context. Important claims still need a human check before anyone relies on them.

Diagram

flowchart TB
    capture["Captured material"] --> candidate["Candidate artifact"]
    candidate --> classify["Classify source and sensitivity"]
    classify --> zone{"Lane"}
    zone -->|working| hold["Hold outside default retrieval"]
    zone -->|private or raw| exclude["Exclude or read selectively"]
    zone -->|reuse candidate| curate["Curate with source refs"]
    curate --> gate{"Approval gate"}
    gate -->|needs work| hold
    gate -->|approved| registry["Manifest and promotion log"]
    registry --> retrieve["Eligible for retrieval"]

Template Or Small Example

An ingestion checklist can be attached to any candidate artifact:

Field Question
Candidate What is the item, and why might it matter later?
Source Where did it come from?
Owner Who is accountable for it?
Lane Is it working, curated, raw, runtime, or private material?
Sensitivity Is any part private or unsafe for default retrieval?
Source references Which sources support the useful claims?
Open questions What is still unverified or incomplete?
Approval Who or what must approve promotion?
Retrieval rule Can agents search it by default, selectively, or not at all?

A rough note might enter as a candidate, stay in the working lane, and remain out of default retrieval. After a person cleans it up, attaches source references, removes private detail, and approves the promotion, the registry can mark it as curated knowledge.

Failure Mode It Guards Against

The failure it prevents is premature trust. Material gets captured, indexed, and reused before anyone has decided whether it is safe, sourced, current, or reviewed.

That failure is easy to miss because the agent output may look polished. The problem is underneath the answer: a draft became searchable, a private detail crossed a boundary, or a partial note turned into apparent authority. The ingestion workflow blocks that path by making promotion a gate, not a side effect of saving a file.

What This Pattern Does Not Prove

This workflow helps an agent find reviewed material. It does not make that material true by itself. A promoted artifact can still be incomplete, stale, or wrong. Important claims still need a human check, and retrieval quality still needs evaluation.

Proof Note

The checklist and diagram are reproducible without a private system. A reader can take a set of captured notes, classify each one, record the approval decision, and decide whether it belongs in default retrieval. The data model shows the same pattern as schema: artifacts, source references, manifest entries, promotion logs, index records, and validation status are tracked separately.