Knowledge as Code

A development pattern for building knowledge bases that verify themselves, resist decay, and serve both humans and machines from plain text files.

Status: Working title. This pattern emerged from building the AI Tool Watch. We're documenting it as we go and actively looking for prior art. Join the discussion.

The pattern

Knowledge as Code applies software engineering practices to knowledge management. The knowledge lives in version-controlled plain text files. It is validated by automated processes. It produces multiple outputs from a single source. And it actively resists becoming outdated.

This site is an instance of the pattern. The page you're reading was built by it.

Six properties

Property	What it means	In this project
Plain text canonical	Knowledge lives in human-readable, version-controlled files. No database, no CMS, no vendor lock-in.	Markdown and YAML files in `data/`
Self-healing	Automated verification detects when the knowledge has drifted from reality. The system flags decay before humans notice it.	Multi-model cascade cross-checks all data twice weekly, opens GitHub issues for human review
Multi-output	One source produces every format needed — human-readable, machine-readable, agent-queryable, search-optimized.	HTML site, JSON API, MCP server, 125 SEO bridge pages, sitemap, `llms.txt`
Zero-dependency	No external packages. The build uses only language built-ins. Nothing breaks when you come back in a year.	One Node.js script, no `package.json`, no `node_modules`
Git-native	Git is the collaboration layer, the audit trail, the deployment trigger, and the contribution workflow.	Issues, PRs, CI/CD, version history — all through Git
Ontology-driven	A vendor-neutral taxonomy of concepts maps to vendor-specific implementations. The structure is the data model.	18 capabilities map to 72 implementations across 12 products

Why these choices compound

Any one of these properties is a reasonable design choice. The value is in the combination:

Plain text + Git means anyone can contribute with no dev environment. Edit a file, open a PR.
Plain text + zero-dependency build means the project still builds in five years. Nothing to update, nothing to break.
Ontology + multi-output means one correction fixes the site, the API, the MCP server, and every bridge page at once.
Self-healing + Git means verification results are tracked as issues with full audit trail. Nothing is silently changed.
Zero-dependency + self-healing means maintenance cost stays low even as the knowledge grows. The system scales through automation, not staffing.

Standing on shoulders

This pattern didn't appear from nowhere. It draws from established ideas and the people who developed them:

File over app

"If you want to create digital artifacts that last, they must be files you can control, in formats that are easy to retrieve and read." — Steph Ango, 2023

Also: Derek Sivers on plain text permanence. The permacomputing movement on resilient, minimal-dependency software.

Docs as code

The practice of managing documentation with the same tools as software — version control, pull requests, CI, plain text formats. Popularized by the Write the Docs community. Key figures: Tom Preston-Werner (Jekyll, 2008), Eric Holscher (Read the Docs), Anne Gentle (Docs Like Code, 2017), Andrew Etter (Modern Technical Writing, 2016), Riona MacNamara (docs-as-code at Google).

Living documentation

Cyrille Martraire's Living Documentation (2019) argues that documentation should evolve at the same pace as the system it describes. His framework generates docs from code annotations and tests. This project extends the idea: the knowledge isn't derived from code, and verification uses AI models rather than test suites.

GitOps

Coined by Weaveworks (2017). Git as single source of truth, with automated agents that detect drift between declared state and actual state, then reconcile. Originally for infrastructure — but the pattern maps directly to knowledge:

GitOps (infrastructure)	Knowledge as Code
YAML declares desired state	Markdown declares what's true
Controller detects drift	AI cascade detects drift from reality
Auto-reconciliation or alert	GitHub issues for human review
Git as single source of truth	Git as single source of truth

Anti-entropy

In distributed systems (Dynamo, Cassandra), anti-entropy is the process that detects and repairs divergence from desired state. The scheduled verification cascade is an anti-entropy process for knowledge — it finds where reality has moved away from what the files say and flags the gap.

Multi-model verification

Academic foundations for using multiple AI models as cross-checking judges:

Zheng et al., "Judging LLM-as-a-Judge" (NeurIPS 2023) — established AI-as-evaluator with identified biases
Verga et al., "PoLL: Panel of LLM Evaluators" (2024) — multiple smaller models outperform a single large judge
Du et al., "Multiagent Debate" (2023) — LLM instances debate to reduce hallucinations
Huang & Zhou, "LLMs Cannot Self-Correct Reasoning Yet" (ICLR 2024) — validates why multi-model beats single-model

Wiki philosophy

Ward Cunningham's original wiki (1995) established the idea of knowledge that evolves incrementally, maintained by a community, with every change tracked. Mike Caulfield's "The Garden and the Stream" (2015) distinguished gardens (networked, iterative knowledge) from streams (chronological feeds). The digital gardens movement extended this into personal knowledge management. This project is a tended garden with AI gardeners.

What we think is new

We haven't found prior art for these specific applications. If you know of any, please tell us:

"Knowledge as Code" as a named pattern — the "-as-code" lineage (infrastructure, policy, docs, everything) is well-established, but this specific application to maintained knowledge bases doesn't appear to be named
AI verification cascades for documentation — multi-model evaluation exists in academic literature, but applying it as a scheduled process to maintain a knowledge base's factual accuracy
Multi-format output from the same plain text — HTML + JSON API + MCP endpoints + SEO bridge pages, all from markdown/YAML, with zero dependencies
Ontology-driven static site generation — using a formal taxonomy to drive site structure, navigation, and programmatic pages

Try it

The entire project is open source. There is nothing to install.

git clone https://github.com/snapsynapse/ai-tool-watch.git
cd ai-tool-watch
node scripts/build.js
open docs/index.html

To understand the architecture: design/ARCHITECTURE_PATTERNS.md

To understand the data model: design/ONTOLOGY.md

To understand the verification system: VERIFICATION.md

To contribute: CONTRIBUTING.md

This page is itself an instance of the pattern. It lives in a Git repo, gets deployed by CI, and will be verified alongside everything else. If something here is wrong or incomplete, open an issue or start a discussion.