Every capability,
with its tradeoff.
Not a feature checklist. Each item below names the decision it implies, the failure mode it prevents, and the cost it accepts.
Knowledge graph,
not text soup.
An LLM pipeline turns every episode into typed entities, temporal facts, and candidate procedures. No manual tagging. No "vectorize and hope".
Entity extraction
Each episode runs through gpt-5-mini with a constrained taxonomy: person, organization, project, service, technology, pattern, environment, document, metric, decision. Up to 10 entities per episode, each with known aliases.
The model is told to use the most specific common name — "Webhook Gateway", not "event delivery system". Generic concepts are rejected: authentication is not an entity, but Webhook Signature Validation Pattern is.
↳ Keeps the graph mergeable. Collisions are not.
Three-pass resolution
Prefer fragmentation over collision. Two records for the same thing can be merged with one UPDATE. Two things wrongly merged corrupt every fact on both.
| Pass | Method | Conf. | Condition |
|---|---|---|---|
| 1 | Exact name + type + ns | 1.00 | case-insensitive |
| 2 | Alias match | 0.95 | single match only |
| 3 | Semantic (embedding) | > 0.92 | top-2 gap > 0.03, else conflict |
Canonical predicate registry — 24 verbs, 4 categories
The LLM sees the full predicate list and is told to use canonical ones. Non-canonical predicates are tracked with occurrence counts for promotion review.
uses · used_by · contains · contained_in · depends_on · dependency_of · implements · implemented_by · integrates_with · authored · authored_by · owns · owned_by
replaced · replaced_by · decided · decided_by · deployed_to · hosts · manages · managed_by · configured_with · targets · blocks · blocked_by
Facts are
superseded, not overwritten.
Every fact carries valid_from and valid_until. A contradicting fact marks the old one superseded — never deleted. History survives forever, working memory doesn't.
Supersession in SQL
-- March 1: Customer Portal uses Semantic Kernel fact_id: a1b2… | predicate: uses | valid_until: NULL | status: observed -- March 10: same subject, new object → old fact superseded fact_id: a1b2… | predicate: uses | valid_until: 2026-03-10 | status: superseded fact_id: c3d4… | predicate: uses | valid_until: NULL | status: observed
Seven evidence statuses track the lifecycle: user_asserted → observed → extracted → inferred → promoted → deprecated → superseded. User-asserted outranks LLM-extracted. Superseded is excluded from compilation, kept for compliance.
Task-specific
context packages.
Different tasks need different memory. The compiler classifies intent, retrieves across multiple strategies, and weights memory types per task.
Dual-profile classification
Borg identifies a primary and optional secondary task class. If the confidence gap is < 0.3, both profiles run retrieval — eliminating single-path classification failure.
| Task | Retrieval profile | Episodic | Semantic | Procedural |
|---|---|---|---|---|
| debug | Graph + Episode Recall | 1.0 | 0.7 | 0.8 |
| architecture | Fact Lookup + Graph | 0.5 | 1.0 | 0.3 |
| compliance | Episode Recall + Facts | 1.0 | 0.8 | 0.0 |
| writing | Fact Lookup | 0.3 | 1.0 | 0.6 |
| chat | Fact Lookup | 0.4 | 1.0 | 0.3 |
Weight 0.0 = hard exclude. Procedural memory is excluded from compliance — candidate patterns aren't authoritative enough for audit trails.
Four-dimension ranking — no opaque composite.
Every candidate is scored on four interpretable dimensions. All four land in the audit trace.
| Dimension | Weight | How it works |
|---|---|---|
| Relevance | 0.40 | Vector similarity × memory-type weight modifier. |
| Recency | 0.25 | Linear decay over 90 days from occurred_at. |
| Stability | 0.20 | Evidence status score + salience, blended 70/30. |
| Provenance | 0.15 | procedure_assist 0.9 · fact_lookup 0.8 · graph 0.7 · episode 0.6 |
Two output formats
Structured XML for Claude, Claude Code, Copilot. Metadata lives on attributes the model can reason about.
Compact JSON for GPT, Codex, local models. No tag overhead.
Format is picked by model param, override available.
Specific facts extraction
A dedicated pass pulls the details generic extraction loses: IPs, CLI invocations, resource names, port numbers, version strings, numeric counts. Stored as structured metadata on the episode, retrievable alongside facts and procedures.
Patterns earn trust
through observation.
Each episode's procedure pass captures candidate workflows, decision rules, best practices, conventions, and troubleshooting patterns. They start weak and accumulate evidence.
Each procedure starts with confidence = 0.3 and evidence_status = extracted. When the same pattern reappears, the record is merged: observation count +1, confidence recomputed as a weighted average, source episode appended.
Procedures do not participate in compilation until promoted — which requires observation in 3+ distinct episodes and confidence ≥ 0.8.
Deliberately conservative. A pattern that appears once may be a one-off. One that appears in five conversations over two weeks is probably real practice.
Every decision,
traceable.
The audit log is the primary mechanism for improving retrieval quality. Every borg_think call writes a full trace; every extraction run logs its outcome.
Per-compile trace
Classification (primary + secondary w/ confidences), retrieval profiles executed, candidates found / selected / rejected with per-item score breakdowns, rejection reasons, compiled token count, output format, per-stage latency.
Per-episode trace
Entities extracted / resolved / new, facts extracted, custom predicates encountered, evidence strengths, procedures extracted / merged, and any errors. Everything joinable with borg_audit_log.
Hard isolation,
by default.
Every entity, fact, and episode belongs to exactly one namespace. No cross-namespace queries. Per-namespace token budgets. Cross-namespace is a future feature, not an accident.
If "APIM" appears in two projects it exists as two separate entity rows. The only way information crosses is an explicit merge. No config flag will relax the query boundary.
| Setting | Default | Notes |
|---|---|---|
| Isolation mode | hard | Structural. Not a permission check. |
| Token budget | 4 096 / ns | Compile trims to this ceiling. |
| Protected namespaces | default | Cannot be deleted. |