A memory stronghold
for your AI coding agent.
The first thing a new engineer needs is context. The same is true for your AI coding agent. Borg compiles it automatically from every prior session — one Postgres, no SDKs, no re-explaining.
Run one install command, type borg init in your project, and every AI coding session builds a knowledge graph that makes the next session smarter.
Open source. Apache 2.0. One local install.
curl -fsSL https://raw.githubusercontent.com/villanub/borgmemory/main/install.sh | sh
borg init> "What patterns do I follow when debugging auth issues?"
borg_think → classify: debug (0.92) + architecture (0.31)
→ retrieve: graph_neighborhood + episode_recall + fact_lookup
→ rank: 14 candidates → 6 selected (1,840 tokens)
→ compile: structured XML
<borg model="claude" ns="product-engineering" task="debug">
<knowledge>
<fact status="observed" salience="0.94">Webhook gateway verifies HMAC signatures before enqueue</fact>
<fact status="observed" salience="0.88">Background jobs retry with exponential backoff</fact>
</knowledge>
<episodes>
<episode source="claude-code" date="2026-03-01">Fixed duplicate webhook delivery during replay</episode>
<episode source="claude-code" date="2026-02-14">Resolved OAuth scope mismatch in staging</episode>
</episodes>
<patterns>
<procedure confidence="0.92">Debug auth: verify scopes, then inspect token audience and issuer</procedure>
</patterns>
</borg>What Borg does differently
Most memory tools store text and search by similarity. Borg extracts structured knowledge, builds a temporal graph, and compiles task-specific context packages.
Knowledge Graph Extraction
An LLM pipeline extracts entities, facts, and procedures from every conversation. Three-pass entity resolution prevents collisions. 24 canonical predicates ensure consistent relationships.
Entities are resolved by exact match → alias match → semantic similarity (0.92 threshold). Fragmentation is preferred over collision — two separate entities can be merged later.
Temporal Facts with Supersession
Facts carry valid_from and valid_until timestamps. When a new fact contradicts an old one, the old fact is marked superseded — not deleted. The full history is always available for compliance queries.
Seven evidence statuses: user_asserted, observed, extracted, inferred, promoted, deprecated, superseded.
Task-Specific Compilation
Dual-profile intent classification determines what kind of memory a query needs. Debug tasks get episodic + procedural memory. Architecture tasks get semantic facts. Compliance tasks exclude procedural entirely.
Memory-type weight modifiers bias ranking without hard exclusion (except procedural in compliance where weight = 0.0).
Inspectable Ranking
Every candidate is scored on four dimensions: relevance, recency, stability, and provenance. Every compilation logs which items were selected, which were rejected, and why.
The audit log is the primary tool for improving retrieval quality. No opaque composite scores.
Namespace Isolation
Hard isolation by default. Every entity, fact, and episode belongs to exactly one namespace. No cross-namespace queries. Configurable token budgets per namespace.
If 'APIM' appears in two projects, it exists as two separate entity records. Restrictive by design — cross-namespace is a future feature, not an accident.
PostgreSQL Maximalism
One database, no exceptions. Graph traversal via recursive CTEs. Embeddings via pgvector. Audit via pgAudit. No external graph database, no separate vector store. Nothing gets out of sync.
15 tables + 1 function. Runs on Azure PostgreSQL Flexible Server, Supabase, Neon, or any Postgres 14+.
How it works
Two pipelines that share a database but never share runtime. Online never waits for offline.
Serves borg_think queries. Latency-sensitive. Compiles context in real time.
1. Classify intent
Dual-profile — primary + secondary task class with confidence scores. Both profiles run retrieval.
2. Retrieve candidates
Up to 3 strategies in parallel: fact lookup, episode recall, graph neighborhood, procedure assist. Vector search when embeddings exist, recency fallback otherwise.
3. Rank and trim
4-dimension scoring (relevance × type weight, recency, stability + salience, provenance). Dedup by content. Trim to namespace token budget.
4. Compile package
Structured XML for Claude/Copilot, compact JSON for GPT/Codex. Model assignment via parameter.
5. Update access tracking
Batch-update entity_state and fact_state for selected candidates. Feeds hot-tier promotion.
6. Audit log
Full trace: classification, profiles executed, score breakdowns, rejection reasons, latency per stage.
Processes episodes via borg_learn. Runs asynchronously. Never blocks queries.
1. Ingest + dedup
SHA-256 content hash + source_event_id unique constraint. Duplicate episodes return existing ID.
2. Generate embedding
Azure OpenAI text-embedding-3-small (1536-dim). Gracefully skips if not configured.
3. Extract entities
LLM extracts up to 10 entities per episode with typed taxonomy and aliases.
4. Resolve entities
Three-pass: exact match → alias match → semantic (0.92 threshold). Ambiguous matches flagged as conflicts.
5. Extract facts + validate predicates
LLM extracts up to 8 fact triples. Predicates validated against 24-predicate canonical registry. Custom predicates tracked.
6. Supersession check
Same subject + predicate + different object → old fact marked superseded with valid_until.
7. Extract procedures
LLM extracts up to 3 repeatable patterns. Existing patterns merged (observation count bumped, confidence averaged).
8. Snapshot
Every 24h, hot-tier state captured for all namespaces. Enables cold-start and drift detection.
Three MCP tools
Not five. Three tools that cover the entire interaction surface.
borg_thinkCompile context for a query. Runs the full online pipeline — classify, retrieve, rank, compile. Returns structured or compact context package.
borg_think(
query: "debug webhook delivery timeout",
namespace: "product-engineering",
model: "claude",
task_hint: "debug"
)borg_learnRecord a decision, discovery, or conversation. Stored immediately, extraction happens in the background. Returns in milliseconds.
borg_learn(
content: "Decided to version event payloads through a schema registry...",
source: "claude-code",
namespace: "product-engineering"
)borg_recallSearch memory directly without compilation. Returns raw episodes, facts, and procedures. For when you want to browse, not compile.
borg_recall(
query: "release checklist",
namespace: "product-engineering",
memory_type: "semantic"
)Try it
# In Codex, Claude Code, or Kiro with Borg connected:
> "Remember that preview environments expire after 7 days unless renewed"
borg_learn → episode accepted, queued for extraction
→ worker: embedded, 2 entities extracted, 1 fact created
# Later, in the same client or a different one:
> "What's our preview environment policy?"
borg_think → classify: architecture (0.87)
→ fact_lookup: "Preview environments expire after 7 days unless renewed"
→ compiled into context package (340 tokens)
# It works across clients because they all hit the same PostgreSQL database.The stack
API Runtime
FastAPI + FastMCP 3
Streamable HTTP MCP and REST on :8080
Auth
Passthrough
No authentication — single-user, local deployment
Database
PostgreSQL 14+
Knowledge graph, pgvector embeddings, and audit trail
Extraction
OpenAI / Azure OpenAI
Supports both standard OpenAI and Azure OpenAI endpoints