Skip to content

Memory infrastructure for AI agents

Capture events → compile knowledge → return compressed reasoning context.

Works with any LLM. Pick your path:
Get Started
Choose your agent
import { AgentMemory } from "@hippocortex/sdk"

const memory = new AgentMemory()
memory.capture(event)
const context = await memory.synthesize(query)
agent.run(context)
403events/sec
Ingestion throughput
18msp50
Context synthesis
100,000x
Memory compression
0LLM calls
Fully deterministic
Works with
OpenAI
🦜LangChain
LangGraph
CrewAI
AutoGen
🦞OpenClaw
THE MEMORY PIPELINE

Six Stages of Agent Memory

Each node in the brain represents a stage in the memory pipeline. Events flow through capture, learning, compilation, prediction, transfer, and synthesis.

memory.capture({
  type: "tool_call",
  agent: "deploy-bot",
  payload: { fn: "deploy", target: "staging" }
})
await memory.learn({
  scope: "session",
  minConfidence: 0.7,
  extractors: ["procedures", "preferences"]
})
const artifacts = await memory.compile({
  type: "procedure",
  topic: "deployment",
  sources: ["episodic", "semantic"]
})
memory.predict({
  agent: "deploy-bot",
  horizon: "next_action",
  warmCache: true
})
await memory.transfer({
  from: "deploy-bot",
  to: "monitor-bot",
  artifacts: ["deployment-procedures"]
})
const ctx = await memory.synthesize({
  query: "deploy payments to staging",
  budget: 4000,
  include: ["procedures", "failures"]
})
THE PROBLEM

Why Your Agents Need Memory Infrastructure

Massive Token Waste

Agents dump entire conversation histories into context windows. 200k tokens of raw chat for a task that needs 2kb of reasoning. You're burning 99% of your token budget on noise.

Catastrophic Forgetting

Your agent solved this exact problem yesterday. Today it starts from scratch - no memory of what worked, what failed, or why decisions were made. Every session is day one.

Reasoning Instability

Without structured memory, agents produce different outputs for the same inputs. No determinism, no provenance, no way to trace why a decision was made.

Slow Context Assembly

RAG retrieves similar text, not relevant knowledge. Vector similarity ≠ reasoning utility. Your agents spend more time searching than thinking.

Without Hippocortex
200k
tokens of raw chat history
~99% noise, redundancy, and irrelevant context
With Hippocortex
~2kb
compressed reasoning pack
Procedures, facts, and provenance - nothing else
> agent.getContext("deploy to staging")
Returning 197,432 tokens of raw history... $0.59/query
> memory.synthesize("deploy to staging", { budget: 4000 })
Returning 1,847 tokens: 3 procedures, 2 failure warnings, 1 preference. $0.006/query
SYSTEM ARCHITECTURE

How It Fits Together

Hippocortex sits between your agent and the LLM. Events flow in, compressed reasoning context flows out.

High-Level Flow
Agent
Your AI agent
Hippocortex
Memory layer
LLM
Any model
Expanded Pipeline
Capture
Event ingestion
Compile
Artifact builder
Artifacts
Knowledge store
Context Pack
Compressed output
Agent Reasoning
Enhanced output
Input
  • Agent events (messages, tool calls)
  • Session metadata and timestamps
  • Structured payloads with schemas
Processing
  • Event queue with ordered processing
  • Deterministic compilation pipeline
  • Redis-backed predictive cache
Output
  • Compressed reasoning context
  • Provenance-tagged artifacts
  • Token-budget-optimized packs
INFRASTRUCTURE

Built for Production

Six infrastructure components that handle the full lifecycle of agent memory. Redis for speed. PostgreSQL for durability. Rust for determinism.

Agent SDK

TypeScript / Python

TypeScript and Python clients. Three function calls: capture(), learn(), synthesize(). Framework-agnostic with adapters for OpenAI Agents, LangGraph, CrewAI, and more.

Capture API

REST / gRPC

RESTful event ingestion endpoint. Accepts structured events with automatic dedup, salience scoring, and schema validation. Sustained 403 events/sec.

Event Queue

Redis Streams

Ordered event processing with at-least-once delivery. Idempotent workers ensure deterministic processing regardless of retries or failures.

Memory Compiler

Rust Core

Deterministic pattern extraction engine. Replays event traces, identifies procedures, detects contradictions, and produces versioned knowledge artifacts.

Artifacts Store

PostgreSQL

Persistent storage for compiled knowledge artifacts. Full-text search, temporal indexing, and provenance tracking. GDPR-ready with hard deletion.

Context Synthesizer

Redis + Rust

Multi-signal ranking engine with token-budget-aware assembly. 18ms p50 latency. Predictive cache pre-warms common query patterns.

Infrastructure Stack
Rust workers
Compute
Redis
Cache
PostgreSQL
Storage
Redis Streams
Queue
OPEN STANDARD

Built on the Open HMX Protocol

HMX is an open, language-agnostic wire protocol for agent memory interoperability. No vendor lock-in. Full portability.

HMX-E

HMX Events

The atomic unit of observable information. 13 event types from messages to state changes.

HMX-A

HMX Artifacts

Compiled knowledge units. Task schemas, failure playbooks, decision policies, causal patterns.

HMX-F

HMX Fingerprints

Portable compressed memory state. Export an agent's entire learned knowledge as a single transferable object.

HMX-C

HMX Context Packs

Token-budget-aware reasoning inputs. Multi-signal ranking with provenance tracking.

HMX-V

HMX Versioning

Semantic versioning with forward/backward compatibility negotiation.

Minimal HMX Event
{
  "hmx_version": "HMX-1.0",
  "event_type": "tool_call",
  "agent_id": "agent-ops-1",
  "session_id": "sess_abc123",
  "timestamp": "2026-03-15T00:12:34Z",
  "sequence": 4,
  "content": {
    "tool_name": "deploy",
    "input": { "service": "payments", "env": "staging" }
  }
}

One protocol. Every framework. Full interoperability.

INTEGRATIONS

Works With Every Framework

Three function calls. Any language. Any framework. Drop in Hippocortex without rewriting your agent.

gateway.py
from openai import OpenAI

# Change your base URL. That's it. ~99% reliability.
client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "sk-...",  # Your OpenAI key
    },
)

# Use normally. Every call now has memory.
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Deploy payments to staging"}]
)
# Works with OpenAI, Anthropic, Groq, Together, Ollama, Mistral...
GatewayOpenAIAnthropicClaude CodeCursorWindsurfOpenClawLangGraphCrewAIAutoGenGroqTogetherOllamaMistralPythonNode.js
PERFORMANCE

Numbers, Not Promises

0/sec
Event ingestion
sustained throughput
0ms
Context synthesis
p50 latency
0%+
Compression ratio
vs. raw retrieval
0
Internal LLM calls
fully deterministic
Deterministic compilation
Same inputs → same artifacts. Always.
Predictive context cache
Pre-warms artifacts before queries arrive.
Token-budget optimization
Specify your limit, get optimal context.
Full provenance tracking
Every fact traceable to source evidence.
GDPR-ready deletion
Hard deletion with tombstones and audit trail.
Idempotent processing
Safe retries - no duplicate side effects.
SECURITY

Enterprise-Grade Security

Agent memory contains sensitive operational data. Every layer of Hippocortex is built with security as a first-class constraint.

Tenant Isolation

Complete data isolation between tenants. Every query is scoped. No cross-tenant data leakage by design.

API Key Authentication

Scoped API keys with fine-grained permissions. Key rotation without downtime. Full audit trail on every request.

Rate Limiting

Per-key and per-tenant rate limits. Configurable burst and sustained thresholds. Graceful degradation under load.

Idempotent Workers

Every processing step is idempotent. Safe retries, no duplicate side effects. Deterministic output regardless of failures.

Request Hardening

Input validation, payload size limits, schema enforcement, and request signing. Defense in depth at every layer.

Data Encryption

TLS 1.3 in transit. AES-256 at rest. Encryption keys managed per-tenant with regular rotation.

COMPARISON

How Hippocortex Compares

Not a wrapper around vector search. Infrastructure-grade memory compilation that no existing tool provides.

Feature
LangChain Memory
Mem0
Zep
Hippocortex
Artifact Compilation
None
None
Basic summaries
Deterministic typed artifacts
Predictive Context
None
None
None
Pre-warmed cache from patterns
Adaptive Learning
Static after indexing
Manual updates
Session summaries
Continuous pattern mining
Memory Compression
None
Basic
Summarization
60%+ deterministic compression
Provenance
Embeddings opaque
Partial
Session-level
Full chain to source
LLM Dependency
Embedding model required
LLM for extraction
LLM for summaries
Zero internal LLM calls
Determinism
Probabilistic
Non-deterministic
Non-deterministic
Same inputs → same outputs
Token Budgets
Manual truncation
Not supported
Not supported
Budget-aware synthesis
PRICING

Start Free. Scale When Ready.

No credit card required. Free tier is real — not a 7-day trial. Upgrade when your agents need more memory.

Most Popular

Pro

€9.99/mo

For developers who need the sharpest memory

  • 25K captures/month
  • 5K synthesize calls/day
  • 50 learn calls/day
  • Unlimited agents & API keys
  • Collective Brain access
  • 10-min memory distillation
  • Knowledge Graph & LLM-enhanced NER
  • Smart Retrieval with query enhancement
  • Encrypted Vault & auto-secret detection
  • Add-on Hub integrations
  • Priority support (24h SLA)

Free

€0/mo

For prototyping and exploration

  • 5K captures/month
  • 75 synthesize calls/day
  • 7 learn calls/day
  • 3 agents, 1 API key
  • 5K stored memories
  • Instant raw memory search
  • Auto-instrumentation
  • Community support
Best Value

Unlimited

€29.99/mo

Unlimited everything. No limits.

  • Everything in Pro
  • Unlimited capture events
  • Unlimited synthesize calls
  • Unlimited learn calls
  • Unlimited stored memories
  • Advanced analytics dashboard
  • Higher API rate limits (6K/min)
  • Priority support (12h SLA)

Team

€299/mo

For teams building production AI. Up to 10 seats.

  • Everything in Unlimited
  • Up to 10 seats included
  • +€10/mo per additional seat
  • Organizations, teams & RBAC
  • Namespaces & full audit logs
  • Memory lineage & lifecycle policies
  • Behavioral intelligence
  • Priority support (12h SLA)

All plans include REST API, JS/TS SDK, auto-instrumentation, and dashboard access. No hidden fees.

Install Hippocortex in seconds.

Free tier. No credit card. Production-ready memory for your AI agents.