Onboarding: Gateway (Any Provider)
Add persistent memory to any application that uses the OpenAI SDK. Works with OpenAI, Anthropic, Google Gemini, Groq, Together, Mistral, Fireworks, Ollama, and any OpenAI-compatible provider.
Best path: Gateway (change one URL) Time: 2 minutes Reliability: ~99% (fully automatic with graceful fallback)
The Gateway runs the full pipeline: capture, synthesize (semantic search, graph retrieval, collective brain, behavioral context), learn, and vault — all automatically.
Step 1: Get an API key
Sign up at dashboard.hippocortex.dev. Copy your API key (hx_live_...).
Step 2: Change your base URL
Point your OpenAI client at the Hippocortex Gateway and pass your LLM provider credentials as headers.
OpenAI:
from openai import OpenAI
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "sk-...",
},
)
Anthropic:
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "sk-ant-...",
"X-LLM-Base-URL": "https://api.anthropic.com",
},
)
Google Gemini:
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "your-google-ai-key",
"X-LLM-Base-URL": "https://generativelanguage.googleapis.com/v1beta/openai",
},
)
Groq:
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "gsk_...",
"X-LLM-Base-URL": "https://api.groq.com/openai",
},
)
Together:
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "tog_...",
"X-LLM-Base-URL": "https://api.together.xyz",
},
)
Mistral:
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "your-mistral-key",
"X-LLM-Base-URL": "https://api.mistral.ai/v1",
},
)
Fireworks:
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "your-fireworks-key",
"X-LLM-Base-URL": "https://api.fireworks.ai/inference/v1",
},
)
Ollama (local):
client = OpenAI(
base_url="https://api.hippocortex.dev/v1",
api_key="hx_live_...",
default_headers={
"X-LLM-API-Key": "unused",
"X-LLM-Base-URL": "http://localhost:11434",
},
)
Step 3: Use normally
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
Every call is now captured automatically. Relevant past context is injected into every request automatically. Compilation runs in the background.
Step 4: Verify
Check dashboard.hippocortex.dev to see captured events and compiled knowledge artifacts.
Headers reference
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer hx_live_... (your Hippocortex key) |
X-LLM-API-Key | Yes | Your LLM provider's API key |
X-LLM-Base-URL | No | Provider's base URL (defaults to OpenAI) |
X-LLM-Model | No | Override the model in the request body |
X-Hippocortex-Session | No | Custom session ID for grouping conversations |
Streaming
Works with stream=True. Tokens stream back in real time. The full response is captured after the stream completes.
TypeScript
The same pattern works with the TypeScript OpenAI SDK:
import OpenAI from 'openai'
const client = new OpenAI({
baseURL: 'https://api.hippocortex.dev/v1',
apiKey: 'hx_live_...',
defaultHeaders: {
'X-LLM-API-Key': 'sk-...',
},
})