Onboarding: Gateway (Any Provider)

Add persistent memory to any application that uses the OpenAI SDK. Works with OpenAI, Anthropic, Google Gemini, Groq, Together, Mistral, Fireworks, Ollama, and any OpenAI-compatible provider.

Best path: Gateway (change one URL) Time: 2 minutes Reliability: ~99% (fully automatic with graceful fallback)

The Gateway runs the full pipeline: capture, synthesize (semantic search, graph retrieval, collective brain, behavioral context), learn, and vault — all automatically.


Step 1: Get an API key

Sign up at dashboard.hippocortex.dev. Copy your API key (hx_live_...).

Step 2: Change your base URL

Point your OpenAI client at the Hippocortex Gateway and pass your LLM provider credentials as headers.

OpenAI:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "sk-...",
    },
)

Anthropic:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "sk-ant-...",
        "X-LLM-Base-URL": "https://api.anthropic.com",
    },
)

Google Gemini:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "your-google-ai-key",
        "X-LLM-Base-URL": "https://generativelanguage.googleapis.com/v1beta/openai",
    },
)

Groq:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "gsk_...",
        "X-LLM-Base-URL": "https://api.groq.com/openai",
    },
)

Together:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "tog_...",
        "X-LLM-Base-URL": "https://api.together.xyz",
    },
)

Mistral:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "your-mistral-key",
        "X-LLM-Base-URL": "https://api.mistral.ai/v1",
    },
)

Fireworks:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "your-fireworks-key",
        "X-LLM-Base-URL": "https://api.fireworks.ai/inference/v1",
    },
)

Ollama (local):

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "unused",
        "X-LLM-Base-URL": "http://localhost:11434",
    },
)

Step 3: Use normally

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Every call is now captured automatically. Relevant past context is injected into every request automatically. Compilation runs in the background.

Step 4: Verify

Check dashboard.hippocortex.dev to see captured events and compiled knowledge artifacts.


Headers reference

HeaderRequiredDescription
AuthorizationYesBearer hx_live_... (your Hippocortex key)
X-LLM-API-KeyYesYour LLM provider's API key
X-LLM-Base-URLNoProvider's base URL (defaults to OpenAI)
X-LLM-ModelNoOverride the model in the request body
X-Hippocortex-SessionNoCustom session ID for grouping conversations

Streaming

Works with stream=True. Tokens stream back in real time. The full response is captured after the stream completes.

TypeScript

The same pattern works with the TypeScript OpenAI SDK:

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://api.hippocortex.dev/v1',
  apiKey: 'hx_live_...',
  defaultHeaders: {
    'X-LLM-API-Key': 'sk-...',
  },
})