Onboarding: Gateway (Any Provider)

Add persistent memory to any application that uses the OpenAI SDK. Works with OpenAI, Anthropic, Google Gemini, Groq, Together, Mistral, Fireworks, Ollama, and any OpenAI-compatible provider.

Best path: Gateway (change one URL) Time: 2 minutes Reliability: ~99% (fully automatic with graceful fallback)

The Gateway runs the full pipeline: capture, synthesize (semantic search, graph retrieval, collective brain, behavioral context), learn, and vault — all automatically.

Step 1: Get an API key

Step 2: Change your base URL

Point your OpenAI client at the Hippocortex Gateway and pass your LLM provider credentials as headers.

OpenAI:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "sk-...",
    },
)

Anthropic:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "sk-ant-...",
        "X-LLM-Base-URL": "https://api.anthropic.com",
    },
)

Google Gemini:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "your-google-ai-key",
        "X-LLM-Base-URL": "https://generativelanguage.googleapis.com/v1beta/openai",
    },
)

Groq:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "gsk_...",
        "X-LLM-Base-URL": "https://api.groq.com/openai",
    },
)

Together:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "tog_...",
        "X-LLM-Base-URL": "https://api.together.xyz",
    },
)

Mistral:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "your-mistral-key",
        "X-LLM-Base-URL": "https://api.mistral.ai/v1",
    },
)

Fireworks:

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "your-fireworks-key",
        "X-LLM-Base-URL": "https://api.fireworks.ai/inference/v1",
    },
)

Ollama (local):

client = OpenAI(
    base_url="https://api.hippocortex.dev/v1",
    api_key="hx_live_...",
    default_headers={
        "X-LLM-API-Key": "unused",
        "X-LLM-Base-URL": "http://localhost:11434",
    },
)

Step 3: Use normally

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Every call is now captured automatically. Relevant past context is injected into every request automatically. Compilation runs in the background.

Step 4: Verify

Check dashboard.hippocortex.dev to see captured events and compiled knowledge artifacts.

Headers reference

Header	Required	Description
`Authorization`	Yes	`Bearer hx_live_...` (your Hippocortex key)
`X-LLM-API-Key`	Yes	Your LLM provider's API key
`X-LLM-Base-URL`	No	Provider's base URL (defaults to OpenAI)
`X-LLM-Model`	No	Override the model in the request body
`X-Hippocortex-Session`	No	Custom session ID for grouping conversations

Streaming

Works with stream=True. Tokens stream back in real time. The full response is captured after the stream completes.

TypeScript

The same pattern works with the TypeScript OpenAI SDK:

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://api.hippocortex.dev/v1',
  apiKey: 'hx_live_...',
  defaultHeaders: {
    'X-LLM-API-Key': 'sk-...',
  },
})