Synthesize API

The Synthesize endpoint retrieves compressed, relevant context from all memory layers for a given query. This is how your agent recalls what it has learned.

When you call Synthesize, the retrieval engine searches across recent events, compiled artifacts, and entity knowledge. It ranks results using an 8-signal scoring system (semantic similarity, temporal recency, frequency, salience, provenance quality, entity overlap, session relevance, and confidence). Results are then packed into a context pack that fits within your specified token budget.

The output is organized into sections (procedures, failures, decisions, facts, causal patterns) so your agent receives structured, actionable context rather than a raw dump of memories.

POST /v1/synthesize

curl -X POST https://api.hippocortex.dev/v1/synthesize \
  -H "Authorization: Bearer hx_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "deploy payment service to production",
    "options": {
      "maxTokens": 8000,
      "sections": ["procedures", "failures", "decisions"],
      "minConfidence": 0.5,
      "includeProvenance": true
    }
  }'

Request Body

Field	Type	Default	Description
`query`	string	(required)	The query to synthesize context for
`options.maxTokens`	number	4000	Token budget for the response
`options.sections`	string[]	all	Which sections to include
`options.minConfidence`	number	0.3	Minimum confidence threshold (0-1)
`options.includeProvenance`	boolean	true	Attach source references

Available Sections

Section	What It Contains
`procedures`	Relevant task schemas and step sequences
`failures`	Failure playbooks matching the query
`decisions`	Decision policies and conditional rules
`facts`	Known facts and entity information
`causal`	Causal patterns and relationships
`context`	General background context

Response

{
  "ok": true,
  "data": {
    "packId": "pack-abc123",
    "entries": [
      {
        "section": "procedures",
        "content": "To deploy the payment service to production:\n1. Run test suite\n2. Build Docker image\n3. Push to registry\n4. Update deployment\n5. Verify health check",
        "confidence": 0.87,
        "provenance": [
          {
            "sourceType": "artifact",
            "sourceId": "art-deploy-001",
            "artifactType": "task_schema",
            "evidenceCount": 12
          }
        ]
      },
      {
        "section": "failures",
        "content": "Known issue: deployment can fail if Redis connection pool is exhausted. Recovery: restart Redis and increase pool size.",
        "confidence": 0.92,
        "provenance": [...]
      }
    ],
    "budget": {
      "limit": 8000,
      "used": 3200,
      "compressionRatio": 12.5,
      "entriesIncluded": 5,
      "entriesDropped": 2
    }
  }
}

The budget object shows how the token budget was allocated. compressionRatio indicates how much raw knowledge was compressed to fit. entriesDropped shows how many lower-ranked entries were excluded to stay within budget.

How Retrieval Works

The retrieval engine evaluates each candidate memory against 8 signals:

Semantic similarity to the query
Temporal recency (recent memories score higher)
Frequency of the pattern across events
Salience of the source events
Provenance quality (how well-supported the knowledge is)
Entity overlap with the query
Session relevance (same session context scores higher)
Confidence of the compiled artifact

Results are sorted by combined score, then packed into sections until the token budget is exhausted. Higher-confidence entries are included first.

Best Practices

Set appropriate token budgets. Account for your model's context window minus your prompt and expected output.
Use section filters when you only need specific types of context (e.g., only procedures and failures for a deployment task).
Check the budget response. If entriesDropped is high, consider increasing maxTokens or narrowing the query.
Use provenance for debugging. The sourceId links back to the artifact that produced each entry, which traces further to source events.