Synthesize API
The Synthesize endpoint retrieves compressed, relevant context from all memory layers for a given query. This is how your agent recalls what it has learned.
When you call Synthesize, the retrieval engine searches across recent events, compiled artifacts, and entity knowledge. It ranks results using an 8-signal scoring system (semantic similarity, temporal recency, frequency, salience, provenance quality, entity overlap, session relevance, and confidence). Results are then packed into a context pack that fits within your specified token budget.
The output is organized into sections (procedures, failures, decisions, facts, causal patterns) so your agent receives structured, actionable context rather than a raw dump of memories.
POST /v1/synthesize
curl -X POST https://api.hippocortex.dev/v1/synthesize \
-H "Authorization: Bearer hx_live_..." \
-H "Content-Type: application/json" \
-d '{
"query": "deploy payment service to production",
"options": {
"maxTokens": 8000,
"sections": ["procedures", "failures", "decisions"],
"minConfidence": 0.5,
"includeProvenance": true
}
}'
Request Body
| Field | Type | Default | Description |
|---|---|---|---|
query | string | (required) | The query to synthesize context for |
options.maxTokens | number | 4000 | Token budget for the response |
options.sections | string[] | all | Which sections to include |
options.minConfidence | number | 0.3 | Minimum confidence threshold (0-1) |
options.includeProvenance | boolean | true | Attach source references |
Available Sections
| Section | What It Contains |
|---|---|
procedures | Relevant task schemas and step sequences |
failures | Failure playbooks matching the query |
decisions | Decision policies and conditional rules |
facts | Known facts and entity information |
causal | Causal patterns and relationships |
context | General background context |
Response
{
"ok": true,
"data": {
"packId": "pack-abc123",
"entries": [
{
"section": "procedures",
"content": "To deploy the payment service to production:\n1. Run test suite\n2. Build Docker image\n3. Push to registry\n4. Update deployment\n5. Verify health check",
"confidence": 0.87,
"provenance": [
{
"sourceType": "artifact",
"sourceId": "art-deploy-001",
"artifactType": "task_schema",
"evidenceCount": 12
}
]
},
{
"section": "failures",
"content": "Known issue: deployment can fail if Redis connection pool is exhausted. Recovery: restart Redis and increase pool size.",
"confidence": 0.92,
"provenance": [...]
}
],
"budget": {
"limit": 8000,
"used": 3200,
"compressionRatio": 12.5,
"entriesIncluded": 5,
"entriesDropped": 2
}
}
}
The budget object shows how the token budget was allocated. compressionRatio indicates how much raw knowledge was compressed to fit. entriesDropped shows how many lower-ranked entries were excluded to stay within budget.
How Retrieval Works
The retrieval engine evaluates each candidate memory against 8 signals:
- Semantic similarity to the query
- Temporal recency (recent memories score higher)
- Frequency of the pattern across events
- Salience of the source events
- Provenance quality (how well-supported the knowledge is)
- Entity overlap with the query
- Session relevance (same session context scores higher)
- Confidence of the compiled artifact
Results are sorted by combined score, then packed into sections until the token budget is exhausted. Higher-confidence entries are included first.
Best Practices
- Set appropriate token budgets. Account for your model's context window minus your prompt and expected output.
- Use section filters when you only need specific types of context (e.g., only procedures and failures for a deployment task).
- Check the budget response. If
entriesDroppedis high, consider increasingmaxTokensor narrowing the query. - Use provenance for debugging. The
sourceIdlinks back to the artifact that produced each entry, which traces further to source events.