Context Management
Token tracking, automatic summarization, and context compactification with @supyagent/sdk/context.
Context Management
The @supyagent/sdk/context sub-package provides a ContextManager that tracks token usage across LLM calls and automatically summarizes older messages when the context window gets full.
Setup
import { createContextManager } from '@supyagent/sdk/context';
import { anthropic } from '@ai-sdk/anthropic';
const contextManager = createContextManager({
maxTokens: 128_000,
summaryModel: anthropic('claude-haiku-4-5-20251001'),
});Options
| Option | Type | Default | Description |
|---|---|---|---|
maxTokens | number | 128000 | Model's context window size |
softThreshold | number | 0.75 | Fraction of maxTokens — triggers background summarization |
hardThreshold | number | 0.90 | Fraction of maxTokens — triggers blocking compactification |
responseReserve | number | 4096 | Tokens reserved for the model's response |
minRecentMessages | number | 4 | Messages to always keep (never summarized) |
summaryModel | AI SDK model | — | Model used for summarization (required unless compactify is provided) |
summaryPrompt | string | — | Custom prompt for the summarizer |
compactify | function | — | Custom compactification function (replaces default summarizer) |
estimateTokens | function | ~4 chars/token | Custom token estimation function |
How It Works
The context manager uses a two-threshold system:
0% 75% 90% 100%
├─────── Normal ──────┼── Summarize ───┼── Compact ───┤
soft hard- Below soft threshold (75%) — Normal operation, no action needed
- Between soft and hard (75-90%) —
shouldSummarize()returnstrue, trigger background summarization - Above hard threshold (90%) —
shouldCompactify()returnstrue, block and compactify before the next LLM call
When compactifying, the manager:
- Keeps the last
minRecentMessagesmessages intact - Summarizes all older messages into a single summary message
- Inserts the summary as an assistant message with
type: "context-summary"metadata
ContextManager Interface
getState()
Returns the current context state:
const state = contextManager.getState();
// {
// totalInputTokens: number,
// totalOutputTokens: number,
// estimatedContextSize: number,
// maxTokens: number,
// usageRatio: number, // 0–1
// softThresholdExceeded: boolean,
// hardThresholdExceeded: boolean,
// summaryCount: number,
// }recordUsage(usage)
Record token usage from an LLM step. Call from onStepFinish:
contextManager.recordUsage({
inputTokens: step.usage.inputTokens,
outputTokens: step.usage.outputTokens,
});shouldSummarize(messages) / shouldCompactify(messages)
Check whether the conversation needs compactification:
if (contextManager.shouldCompactify(messages)) {
messages = await contextManager.compactify(messages);
}compactify(messages)
Summarize older messages and return a compacted message list:
const compacted = await contextManager.compactify(messages);prepareMessages(messages, systemPrompt)
Prepares messages for the LLM call — finds the last summary message, drops everything before it, and injects the summary into the system prompt:
const { messages: prepared, systemPrompt: updatedPrompt } =
await contextManager.prepareMessages(messages, systemPrompt);getMessageMetadata()
Returns metadata to attach to streamed responses (cumulative token counts and usage ratio):
return result.toUIMessageStreamResponse({
messageMetadata: contextManager.getMessageMetadata(),
});reset()
Reset internal state (e.g., when starting a new chat):
contextManager.reset();Full Integration Example
import { streamText, type UIMessage } from 'ai';
import { createContextManager } from '@supyagent/sdk/context';
import { anthropic } from '@ai-sdk/anthropic';
const contextManager = createContextManager({
maxTokens: 200_000,
summaryModel: anthropic('claude-haiku-4-5-20251001'),
});
export async function POST(req: Request) {
let { messages }: { messages: UIMessage[] } = await req.json();
// Compactify if context is too large
if (contextManager.shouldCompactify(messages)) {
messages = await contextManager.compactify(messages);
}
const result = streamText({
model: anthropic('claude-sonnet-4-6-20250620'),
system: 'You are a helpful assistant.',
messages,
tools,
});
return result.toUIMessageStreamResponse({
onStepFinish: async ({ usage }) => {
contextManager.recordUsage({
inputTokens: usage.inputTokens,
outputTokens: usage.outputTokens,
});
},
});
}React UI Components
The @supyagent/sdk/react package includes components for displaying context state in the UI:
ContextIndicator— Shows a visual indicator of context usage (percentage bar)SummaryMessage— Renders summary messages with metadata (messages summarized, token savings)isContextSummary(message)— Check if a message is a context summary
See React Components for details.
What's Next
- React Components — UI components for context indicators and tool rendering
- Full API Route Example — See context management in a complete endpoint