Supyagent
SDK (Node.js)

Context Management

Token tracking, automatic summarization, and context compactification with @supyagent/sdk/context.

Context Management

The @supyagent/sdk/context sub-package provides a ContextManager that tracks token usage across LLM calls and automatically summarizes older messages when the context window gets full.

Setup

import { createContextManager } from '@supyagent/sdk/context';
import { anthropic } from '@ai-sdk/anthropic';

const contextManager = createContextManager({
  maxTokens: 128_000,
  summaryModel: anthropic('claude-haiku-4-5-20251001'),
});

Options

OptionTypeDefaultDescription
maxTokensnumber128000Model's context window size
softThresholdnumber0.75Fraction of maxTokens — triggers background summarization
hardThresholdnumber0.90Fraction of maxTokens — triggers blocking compactification
responseReservenumber4096Tokens reserved for the model's response
minRecentMessagesnumber4Messages to always keep (never summarized)
summaryModelAI SDK modelModel used for summarization (required unless compactify is provided)
summaryPromptstringCustom prompt for the summarizer
compactifyfunctionCustom compactification function (replaces default summarizer)
estimateTokensfunction~4 chars/tokenCustom token estimation function

How It Works

The context manager uses a two-threshold system:

0%                    75%              90%            100%
├─────── Normal ──────┼── Summarize ───┼── Compact ───┤
                     soft            hard
  1. Below soft threshold (75%) — Normal operation, no action needed
  2. Between soft and hard (75-90%)shouldSummarize() returns true, trigger background summarization
  3. Above hard threshold (90%)shouldCompactify() returns true, block and compactify before the next LLM call

When compactifying, the manager:

  • Keeps the last minRecentMessages messages intact
  • Summarizes all older messages into a single summary message
  • Inserts the summary as an assistant message with type: "context-summary" metadata

ContextManager Interface

getState()

Returns the current context state:

const state = contextManager.getState();
// {
//   totalInputTokens: number,
//   totalOutputTokens: number,
//   estimatedContextSize: number,
//   maxTokens: number,
//   usageRatio: number,        // 0–1
//   softThresholdExceeded: boolean,
//   hardThresholdExceeded: boolean,
//   summaryCount: number,
// }

recordUsage(usage)

Record token usage from an LLM step. Call from onStepFinish:

contextManager.recordUsage({
  inputTokens: step.usage.inputTokens,
  outputTokens: step.usage.outputTokens,
});

shouldSummarize(messages) / shouldCompactify(messages)

Check whether the conversation needs compactification:

if (contextManager.shouldCompactify(messages)) {
  messages = await contextManager.compactify(messages);
}

compactify(messages)

Summarize older messages and return a compacted message list:

const compacted = await contextManager.compactify(messages);

prepareMessages(messages, systemPrompt)

Prepares messages for the LLM call — finds the last summary message, drops everything before it, and injects the summary into the system prompt:

const { messages: prepared, systemPrompt: updatedPrompt } =
  await contextManager.prepareMessages(messages, systemPrompt);

getMessageMetadata()

Returns metadata to attach to streamed responses (cumulative token counts and usage ratio):

return result.toUIMessageStreamResponse({
  messageMetadata: contextManager.getMessageMetadata(),
});

reset()

Reset internal state (e.g., when starting a new chat):

contextManager.reset();

Full Integration Example

app/api/chat/route.ts
import { streamText, type UIMessage } from 'ai';
import { createContextManager } from '@supyagent/sdk/context';
import { anthropic } from '@ai-sdk/anthropic';

const contextManager = createContextManager({
  maxTokens: 200_000,
  summaryModel: anthropic('claude-haiku-4-5-20251001'),
});

export async function POST(req: Request) {
  let { messages }: { messages: UIMessage[] } = await req.json();

  // Compactify if context is too large
  if (contextManager.shouldCompactify(messages)) {
    messages = await contextManager.compactify(messages);
  }

  const result = streamText({
    model: anthropic('claude-sonnet-4-6-20250620'),
    system: 'You are a helpful assistant.',
    messages,
    tools,
  });

  return result.toUIMessageStreamResponse({
    onStepFinish: async ({ usage }) => {
      contextManager.recordUsage({
        inputTokens: usage.inputTokens,
        outputTokens: usage.outputTokens,
      });
    },
  });
}

React UI Components

The @supyagent/sdk/react package includes components for displaying context state in the UI:

  • ContextIndicator — Shows a visual indicator of context usage (percentage bar)
  • SummaryMessage — Renders summary messages with metadata (messages summarized, token savings)
  • isContextSummary(message) — Check if a message is a context summary

See React Components for details.

What's Next