Supyagent
Examples & Recipes

Data Pipeline

Build an execution mode agent for batch data processing using supyagent batch.

Data Pipeline

This example builds an execution mode agent for batch data processing. Execution agents take a single input, process it, and return a structured output -- no interactive conversation. Combined with supyagent batch, they can process hundreds of items from JSONL or CSV files.

The Agent YAML

agents/data-processor.yaml
name: data-processor
description: Processes structured data inputs and returns formatted outputs
version: "1.0"
type: execution

model:
  provider: anthropic/claude-sonnet-4-5-20250929
  temperature: 0.1              # Very low temperature for consistent output
  max_tokens: 2048

system_prompt: |
  You are a data processing agent. You receive structured input,
  process it according to the instructions, and return structured output.

  Rules:
  - Process the input exactly as specified
  - Return only the result, no conversation
  - If the input is malformed, return an error description
  - Be consistent -- same input should produce same output

tools:
  allow: []                     # No tools needed for pure data transformation

will_create_tools: false

limits:
  max_tool_calls_per_turn: 0

Input/Output Formats

Single Task

# String input
supyagent run data-processor "Classify this text: The product arrived damaged"

# JSON input
supyagent run data-processor '{"text": "The product arrived damaged", "categories": ["positive", "negative", "neutral"]}'

# JSON output
supyagent run data-processor '{"text": "Great service!"}' --output json

Output with --output json:

{
  "ok": true,
  "data": "Category: positive\nConfidence: 0.95\nReason: Expression of satisfaction with service"
}

Batch Processing from JSONL

Create an input file with one JSON object per line:

inputs.jsonl
{"text": "The product arrived damaged and customer service was unhelpful", "task": "classify_sentiment"}
{"text": "Excellent quality, fast shipping, would buy again!", "task": "classify_sentiment"}
{"text": "Average product, nothing special but works as described", "task": "classify_sentiment"}
{"text": "Terrible experience. Requesting a full refund.", "task": "classify_sentiment"}

Run the batch:

supyagent batch data-processor inputs.jsonl --output results.jsonl

Output:

-- Task 1/4: {"text": "The product arrived damaged and customer se... --
  done
-- Task 2/4: {"text": "Excellent quality, fast shipping, would bu... --
  done
-- Task 3/4: {"text": "Average product, nothing special but works... --
  done
-- Task 4/4: {"text": "Terrible experience. Requesting a full ref... --
  done

Processed 4 items (4 succeeded, 0 failed)
  Results written to results.jsonl

Batch Processing from CSV

reviews.csv
text,product_id
"Great product, love it!",SKU-001
"Broken on arrival",SKU-002
"Works fine but overpriced",SKU-003
supyagent batch data-processor reviews.csv --format csv --output results.jsonl

Specialized Data Processors

Text Classifier

agents/classifier.yaml
name: classifier
description: Classifies text into predefined categories
version: "1.0"
type: execution

model:
  provider: anthropic/claude-sonnet-4-5-20250929
  temperature: 0.0

system_prompt: |
  You are a text classifier. Given a text input, classify it into
  exactly one of the provided categories.

  Input format: {"text": "...", "categories": ["cat1", "cat2", ...]}

  Output format (return exactly this JSON):
  {"category": "chosen_category", "confidence": 0.0-1.0, "reasoning": "brief explanation"}

  Rules:
  - Choose exactly one category
  - Confidence should reflect how clearly the text fits the category
  - If none fit well, choose the closest match with low confidence
  - Return ONLY the JSON, no other text

tools:
  allow: []

Usage:

supyagent run classifier '{"text": "I need to cancel my subscription", "categories": ["billing", "technical", "product", "general"]}' --output json --quiet

Data Extractor

agents/extractor.yaml
name: extractor
description: Extracts structured data from unstructured text
version: "1.0"
type: execution

model:
  provider: anthropic/claude-sonnet-4-5-20250929
  temperature: 0.0

system_prompt: |
  You are a data extraction agent. Given unstructured text, extract
  the requested fields into structured JSON.

  Input format: {"text": "...", "fields": ["field1", "field2", ...]}

  Return a JSON object with the requested fields. Use null for fields
  that cannot be extracted. Return ONLY the JSON.

tools:
  allow: []

Summarizer

agents/summarizer.yaml
name: summarizer
description: Produces concise summaries of input text
version: "1.0"
type: execution

model:
  provider: anthropic/claude-sonnet-4-5-20250929
  temperature: 0.3
  max_tokens: 1024

system_prompt: |
  Produce a concise summary of the input text. The summary should:
  - Be 2-3 sentences for short inputs, up to a paragraph for long inputs
  - Capture the key points and main conclusion
  - Preserve important numbers, names, and dates
  - Be written in the same language as the input

tools:
  allow: []

Usage:

# Single file
supyagent run summarizer --input article.txt

# Pipe from another command
curl -s https://example.com/article | supyagent run summarizer

# Batch processing
supyagent batch summarizer documents.jsonl --output summaries.jsonl

Pipeline Patterns

Chained Processing

Process data through multiple agents in sequence:

# Step 1: Extract data
supyagent batch extractor raw_data.jsonl --output extracted.jsonl

# Step 2: Classify extracted data
supyagent batch classifier extracted.jsonl --output classified.jsonl

# Step 3: Summarize each category
supyagent batch summarizer classified.jsonl --output summaries.jsonl

Shell Pipeline

cat raw_text.txt | supyagent run extractor --quiet | supyagent run classifier --quiet

With Secrets

Pass API keys for tools that need external access:

supyagent batch api-caller inputs.jsonl \
  --secrets API_KEY=sk-xxx \
  --secrets .env \
  --output results.jsonl

Orchestrated Workflows

For complex multi-step pipelines, use supyagent orchestrate:

workflows/process-reviews.yaml
name: process-reviews
steps:
  - agent: extractor
    task: "Extract product name, rating, and key issues from: {{input}}"
    output: extracted_data

  - agent: classifier
    task: "Classify the sentiment: {{extracted_data}}"
    depends_on: [extracted_data]
    output: classification

  - agent: summarizer
    task: "Summarize findings: {{extracted_data}} with classification {{classification}}"
    depends_on: [extracted_data, classification]
supyagent orchestrate workflows/process-reviews.yaml

Performance Tips

  • Use temperature: 0.0 for maximum consistency in batch processing
  • Set max_tokens to the minimum needed for your output format
  • Use --quiet flag to suppress status messages when piping output
  • For large batches, monitor progress on stderr while results go to stdout
  • The --output json flag ensures machine-parseable output