Web Tool
Reference for the web default tool: fetch_url for page fetching and http_request for full HTTP control.
Web Tool
Script: powers/web.py
Dependencies: pydantic, httpx, markdownify, beautifulsoup4
Fetch URLs and make HTTP requests. Converts HTML pages to clean markdown for easy consumption by agents. No browser needed -- uses HTTP directly.
For JavaScript-rendered pages (SPAs), use the browser tool instead.
fetch_url
Fetch a URL and return its content as clean markdown. Strips navigation, scripts, styles, and other noise. Follows redirects automatically.
Input
| Field | Type | Default | Description |
|---|---|---|---|
url | str | required | The URL to fetch |
include_links | bool | true | Include hyperlinks in the markdown output |
max_length | int | 50000 | Maximum characters to return (truncates if longer) |
timeout | int | 30 | Request timeout in seconds |
headers | dict | null | Additional HTTP headers to send |
Output
| Field | Type | Description |
|---|---|---|
ok | bool | true if fetch succeeded |
content | str | Page content as clean markdown |
title | str | Page title (from <title> tag) |
url | str | Final URL after redirects |
status_code | int | HTTP status code |
content_type | str | Response content type |
truncated | bool | true if content was truncated to max_length |
error | str | Error message on failure |
Examples
// Fetch documentation
{"url": "https://docs.python.org/3/library/pathlib.html"}
// Fetch without links (cleaner text)
{"url": "https://example.com", "include_links": false}
// Custom timeout for slow sites
{"url": "https://slow-api.example.com/data", "timeout": 60}
// With custom headers
{
"url": "https://api.example.com/page",
"headers": {"Authorization": "Bearer token123"}
}Content Type Handling
| Content Type | Behavior |
|---|---|
text/html | Converted to clean markdown (scripts, styles, nav, footer removed) |
application/json | Pretty-printed JSON |
text/plain | Returned as-is |
| Binary types | Returns description: [Binary content: image/png, 45032 bytes] |
HTML Cleanup
The following tags are removed from HTML before conversion to markdown:
script, style, nav, footer, header, aside, noscript, iframe, svg
A browser-like User-Agent header is sent by default to avoid being blocked.
http_request
Make an HTTP request with full control over method, headers, and body. Use this for API calls, webhooks, and any HTTP interaction beyond simple page fetching.
Input
| Field | Type | Default | Description |
|---|---|---|---|
method | str | "GET" | HTTP method: GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS |
url | str | required | The URL to request |
headers | dict | null | HTTP headers |
body | str | null | Request body (string). For JSON, pass a JSON string |
json_body | dict | null | Request body as JSON object (sets Content-Type automatically) |
timeout | int | 30 | Request timeout in seconds |
follow_redirects | bool | true | Follow HTTP redirects |
Output
| Field | Type | Description |
|---|---|---|
ok | bool | true if request completed (regardless of HTTP status) |
status_code | int | HTTP status code |
headers | dict | Response headers |
body | str | Response body (truncated at 100KB) |
url | str | Final URL after redirects |
error | str | Error message on connection failure |
Examples
// Simple GET request
{"url": "https://api.github.com/repos/python/cpython"}
// POST with JSON body
{
"method": "POST",
"url": "https://httpbin.org/post",
"json_body": {"key": "value", "count": 42}
}
// POST with form data
{
"method": "POST",
"url": "https://httpbin.org/post",
"body": "username=alice&password=secret",
"headers": {"Content-Type": "application/x-www-form-urlencoded"}
}
// DELETE with auth header
{
"method": "DELETE",
"url": "https://api.example.com/items/1",
"headers": {"Authorization": "Bearer token123"}
}
// HEAD request (check if URL exists)
{"method": "HEAD", "url": "https://example.com/large-file.zip"}
// Don't follow redirects
{
"url": "https://bit.ly/example",
"follow_redirects": false
}Notes on body vs json_body
- Use
json_bodywhen sending JSON -- it automatically setsContent-Type: application/json - Use
bodyfor other content types (form data, XML, raw text) - If both are provided,
json_bodytakes precedence
Tips
- Use
fetch_urlfor reading web pages -- it handles HTML-to-markdown conversion and noise removal - Use
http_requestfor API calls -- it gives you full control over method, headers, and body - The response body is truncated at 100KB for
http_requestto prevent memory issues - Both functions follow redirects by default
- Connection errors (DNS failure, refused connections) set
ok: falsewith a descriptive error - Timeout errors are reported clearly with the timeout duration