ToolEnvironment base class.
ToolEnvironment
ToolEnvironment extends MultiTurnEnvironment with built-in tool calling support. You define tools as Python functions, and the class handles schema generation, prompt injection, tool call parsing, execution, and result formatting.
Defining tools
Tools are regular Python functions with type hints and docstrings:func_to_tool_schema(). The type hints map to JSON Schema types (float → number, str → string, int → integer, bool → boolean), and the docstring becomes the tool description.
How tool calls are processed
Each turn follows this cycle:-
Parse — The model’s completion is scanned for tool calls. By default, XML format is used with a JSON object inside the tags:
-
Execute — Each parsed
ToolCallis executed by calling the corresponding Python function with the parsed arguments. - Format — Results are formatted as tool response messages and appended to the conversation.
-
Check —
is_final_answer()determines if the model is providing a final answer (no tool calls) or wants to continue using tools.
Building a ToolEnvironment
The minimal subclass needsload_dataset and compute_reward:
Override points
| Method | Default behavior | When to override |
|---|---|---|
parse_tool_calls(text) | Parses XML-tagged tool calls | Custom tool call format |
execute_tool(tool_call) | Calls the matching Python function | Tools need side effects, async I/O, or sandbox execution |
format_tool_result(result) | Formats as XML tool response | Custom result formatting |
is_final_answer(completion, state) | True if no tool calls found | Custom completion detection (e.g., <answer> tags) |
Tool metrics
get_tool_metrics(state) returns a dict with usage stats from the trajectory:
sample_metrics. See Metrics for details on how sample metrics are tracked and displayed in the UI.
Sandbox execution
For environments that need to execute code (not just call Python functions), Telescope provides a pluggable sandbox system.SandboxConfig
Supported providers
Telescope is agnostic to which sandbox provider is used — any provider that implements theSandboxProvider interface (create, execute, upload_bytes, upload_file, destroy) will work. For convenience, the following providers come pre-configured:
| Provider | Description | Credentials |
|---|---|---|
prime | Prime infrastructure | PRIME_API_KEY env var or prime login |
modal | Cloud-based sandboxes with fast cold starts | MODAL_TOKEN_ID env var or Modal SDK auth |
daytona | Self-hosted sandbox environments | DAYTONA_API_KEY or DAYTONA_JWT_TOKEN env var |
e2b | Cloud sandboxes for prototyping and development | E2B_API_KEY env var |
Using sandboxes in environments
A typical sandbox environment follows this pattern:- Create sandboxes in
create_initial_state()with concurrency control via semaphores - Execute commands in
env_response()by parsing tool calls and running them in the sandbox - Clean up in a destroy hook when the rollout completes
Multi-turn configuration for agentic tasks
Key config parameters for tool-using and agentic environments:priority scheduling is important for multi-turn environments: it ensures the model completes earlier turns before starting new ones, preventing scenarios where later turns queue behind a flood of first-turn requests.
interleaved_rollouts (enabled by default) reuses token IDs from previous turns exactly, avoiding subtle tokenization differences that could corrupt logprob computation across turns.
