AI agents often rely on sequential tool calling, where each step requires a full round-trip to the large language model. This architecture introduces significant latency and increases token consumption as intermediate results pass through the model context repeatedly.
Programmatic tool calling (PTC) offers a more efficient alternative. By shifting logic from the model to a sandboxed execution environment, architects can reduce overhead and improve the performance of complex agentic workflows.
In short
- •
Programmatic tool calling reduces latency by executing multi-step logic in a sandbox rather than forcing the LLM to reason through every intermediate tool result.
- •
This pattern lowers token costs by minimizing the amount of data passed back and forth between the model and the execution environment.
- •
Architects should prioritize PTC for workflows involving large data processing or multi-step orchestration where raw data privacy is a concern.
Moving Beyond Sequential Round-Trips
In a standard tool-calling loop, the model invokes a tool, waits for the output, and then processes that output before deciding on the next step. This cycle repeats for every action. For complex tasks, this creates a bottleneck where the model spends more time waiting for I/O than performing actual reasoning.
PTC changes this by having the model generate code, such as Python, that encapsulates multiple tool calls. This code runs in a secure, sandboxed environment. The model is sampled once to produce the logic, and the execution environment handles the iteration, filtering, and aggregation. Only the final, processed result returns to the model context.
Implementation and Trade-offs
Implementing PTC requires a execution environment. Options range from self-hosted Docker containers on platforms like ECS for full control, to managed services like the Bedrock AgentCore Code Interpreter. The choice depends on the team's capacity to manage infrastructure versus the need for specific security guardrails.
While PTC improves performance, it shifts the burden of error handling to the execution environment. If the generated code fails or hits a runtime error, the agent must be equipped to handle the exception without crashing the entire workflow. Architects should ensure that the sandbox environment is strictly isolated to prevent unauthorized access to system resources during code execution.
Source
Implementing programmatic tool calling on Amazon Bedrock
https://aws.amazon.com/blogs/machine-learning/implementing-programmatic-tool-calling-on-amazon-bedrock


