Control plane for production agents

Stop runaway Python agents before they burn money.

Run one guarded agent call locally first. Add hosted control later when you need retained incidents, alerts, remote kill, and team visibility.

  • Stop loops, retries, and runaway spend
  • Start with the free SDK

Install with pip install agentguard47. Inspect the SDK on GitHub.

Fastest first proof

Run one guarded call locally.

Install the SDK, set one budget, verify the run locally, then add the hosted control plane later.

pip install agentguard47 openai
Open the full quickstart

Fastest first proof

Run this first

One package. One guarded call. One local proof before you wire any hosted service.

pip install agentguard47 openai

import agentguard
from openai import OpenAI

agentguard.init(
    service="repo-coder",
    budget_usd=5.00,
    trace_file="traces.jsonl",
    local_only=True,
)

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Review this function and return one bug risk."}
    ],
)

print(response.choices[0].message.content)
See quickstart, doctor, and starter paths
OpenAI LangChain LangGraph CrewAI Plain Python
0 SDK runtime dependencies
MIT SDK license
14-day Dashboard trial

Production failure mode

The first buyer problem is not observability. It is stopping the bad run.

When an agent gets stuck on a broken path, retries pile up, budget climbs, and the team wakes up with no clean incident trail.

913 retries

A broken tool call keeps firing because the agent never exits the bad path.

$423 overnight

Budget keeps climbing while nobody is watching the run.

0 shared record

The team has no retained incident trail when they come back to debug it.

AgentGuard response

The SDK stops the run locally. The hosted control plane keeps the incident, alerts the team, and gives you remote control afterward.

What buyers actually care about

Stop the bad run. Keep the record. Give the team control.

Hard runtime budgets

Put a dollar boundary on a run before a bad agent burns through budget.

Loop and retry stops

Cut repeated tool calls and runaway retry paths before they compound.

Retained incident trail

Keep the incident, the alert, and the shared record instead of one local file on one machine.

Hosted remote control

Use the control plane for alerts, retained history, service rollups, and remote kill.

Comparison

Why this instead of tracing or workflow tools?

AgentGuard is built to stop the bad run first.

AgentGuard

Stops bad runs locally, then keeps the incident record and remote controls in the hosted control plane.

LangSmith

Great for tracing and evals. Not the same thing as incident control.

CrewAI / n8n

Useful for orchestration and workflow automation. Different job than stopping a runaway production agent.

Question AgentGuard LangSmith CrewAI n8n
Primary job Runtime control Tracing + evals Framework + AMP Workflow automation
Start local first Yes Partial Partial Self-hosted workflows
Stops a bad run locally Yes Not documented Partial guardrails Not the primary product
Remote kill signal Yes Not documented Not documented Workflow stop, not agent kill
Best fit Production agent incidents LLM observability Agent orchestration Broad automations

Based on primary documented product surfaces as of April 9, 2026. We use “Not documented” or “Partial” when the docs do not support a hard “No.”

Trust the boundary

Zero SDK runtime dependencies

One package to audit. No runtime dependency sprawl in the SDK path.

Only what you send

The dashboard stores only the traces and events you choose to send.

Hosted only when needed

Local proof first. Hosted later for retained incidents, alerts, and team operations.

Free to prove. Paid when the incident needs a team.

Free SDK

Local proof

Run one guarded workflow first.

SDK first
$0 forever
  • MIT-licensed Python SDK
  • Zero SDK runtime dependencies
  • Local enforcement
Best path Start here Hosted Optional later
Start with the SDK

Pro control plane

First production workflow

Retained incidents, alerts, and remote kill.

14-day hosted trial
$39 / mo
  • 500,000 events / month
  • 30-day retention
  • Remote kill and alerts
5 API keys 1 user
Start hosted trial

Team control plane

Shared operations

Same control path, sized for a real team.

Hosted control
$79 / mo
  • 5,000,000 events / month
  • 90-day retention
  • Shared team workflows
20 API keys 10 users
Start hosted trial

Questions people ask first

Do I need the control plane to start?
No. Start with the free SDK. Add the hosted control plane when you want retained incident history, alerts, remote kill, and shared team visibility.
What does the control plane add?
Retained incidents, alerts, remote kill, and team workflows. The SDK still owns local enforcement.
Is this generic observability?
No. It is a control plane for stopping bad runs and keeping the incident record.

Run one guarded agent path first.

SDK first. Hosted later.