Lab Notes

May 2026

policy-as-code for ai agents

How production teams can move agent safety rules out of prompts and scattered tool code into explicit, reviewable runtime policy.

Policy-as-code for AI agents is not just "write a Python function that blocks a tool." The useful version is an implementation pattern: normalize every agent action into a structured event, evaluate policy at the action boundary, return a decision, and leave an audit trail that engineers and reviewers can inspect later.

This matters because agent failures rarely involve one isolated tool. A support agent might read CRM data, retrieve policy docs, issue a refund, and send an email. A data analyst agent might query a warehouse, write to a notebook, export a CSV, and call a model to summarize results. A coding agent might read files, open GitHub issues, run shell commands, and deploy. Policy has to follow the workflow, not just one function.

Security guidance is moving in the same direction. OWASP's agentic AI work frames agents as systems with new threat surfaces around tool use, memory, delegation, and runtime behavior. NIST's AI Risk Management Framework pushes teams toward mapped, measured, managed controls rather than one-time model choices. Recent agent-runtime research describes the attack surface shifting from build-time artifacts to inference-time dependencies such as retrieved context, memory, and tools. The engineering question is how to make those ideas concrete in code.

Brane's answer is the loop described in the docs: Capability + AgentAction + PolicyContext -> Policy -> Decision. A policy receives a PolicyContext, inspects the attempted action and runtime metadata, then returns a structured Decision. The examples below show what that looks like in realistic systems.

The common starting point: policy hidden in four places

Most teams do not begin with a policy runtime. They begin with a mix of prompt instructions, tool-specific conditionals, application authorization, and manual logs. That works for demos because the agent has a narrow path. It breaks down when capabilities multiply.

Where teams put policy nowWhat it catchesWhere it fails
System promptIntent, tone, task framing, allowed behaviorAdvisory only; cannot guarantee side effects
Tool implementationLocal invariants inside one functionScatters rules across tools and frameworks
App auth and IAMIdentity, resource, and service permissionsUsually misses agent intent, arguments, and workflow context
Logs and dashboardsIncident review after the factToo late to stop a refund, deletion, email, or data export

Brane does not replace these layers. Tool code should still validate hard invariants. IAM should still enforce least privilege. Prompts should still tell the model what good behavior looks like. Brane adds the missing enforcement point: the runtime decision before or after an agent capability executes.

Example 1: support agent with refunds, CRM, and email

A customer support agent has three capabilities: read a customer profile, issue a refund, and send a customer email. The failure is not only "the refund tool was called." The failure is that the agent can combine stale CRM data, an angry user message, and an overly broad refund capability into a financial side effect, then email a customer with internal notes.

A developer building this for the first time will usually protect the obvious danger points directly. They put refund guidance in the system prompt so the model knows the business rule, add an amount check inside the payment function, and add a lightweight content check before sending email. That is a sensible first pass because it is close to the code that performs each action and easy to ship with the agent.

Teams often start with something like this:

SYSTEM_PROMPT = """
You are a careful support agent.
Never refund more than $100.
Never include internal notes in customer emails.
"""

def refund_customer(customer_id, amount_usd):
    if amount_usd > 100:
        raise ValueError("Refund too high")
    return stripe.refunds.create(customer_id, amount_usd)

def send_customer_email(to, body):
    if "internal" in body.lower():
        raise ValueError("Looks internal")
    return email.send(to=to, body=body)

This is reasonable defensive programming, but the policy is split across prompt text and two tool implementations. The refund check cannot see tenant-specific limits. The email check cannot see that the email follows a high-risk denied refund attempt. Neither check creates a consistent policy decision record.

The code also grows awkward as the product matures. Enterprise tenants may have different refund limits. Some support agents may be allowed to issue credits but not payment refunds. A denied refund may require a different customer email template. Once those rules live in separate tools, the developer has to keep prompt instructions, billing logic, support workflow logic, and audit logging in sync.

With Brane, each operation becomes a governed capability:

@runtime.capability(name="crm.read_customer", type="tool", risk="medium")
def read_customer(customer_id: str, tenant_id: str):
    return crm.get_customer(customer_id=customer_id, tenant_id=tenant_id)

@runtime.capability(name="billing.refund_customer", type="tool", risk="high")
def refund_customer(customer_id: str, amount_usd: int, tenant_id: str):
    return payments.refund(customer_id=customer_id, amount_usd=amount_usd)

@runtime.capability(name="email.send_customer", type="tool", risk="medium")
def send_customer_email(to: str, body: str, tenant_id: str):
    return email.send(to=to, body=body)

Then the business rules live together as runtime policies:

@runtime.before_capability("*", name="tenant_boundary", version="1.0.0", priority=0)
def tenant_boundary(ctx):
    requested_tenant = ctx.arg("tenant_id")
    if requested_tenant and requested_tenant != ctx.tenant_id:
        return Decision(type="deny", reason="Cross-tenant action blocked")
    return Decision(type="allow")

@runtime.before_capability("billing.refund_customer", name="refund_limit", version="1.0.0")
def refund_limit(ctx):
    amount = ctx.arg("amount_usd", 0)
    limit = get_refund_limit(ctx.tenant_id, ctx.principal_id)

    if amount > limit:
        return Decision(type="deny", reason=f"Refund exceeds USD {limit} limit")
    return Decision(type="allow")

@runtime.before_capability("email.send_customer", name="customer_email_content", version="1.0.0")
def customer_email_content(ctx):
    body = ctx.arg("body", "")
    blocked_terms = ["internal only", "do not share", "risk score"]

    if any(term in body.lower() for term in blocked_terms):
        return Decision(type="deny", reason="Customer email contains internal language")
    return Decision(type="allow")

The advantage is not only cleaner code. The advantage is a single control surface across CRM, billing, and email. A reviewer can ask: which policy denied the refund, what tenant limit was applied, which agent attempted it, and what email would have been sent next? The Refund Limit Recipe shows the same pattern in a smaller form.

The tool functions still do their jobs: billing still talks to the payment provider, email still sends email, and CRM still reads customer data. Brane does not ask the developer to turn those functions into a security framework. It lets the developer keep business enforcement in one policy layer where cross-capability rules can be reviewed, tested, and reused by every support workflow.

Example 2: data analyst agent with SQL, retrieval, and CSV export

A data analyst agent often has more than one data capability. It can search a knowledge base, run SQL, call a model to summarize, and export a file. The dangerous path is a chain: retrieve a confidential metric definition, run an under-scoped query, summarize the result, then export a CSV outside the tenant boundary.

The natural developer instinct is to start with the database, because SQL is the visible risk. Teams add a read-only check, restrict the database user, and maybe block obvious keywords. That is useful, but the agent's data risk is broader than the SQL statement. It also depends on the retrieval namespace, the size and sensitivity of the returned rows, where exports are written, and whether the model summary is allowed to include raw values.

A typical first implementation protects only the SQL helper:

def execute_sql(query: str):
    normalized = query.strip().lower()
    if not normalized.startswith("select"):
        raise ValueError("Only SELECT queries are allowed")
    return warehouse.query(query)

def export_csv(rows, destination):
    return storage.write_csv(rows, destination)

This catches obvious writes, but it does not answer the operational questions: is the query scoped to the current tenant? Is the result set too large? Is the export destination internal? Did the retrieval step pull data from a restricted namespace? Those checks usually end up as scattered helper functions or post-hoc dashboard queries.

Developers can keep adding checks inside each helper, but that creates a hidden coupling problem. The SQL helper does not know whether the rows will be exported. The export helper does not know whether the rows came from a tenant-scoped query. The retrieval helper does not know whether the retrieved document will be used to generate a report for a different customer. The workflow risk lives between the tools.

With Brane, SQL, retrieval, model calls, and export are all capabilities:

@runtime.capability(name="warehouse.execute_sql", type="tool", risk="high")
def execute_sql(query: str, tenant_id: str):
    return warehouse.query(query)

@runtime.capability(name="retrieval.search_docs", type="retrieval", risk="medium")
def search_docs(query: str, namespace: str, tenant_id: str):
    return vector_store.search(query=query, namespace=namespace)

@runtime.capability(name="files.export_csv", type="filesystem", risk="high")
def export_csv(rows, destination: str, tenant_id: str):
    return storage.write_csv(rows, destination)

The policies now govern the workflow, not just the SQL string:

@runtime.before_capability("warehouse.execute_sql", name="read_only_sql", version="1.0.0")
def read_only_sql(ctx):
    query = ctx.arg("query", "").strip().lower()
    if not query.startswith("select"):
        return Decision(type="deny", reason="Only SELECT queries are allowed")
    if f"tenant_id = '{ctx.tenant_id}'" not in query:
        return Decision(type="deny", reason="SQL query must be tenant-scoped")
    return Decision(type="allow")

@runtime.before_capability("retrieval.search_docs", name="namespace_scope", version="1.0.0")
def namespace_scope(ctx):
    namespace = ctx.arg("namespace", "")
    if not namespace.startswith(f"tenant/{ctx.tenant_id}/"):
        return Decision(type="deny", reason="Retrieval namespace is outside tenant scope")
    return Decision(type="allow")

@runtime.after_capability("warehouse.execute_sql", name="result_size_limit", version="1.0.0")
def result_size_limit(ctx):
    rows = ctx.output or []
    if len(rows) > 1000:
        return Decision(type="deny", reason="Result set too large for agent workflow")
    return Decision(type="allow")

@runtime.before_capability("files.export_csv", name="export_destination", version="1.0.0")
def export_destination(ctx):
    destination = ctx.arg("destination", "")
    if not destination.startswith(f"s3://internal-exports/{ctx.tenant_id}/"):
        return Decision(type="deny", reason="CSV export destination is not approved")
    return Decision(type="allow")

The advantage is that the policy layer can express data governance across multiple capabilities: retrieval namespace, SQL tenant filter, result-size control, and export destination. That is much closer to how the agent actually fails. The SQL Read-Only Recipeis the narrow version; production systems usually need the surrounding retrieval and export rules too.

This also makes rollout safer. A team can begin with read-only SQL, then add namespace checks, then add export controls as the agent gains more autonomy. Each rule has a name and version, so denials are debuggable. When an analyst asks why a report failed, the answer is a policy decision, not a generic database exception buried in one tool.

Example 3: coding agent with GitHub, shell, files, and deploy

Coding agents are powerful because they can combine repository access, issue trackers, file edits, shell commands, package managers, test runners, and deployment systems. The failure mode is a chain of individually plausible steps: read a GitHub issue, edit a migration, run a destructive command, push to a protected branch, then deploy.

Developers usually harden coding agents incrementally. First they add sandboxing. Then they block a few dangerous shell substrings. Then they restrict deploys to staging. Then they add protected branch rules in GitHub. Each control helps, but each one lives in the adapter where the problem first appeared. The result is a patchwork of enforcement points that are hard to reason about as one system.

Without a central policy layer, teams often patch each adapter:

def run_shell(command: str):
    if "rm -rf" in command or "drop database" in command.lower():
        raise ValueError("Blocked command")
    return subprocess.run(command, shell=True)

def deploy_service(service: str, environment: str):
    if environment == "prod":
        raise ValueError("Agents cannot deploy prod")
    return deploy(service, environment)

This is fragile because it is string-based, local to each adapter, and hard to audit across the workflow. It also misses context: a shell command that is fine in a sandbox may be unacceptable in production; a deploy might be acceptable only after tests passed and a human approval token is present.

The deeper issue is that the risky unit is not always the command. It is the command plus the working directory, target environment, branch, file path, agent scope, and preceding workflow state. A migration file edit may be fine in a feature branch but not in a generated hotfix. A deploy may be fine for a preview service but not for production. Those are policy questions, not shell-wrapper questions.

With Brane, coding surfaces become capabilities with risk metadata:

@runtime.capability(name="repo.write_file", type="filesystem", risk="medium")
def write_file(path: str, contents: str, repo: str):
    return repo_client.write(path, contents)

@runtime.capability(name="shell.run", type="tool", risk="high")
def run_shell(command: str, cwd: str, environment: str):
    return shell.run(command=command, cwd=cwd)

@runtime.capability(name="github.open_pr", type="tool", risk="medium")
def open_pr(repo: str, branch: str, title: str):
    return github.open_pr(repo=repo, branch=branch, title=title)

@runtime.capability(name="deploy.service", type="tool", risk="high")
def deploy_service(service: str, environment: str):
    return deploy(service=service, environment=environment)

Then policies can encode the operational rules directly:

@runtime.before_capability("shell.run", name="dangerous_shell_commands", version="1.0.0")
def dangerous_shell_commands(ctx):
    command = ctx.arg("command", "").lower()
    blocked = ["rm -rf /", "drop database", "kubectl delete namespace"]
    if any(fragment in command for fragment in blocked):
        return Decision(type="deny", reason="Dangerous shell command blocked")
    return Decision(type="allow")

@runtime.before_capability("repo.write_file", name="protected_paths", version="1.0.0")
def protected_paths(ctx):
    path = ctx.arg("path", "")
    protected = ["infra/prod/", ".github/workflows/", "migrations/"]
    if any(path.startswith(prefix) for prefix in protected) and not ctx.agent_has_scope("repo:protected-write"):
        return Decision(type="deny", reason="Agent lacks protected path scope")
    return Decision(type="allow")

@runtime.before_capability("deploy.service", name="prod_deploy_gate", version="1.0.0")
def prod_deploy_gate(ctx):
    if ctx.arg("environment") == "prod" and not ctx.arg("approval_id"):
        return Decision(type="deny", reason="Production deploy requires approval")
    return Decision(type="allow")

The advantage is that the deploy policy does not have to live inside the deploy adapter, the path policy does not have to live inside every file-writing tool, and the shell policy does not have to be duplicated across agent frameworks. The same runtime pattern can protect the OpenAI Agents SDK, LangGraph, CrewAI, a custom agent loop, or an MCP adapter as long as the capabilities are normalized.

For a developer, this keeps the coding agent flexible without giving it an unstructured blast radius. The agent can still plan, edit, test, and open pull requests. The runtime decides which paths, commands, and environments are allowed for this agent in this context. That separation is especially useful when a team wants different policies for local experimentation, CI, staging, and production.

Example 4: MCP-connected agent with external tools

MCP makes tool discovery easier, but it also changes trust boundaries. OWASP's MCP tool poisoning writeup describes the risk: an agent connects to a tool server that looks normal, then tool responses carry instructions that influence future tool calls. The core problem is runtime trust. A server that was acceptable to connect to is not automatically safe to trust for every response and every downstream action.

A developer adopting MCP often starts from the integration surface: connect to a server, list tools, add them to the agent, and let the model decide when to call them. That is exactly why MCP is useful. It lowers the cost of adding capabilities. But the same property means a production agent may receive tools, descriptions, resources, and responses from systems with different owners and trust levels.

The common first version treats MCP tools like ordinary local tools:

tools = await mcp_client.list_tools(server_url)
agent = Agent(tools=tools + [read_file, send_email, query_database])

# The model sees the MCP tool descriptions and decides what to call.

That makes integration easy, but it blurs internal and external privilege. An untrusted MCP result can land in the same model context as a privileged internal database or file tool.

The risky pattern is not only malicious tooling. It can be an ordinary external system returning unexpected shapes, stale permissions, or content that should not be allowed to steer internal actions. A CRM MCP server might be trusted for account lookup but not for creating customer-facing messages. A GitHub MCP server might be allowed to open issues but not to mutate deployment workflows.

With Brane, MCP calls can be normalized as capabilities:

@runtime.capability(name="mcp.salesforce.search_account", type="mcp_tool", risk="medium")
def search_account(account_name: str, server_id: str, tenant_id: str):
    return mcp.call_tool(server_id, "search_account", {"account_name": account_name})

@runtime.capability(name="mcp.github.create_issue", type="mcp_tool", risk="medium")
def create_issue(repo: str, title: str, body: str, server_id: str):
    return mcp.call_tool(server_id, "create_issue", {"repo": repo, "title": title, "body": body})

Then policies can separate server trust, tool risk, and downstream effects:

TRUSTED_MCP_SERVERS = {"internal-salesforce", "internal-github"}

@runtime.before_capability("*", name="trusted_mcp_server", version="1.0.0")
def trusted_mcp_server(ctx):
    if ctx.capability.type != "mcp_tool":
        return Decision(type="allow")

    server_id = ctx.arg("server_id")
    if server_id not in TRUSTED_MCP_SERVERS:
        return Decision(type="deny", reason="Untrusted MCP server")
    return Decision(type="allow")

@runtime.after_capability("*", name="mcp_response_shape", version="1.0.0")
def mcp_response_shape(ctx):
    if ctx.capability.type != "mcp_tool":
        return Decision(type="allow")

    output = ctx.output or {}
    if not isinstance(output, dict) or "data" not in output:
        return Decision(type="deny", reason="MCP response failed schema check")
    return Decision(type="allow")

The advantage is explicit trust zoning. MCP can still be used, but external tool responses do not silently become authority to call internal tools. Brane's MCP Tool Governanceguide covers the category-level version of this pattern.

This lets a developer keep MCP's ergonomics while adding runtime boundaries around it. Tool discovery can remain dynamic, but execution is still governed. Server allowlists, schema checks, tenant checks, and downstream privilege rules all become policy decisions instead of assumptions baked into the agent prompt.

Example 5: model calls as governed capabilities

Many teams treat model calls as harmless compared with tools. In production, model calls are capabilities too. They carry cost, latency, data residency, privacy, and retention implications. An agent that routes every small classification to an expensive frontier model, or sends regulated data to a provider that is not approved for that tenant, is still creating a production incident.

Developers often start by choosing a model inline because it is the fastest way to make the workflow work. A ticket classifier calls one model, a summarizer calls another, and a planning step calls the best available model. That is fine when there is one tenant and one environment. It becomes hard to govern when model choice depends on tenant contracts, data class, latency budget, task type, or monthly spend.

A typical first implementation chooses a model inline:

def classify_ticket(ticket_text: str):
    return openai.responses.create(
        model="frontier-large",
        input=f"Classify this ticket: {ticket_text}",
    )

That is simple, but the policy is invisible. There is no central place to ask whether the tenant permits that provider, whether the text contains regulated data, or whether a cheaper model should be required for this action class.

Moving this into ad hoc gateway code helps, but only if every agent uses the same gateway correctly. In practice, teams end up with model choices scattered across chains, evaluators, routers, and one-off helper functions. The policy question is not just "which model was called?" It is "was this model allowed for this tenant, data type, purpose, and cost profile?"

With Brane, model calls can be represented as capabilities:

@runtime.capability(name="model.classify_ticket", type="model", risk="medium")
def classify_ticket(ticket_text: str, tenant_id: str, provider: str, model: str):
    return model_gateway.call(
        provider=provider,
        model=model,
        input=ticket_text,
        tenant_id=tenant_id,
    )

Then policy can enforce data and cost constraints before the call:

@runtime.before_capability("model.classify_ticket", name="approved_model_provider", version="1.0.0")
def approved_model_provider(ctx):
    provider = ctx.arg("provider")
    if provider not in approved_providers_for_tenant(ctx.tenant_id):
        return Decision(type="deny", reason="Model provider is not approved for tenant")
    return Decision(type="allow")

@runtime.before_capability("model.classify_ticket", name="cheap_model_for_classification", version="1.0.0")
def cheap_model_for_classification(ctx):
    model = ctx.arg("model")
    if model not in {"fast-small", "fast-medium"}:
        return Decision(type="deny", reason="Ticket classification must use approved low-cost model")
    return Decision(type="allow")

@runtime.before_capability("model.classify_ticket", name="regulated_data_boundary", version="1.0.0")
def regulated_data_boundary(ctx):
    text = ctx.arg("ticket_text", "")
    if contains_ssn(text) and not tenant_allows_sensitive_model_calls(ctx.tenant_id):
        return Decision(type="deny", reason="Sensitive data cannot be sent to this model provider")
    return Decision(type="allow")

The advantage is that model governance joins the same policy plane as tool governance. Teams can review model calls, SQL calls, exports, MCP tools, and emails as one agent action stream instead of stitching together five separate logs.

That makes cost and data rules operational instead of aspirational. A policy can deny expensive models for low-value classification, block sensitive input from leaving an approved provider boundary, and explain the decision in audit records. The model gateway remains the execution mechanism; Brane supplies the action-level governance around it.

What Brane changes in the implementation

The important difference is not syntax. The difference is where the control sits. With ad hoc checks, each tool decides for itself what it knows and what it logs. With a runtime policy layer, every attempted action can be evaluated against the same set of questions:

  • Which agent is acting?
  • Which principal and tenant is it acting for?
  • Which capability is being attempted?
  • Is the capability a tool, model call, retrieval call, file operation, MCP tool, or external API?
  • What arguments are being passed?
  • Is this production, staging, or local development?
  • Which policy allowed or denied the action?
  • What reason was recorded?

Those questions map directly to Brane's PolicyContext, before-capability policies, and after-capability policies.

Advantages by example

ScenarioNormal implementationBrane-style implementation
Refund + emailPrompt rule plus checks inside each functionTenant, amount, approval, and content policies across both capabilities
SQL + exportRegex inside SQL helper and bucket permissions elsewhereRead-only, tenant scope, result size, and export destination in one policy layer
Coding agentAdapter-specific deny lists and protected branch rulesCapability risk, environment, path scope, and approval rules evaluated before action
MCP toolsTrust tools after connect-time discoveryServer allowlists, response-shape checks, and privilege separation at runtime
Model callsProvider and model selected inline in app codeProvider approval, data boundaries, and cost constraints as reviewable policy

Testing policy code

A major advantage of policy-as-code is that policies can be tested without the model in the loop. You can construct an AgentAction, wrap it in PolicyContext, and assert that the decision is allowed or denied.

def test_refund_policy_denies_large_amount():
    action = AgentAction(
        action_type="tool_call",
        capability=Capability(name="billing.refund_customer", type="tool", risk="high"),
        agent_id="support_agent",
        input={"amount_usd": 250, "tenant_id": "tenant_acme"},
        environment="prod",
        tenant_id="tenant_acme",
        principal_id="user_123",
    )
    ctx = PolicyContext(action=action, args=action.input)

    decision = refund_limit(ctx)

    assert decision.denied
    assert "exceeds" in decision.reason

That kind of test is impossible if the rule only exists as natural language in a prompt. It is also easier to run in CI, review in pull requests, and promote across environments. The docs explain the policy function shape in Writing Policies.

Design rules for production agent policy

  • Normalize every external action as a capability, not only classic tools.
  • Keep tool invariants in tool code, but put workflow policy in runtime policy.
  • Use global policies for universal boundaries such as tenant isolation.
  • Use capability-specific policies for business rules such as refund limits.
  • Use after-capability policies for result size, output shape, and sensitive output checks.
  • Name and version policies so audit records are meaningful.
  • Test policies with constructed contexts instead of relying only on agent evals.
  • Start in observation for broad policies, then enforce once denial behavior is understood.

External references

For background on the broader risk model, see OWASP's Agentic AI Threats and Mitigations, OWASP's MCP Tool Poisoningwriteup, NIST's AI Risk Management Framework, and the 2026 arXiv systematization on agentic supply-chain runtime risk. Brane is a concrete policy runtime for applying those ideas to agent actions in application code.

This is the kind of problem we're solving with Brane — policy-as-code for AI agents. Write Python policies that run before agent actions execute. Block or allow high-risk capability use, with an audit-ready decision trace.