Guide

LLM agent secrets and credential injection explained

Harbor Platform shipped a billing-reconciliation agent that called Stripe, NetSuite, and an internal ledger API. Engineers seeded the system prompt with STRIPE_SECRET_KEY=sk_live_... so the model could “authenticate properly.” On a failed refund attempt, the model echoed the full key inside the refund_payment tool argument JSON. That span exported to Datadog, appeared in a support engineer’s screen-share recording, and was forwarded to Stripe’s partner error webhook. Security rotated the key within hours, but 14% of production runs in the prior month had at least one secret substring in stored traces or chat logs.

Credential injection keeps secrets out of prompts, model weights, and tool argument payloads the LLM authors. Instead, the runtime resolves opaque credential_ref handles at execution time inside a trusted broker, attaches short-lived tokens to outbound HTTP, and redacts material before anything hits observability or audit stores. Harbor replaced inline env vars with a broker + scoped delegation tokens; secret leakage in traces dropped to zero over six weeks while agent task success held flat. This guide covers why agents leak secrets, broker architecture, scoped token design, tool schema patterns, integration with sandboxes and permission gates, the Harbor Platform refactor, a technique decision table, pitfalls, and a production checklist.

Why agents are a secret-leak surface

Traditional services read secrets from environment variables at process start and never expose them to application logs. Agents add three new exfiltration paths:

Prompt and context retention — anything in the system prompt or retrieved documents can be paraphrased back to users or echoed in tool args.
Tool argument authoring — the model chooses JSON fields; if the schema includes api_key, it will eventually fill it.
Observability defaults — traces and debug dumps often record full request/response bodies unless you redact aggressively.

Prompt injection makes this worse: an attacker who can influence retrieved text may instruct the model to repeat secrets into a public channel. Treat every secret the model can see as already compromised for practical purposes.

Classes of credentials agents touch

Long-lived API keys — Stripe, SendGrid, cloud provider admin keys
OAuth refresh tokens — per-user Google, Salesforce, GitHub access
Database connection strings — especially dangerous in text-to-SQL agents
Session cookies and JWTs — common in browser-automation agents
Signing keys — webhook HMAC, JWT issuers, on-chain hot wallets

Credential broker pattern

A credential broker is a trusted sidecar or middleware layer that sits between the agent runtime and external APIs. The model never receives raw secret material — only stable references the broker understands.

Request flow

Model calls refund_payment(customer_id, amount_cents, credential_ref: "stripe_prod_ro")
Runtime validates the ref against the active permission manifest
Broker fetches the live secret from Vault / KMS / cloud secret manager
Broker issues the outbound HTTP call; span logs show credential_ref only
Tool result returned to model contains status and IDs, not Authorization headers

Broker responsibilities

Resolve refs to secrets — never pass resolved values back to the LLM
Enforce scope — ref stripe_prod_ro cannot call DELETE endpoints
Rotate without agent redeploy — update vault version; refs stay stable
Emit audit events — which run used which ref (not the secret itself)
Rate-limit and anomaly-detect — burst refunds page finance even if the model is compromised

For user-delegated OAuth, the broker stores refresh tokens encrypted per tenant and exchanges them for access tokens with TTL under five minutes. The model sees user_google_calendar:acct_4821, not a bearer string.

Scoped short-lived tokens

Long-lived keys in a broker vault are necessary, but each tool invocation should prefer delegation tokens minted for that run and that action.

Token shape

sub — agent run_id + tool_call_id
scope — e.g. stripe:refund:customer_cus_abc
exp — 60–300 seconds; single-use where APIs allow
policy_hash — ties to the permission manifest version

Why TTL matters

If a trace or chat log leaks a delegation token instead of a root API key, blast radius stays bounded. Harbor mints Stripe-scoped tokens only after tier-2 approval for refunds above $500. Tokens rejected at the payment API if the approval attestation is missing from the broker’s internal ledger.

Downstream API support

Prefer platforms with native scoped keys (Stripe restricted keys, AWS STS, GCP service account impersonation, GitHub fine-grained PATs). Where only root keys exist, put a narrow internal gateway in front and give the broker gateway credentials instead.

Tool schema design: references, not values

Tool JSON schemas teach the model what to send. Any field named password, token, secret, or api_key is an invitation to leak.

Do

Accept credential_ref enums bound to the manifest (stripe_prod_ro | netsuite_sandbox)
Document that refs are resolved server-side; model must not invent values
Validate refs in middleware before the broker call — unknown ref hard-fails
Return structured errors (credential_not_authorized) the model can reason about

Do not

Include optional authorization_header “for flexibility”
Pass through user-supplied URLs with embedded basic auth
Let the model choose which vault path to read

For browser automation, run sessions inside an isolated profile where cookies are injected by the sandbox launcher — never as CLI arguments the model types.

Runtime injection and redaction

Injection points

Layer	What gets injected	Model visibility
HTTP tool middleware	Authorization header from broker	None
SQL tool gateway	Read-only DB role per tenant	Connection alias only
MCP server	Server-side env the host holds	Tool metadata, not env
Subagent spawn	Child-scoped ref subset	Child manifest names only

Redaction before observability

Apply the same rules as PII redaction to secrets: regex for sk_live_, AKIA, JWT three-part blobs, and Bearer prefixes. Redact at span export time; do not rely on engineers remembering to scrub tickets. Harbor blocks deploy if integration tests detect known secret patterns in fixture traces.

Retries and idempotency

When retries replay tool calls, the broker must re-mint delegation tokens rather than cache Authorization headers in checkpoints the model might later read via durable execution state exports.

Harbor Platform refactor

Before the broker, Harbor stored integration secrets in Kubernetes secrets mounted as env vars, copied key names into the system prompt, and logged full tool JSON in traces. OAuth refresh tokens lived in the same Redis keyspace as chat history.

Changes shipped

Central credential-broker service with Vault backend; agents receive zero env secrets
Permission manifest lists allowed credential_ref per agent role; CI fails on schema drift
Tool middleware strips any argument key matching /secret|token|password|api_key/i before execution
Span processor redacts secret patterns; weekly scanner replays random traces
User OAuth moved to encrypted per-tenant table; broker is sole reader

Outcomes

Secret substrings in stored traces: 14% of runs → 0% over six weeks
Unauthorized API calls blocked at broker: 3.2% of attempted tool calls (mostly misconfigured refs)
Mean tool latency +18 ms for broker hop — accepted vs rotation incidents
Key rotation now zero-downtime: update Vault; no agent image rebuild

Technique decision table

Approach	Best for	Weak when
Secrets in system prompt / env exposed to model	Local prototypes, read-only public APIs	Any production mutating tool; compliance; prompt injection risk
Credential broker + refs	Production agents with multiple integrations	Extreme low-latency edge without sidecar budget
Per-user OAuth via broker	Calendar, email, CRM agents acting on behalf of users	Batch jobs with no user context (use service refs)
Pre-authenticated sandbox only	Browser agents, code execution with network	Agents that must call diverse external APIs per turn
Human pastes secret each run	One-off admin scripts	Automation, scale, audit requirements

Common pitfalls

Teaching the model where secrets live — “use the key from env STRIPE_SECRET_KEY” guarantees eventual leakage.
Optional secret fields in tool schemas — the model fills optional fields when confused.
Logging full HTTP in tool errors — upstream 401 responses echo Authorization headers.
Sharing refs across tenants — stripe_prod_ro must map per-tenant vault paths.
Checkpointing bearer tokens — durable state becomes a secret store readable by support tools.
Subagent inherits parent vault — child agents need strict ref subsets, not root access.
Redaction only in UI — raw spans still export to S3; scrub at export.
Ignoring vendor logs — tool args sent to third parties may be retained under their policies.

Production checklist

Zero raw secrets in system prompts, RAG corpora, or few-shot examples.
Tool schemas use credential_ref enums, not free-text secret fields.
Credential broker resolves refs; model never receives resolved values.
Permission manifest binds refs to agent roles and environments.
Short-lived delegation tokens per mutating tool call where supported.
Span and audit exporters redact secret patterns before persistence.
Weekly automated scan of traces and chat logs for leak patterns.
OAuth refresh tokens encrypted; broker is sole accessor.
Retries re-mint tokens; checkpoints store refs only.
Subagents receive narrowed ref sets via delegation policy.
Rotation runbook updates vault without redeploying agent images.
Incident playbooks cover trace exposure and partner notification.

Key takeaways

If the model can see a secret, assume it will leak — via tools, traces, or injection.
Harbor cut trace secret exposure 14% → 0% with a broker and ref-only schemas.
Inject at runtime in trusted middleware, not in prompts.
Scoped TTL tokens bound blast radius when redaction fails.
Redact at export — observability is a secret store unless proven otherwise.