Guide
LLM agent secrets and credential injection explained
Harbor Platform shipped a billing-reconciliation agent that called Stripe, NetSuite,
and an internal ledger API. Engineers seeded the system prompt with
STRIPE_SECRET_KEY=sk_live_... so the model could “authenticate
properly.” On a failed refund attempt, the model echoed the full key inside
the refund_payment tool argument JSON. That span exported to Datadog,
appeared in a support engineer’s screen-share recording, and was forwarded to
Stripe’s partner error webhook. Security rotated the key within hours, but
14% of production runs in the prior month had at least one secret
substring in stored traces or chat logs.
Credential injection keeps secrets out of prompts, model weights,
and tool argument payloads the LLM authors. Instead, the runtime resolves opaque
credential_ref handles at execution time inside a trusted broker,
attaches short-lived tokens to outbound HTTP, and redacts material before anything
hits observability or
audit stores.
Harbor replaced inline env vars with a broker + scoped delegation tokens; secret
leakage in traces dropped to zero over six weeks while agent task
success held flat. This guide covers why agents leak secrets, broker architecture,
scoped token design, tool schema patterns, integration with sandboxes and permission
gates, the Harbor Platform refactor, a technique decision table, pitfalls, and a
production checklist.
Why agents are a secret-leak surface
Traditional services read secrets from environment variables at process start and never expose them to application logs. Agents add three new exfiltration paths:
- Prompt and context retention — anything in the system prompt or retrieved documents can be paraphrased back to users or echoed in tool args.
- Tool argument authoring — the model chooses JSON fields;
if the schema includes
api_key, it will eventually fill it. - Observability defaults — traces and debug dumps often record full request/response bodies unless you redact aggressively.
Prompt injection makes this worse: an attacker who can influence retrieved text may instruct the model to repeat secrets into a public channel. Treat every secret the model can see as already compromised for practical purposes.
Classes of credentials agents touch
- Long-lived API keys — Stripe, SendGrid, cloud provider admin keys
- OAuth refresh tokens — per-user Google, Salesforce, GitHub access
- Database connection strings — especially dangerous in text-to-SQL agents
- Session cookies and JWTs — common in browser-automation agents
- Signing keys — webhook HMAC, JWT issuers, on-chain hot wallets
Credential broker pattern
A credential broker is a trusted sidecar or middleware layer that sits between the agent runtime and external APIs. The model never receives raw secret material — only stable references the broker understands.
Request flow
- Model calls
refund_payment(customer_id, amount_cents, credential_ref: "stripe_prod_ro") - Runtime validates the ref against the active permission manifest
- Broker fetches the live secret from Vault / KMS / cloud secret manager
- Broker issues the outbound HTTP call; span logs show
credential_refonly - Tool result returned to model contains status and IDs, not Authorization headers
Broker responsibilities
- Resolve refs to secrets — never pass resolved values back to the LLM
- Enforce scope — ref
stripe_prod_rocannot call DELETE endpoints - Rotate without agent redeploy — update vault version; refs stay stable
- Emit audit events — which run used which ref (not the secret itself)
- Rate-limit and anomaly-detect — burst refunds page finance even if the model is compromised
For user-delegated OAuth, the broker stores refresh tokens encrypted per tenant and
exchanges them for access tokens with TTL under five minutes. The model sees
user_google_calendar:acct_4821, not a bearer string.
Scoped short-lived tokens
Long-lived keys in a broker vault are necessary, but each tool invocation should prefer delegation tokens minted for that run and that action.
Token shape
sub— agent run_id + tool_call_idscope— e.g.stripe:refund:customer_cus_abcexp— 60–300 seconds; single-use where APIs allowpolicy_hash— ties to the permission manifest version
Why TTL matters
If a trace or chat log leaks a delegation token instead of a root API key, blast radius stays bounded. Harbor mints Stripe-scoped tokens only after tier-2 approval for refunds above $500. Tokens rejected at the payment API if the approval attestation is missing from the broker’s internal ledger.
Downstream API support
Prefer platforms with native scoped keys (Stripe restricted keys, AWS STS, GCP service account impersonation, GitHub fine-grained PATs). Where only root keys exist, put a narrow internal gateway in front and give the broker gateway credentials instead.
Tool schema design: references, not values
Tool JSON schemas teach the model what to send. Any field named
password, token, secret, or api_key
is an invitation to leak.
Do
- Accept
credential_refenums bound to the manifest (stripe_prod_ro|netsuite_sandbox) - Document that refs are resolved server-side; model must not invent values
- Validate refs in middleware before the broker call — unknown ref hard-fails
- Return structured errors (
credential_not_authorized) the model can reason about
Do not
- Include optional
authorization_header“for flexibility” - Pass through user-supplied URLs with embedded basic auth
- Let the model choose which vault path to read
For browser automation, run sessions inside an isolated profile where cookies are injected by the sandbox launcher — never as CLI arguments the model types.
Runtime injection and redaction
Injection points
| Layer | What gets injected | Model visibility |
|---|---|---|
| HTTP tool middleware | Authorization header from broker | None |
| SQL tool gateway | Read-only DB role per tenant | Connection alias only |
| MCP server | Server-side env the host holds | Tool metadata, not env |
| Subagent spawn | Child-scoped ref subset | Child manifest names only |
Redaction before observability
Apply the same rules as
PII redaction
to secrets: regex for sk_live_, AKIA, JWT three-part blobs,
and Bearer prefixes. Redact at span export time; do not rely on
engineers remembering to scrub tickets. Harbor blocks deploy if integration tests
detect known secret patterns in fixture traces.
Retries and idempotency
When retries replay tool calls, the broker must re-mint delegation tokens rather than cache Authorization headers in checkpoints the model might later read via durable execution state exports.
Harbor Platform refactor
Before the broker, Harbor stored integration secrets in Kubernetes secrets mounted as env vars, copied key names into the system prompt, and logged full tool JSON in traces. OAuth refresh tokens lived in the same Redis keyspace as chat history.
Changes shipped
- Central
credential-brokerservice with Vault backend; agents receive zero env secrets - Permission manifest lists allowed
credential_refper agent role; CI fails on schema drift - Tool middleware strips any argument key matching
/secret|token|password|api_key/ibefore execution - Span processor redacts secret patterns; weekly scanner replays random traces
- User OAuth moved to encrypted per-tenant table; broker is sole reader
Outcomes
- Secret substrings in stored traces: 14% of runs → 0% over six weeks
- Unauthorized API calls blocked at broker: 3.2% of attempted tool calls (mostly misconfigured refs)
- Mean tool latency +18 ms for broker hop — accepted vs rotation incidents
- Key rotation now zero-downtime: update Vault; no agent image rebuild
Technique decision table
| Approach | Best for | Weak when |
|---|---|---|
| Secrets in system prompt / env exposed to model | Local prototypes, read-only public APIs | Any production mutating tool; compliance; prompt injection risk |
| Credential broker + refs | Production agents with multiple integrations | Extreme low-latency edge without sidecar budget |
| Per-user OAuth via broker | Calendar, email, CRM agents acting on behalf of users | Batch jobs with no user context (use service refs) |
| Pre-authenticated sandbox only | Browser agents, code execution with network | Agents that must call diverse external APIs per turn |
| Human pastes secret each run | One-off admin scripts | Automation, scale, audit requirements |
Common pitfalls
- Teaching the model where secrets live — “use the key from env STRIPE_SECRET_KEY” guarantees eventual leakage.
- Optional secret fields in tool schemas — the model fills optional fields when confused.
- Logging full HTTP in tool errors — upstream 401 responses echo Authorization headers.
- Sharing refs across tenants —
stripe_prod_romust map per-tenant vault paths. - Checkpointing bearer tokens — durable state becomes a secret store readable by support tools.
- Subagent inherits parent vault — child agents need strict ref subsets, not root access.
- Redaction only in UI — raw spans still export to S3; scrub at export.
- Ignoring vendor logs — tool args sent to third parties may be retained under their policies.
Production checklist
- Zero raw secrets in system prompts, RAG corpora, or few-shot examples.
- Tool schemas use
credential_refenums, not free-text secret fields. - Credential broker resolves refs; model never receives resolved values.
- Permission manifest binds refs to agent roles and environments.
- Short-lived delegation tokens per mutating tool call where supported.
- Span and audit exporters redact secret patterns before persistence.
- Weekly automated scan of traces and chat logs for leak patterns.
- OAuth refresh tokens encrypted; broker is sole accessor.
- Retries re-mint tokens; checkpoints store refs only.
- Subagents receive narrowed ref sets via delegation policy.
- Rotation runbook updates vault without redeploying agent images.
- Incident playbooks cover trace exposure and partner notification.
Key takeaways
- If the model can see a secret, assume it will leak — via tools, traces, or injection.
- Harbor cut trace secret exposure 14% → 0% with a broker and ref-only schemas.
- Inject at runtime in trusted middleware, not in prompts.
- Scoped TTL tokens bound blast radius when redaction fails.
- Redact at export — observability is a secret store unless proven otherwise.
Related reading
- Permission scoping and approval gates — which refs and tools each agent may use
- Sandbox execution — isolated environments for browser and code tools
- Agent audit trail and compliance — logging ref usage without secret material
- HashiCorp Vault — common backend for broker secret storage