Guide

LLM agent scheduled cron job trigger systems explained

Harbor Finance wired a month-end close agent to run at 0 2 1 * * in America/New_York — reconcile ledgers, draft variance commentary, and post accrual journals through approved write tools. The first production month looked fine. November broke: daylight saving time ended on a Sunday, the scheduler fired the job twice in the ambiguous 1:00–2:00 AM hour, and a mid-run deploy restarted the worker process, which re-enqueued the same tick because no overlap lock existed. Finance closed the books with duplicate accrual entries on 19% of subsidiary ledgers before anyone traced the bug to cron semantics, not model hallucination.

Scheduled cron triggers are how agents run without a human in the loop — nightly reports, hourly health scans, compliance attestations, and data-pipeline maintenance. They look simpler than webhook ingress but fail in subtler ways: timezones, missed ticks, overlapping long runs, and duplicate enqueue on process restart. This guide covers schedule registries, cron vs interval triggers, timezone and DST policy, missed-window catch-up rules, per-tick idempotency and overlap guards, integration with durable checkpoints, the Harbor Finance refactor, a technique decision table, pitfalls, and a production checklist alongside dead letter handling for poisoned scheduled jobs.

When cron beats webhooks and when it does not

Cron triggers answer: “run this agent at predictable wall-clock times.” Webhooks answer: “run when an external system pushes an event.” Most production platforms need both.

  • Cron fits — batch reconciliation, scheduled digests, SLA scans, cache warming, model evaluation sweeps, and any job whose inputs are time-bounded (“all transactions since last successful tick”).
  • Webhooks fit — ticket created, PR opened, payment succeeded. Latency-sensitive, event-shaped payloads.
  • Hybrid — cron enqueues a sweep; individual records still arrive via webhook and dedupe against the same idempotency ledger.

The mistake Harbor made was treating cron as “just call the agent” without the same durability guarantees as webhook ingress: signature verification becomes schedule-version verification; dedup becomes per-tick idempotency keys; the queue topology is identical.

Schedule registry: versioned, tenant-scoped definitions

Store schedules as data, not hard-coded crontab lines scattered across services. A minimal registry row:

{
  "schedule_id": "harbor-close-monthly",
  "agent_id": "ledger-close-v3",
  "cron": "0 2 1 * *",
  "timezone": "America/New_York",
  "enabled": true,
  "version": 7,
  "missed_policy": "skip",
  "overlap_policy": "forbid",
  "tenant_id": "harbor-finance-prod"
}

Every deploy that changes prompt, tools, or cron expression bumps version. Workers embed schedule_id + version + tick_timestamp in the idempotency key so a restarted process cannot double-fire the same logical run. Multi-tenant schedulers must bind tenant_id on the worker before any tool executes — see tenant isolation for credential scoping.

Support three trigger shapes in one registry:

  • Cron expression — calendar semantics with timezone (use a library that understands IANA zones, not fixed UTC offsets).
  • Fixed interval — every N minutes from last successful completion (not from enqueue) for jobs where wall-clock alignment matters less than spacing.
  • One-shot — run at run_at timestamp for delayed human approvals or maintenance windows.

Timezone, DST, and the ambiguous hour

Cron bugs cluster around daylight saving transitions. Rules Harbor now enforces:

  • Never store schedules as raw UTC offsets — always IANA timezone strings.
  • Spring forward (missing hour) — skip ticks that fall in nonexistent local times; log SKIPPED_DST_GAP.
  • Fall back (repeated hour) — fire at most once per logical tick; key idempotency on UTC instant, not local clock label. Harbor’s duplicate close used local hour as key and fired twice.
  • Document tenant preference — some finance teams want “2 AM local always”; others want “09:00 UTC globally.” Make the choice explicit in the registry UI.

Run a quarterly test suite that simulates DST edges for every active timezone in the registry. This is cheaper than a month-end restatement.

Missed-window policy: catch-up vs skip

Schedulers pause during outages. When the worker returns, the cron library may offer to fire all missed ticks. For agent jobs with write tools, that is dangerous.

Policy Behavior Use when
Skip Missed ticks are logged and dropped; next fire is the next valid cron match. Month-end close, daily digest, any job where duplicate logical periods cause harm.
Coalesce At most one catch-up run with backfill_from=last_success_tick in the payload. Metrics aggregation, read-only scans, idempotent sweeps.
Replay each Enqueue one job per missed tick in order. Rare; only when each interval is independent and writes are keyed by interval ID.

Harbor Finance moved month-end close to skip with manual “run now” for operators. Hourly read-only anomaly scans use coalesce so a 20-minute outage does not leave a gap in coverage.

Overlap guards and long-running agents

Agent runs can exceed the cron period — a 90-minute close job on an hourly schedule. Without overlap policy, ticks stack until the queue saturates.

  • forbid — if a run for schedule_id is active, drop or defer the new tick; emit metric cron_overlap_skipped.
  • allow_parallel — only for read-only agents with no shared mutable state; still cap concurrency per schedule.
  • cancel_previous — almost never appropriate for agents mid-tool-chain; prefer forbid.

Persist run state in checkpoints so a deploy mid-run resumes instead of restarting from scratch — Harbor’s second duplicate came from a fresh enqueue on boot rather than resume.

Per-tick idempotency and the side-effect ledger

Every scheduled enqueue should carry:

idempotency_key = hash(schedule_id, version, tick_utc_instant)
payload = { "period_start": "...", "period_end": "...", "trigger": "cron" }

The worker checks a side-effect ledger before executing write tools. If idempotency_key already reached COMPLETED, return cached summary. If IN_PROGRESS beyond lease TTL, escalate to DLQ triage instead of starting a parallel run.

Pair with retry policy: cron jobs should not blindly retry write tools on ambiguous timeout. The next cron tick is not a safe retry if the period boundary overlaps.

Harbor Finance refactor

Engineering shipped four changes after the duplicate close incident:

  1. UTC-based idempotency keys for all ticks, eliminating DST double-fire.
  2. Overlap lock with 4-hour lease on month-end close; mid-run deploys call resume, not re-enqueue.
  3. Missed policy = skip for all write-capable schedules; coalesce only on read-only scans.
  4. Schedule version in audit trail tied to compliance logs so auditors see which prompt/tool manifest ran for each period.

Duplicate journal rate on scheduled closes fell from 19% to 0.6% (remaining cases were manual reruns without new idempotency keys). Mean time to detect scheduler bugs dropped because overlap and skip metrics dashboarded alongside agent success rate.

Technique decision table

Approach Strengths Weaknesses Best for
In-process cron library Simple, low ops Dies with process; DST and overlap bugs Dev/staging only
Dedicated scheduler + queue (recommended) Durable enqueue, horizontal workers, uniform with webhooks Requires registry and idempotency discipline Production agents with write tools
External orchestrator (Airflow, Temporal) Rich DAG, backfill UI, mature observability Heavier; agent-specific payload mapping Multi-step pipelines mixing agents and ETL
Webhook-only (no cron) Event-exact timing No periodic sweeps; gaps if events drop Purely reactive workflows

Common pitfalls

  • Treating cron as fire-and-forget HTTP — scheduled agents need the same queue, dedup, and DLQ path as webhooks.
  • Local-time idempotency keys — DST fall-back duplicates every job that keys on hour-of-day.
  • Catch-up storms after outage — replaying 48 hourly ticks floods APIs and duplicates writes.
  • No overlap guard on long jobs — stack depth grows until workers OOM or rate limits trip.
  • Restart without resume — process boot re-enqueues active runs unless lease + checkpoint exist.
  • Silent timezone defaults — UTC vs tenant local causes reports to close on the wrong business day.
  • Cron for sub-minute latency — use webhooks or streaming; cron libraries are not real-time triggers.

Production checklist

  • Store schedules in a versioned registry with tenant binding.
  • Use IANA timezones; test DST spring and fall edges quarterly.
  • Define missed-window policy per schedule (skip / coalesce / replay).
  • Set overlap policy; forbid parallel runs on write-capable agents.
  • Generate idempotency keys from schedule_id + version + UTC tick.
  • Enqueue through the same durable queue as webhook ingress.
  • Resume from checkpoint on deploy; never blind re-enqueue.
  • Attach period boundaries (start/end) to scheduled payloads.
  • Dashboard overlap skips, missed ticks, and duplicate-key rejects.
  • Route exhausted scheduled runs to DLQ with tick metadata for replay.

Key takeaways

  • Cron triggers need the same durability as webhooks — registry, queue, idempotency, DLQ.
  • DST and overlap are the top duplicate-run sources — key on UTC instants and forbid overlapping writes.
  • Missed-window policy must be explicit — skip for period-bound closes; coalesce for read sweeps.
  • Harbor Finance cut duplicate closes from 19% to 0.6% with UTC keys, overlap locks, and skip-on-miss.
  • Checkpoint resume beats restart-enqueue on deploy for long scheduled agent runs.

Related reading