Back to blog

Reference · May 18, 2026

Dead man's switch, explained for developers (and how to actually build one)

Topics:Monitoring
A dead man's switch is the developer pattern for “alert me when the silence is the problem.” This post is the plain-English explainer, five real scenarios, and a walkthrough you can ship in one lunch break.
crontap.com / blog
A dead man's switch alerts you when silence is the problem: the backup that never ran, the worker that stopped pinging. What it is, when you need one, and how to wire one up in about 60 seconds.

A dead man's switch is the developer pattern for "alert me when the silence is the problem." Most of your monitoring stack watches for things going wrong: a 500, a slow request, a queue backing up. A dead man's switch watches for nothing happening when something was supposed to. Your nightly backup script that never logged today. The worker that crashed at 3am. The cron that the platform silently throttled. This post is the developer-flavored explainer for what a dead man's switch is, when you actually need one, and how to cover it with Crontap (plus when to add a dedicated absence monitor).

For the heartbeat vs uptime distinction (they catch different failures), start with Cron heartbeat vs uptime monitor. For the full Healthchecks.io pairing walkthrough, see Pair Crontap with Healthchecks.io for end-to-end monitoring.

What a dead man's switch is

The literal mechanical version

The name comes from railways. A dead man's switch on a train requires the driver to hold a lever or pedal. If the driver becomes incapacitated and releases it, the brakes engage automatically. The safety mechanism is tied to continuous human (or machine) presence, not to detecting a specific failure mode.

The software version

In software, the same idea shows up as a heartbeat or dead-man check: something that should happen on a schedule pings a monitor URL. If the ping does not arrive inside a tolerance window, the monitor pages you. The alert is about absence, not about an error response.

"Fail deadly" and why it is the same idea

"Fail deadly" is the security framing: the system defaults to the dangerous outcome unless something actively proves it is still healthy. A dead man's switch inverts that for ops: unless your job actively checks in, you assume it failed. Hosted heartbeat services phrase it as monitoring "did this thing happen?" rather than "did this HTTP call return 200?"

When to use one (five developer scenarios)

  1. Nightly backup or ETL. The job runs at 2am. If it does not run, nobody notices until someone asks for yesterday's data. A dead-man check pages you when the success ping never arrives.

  2. Cron job that pushes data into a warehouse. Stripe sync, HubSpot export, Shopify inventory pull. The dashboard looks fine because it still shows last week's numbers. Silence is the bug.

  3. Long-running worker process. A queue consumer or sidecar that should heartbeat every N minutes. Process death is invisible to an uptime URL check on the public site.

  4. CI or scheduled pipeline heartbeat. GitHub Actions, Render cron, or an external scheduler fires a workflow. You want an alert when the workflow never started, not only when it failed mid-run.

  5. Personal or small-team "sanity" jobs. A script that emails you a digest, rotates logs, or renews a token. Low traffic, high consequence if it stops.

For scheduled HTTP work on Crontap, see monitoring heartbeats and uptime monitoring.

How a dead man's switch works in practice

The two halves: the thing that should fire vs the watcher

Every implementation has the same shape:

[Scheduler]  →  runs job  →  job succeeds  →  ping monitor URL

                                └── job fails or never runs → no ping → alert

The scheduler owns the clock (system cron, Crontap, GitHub Actions, platform cron). The job does the work and, on success, hits the monitor. The watcher (a hosted heartbeat service, or a second schedule that pings a dead-man URL) holds the tolerance window and fires when the ping is late or missing.

Why uptime monitoring is not the same thing

An uptime monitor asks: "Is this URL up right now?" It does not know your backup was supposed to run at 2am. A dead-man check asks: "Did I hear from this job inside the window I expected?"

Crontap uptime is built for the first question: paste a public URL, pick a probe interval, get email when probes fail. That is the right tool when customers hit your API and you need a green/red chart. It is not a substitute for calendar-aware absence detection on a background job.

Crontap schedule failure alerts cover another slice: the schedule fired, and the HTTP call returned 4xx, 5xx, or timed out. They do not cover the silent case: the schedule was paused, deleted, or never ran, so there is no failed run to alert on. That gap is what dead-man monitoring fills. We walk through all three layers in Cron heartbeat vs uptime monitor.

What Crontap covers out of the box

Most teams start here before adding a second tool.

1. Fire the job and alert on failure

Create the schedule at Crontap. Paste your job URL, set the cron expression and IANA timezone (for example 0 2 * * * in America/New_York for a 2am backup). Route failure alerts to email, Slack, Discord, or Telegram in the integrations panel.

This covers the loud failure mode: Crontap fired, the endpoint misbehaved, you get paged. Most platform schedulers (Vercel Cron, GitHub Actions, Heroku Scheduler) stop there or make alerts harder to wire.

2. Uptime on public URLs

If you also need "is this URL responding 200 for customers?", add an uptime monitor on /uptime. Same account, same alert channels. Pro unlocks 1-minute probes and 90-day history.

Uptime does not prove your nightly job ran. It proves the endpoint answers when probed. Keep both patterns when you have customer-facing URLs and calendar-driven background work.

3. When silence is the bug, add a dead-man ping

If you need an alert when nothing ran at all, something has to watch for missing success pings. Crontap does not raise an alert on absence today; there is no failed run to surface.

The pattern we recommend: Crontap fires the job; your job pings a dead-man URL on success only. Many teams use Healthchecks.io for that watcher (free tier, cron-mode checks, familiar integrations). The click-by-click recipe is in Pair Crontap with Healthchecks.io for end-to-end monitoring.

At the end of the success path:

generate-and-send-report.sh && curl -fsS "https://hc-ping.com/<uuid>"

For WordPress or PHP, use wp_remote_get with blocking => false after the email sends. The && matters: if the report fails, do not ping, or the dead-man monitor thinks the week was fine.

You only need this third layer when a missed run is as bad as a failed run. If failure alerts are enough, stop at step 1.

Fix this in 60 seconds with Crontap. Free forever tier. Three schedules. No credit card. Schedule your first job →

Common gotchas

Clock drift and timezone bugs. The scheduler and the dead-man check must agree on the calendar. Use the same IANA timezone on Crontap and on the heartbeat check. A weekly job at "5am Monday" in Europe/London needs the watcher configured for Monday 5am London, not UTC.

The alert system is what failed. If your only pager is Slack and Slack is down, you may not see the dead-man alert. For critical jobs, route to email and a second channel.

Pinging on failure. If you ping the dead-man URL even when the job fails, the monitor thinks everything is fine. Ping only after successful completion.

Too-tight grace windows. A 5-minute grace on a daily job causes false positives when the job legitimately runs at 2:07am. Set grace slightly larger than your cadence plus normal runtime variance.

Secrets in ping URLs. Treat heartbeat URLs like bearer tokens. Do not commit them to public repos or log them in plaintext.

FAQ

Is a dead man's switch the same as a watchdog timer?

Close. A watchdog timer in embedded systems resets a counter when fed periodically; if the counter expires, the system resets or halts. A dead-man check is the hosted-ops version: an external service expects a ping on cadence and alerts when the ping stops.

Can I build one without a service?

Yes, with trade-offs. You can cron a script that checks last_success.txt timestamp and emails you if it is stale. You maintain the checker, the alert routing, and the edge cases (DST, paused crons, duplicate alerts). Hosted heartbeats exist because that script becomes another thing to monitor.

What is the minimum interval I should monitor?

Match the job's cadence plus margin. Hourly job: 5-10 minute grace. Daily job: 30-60 minutes. Weekly job: a few hours is fine. Too tight creates noise; too loose delays real incidents.

Does Crontap uptime replace a dead-man check?

No. Uptime probes a URL from the outside. A dead-man check waits for your job to phone home on a calendar. Use Crontap uptime for public URL health; use a dead-man ping when the fear is "the cron never fired."

Does Crontap replace Healthchecks.io?

For scheduling, failure alerts, and URL uptime, often yes on one bill. For absence detection ("the run never happened"), Healthchecks (or a similar heartbeat service) still fills the gap until Crontap grows native absence alerts. Pair both when silence is unacceptable; see the pairing post or Crontap vs Healthchecks.io if you are comparing vendors.

Where does this fit with cron job monitoring as a product category?

Dead-man checks are one layer of cron job monitoring: schedule health, run logs, failure alerts, and heartbeat absence together. Crontap ships the first three plus uptime; dedicated heartbeat tools specialize in the last.

Related on Crontap

Fix this in 60 seconds with Crontap. Free forever tier. Three schedules. No credit card. Schedule your first job →

From the blog

Read the blog

Guides, patterns and product updates.

Tutorials on scheduling API calls, webhooks and automations, plus deep dives into cron syntax, timezones and reliability.

Alternatives

Vercel Cron every minute: beating the Hobby hourly limit

Vercel Cron caps Hobby at hourly cadence and 5 jobs, and ties every change to a redeploy. Here is the external cron pattern teams use to ship per-minute schedules, per-IANA timezones, and one dashboard across projects without paying $20/mo per user for Pro.

Alternatives

Cloud Run cron without Cloud Scheduler

Cloud Scheduler costs $0.10 per job per month after the first 3 and asks for OIDC plus IAM bindings on every target. Here is the IAM-free pattern Cloud Run teams use to fire their .run.app URLs on a clock with one bearer token and one dashboard across every GCP project.