Zero-Downtime Deployments on a Modern PaaS

"Zero downtime deployments" sounds like a feature request from a CTO PowerPoint. In practice it's a mostly-solved problem on any modern PaaS — but the specific configuration matters, and database migrations have a way of unmaking the whole thing if you don't think carefully.

What zero-downtime actually means

Two distinct things:

No request is dropped during a deploy. A user mid-API-call shouldn't see a 502.
No request returns errors because of a deploy. A user shouldn't see "column not found" because the new schema isn't applied yet.

The first is solved by the platform; the second is solved by you. We'll cover both.

How the platform handles the deploy

A modern container PaaS performs a deploy roughly like this:

Build the new image (this can take minutes; the old container keeps serving).
Start the new container alongside the old one.
Wait for the new container's healthcheck to pass.
Switch the proxy to route new requests to the new container.
Allow in-flight requests on the old container to drain (typically 30–60 seconds).
Stop the old container.

If any step fails, the deploy is rolled back automatically — the old container keeps serving. You shouldn't see your app down on a failed deploy either.

What you need to do

The platform handles most of this, but a few things are on you:

1. Implement a real healthcheck

Most platforms hit / to verify the container's alive. Don't take this default — it'll succeed even if your database is unreachable. Add an explicit /healthz route that checks:

The application can reach Postgres.
The application can reach Redis (if present).
Any other "without this we serve garbage" dependency.

Return 200 if everything's good, 503 otherwise. The platform will refuse to switch traffic to a new container that's failing its healthcheck.

2. Drain in-flight requests

Most languages and frameworks support graceful shutdown out of the box. Verify yours does. In Node:

const server = app.listen(3000);
process.on("SIGTERM", () => {
  server.close(() => process.exit(0));
});

Without this, the platform's "stop the old container" step will hard-kill mid-flight requests. Users see 502s for the brief overlap window.

3. Don't run schema migrations on the start path

This is the migration trap. If your application runs prisma migrate deploy (or equivalent) during container start, you have a race:

Old container is still serving requests against the old schema.
New container starts, runs migrations, schema changes.
Old container's queries now fail because columns moved.

The fix is to run migrations as a separate step before the new container starts. On Launchverse this is the pre-deploy hook — runs once, against the existing infrastructure, before the new container is started.

4. Make migrations backwards-compatible

Even with a pre-deploy hook, there's still a window where the old code runs against the new schema. Make every migration safe in both directions:

Adding a column? Always nullable. Old code ignores it.
Renaming a column? Don't. Add the new one, copy data, deploy code that uses the new one, then drop the old one in a second deploy.
Dropping a column? Stop reading it in code first, deploy, then drop.
Changing a type? Same as renaming — stage it across deploys.

This sounds laborious, and it is. The alternative is downtime windows, which are worse for users (and harder to schedule than the work to do it right).

Connection draining and websockets

If your app uses long-lived connections (websockets, SSE), the default 30-second drain timeout will cut them off. Either:

Raise the timeout (Launchverse exposes this as Graceful Shutdown Timeout in project settings), or
Implement client-side reconnect logic — most websocket libraries have this built in.

The first works. The second is the pattern most resilient large products use.

Verifying it works

Run wrk or k6 against your production URL during a deploy. If you see any 502s, your healthcheck or graceful-shutdown is wrong. Fix until you don't.

What about background jobs?

Cron jobs and queue workers need their own deploy story. The simplest pattern: a separate "worker" service running in the same project. Deploy it the same way — the platform replaces the container, the queue is consumed by both during the overlap, and nothing's lost.