Zero-Downtime Laravel Deployments on AWS

/ Content

The first time I deployed a Laravel app to AWS "properly," the deploy took the site down for about 45 seconds. Composer install, npm build, migrations, cache clear, all happening while the app was supposed to be serving requests. Forty-five seconds doesn't sound like much, but if you're running anything that people actually depend on, it's an eternity.

Getting to true zero-downtime deployments took me a few iterations. Here's the strategy I've settled on, using AWS CodeDeploy with EC2 instances behind a load balancer.

The Strategy

The core idea is straightforward: prepare everything the new release needs before switching traffic to it. Don't install dependencies while serving requests. Don't run builds while serving requests. Do all of that in the background, and only flip the switch when the new code is fully ready.

With CodeDeploy, you define lifecycle hooks in an appspec.yml file. Each hook runs a script at a specific phase of the deployment. Here's the one I use:

version: 0.0
os: linux
files:
  - source: /
    destination: /var/www/releases/current

hooks:
  BeforeInstall:
    - location: scripts/before-install.sh
      timeout: 60
      runas: root

  AfterInstall:
    - location: scripts/after-install.sh
      timeout: 300
      runas: deploy

  ApplicationStart:
    - location: scripts/application-start.sh
      timeout: 120
      runas: deploy

  ValidateService:
    - location: scripts/validate-service.sh
      timeout: 60
      runas: deploy

Each hook has a timeout, and if the script doesn't finish in time, the deployment fails and rolls back. Here's what each script does.

BeforeInstall: Clean Up

This runs before CodeDeploy copies the new files. I use it to set up the release directory and clean up old releases:

#!/bin/bash
set -e

RELEASE_DIR="/var/www/releases/current"
SHARED_DIR="/var/www/shared"

# Create the release directory fresh
rm -rf "$RELEASE_DIR"
mkdir -p "$RELEASE_DIR"

# Ensure shared directories exist
mkdir -p "$SHARED_DIR/storage"
mkdir -p "$SHARED_DIR/.env"

# Keep only the last 5 releases for rollback
cd /var/www/releases
ls -t | tail -n +6 | xargs -r rm -rf

The shared directory pattern is important. Things like storage files, the .env file, and any uploaded content shouldn't live inside the release directory. They need to persist across deployments. We'll symlink them in the next step.

AfterInstall: The Heavy Lifting

This is where most of the work happens, and it's where zero-downtime lives or dies. The old code is still serving requests at this point, we haven't switched anything yet. We're just preparing the new release.

#!/bin/bash
set -e

RELEASE_DIR="/var/www/releases/current"
SHARED_DIR="/var/www/shared"

cd "$RELEASE_DIR"

# Symlink shared resources
ln -nfs "$SHARED_DIR/.env" .env
ln -nfs "$SHARED_DIR/storage" storage

# Install PHP dependencies
composer install --no-dev --no-interaction --prefer-dist --optimize-autoloader

# Install and build frontend
npm ci --production
npm run build

# Cache configuration and routes
php artisan config:cache
php artisan route:cache
php artisan view:cache
php artisan event:cache

# Run migrations
php artisan migrate --force

A couple of critical things happening here. First, composer install with --no-dev skips development dependencies (you don't need PHPUnit on your production server). The --optimize-autoloader flag generates a class map for faster autoloading.

Second, migrations. This is the trickiest part of zero-downtime deployments. The old code is still running against this database, so your migrations can't break backward compatibility. You can't rename a column that the old code is still using. You can't drop a table that the old code queries.

My rule: every migration must be backward-compatible with the previous release. If I need to rename a column, I do it in two deployments. First deployment adds the new column and writes to both. Second deployment removes the old column after confirming nothing reads from it. It's more work, but it's the only way to do this safely.

ApplicationStart: Flip the Switch

This is the moment of truth. Everything's prepared, and now we switch traffic to the new code:

#!/bin/bash
set -e

RELEASE_DIR="/var/www/releases/current"

# Update the symlink that nginx/Apache points to
ln -nfs "$RELEASE_DIR" /var/www/active

# Restart PHP-FPM to clear opcache and pick up new code
sudo systemctl reload php8.4-fpm

# Restart queue workers so they pick up the new code
php /var/www/active/artisan queue:restart

# Restart Horizon if you're using it
# php /var/www/active/artisan horizon:terminate

The key here is reload, not restart, for PHP-FPM. A reload gracefully finishes existing requests before spawning new worker processes. A restart kills everything immediately. That's the difference between zero downtime and "oops, 200 users just got 502 errors."

The queue:restart command does the same graceful thing for queue workers. It signals them to finish their current job and then exit. Supervisor (or whatever process manager you're using) will start new workers that load the new code.

ValidateService: Trust but Verify

After the switch, CodeDeploy runs a validation step. I use this to hit a health check endpoint and make sure the app is actually working:

#!/bin/bash
set -e

# Hit the health check endpoint
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost/health)

if [ "$HTTP_STATUS" != "200" ]; then
    echo "Health check failed with status $HTTP_STATUS"
    exit 1
fi

echo "Application is healthy"

If this script exits with a non-zero code, CodeDeploy marks the deployment as failed and rolls back to the previous version. This is your safety net. I've had deployments where everything installed fine but the app threw a 500 because of a missing environment variable or a config typo. The validation step caught it before any real users were affected.

Make sure your health check endpoint actually tests something meaningful. Don't just return 200 from a route. Check the database connection, check Redis, check that critical services are reachable. I usually have a /health route that does all of this.

The Migration Problem, Revisited

I glossed over this earlier, but it deserves more attention because it's the hardest part of zero-downtime deployments. The window between "migrations run" and "new code serves requests" is where things can break.

Here's my approach:

Adding columns is safe. The old code ignores columns it doesn't know about.

Removing columns is not safe. Remove them in a follow-up deployment after the new code is running.

Renaming columns is not safe. Add the new column, copy data, deploy code that uses the new column, then drop the old column in a subsequent deploy.

Adding tables is safe. Removing tables follows the same "two deployment" pattern as columns.

For most applications, this "expand and contract" pattern works well. It's a bit more work than just running whatever migration you wrote, but it means you literally never have to take the site offline for a deploy.

What About Blue/Green?

CodeDeploy also supports blue/green deployments, where you spin up an entirely new set of instances, deploy to them, and then switch the load balancer. It's conceptually cleaner but significantly more expensive (you're running double the instances during deployment) and slower (spinning up new instances takes time).

I use in-place deployments for most projects. Blue/green is nice when you need absolute rollback guarantees, you can switch the load balancer back to the old instances instantly. But the in-place approach with the ValidateService hook gives me enough confidence for most situations.

That's It, Really

Zero-downtime deployments aren't magic. They're just careful sequencing: prepare everything first, switch quickly, verify it works. The CodeDeploy lifecycle hooks give you natural places to put each step, and the rollback capability means a bad deploy doesn't have to be a crisis.

The hardest part honestly isn't the deployment pipeline. It's the discipline around backward-compatible migrations. Once you get that part down, everything else is just bash scripts and YAML.