Monitor WordPress Site Health with Shell Scripts, Cron Jobs, and Email Alerts • HelloAdmin

A WordPress site health monitoring stack built from shell scripts and cron jobs provides lightweight, dependency-free alerting without SaaS subscriptions — covering HTTP availability, response time, disk usage, MySQL replication lag, and PHP error log growth. The core of each health check is a curl command that measures the HTTP status code and time-to-first-byte: curl -o /dev/null -s -w "%{http_code} %{time_starttransfer}" https://example.com — a non-200 status or a response time above a threshold triggers an email alert via mail or sendmail. Disk usage checks use df -h with awk to extract the percentage for the partition holding WordPress files and the MySQL data directory — alerting at 80% and 90% thresholds gives enough lead time to clean up before the disk fills. PHP error log monitoring uses wc -l to count lines in /var/log/php_errors.log between runs (storing the previous count in a state file) and alerts when more than 50 new errors appear in a 5-minute window. MySQL replication lag is checked with mysql -e "SHOW REPLICA STATUS\G" and grep Seconds_Behind_Source — a lag above 30 seconds triggers an alert for the replica setup described in the replication post. All check scripts write to a structured log file in /var/log/wp-monitor/ with ISO-8601 timestamps, and the alert email includes the hostname, timestamp, check name, current value, and threshold for immediate context. The scripts are owned by root, not world-writable, and run as a dedicated wpmonitor system user that has read-only MySQL access via a separate monitor@localhost account with only the REPLICATION CLIENT privilege. Cron scheduling uses standard crontab syntax: */5 * * * * for the availability and error-log checks, 0 * * * * for disk checks, and */1 * * * * for replication lag. Rate-limiting alerts with a lockfile prevents flooding the inbox during extended outages — a check only sends a new email if more than 30 minutes have passed since the last alert for the same check.

Problem: WordPress site outages, slow response times, and disk-full conditions go undetected for hours because there is no automated monitoring in place — the first notification comes from an angry customer rather than an alert system.

Solution: Write shell scripts that check HTTP status, response time, disk usage, and PHP error log growth, schedule them with cron, send email alerts with context when thresholds are exceeded, and rate-limit alerts with a lockfile to prevent inbox flooding.

#!/usr/bin/env bash
# /usr/local/bin/wp-health-check.sh
# Checks HTTP status and TTFB; sends email alert on failure.

set -euo pipefail

SITE_URL="https://example.com"
ALERT_EMAIL="admin@example.com"
TTFB_LIMIT=3.0          # seconds
LOCK_DIR="/var/run/wp-monitor"
LOG_FILE="/var/log/wp-monitor/health.log"
HOSTNAME_=$(hostname -f)
NOW=$(date --iso-8601=seconds)

mkdir -p "$LOCK_DIR" "$(dirname "$LOG_FILE")"

# Measure HTTP status and TTFB
read -r HTTP_CODE TTFB <<< "$(
    curl -o /dev/null -s -w "%{http_code} %{time_starttransfer}" \
         --max-time 10 --connect-timeout 5 "$SITE_URL"
)"

echo "$NOW  status=$HTTP_CODE  ttfb=${TTFB}s" >> "$LOG_FILE"

send_alert() {
    local subject="$1" body="$2" lock="$LOCK_DIR/alert_${3}.lock"
    # Suppress if alerted within the last 30 minutes
    if [[ -f "$lock" ]] && (( $(date +%s) - $(stat -c %Y "$lock") < 1800 )); then
        return
    fi
    touch "$lock"
    echo -e "Host: $HOSTNAME_\nTime: $NOW\n\n$body" | mail -s "$subject" "$ALERT_EMAIL"
}

# HTTP status check
if [[ "$HTTP_CODE" != "200" ]]; then
    send_alert "[WP ALERT] $HOSTNAME_: HTTP $HTTP_CODE" \
               "Site returned HTTP $HTTP_CODE (expected 200).\nURL: $SITE_URL" \
               "http_status"
fi

# TTFB check (compare as floats using awk)
if awk -v ttfb="$TTFB" -v limit="$TTFB_LIMIT" 'BEGIN{exit !(ttfb > limit)}'; then
    send_alert "[WP ALERT] $HOSTNAME_: Slow TTFB ${TTFB}s" \
               "Time to first byte is ${TTFB}s (limit: ${TTFB_LIMIT}s).\nURL: $SITE_URL" \
               "ttfb"
fi

#!/usr/bin/env bash
# /usr/local/bin/wp-disk-check.sh
# Alerts when disk usage exceeds 80% on the WordPress or MySQL partition.

ALERT_EMAIL="admin@example.com"
WARN_PCT=80
CRIT_PCT=90
NOW=$(date --iso-8601=seconds)
HOSTNAME_=$(hostname -f)
LOCK_DIR="/var/run/wp-monitor"
mkdir -p "$LOCK_DIR"

# Check all mounted filesystems
df -h --output=pcent,target | tail -n +2 | while read -r pct mount; do
    pct="${pct%%%}"  # strip % sign
    if (( pct >= CRIT_PCT )); then
        level="CRITICAL"; lock_key="disk_crit_${mount//\//_}"
    elif (( pct >= WARN_PCT )); then
        level="WARNING";  lock_key="disk_warn_${mount//\//_}"
    else
        continue
    fi

    lock="$LOCK_DIR/${lock_key}.lock"
    if [[ -f "$lock" ]] && (( $(date +%s) - $(stat -c %Y "$lock") < 3600 )); then
        continue
    fi
    touch "$lock"
    echo "Host: $HOSTNAME_\nTime: $NOW\nMount: $mount\nUsage: ${pct}%" | \
        mail -s "[WP $level] Disk ${pct}% on $mount ($HOSTNAME_)" "$ALERT_EMAIL"
done

# Crontab entries (add via: crontab -e)
# */5 * * * *  /usr/local/bin/wp-health-check.sh
# 0   * * * *  /usr/local/bin/wp-disk-check.sh

NOTE: Ensure sendmail or a compatible MTA (msmtp, postfix) is configured on the server before relying on the mail command — many cloud VPS providers block outbound SMTP on port 25 by default. Use msmtp with an authenticated SMTP relay (Gmail, SendGrid, Mailgun) and point /etc/mail.rc to it for reliable alert delivery.