Skip to content

Troubleshooting

Why Is My Website Down?

Your website isn't working and you need to fix it now. This guide covers the 10 most common reasons websites go down — from server crashes and DNS failures to the sneakier problems like broken deploys and "up-but-broken" failures. Each section includes how to diagnose and fix the issue.

  • 10 most common causes of website downtime
  • Diagnosis steps for each cause
  • How to prevent each type of failure

Infrastructure causes

Server and infrastructure failures

1. Server crash or overload

Symptoms: 500, 502, or 503 errors. Connection timeouts. Site completely unreachable.

Common causes: Traffic spike beyond server capacity, out-of-memory errors, crashed application process, database connection pool exhaustion, disk space full.

How to fix: Check your hosting provider's status page first. Then SSH into your server and check: process status (systemctl status), memory (free -m), disk (df -h), and application logs. Restart the application process if it crashed. For traffic spikes, scale up temporarily.

Prevention: Set up server monitoring with alerts for CPU, memory, and disk usage. Configure auto-scaling if your hosting supports it.

2. DNS failure

Symptoms: ERR_NAME_NOT_RESOLVED. Site unreachable by domain name but works with IP address. "Server not found" browser error.

Common causes: Domain expired, DNS records deleted or misconfigured, DNS provider outage, recent nameserver change that hasn't propagated.

How to fix: Run dig yoursite.com to check if DNS resolves. If it doesn't, log into your domain registrar and check: is the domain active? Are nameservers correct? Are A/AAAA records pointing to the right IP?

Prevention: Enable auto-renewal for your domain. Use a reliable DNS provider. Monitor DNS resolution as part of your monitoring stack.

3. SSL/TLS certificate expired

Symptoms: Browser shows "Your connection is not private" or ERR_CERT_DATE_INVALID. Site loads on HTTP but not HTTPS.

Common causes: Certificate expired and auto-renewal failed. Let's Encrypt renewal cron job broken. Certificate issued for wrong domain.

How to fix: Check certificate expiry: click the padlock in your browser or run echo | openssl s_client -servername yoursite.com -connect yoursite.com:443 2>/dev/null | openssl x509 -noout -dates. Renew the certificate through your provider or re-run certbot.

Prevention: Use auto-renewing certificates (Let's Encrypt). Monitor certificate expiry dates with at least 14 days warning.

11

Detection rules

5–30 min

Check intervals

Free

1 site

Hosting and delivery

Hosting, CDN, and network issues

4. Hosting provider outage

Symptoms: All sites on the same host are down. Provider status page shows incidents. No SSH access.

Common causes: Data center outage, network failure, provider maintenance gone wrong, hardware failure.

How to fix: Check your provider's status page. There's often nothing you can do but wait. For critical sites, have a failover plan — a secondary hosting provider or a static fallback page served from a CDN.

Prevention: Choose hosting with strong SLA guarantees. Consider multi-region or multi-provider deployment for critical sites.

5. CDN failure or misconfiguration

Symptoms: Site works in some regions but not others. Assets load from origin but not from CDN edge. Stale content served. 403 errors on static files.

Common causes: CDN provider outage, misconfigured cache rules, origin shield failure, WAF blocking legitimate requests, certificate mismatch between CDN and origin.

How to fix: Test from multiple regions. Check CDN provider status. Purge the CDN cache. Verify cache rules and origin configuration. Check WAF logs for blocked requests.

Prevention: Monitor from multiple regions. Set up CDN cache validation. Have a "bypass CDN" fallback plan.

6. DDoS attack

Symptoms: Extreme traffic spike in analytics. Server responding very slowly or not at all. High bandwidth usage. Traffic from unusual geographic patterns.

Common causes: Targeted DDoS attack, accidental traffic flood (viral content + no caching), bot crawling aggressively.

How to fix: Enable DDoS protection through your CDN (Cloudflare, AWS Shield). Enable rate limiting. If under active attack, activate "under attack" mode in Cloudflare or similar. Contact your hosting provider.

Prevention: Use a CDN with DDoS protection. Configure rate limiting. Set up traffic monitoring with anomaly alerts.

The most common cause

Deploy regressions and code issues

7. Broken deploy

Symptoms: Site breaks immediately after or within hours of a deploy. Server returns 200 OK but the page doesn't work. JavaScript errors in browser console. Unstyled page.

Common causes: Build failed silently, asset filenames changed but CDN cache wasn't purged, environment variables missing in production, database migration failed.

How to fix: Roll back to the previous deploy immediately. Then investigate: check build logs, compare deployed assets with expected assets, verify environment variables, check database state.

Prevention: This is the single most impactful prevention step you can take: set up post-deploy verification that checks your site after every deploy. Deploy hooks that trigger website checks catch these failures in minutes instead of hours.

8. "Up but broken" — asset-level failures

Symptoms: Server responds 200 OK. Uptime monitor says "all good." But users report broken pages, non-functional buttons, missing styles, or empty screens.

Common causes: JS bundle returns 404 (filename hash changed), CSS served with wrong MIME type (browser blocks it), third-party script CDN is down, CDN serving stale/corrupted asset version.

How to fix: Open browser dev tools → Network tab. Look for failed asset requests (red entries). Check MIME types in response headers. Compare asset URLs in HTML with what's actually deployed.

Prevention: This is the failure type that website monitoring is designed to catch. Traditional uptime tools can't detect these by design — the server is responding. You need monitoring that validates every asset on the page.

Start monitoring today

Free plan. No credit card.

Configuration problems

Configuration and application issues

9. Database connection failure

Symptoms: 500 errors on dynamic pages. Static pages work fine. "Connection refused" or timeout errors in application logs. Intermittent failures.

Common causes: Database server down, connection pool exhausted, credentials changed, firewall rule change blocking database port, database disk full.

How to fix: Check database server status. Test connection directly: mysql -h hostname -u user -p or psql -h hostname -U user. Check connection pool settings. Verify credentials. Check disk space on database server.

Prevention: Monitor database health separately. Set up connection pool monitoring. Use read replicas for resilience. Automate database backups.

10. Redirect loop or misconfiguration

Symptoms: ERR_TOO_MANY_REDIRECTS. Page keeps loading and redirecting. Specific pages broken while homepage works fine.

Common causes: HTTP→HTTPS redirect conflicting with CDN settings, www→non-www redirect loop, force-HTTPS at server level + CDN level creating a loop, trailing slash redirect conflicts.

How to fix: Use curl -I -L yoursite.com to follow the redirect chain and find the loop. Check redirects at each layer: DNS/CDN (Cloudflare page rules), web server (nginx/Apache config), application (.htaccess, middleware, route rules).

Prevention: Configure redirects at only one layer. Test redirect behavior after every configuration change. Monitor specific pages, not just the homepage.

Never be surprised again

How to prevent downtime

Website monitoring

Go beyond uptime pings. Monitor every asset on your pages — JS, CSS, images, third-party scripts. Catch "up but broken" failures that uptime tools miss.

Post-deploy checks

The majority of outages happen right after a deploy. Set up deploy hooks that trigger website checks after every ship.

Multi-region monitoring

Your site can be down in Europe while working fine in the US. Monitor from multiple regions to catch CDN and routing issues.

Smart alerting

Get notified on the channels you actually check — Slack, Discord, email. Alerts should include the root cause, not just "your site is down."

Common questions

Find out before your users do

Free plan available. Website monitoring that catches what uptime tools miss.