DNS Monitoring: Why Your Domain Needs Constant Watching

Your DNS is the foundation of your entire web presence. If DNS fails, nothing works — no website, no email, no API. Yet most teams don't monitor DNS at all. They find out about DNS problems when customers start complaining.

How DNS Works (Quick Refresher)

When a user types yoursite.com, their browser needs to find the server's IP address:

  1. Browser checks its cache → miss
  2. OS resolver checks its cache → miss
  3. Query goes to recursive resolver (ISP or 8.8.8.8)
  4. Resolver asks root nameserver → "Ask .com nameservers"
  5. Resolver asks .com nameserver → "Ask ns1.yourdns.com"
  6. Resolver asks your authoritative nameserver → "It's 203.0.113.42"
  7. IP cached, connection established

This chain has multiple points of failure, and each one can take your site offline.

What Can Go Wrong

1. Nameserver Outage

If your authoritative nameservers go down, no new DNS lookups can resolve your domain. Cached responses will work until their TTL expires (usually 5 minutes to 24 hours), then your site becomes unreachable.

Real-world example: The 2016 Dyn DDoS attack took down Twitter, Netflix, Reddit, and hundreds of other sites — not by attacking the sites themselves, but by attacking their DNS provider.

2. DNS Record Misconfiguration

A typo in a DNS record, an accidental deletion, or a failed migration can silently break your site:

  • Wrong A record → points to the wrong server or a dead IP
  • Missing CNAME → subdomain stops working
  • Wrong MX record → email delivery fails
  • Incorrect CAA record → SSL certificate renewal fails

3. DNS Propagation Issues

After changing DNS records, the old values are cached across the internet. Propagation can take minutes to 48 hours depending on TTL values. During this window, some users see the old IP and some see the new one.

4. Domain Expiry

This sounds too basic to happen, but it does — even to large companies. Microsoft accidentally let passport.com expire in 1999. Google's domain briefly expired in 2015.

5. DNS Hijacking

Attackers can: - Compromise your registrar account and change nameservers - Perform BGP hijacking to intercept DNS traffic - Use cache poisoning to inject false records

6. DNSSEC Validation Failures

If you've enabled DNSSEC (you should), signature validation failures can make your domain unresolvable for resolvers that enforce DNSSEC.

What to Monitor

Record Accuracy

Regularly verify that your critical DNS records return the expected values:

Record Type What to Check
A / AAAA Correct IP address
CNAME Correct target domain
MX Mail server priority and hostname
TXT SPF, DKIM, DMARC records intact
NS Nameservers haven't changed
CAA Certificate authority restrictions
SOA Serial number increases on updates

Resolution Time

DNS resolution should take under 100ms. If it consistently takes longer, your nameservers may be overloaded or geographically distant from your users.

Nameserver Health

Monitor each of your nameservers independently. If you have ns1.example.com and ns2.example.com, check both — don't assume that if one works, both do.

DNSSEC Chain

If you use DNSSEC, validate the entire chain from root to your zone. A broken DNSSEC chain is worse than no DNSSEC at all.

Domain Registration

Monitor your domain's expiry date and WHOIS status. Set up alerts at 90, 30, and 7 days before expiry.

DNS Monitoring Best Practices

Check from Multiple Locations

DNS infrastructure is distributed. A record that resolves correctly from Frankfurt might fail from Tokyo due to: - Different recursive resolvers - Geo-based DNS routing misconfiguration - Regional ISP caching issues

Use Low Check Intervals

DNS changes can take effect within seconds. Check every 1-5 minutes to catch issues quickly.

Monitor Both Authoritative and Recursive

  • Authoritative checks verify your nameservers directly
  • Recursive checks verify what actual users experience

Alert on Any Change

DNS records shouldn't change unexpectedly. Any unauthorized change could indicate a security incident.

Keep TTLs Reasonable

  • Too low (< 60s): More DNS traffic, higher load on nameservers
  • Too high (> 3600s): Slow failover, long propagation on changes
  • Recommended: 300s (5 minutes) for most records, lower during migrations

Incident Response for DNS Issues

Symptoms

  • Site unreachable, but server is running fine
  • "DNS_PROBE_FINISHED_NXDOMAIN" errors in browser
  • Email delivery failures (MX record issues)
  • SSL certificate renewal failures (CAA record issues)

Immediate Actions

  1. Check DNS records with dig or nslookup from multiple locations
  2. Verify nameserver health
  3. Check registrar account for unauthorized changes
  4. If hijacked: contact registrar, enable registrar lock, change passwords

Prevention

  • Enable registrar lock to prevent unauthorized transfers
  • Use two-factor authentication on your registrar account
  • Deploy DNSSEC to prevent cache poisoning
  • Use multiple DNS providers for redundancy
  • Monitor DNS records continuously

Conclusion

DNS is the invisible infrastructure that everything depends on. When it works, nobody thinks about it. When it fails, nothing works. Set up DNS monitoring now — before you learn about a DNS problem from an angry customer email that may not even reach you because your MX records are broken too.