DNS Monitoring: Why Your Domain Needs Constant Watching
Your DNS is the foundation of your entire web presence. If DNS fails, nothing works — no website, no email, no API. Yet most teams don't monitor DNS at all. They find out about DNS problems when customers start complaining.
How DNS Works (Quick Refresher)
When a user types yoursite.com, their browser needs to find the server's IP address:
- Browser checks its cache → miss
- OS resolver checks its cache → miss
- Query goes to recursive resolver (ISP or 8.8.8.8)
- Resolver asks root nameserver → "Ask .com nameservers"
- Resolver asks .com nameserver → "Ask ns1.yourdns.com"
- Resolver asks your authoritative nameserver → "It's 203.0.113.42"
- IP cached, connection established
This chain has multiple points of failure, and each one can take your site offline.
What Can Go Wrong
1. Nameserver Outage
If your authoritative nameservers go down, no new DNS lookups can resolve your domain. Cached responses will work until their TTL expires (usually 5 minutes to 24 hours), then your site becomes unreachable.
Real-world example: The 2016 Dyn DDoS attack took down Twitter, Netflix, Reddit, and hundreds of other sites — not by attacking the sites themselves, but by attacking their DNS provider.
2. DNS Record Misconfiguration
A typo in a DNS record, an accidental deletion, or a failed migration can silently break your site:
- Wrong A record → points to the wrong server or a dead IP
- Missing CNAME → subdomain stops working
- Wrong MX record → email delivery fails
- Incorrect CAA record → SSL certificate renewal fails
3. DNS Propagation Issues
After changing DNS records, the old values are cached across the internet. Propagation can take minutes to 48 hours depending on TTL values. During this window, some users see the old IP and some see the new one.
4. Domain Expiry
This sounds too basic to happen, but it does — even to large companies. Microsoft accidentally let passport.com expire in 1999. Google's domain briefly expired in 2015.
5. DNS Hijacking
Attackers can: - Compromise your registrar account and change nameservers - Perform BGP hijacking to intercept DNS traffic - Use cache poisoning to inject false records
6. DNSSEC Validation Failures
If you've enabled DNSSEC (you should), signature validation failures can make your domain unresolvable for resolvers that enforce DNSSEC.
What to Monitor
Record Accuracy
Regularly verify that your critical DNS records return the expected values:
| Record Type | What to Check |
|---|---|
| A / AAAA | Correct IP address |
| CNAME | Correct target domain |
| MX | Mail server priority and hostname |
| TXT | SPF, DKIM, DMARC records intact |
| NS | Nameservers haven't changed |
| CAA | Certificate authority restrictions |
| SOA | Serial number increases on updates |
Resolution Time
DNS resolution should take under 100ms. If it consistently takes longer, your nameservers may be overloaded or geographically distant from your users.
Nameserver Health
Monitor each of your nameservers independently. If you have ns1.example.com and ns2.example.com, check both — don't assume that if one works, both do.
DNSSEC Chain
If you use DNSSEC, validate the entire chain from root to your zone. A broken DNSSEC chain is worse than no DNSSEC at all.
Domain Registration
Monitor your domain's expiry date and WHOIS status. Set up alerts at 90, 30, and 7 days before expiry.
DNS Monitoring Best Practices
Check from Multiple Locations
DNS infrastructure is distributed. A record that resolves correctly from Frankfurt might fail from Tokyo due to: - Different recursive resolvers - Geo-based DNS routing misconfiguration - Regional ISP caching issues
Use Low Check Intervals
DNS changes can take effect within seconds. Check every 1-5 minutes to catch issues quickly.
Monitor Both Authoritative and Recursive
- Authoritative checks verify your nameservers directly
- Recursive checks verify what actual users experience
Alert on Any Change
DNS records shouldn't change unexpectedly. Any unauthorized change could indicate a security incident.
Keep TTLs Reasonable
- Too low (< 60s): More DNS traffic, higher load on nameservers
- Too high (> 3600s): Slow failover, long propagation on changes
- Recommended: 300s (5 minutes) for most records, lower during migrations
Incident Response for DNS Issues
Symptoms
- Site unreachable, but server is running fine
- "DNS_PROBE_FINISHED_NXDOMAIN" errors in browser
- Email delivery failures (MX record issues)
- SSL certificate renewal failures (CAA record issues)
Immediate Actions
- Check DNS records with
digornslookupfrom multiple locations - Verify nameserver health
- Check registrar account for unauthorized changes
- If hijacked: contact registrar, enable registrar lock, change passwords
Prevention
- Enable registrar lock to prevent unauthorized transfers
- Use two-factor authentication on your registrar account
- Deploy DNSSEC to prevent cache poisoning
- Use multiple DNS providers for redundancy
- Monitor DNS records continuously
Conclusion
DNS is the invisible infrastructure that everything depends on. When it works, nobody thinks about it. When it fails, nothing works. Set up DNS monitoring now — before you learn about a DNS problem from an angry customer email that may not even reach you because your MX records are broken too.