IPv6 and Hidden Downtime: How My Friend's Phone Exposed a Problem I Didn't Know I Had
A few months ago my buddy Marcus texted me: "Dude, your server status page is dead." I was sitting at my desk, the page was open in a browser tab right in front of me, loading fine. I refreshed. Still fine. I told him it was working. He sent me a screenshot of a timeout error.
We went back and forth for about ten minutes. I checked from my laptop, my phone on Wi-Fi, my other laptop. All fine. He tried on his phone and his girlfriend's phone. Dead for both. Another friend in the same Discord channel said it was down for him too. He was also on mobile data.
That's when it clicked. Marcus was on T-Mobile. Our other friend was on Verizon. Both on cellular. I was on my home broadband. We were literally looking at the same URL and getting completely different results.
The problem turned out to be IPv6. My server had an AAAA record pointing to an address that wasn't actually listening anymore. I'd changed my server config a week earlier and forgot to update the IPv6 side. My home ISP was using IPv4 to reach the site, so everything was fine for me. Marcus and everyone on mobile carriers were hitting IPv6 first and getting nothing back.
I'd been running a broken site for a week and had no idea. My monitoring? All green. 100% uptime. Because my monitoring, just like my home internet, was checking over IPv4.
Why Mobile Carriers Are Mostly IPv6 Now
I didn't really understand this until I started digging into it after the Marcus incident. Here's the short version: the world ran out of IPv4 addresses. There are only about 4.3 billion of them, and they've been allocated for years. When mobile carriers needed to connect hundreds of millions of new smartphones, they couldn't get enough IPv4 addresses to give one to each device. So they went with IPv6.
The scale of this is bigger than I expected:
- T-Mobile in the US runs over 90% of its traffic on IPv6. When Marcus connected to my server from his phone, he was almost certainly going over IPv6.
- Verizon and AT&T are both above 70% IPv6 traffic. Most smartphone users on these networks default to IPv6 connections.
- Reliance Jio in India serves hundreds of millions of subscribers on an IPv6-only network. If you have users in India on mobile, they are on IPv6.
- Deutsche Telekom, Sky UK, and other big European ISPs have rolled out IPv6 to residential and mobile customers at scale.
- Brazil's Claro and Vivo, plus carriers across Japan, Vietnam, Thailand, and Malaysia all have significant IPv6 deployments.
Google's own stats show that over 45% of all connections to their services globally use IPv6. In the US it's above 50%. In India during peak mobile hours it goes above 70%. These aren't predictions. These are measured right now.
What happens when an IPv6-only user visits your site
Most of these carriers run what's called an IPv6-only network with NAT64. The user's phone only has an IPv6 address. When they visit a site that's IPv4-only, the carrier's gateway translates the traffic. It works, but it adds latency and one more thing that can break.
There are three scenarios:
- Your site has both IPv4 and IPv6 and both work. The user connects natively over IPv6. Fast, direct, no translation. This is the happy path.
- Your site is IPv4-only (no AAAA record). The user connects through NAT64 translation. It's slower but it works. Not ideal, but functional.
- Your site has a broken IPv6 endpoint. The AAAA record exists but the server doesn't respond on IPv6. The user's phone tries IPv6 first, waits for a timeout (10-30 seconds), then falls back to IPv4 through NAT64. This is the worst case. Your site feels completely broken even though it eventually loads.
My situation was that third one. The AAAA record was there from my original setup, pointing to an IPv6 address that wasn't configured on my new server setup. So every mobile user was waiting for a timeout before the fallback kicked in. Most of them probably just gave up and closed the tab.
Why I Didn't Catch It (And Why You Probably Won't Either)
Here's what really bugged me about this whole thing. I had monitoring set up. I had a service pinging my site every minute. It was showing 100% uptime. And it was technically correct. Over IPv4, my site was up 100% of the time. The monitor just never tried IPv6.
How monitoring tools usually work
When you give a monitoring tool a domain name, it resolves the DNS and connects. Most monitoring servers run on Linux in data centers, and they default to IPv4 connections. Some tools only look up the A record and never even query for AAAA. Unless you've gone out of your way to set up a separate IPv6 monitor, you're only testing the IPv4 path. That's it.
The timeline of invisible downtime
Think about how this plays out in real life:
- You change your server config, break IPv6, don't realize it.
- Your monitoring checks the site every minute over IPv4. Every check comes back green.
- Morning comes. Mobile users in India start trying to reach your site. Timeouts.
- A few hours later, US users on T-Mobile try during their commute. Same timeouts.
- Your support team starts their day, checks the dashboard. All green. But the ticket queue is growing.
- Someone finally thinks to test from a phone on cellular data and discovers the problem.
In my case it was a week. A full week of broken IPv6 before Marcus happened to mention it. That's not hypothetical downtime cost. That's real users bouncing off a broken page.
The "works for me" trap
Even after Marcus told me it was broken, my first instinct was to test it myself. I loaded it on my laptop. Fine. My phone on Wi-Fi. Fine. My partner's phone on our home Wi-Fi. Fine. I almost told Marcus it was on his end. Every test I ran was on my home network, which is IPv4. Of course it worked for me. For more on this problem, check out our guide on checking if your site is down for everyone or just you.
All the Ways IPv6 Can Break Independently
After I fixed my issue (updated the AAAA record to point to the right address and made sure my server was actually listening on IPv6), I went down a rabbit hole reading about all the different ways IPv6 can fail while IPv4 stays healthy. It's a longer list than you'd expect.
DNS-level failures
- Stale AAAA record. This was my problem. The record existed but pointed to an old address. Users try IPv6, timeout, slow fallback.
- Missing AAAA record entirely. Not great for performance, but at least users go straight to NAT64 without the timeout.
- DNS server not reachable over IPv6. If your authoritative nameserver can't be queried over IPv6, some resolvers won't get your AAAA records at all.
Firewall problems
- ip6tables not configured. You set up your iptables rules perfectly but forgot that IPv6 has its own firewall. All IPv6 packets get dropped.
- Cloud security groups missing IPv6 rules. On AWS, GCP, and Azure, security groups need explicit IPv6 CIDR entries (
::/0). If you only added IPv4 rules, IPv6 traffic is blocked. - DDoS protection dropping IPv6. Some DDoS mitigation services don't handle IPv6 filtering well. During an attack, legitimate IPv6 traffic gets dropped with the bad traffic.
Web server misconfigurations
- Missing IPv6 listener. Your Nginx config has
listen 443 ssl;but notlisten [::]:443 ssl;. IPv4 works, IPv6 gets connection refused. - SSL certificate not served on IPv6. The listener exists but the certificate isn't attached to it. IPv6 users get SSL errors.
- Virtual host mismatch. IPv6 requests land on the default virtual host instead of yours. Wrong content or a 404.
CDN and load balancer issues
- CDN not configured for IPv6. The CDN published an AAAA record but the edge doesn't serve your content on IPv6. Our article on IPv6 and CDN failures goes deeper on this.
- Load balancer IPv6 listener disabled. Cloud load balancers sometimes need separate configuration for IPv6.
- Health checks only run on IPv4. The load balancer checks backends over IPv4 only, so an IPv6-specific backend failure goes undetected.
Routing and transit problems
- ISP peering issues. Your server's IPv6 route to certain ISPs can be down while IPv4 routes are healthy.
- BGP route leaks. An IPv6 routing incident can make your server unreachable from specific networks or regions while IPv4 is fine.
- Tunnel failures. Some networks still use IPv6 tunneling, which adds another point of failure.
What This Actually Costs You
When I first hit this issue, I thought of it as a minor technical thing. But the more I looked into it, the more I realized the business impact is real, even for a hobby project like mine.
You're losing mobile visitors
Mobile traffic is the majority of web traffic globally. In a lot of markets, mobile users are the majority of your audience, and they're predominantly on IPv6. If your site is broken for them, you're losing the biggest chunk of your visitors. For e-commerce, that's lost sales. For SaaS, that's failed logins and users who assume you're unreliable and never come back.
Search engines notice
Googlebot has been crawling over IPv6 for years. If your IPv6 endpoint is intermittently broken, Googlebot encounters errors when it tries to crawl via IPv6. That can mess with your crawl budget and indexing frequency. Google hasn't said outright that IPv6 reachability affects rankings, but crawl errors from any source reduce how often and how thoroughly your site gets crawled.
Your uptime numbers are a lie
If you report uptime based on IPv4-only monitoring, you might be telling customers you have 99.99% availability while a significant chunk of them are experiencing something very different. When a customer says "your service was down for 3 hours" and you say "our monitoring shows 100% uptime," nobody wins that argument. You just lose trust.
Slow incident response
Without IPv6 monitoring, you find out about IPv6 failures when users complain. If the affected users are in a different timezone, it could be hours before someone reports it. Compare that to automated monitoring that catches the failure in minutes and pings your phone. Use our Downtime Cost Calculator to see what that delay actually costs.
How I Fixed This With UptyBots
After the Marcus incident, I set up dual-stack monitoring. I wasn't going to let this happen again. UptyBots made this straightforward because it's built for exactly this kind of protocol-specific checking.
| Monitoring feature | How it helps detect IPv6-only downtime |
|---|---|
| Separate IPv4 and IPv6 monitors | Each protocol is checked independently. An IPv6 failure triggers an alert even when IPv4 is healthy. |
| Multi-location probes | Checks run from multiple regions, detecting IPv6 failures that only affect specific ISPs or geographies. |
| Response time history | Compare IPv4 vs IPv6 performance over time. Spot degradation before it becomes a full outage. |
| Multi-channel alerts | Get notified via email, Telegram, or webhook the moment IPv6 goes down. No waiting for user reports. |
| SSL checks per protocol | Detect certificate mismatches or handshake failures that only occur on IPv6 connections. |
I now have two monitors for every endpoint I care about: one for IPv4, one for IPv6. If either one fails, I get a Telegram message within a minute. No more week-long invisible outages. See our step-by-step guide to creating separate IPv4 and IPv6 monitors if you want the full walkthrough.
Quick Test: Is Your IPv6 Working Right Now?
After everything I went through, I check this regularly now. You can do it in about 30 seconds from a terminal:
# Check if your domain has an AAAA record
dig AAAA yourdomain.com +short
# Test IPv6 HTTP connection
curl -6 -I https://yourdomain.com
# Test IPv6 ping
ping6 yourdomain.com
# Compare IPv4 vs IPv6 response times
curl -4 -w "IPv4: %{time_total}s\n" -o /dev/null -s https://yourdomain.com
curl -6 -w "IPv6: %{time_total}s\n" -o /dev/null -s https://yourdomain.com
If the curl -6 command fails but curl -4 works, you have exactly the problem I had. If the AAAA dig returns an address but the curl times out, you've got a stale record pointing to nowhere. Our guide on diagnosing IPv6 failures with ping, traceroute, and DNS tools walks through the full troubleshooting process.
What I'd Tell Past Me (And What You Should Do Today)
If I could go back to before the Marcus incident, here's what I'd do differently:
- Check what your monitoring actually tests. Log into your monitoring dashboard right now. Look at each monitor. Is it checking IPv4 only? If you're not sure, it almost certainly is. That means you have a blind spot.
- Add IPv6 monitors for everything that matters. For every endpoint you monitor over IPv4, create a matching IPv6 monitor. Same check type (HTTP, ping, port, SSL), same interval. It takes a few minutes.
- Verify your infrastructure end to end. Check your web server config, your firewall rules, your load balancer, your CDN. Make sure IPv6 traffic actually reaches your application. Don't assume it works because you turned it on once.
- Set up alerts that reach you fast. IPv6 failures should get to your on-call team immediately, not through the slow drip of customer complaints hours later.
- Re-check after every change. Any infrastructure change (firewall rules, load balancer config, CDN settings, server migrations) can silently break IPv6 while IPv4 stays fine. Make verifying IPv6 part of your post-change checklist.
I spent a week serving a broken page to a big chunk of my users and had no idea. My monitoring dashboard said everything was perfect. The only reason I found out was because Marcus happened to mention it in a text. That's not a monitoring strategy. That's luck.
IPv6 adoption is only going up. More carriers, more users, more regions going IPv6-first or IPv6-only. The gap between what your IPv4 monitoring shows and what your users actually experience is getting wider, not smaller. The fix is simple: monitor both protocols. Catch the failures your current setup is blind to.
For more on why dual-protocol monitoring matters, read why monitoring IPv4 and IPv6 separately matters.
See setup tutorials or get started with UptyBots monitoring today.