I have an idea: Because DNS requests are made from a server close to the user, the TLD should use a GEO table in order to give the the two closest DNS servers. Kinda like Anycast without having to configure routing/BGP sessions.
Yes. But most domains eg. websites dont have Anycast. And Anycast is expensive if you just have a private web site or blog. And Anycast services have poor coverage, its only clodflare that has decent coverage. But they only offer proper DNS service to enterprise customers.
> But most domains eg. websites dont have Anycast.
Are you talking about the (mathematical) "domain" in the DNS specs, or the popular domain i.e. the web server?
The latter is arguably true, in which case the geoip proposition is moot: there is only one web server. Maybe you mean the web server has multiple addresses instead of being anycast. Ok, yes that happens; and some DNS servers do use geoip to tailor replies to try and hand the closest address. Here is from tbe BIND ARM:
"By default, if a DNS query includes an EDNS Client Subnet (ECS) option which encodes a non-zero address prefix, then GeoIP ACLs will be matched against that address prefix. Otherwise,they are matched against the source address of the query"
Regarding the former, does anyone have info on how many DNS providers use anycast? I think a lot; or maybe I should say that a lot of domains are hosted on anycast, the DNS isn't as distributed as it used to be. If you're using DNS as a distributed key/value store, I hope you're doing a better job thinking about externalities (leakage) than e.g. the antivirus companies in terms of locating authoritatives and how you update them opaquely.
Personally I think stub resolvers are stuck in the 1980s. They could do a lot more by monitoring traffic health and editing DNS replies. Due to peering arrangements you could be in the same IX as someone else but that might not be the best route. Traceroute, SYN exchanges, (IP) TTLs might be better signals for determining the health of a particular path. I'd never thought about it until this thread started, maybe the stub resolver could use netflow analysis to inform editing the responses it returns to the applications.
DNS getting less distributed is a problem, as public DNS services generally do not hold a cache for long. They also give up if it's unlucky and tries a DNS server that is down!
So my case for top level (TLD) GeoIP: I have many DNS servers for my web-address/domains: Three in EU and two in US. The problem is that when the TLD servers sends the list of DNS servers it's randomized. Instead I want it to return the list in GEO order (and also network health order), so that the recursive resolver ask the best/closest DNS server. So in worst case scenario for a recursive resolver in EU tries a DNS server in USA, and it happens to be down, and then gives up.
The best case scenario is that it tries the closest server in EU.
Trying to solve my problem I've tried the top 10 DNS providers (ranked by uptime and query speed), which uses anycast. Only two could be used as secondary/slaves, and both of them took over two days to propagate an update (they did not use TTL).
The reason why I need fast updates is because of Letsencrypt which requires DNS challenges for wild-star domain SSL/TSL.
About anycast use, the root server's have been using anycast for a while now. Some TLD's use anycast I think (haven't actually checked). For web-hotels, and ISP's most do not use Anycast. ISP however have the DNS server's very close to the end users and they are good at caching. Which is the second reason why I'm against DNS centralization. Querying from example 8.8.8.8 is often 10x slower then using ISP DNS (assuming the ISP have the query cached).
Anycast, although proven to work nice for the root servers, which makes it harder to DDOS-attack, doesn't actually work that great. I argue they could just list the servers in GEO order instead of configuring BGP routes.
When I evaluated the "top 10" Anycast DNS providers, sometimes my amateur setup got lucky (eg test server from US got the US IP first and vice versa) and thus beat the Anycast network in query performance/latency.
> So in worst case scenario for a recursive resolver in EU tries a DNS server in USA, and it happens to be down, and then gives up.
Recursive resolving algorithm for caching servers is actually addressed in the RFCs, it /should/ be trying all of them and using its own findings to prefer the best-performing one(s). But it doesn't know about anycast, if a recursive was switching between anycast nodes (with the same address) then that would imply that routes were flapping. :-(
Interesting data points.
I think the elephant in the room is the Universal Terrestrial Radio Access Network (UTRAN) a.k.a. "mobile" and I don't work with provisioning much so all I can say is that I suspect that if mobile is your concern you just prostrate yourself to the UTRAN masters and co-locate wherever they tell you to.
SSL/TLS cert management is a fiasco in my opinion, it's a shame that DNSSEC hasn't achieved market dominance so that if you own a domain you can automatically sign certs for it yourself. (Then we wouldn't need CA lists in browsers and OSes either.)