Hacker News new | past | comments | ask | show | jobs | submit login

DNS has its own load balancing at several levels (and several different kinds):

Nameserver records (NS records) used to locate a resource are served by other nameservers. NS records are chosen from among those offered in response to a query (RRs), and should all be tried if necessary to elicit a response. The algorithm isn't strictly specified and some nameservers will shuffle the order in which they return RRs in their answers, some won't assuming the stub resolver or app will do it. The foregoing also applies to A and AAAA records (returning IP addresses for names), and this has long been used as a quick and easy form of load balancing/failover, except that it doesn't really failover very well unless your app is coded to try all of the different answers (and the stub resolver returns them to your app).

Nameservers querying other nameservers (caching/recursive resolvers) are supposed to compile metrics on response times when they make upstream requests and pick the fastest upstreams once they learn them.

Stub resolvers (running on your device) typically query nameservers in the order you specified them in your network config, but not always.

From the foregoing, you can probably see that running a caching/recursive resolver close to your devices is supposed to be desirable, by design.

So far, so far. ;-)

As specified, and it's never been changed, DNS tries UDP first. "Ok", you think "that must mean it will try TCP" but that's not actually true: it only tries TCP if it receives a UDP response with TC=1 (flagged as truncated). But if there's a UDP frag and it doesn't get all the frags or never gets a UDP response at all it /never/ tries TCP.

You're mixing two very different environments above: 1) a datacenter with (let's just assume) VPCs and 2) a web browser.

In case #2 I'll match your ante and raise you an overloaded segment which is dropping UDP packets, in which case stuff may fail to resolve at all. Oh look, I drew a wildcard: traditionally browsers have utilized the devices stub resolver, but since they've pushed ahead with DoH they've had to implement their own. People think I'm a DNS expert (what do they know?) and I guess conventional wisdom amongst myself and my peers is that UDP should perform better than TCP but anecdotally people are claiming that DoH and DoT perform better for them than their stub resolver. "Must be your ISP messing with you" says someone, "yeah right, that's gotta be it". Me: "did you try running your own local resolver?" them: "wuut?"

So here's where I confess that the experts aren't always right, because I run my own local resolver and I have the same problem: when the streaming media devices are running DNS resolution on the wifi-connected laptop sucks and if I run a TCP forwarder it starts working! (https://github.com/m3047/tcp_only_forwarder).

Now to case #1, the datacenter. I hope you're running your own authoritative and caching server, and you should read about views in your server config guide; using EDNS to pass subnet info is a kludge. If you're writing datacenter apps, you should consider doing your own resolution and using TCP (try the forwarder, I dare you), and provisioning accordingly (because DNS servers assume most requests will come in via UDP).

If you want load balancing "you know, like nginx" I've got news for you: BIND comes with instructions for configuring nginx as a reverse TCP proxy. Oh! Looks like I've got a straight in a single suit: nginx provides SSL termination so I've got DoT for free!




I am not really talking about load balancing the DNS traffic, I'm talking about interpreting the response of the DNS query. (The reliability at the network level seems to be handled by moving everything to DNS-over-HTTPS or something, and is a debate for another day.)

For example, consider the case where you resolve ycombinator.com. You get:

    ycombinator.com.        59      IN      A       13.225.214.21
    ycombinator.com.        59      IN      A       13.225.214.51
    ycombinator.com.        59      IN      A       13.225.214.81
    ycombinator.com.        59      IN      A       13.225.214.73
Which of those hosts should I open a TCP connection to to begin speaking TLS/ALPN/HTTP2? The standard doesn't say. I would like a standard that says what to do. (The more interesting case is, say I pick 13.225.214.21 at random. It doesn't respond. What do I do now? Tell the user ycombinator.com is down? Try another one? All of this could be defined by a standard ;)


Perfect example. :-) There's not enough information to make a considered response, unless you've got a history of opening TCP connections to them to base a decision on.

Don't get me wrong, I think stub resolver logic is stuck in the 1980s!

If your app or device doesn't have such a history, and no way to obtain it, then maybe the server can do it based on what it knows about its history with IP addresses "close" to yours (the EDNS kludge).

> It doesn't respond. What do I do now? Tell the user ycombinator.com is down? Try another one? All of this could be defined by a standard

I would argue the DNS is clear about this from its own behavior: it tries another one.

Although it's not clear from `pydoc3 socket.create_connection`, it's pretty clear from https://docs.python.org/3/library/socket.html#creating-socke... that socket.create_connection() that it will "...try to connect to all possible addresses in turn until a connection succeeds."

So I would say that the correct action would be to try all possible addresses until one succeeds.


> Which of those hosts should I open a TCP connection to to begin speaking TLS/ALPN/HTTP2? The standard doesn't say. I would like a standard that says what to do.

Well, there was an RFC (found it, RFC 3484) that told you to pick the one closest to your network (which wouldn't make a difference in this case, unless you were in say 13.225.214.0/27 or so). But that's not actually helpful, because given two destination IPs, one in the same /8 as me, and one not, I don't have any information that would help me determine which is a better choice.

From experience, most browsers will try a couple IPs before showing an error message, but that's not standard. If you have a fancy authoritative server, a lot of traffic, and a bunch of server IPs, you can get OK balancing by telling some clients some IPs and some clients other IPs; but it depends on having enough diversity in recursive servers; if all of your users are coming from one mobile ISP, chances are you won't get a lot of balancing.

(And I'm sure you already know all this :)

Better to have clients with a bit of intelligence. :)


I think the main problem with great ideas like this is that some clients will do a really bad job at implementing the spec correctly and one of those clients will be the default browser on a very popular OS or device.


Isn't that what SRV records are for? If there's one for _http specifying ycombinator.com as the name, then any of those IP addresses should accept a connection on port 80 speaking HTTP. Without independent names, they should all be treated equally and your app (like a browser) gets to try just one or all of them.

If you're talking about subprotocols/versions of HTTP like HTTP2 then you can define subservices, so you could have _http2._http. But no one has proposed that yet :)

Of course with anycast, multiple A records can be redundant :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: