I think you're misunderstanding. If I have a lookup for foo.com configured to re...

Lukas_Skywalker · on Nov 9, 2022

I think charcircuit suggests returning the inverse of the actual measured traffic: if the target is 50/50 and backend A gets 70 while backend B gets 30, one solution could be to return A in only 30% of the requests and B in 70% of them, leading to an actual 50/50 distribution.

charcircuit · on Nov 9, 2022

I am suggesting that the returned servers are the servers with the least load. Yes, there may be uneven load being assigned to these servers, but clients are being given the choice of selecting a server with low load.

charcircuit · on Nov 9, 2022

I'm suggesting to use more horizontal scaling. So if you have 1.2.3.0/24 and you advertise 2 servers at a time you can start with 1.2.3.1 and 1.2.3.2 and then you would cycle out servers as they increase in load. If one server is getting less traffic then it just takes longer for it to cycle out.

People without cached records will always get the returned servers with the lowest load.

I was referring to load balancing and not splitting traffic for an experiment.

toast0 · on Nov 9, 2022

You're going to tend to have unevenness there because you're returning results to recursive DNS servers, not clients; so maybe you get unlucky and return A to more large ISPs; their recursive servers serve more clients per lookup and you've got lumpy results.

Then, if you return both IPs, you're very likely to eventually come across some people who sort multiple returned A/AAAA records and will prefer one that's "closer" to their current IP. Because if you're in 1.0.0.0/8, it's certainly going to be better to connect to 1.2.3.4 instead of 5.6.7.8. Same thing happens in v6 land, of course.

charcircuit · on Nov 9, 2022

If get unlucky A's load will rise quickly and it will no longer have a record for it. As the TTL expires the resolvers will get a new set of servers with low load.

Lumpy results just means the times for an IP to be rotated out will be lumpy, and not that the server load will be lumpy.

toast0 · on Nov 10, 2022

You're suggesting that the DNS server(s) get realtime load feedback and adjust appropriately. That's possible, but not always available.

charcircuit · on Nov 10, 2022

Yes, that is how you scale DNS load balancing. There is a pretty low limit for the number of records you are allowed to return in practice. Putting 100 A records works with 1.1.1.1, but will break other's resolvers on the internet.

>That's possible, but not always available.

If you build it, then it's always available.

toast0 · on Nov 10, 2022

Sure, but a lot of people use other people's DNS services. The number of times I've been allowed to build a DNS service for my employer is zero. I've used static DNS, where you just put in your X number of A records and hope; dynamic services where you put in X and have them serve only Y of them in any request and hope. I've used services with load feedback, but those are always a much higher tier.

charcircuit · on Nov 11, 2022

>Sure, but a lot of people use other people's DNS services

There are many of DNS services which offer an API for querying and updating records.

It is very easy to write a service / script that just fines the X least loaded servers and then call an API to set those as the available records. In practice you will also want to implement some monitoring that it is actually working.