I'm more and more leaning towards "run dnsmasq on every single machine and be done with it", it feels like the sweet sweet spot to get flexibility and the predictability. Otherwise, systemd-resolved is not as unpredictable and frustrating as resolvconf or networkmanager.
Inaccurate information in the man pages doesn’t help. The one for resolved.conf claims that, if the DNS entry is not present in its configuration, systemd resolved will use the DNS servers in /etc/resolv.conf. But using strace showed me that, in fact, it is ignoring that file and using the resolver provided by my ISP (as resolved’s status output confirms). After copying the entries (for 8.8.8.8 and a few backups) into resolved’s configuration, it’s doing what I want it to, now.
I’ve suspected for a while, and am now pretty sure, that this system has become so byzantine that the (incredibly talented) people assembling distros, writing the software, and maintaining the man pages, don’t know how it works.
> I’ve suspected for a while, and am now pretty sure, that this system has become so byzantine that the (incredibly talented) people assembling distros, writing the software, and maintaining the man pages, don’t know how it works.
Which is... precisely what a lot of people were worried about re: systemd :(
I don't know about deterministic, but there's something to the flakiness.
The old stuff was byzantine, but it was decades old, which meant that it had decades of people beating on it, figuring out all the ways that it could flake, and fixing those flakes. And that's not something to sneeze at.
That doesn't make the old stuff any less byzantine. And systemd really is a lot more coherent than most of what it replaced. But systemd doesn't have decades of people finding all the ways it can fall over.
I agree. My comment wasn’t a complaint about systemd per se. The old init system was nothing to fall in love with. I think this is an improvement. The problem is that is hasn’t “replaced” all the older mechanisms, but is one more layer that adds to the confusing mess, with sedimentary layers of conflicting systems, mysterious files all over the filesystem, and obsolete and conflicting documentation.
What I’ve enjoyed in particular: systemd timers are an improvement over cron; and I’ve been making great use of systemd-nspawn, which is very convenient and powerful.
I agree that it is another layer of complexity, but when it works well it's a really good layer of abstraction for user space programs to interact with the kernel. It's only when issues arise that the extra layer brings in more complexity. And like GP stated, that's mostly an issue of time needed to eliminate the most serious bugs and for edge cases to be well understood and documented.
there were plenty of growing pains at first, but the transition period for any project looking to replace the old init system was going to be a headache no matter what. It does take some time to grok, and it feels monolithic at first, but I actually think that's a good thing. It's composed of many separate binaries that individually conform to the unix philosophy, but they're all distributed, maintained, and released together. You still have tons of configurations options, but you can be confident that standard systemd components are always going to play nicely with each other.
The only issue I've really had are the binary logs. If the indexes get messed up it's a pain to access the logs even if though the actual files rarely have issues with corruption. But it's fairly easy to forward messages to syslog if you need.
The problem that I can see is not that systemd and other new systems are complex. They deal with complex problems. The problem is that the old stuff stays around, so a distribution like Ubuntu keeps growing in complexity and obscurity. My Ubuntu laptop and the couple of Debian servers that I take care of have systemd, which is fine, and I’m happily using it. But they also still have an init.d directory, are still running cron, etc., etc. It’s not coherent. When I have to use strace to spy on my system to I can figure out how to configure it, something’s wrong. And it’s not just the init system. My laptop’s sound is much more reliable after I removed PulseAudio, because it seemed to be fighting with alsa, or something else.
I'd say the vast majority of systemd makes things more consistent and easier to understand how something is working at a glance. For some reason resolved in particular is one of the pieces that happens to have a wacky take on the service implementation choices ON TOP OF all of the other services that want to make things "just work" screwing with "how it's supposed to work".
If it were just resolved or just the other tools doing wacky things DNS stuff would be a lot less messy to configure.
Don't even get me started on systemd/resolved. I know the RFC does not give any of the name servers listed priority, but that is how it has been used since inception. Here comes resolved, so now if your first name server ever fails to resolve, it will switch the second and not switch back (unless it also fails). This causes all sorts of issues with private DNS servers on lans, VPCs, and VPNs where you want a failover to a public DNS while resolving.
> so now if your first name server ever fails to resolve, it will switch the second and not switch back (unless it also fails)
Exactly as it should.
> This causes all sorts of issues with private DNS servers on lans, VPCs, and VPNs where you want a failover to a public DNS while resolving.
This is broken; you should have per-interface defined DNS. For internal lans, vpns et cetera you can define which zones these DNS servers handle. For public resolvers (designated with handling the ~. zone), just having multiple entries provides the fallback you want.
This post is simply wrong. Even on a plain old Debian without any of those programs (networkmanager, ...), the file /etc/resolv.conf will probably be written over at boot time. I don't want to draw a political comparison to the Brexit, but before engaging in "taking back control", one should learn how the current situation works and what the change will bring.
The main missing concept in the article is DHCP. Most computers use this to get their IP and the DNS servers. For example, for as long as I can remember the default Debian way to handle the network was ifupdown, where a hook called `dhcp` after an interface is connected, and by doing so the file resolv.conf is updated. If resolv.conf is manually controlled, there is no DHCP to update it, and then you have to declare your DNS servers. But the DNS that you can query from home may not be reachable from a public wifi. And at work, you won't have access to the intranet unless you add some private DNS servers at the top of your resolv.conf. So a totally manual handling of resolv.conf is a bad idea, at least for a laptop.
The article also never wonder why do all these programs try to "take control" of resolv.conf. Some of them just try to update it when they receive DHCP data about DNS. Other, like resolvconf, try to separate the manual part from the dhcp-generated part (which is what "manual control" should aim to). Other programs may add features, like a local cache or automatic knowledge of the local network (containers and such).
Sometimes you don't want to blindly trust dhcp. My isp is set up so that the internet-facing host gets an IP through dhcp. I don't want to use the ISP's dns however.
I use OpenBSD on that machine. I think it was "ignore domain-name-servers" in dhclient.conf that fixed that for me.
Careful. Even if you point at a different dns resolver, your ISP still gets to see this traffic, and it can still MITM it. This is not theoretical, it is trivial. Many ISPs do it.
Confidentiality isn't the only form of trust, many are simply annoyed with "features" such as wildcard catch-alls that direct you to ad/search pages instead of saying the domain doesn't resolve.
I've seen no evidence that Virgin Media intercepts UCP/TCP port 53 traffic, however. It just runs rigged proxy DNS servers and farms them out over DHCP.
I've seen several reports that the Virgin Media "click here to disable" mechanism is a placebo that actually does nothing at all, including what that message was a reply to.
I use non-ISP dns servers for a variety of reasons. I've tested and found 1.1.1.1 faster than my ISP dns in the past.
Some people use e.g. opendns because they want to use that for keeping track of what dns names are being queried from inside their networks.
I actually do use my ISP's dns for some names (particularly netflix), because only they know the names of their internal cache boxes. I have a dnsmasq configuration that sends queries that end in `netflix.com` to the ISP dns and others to 8.8.8.8/1.1.1.1 or whatever.
Actually unbound. I forget why I chose this over unwind. I had a reason once. Set it up years ago and didn't touch the config files. Maybe the initial setup predated unwind and I had something working already...?
Totally off topic, but I was thinking of replacing it with something I wrote myself. I had some downtime earlier this year and I wrote a forwarding dns server that can speak TLS on both ends. Was simple to do. But needs some polishing before I throw it on github or use it at home.
dnsmasq, unwind and others all support DNSSEC
If you can get a DNS path with DNSSEC, this bypasses the ISP's ability to manipulate your DNS.
Of course, DNS over HTTP is another solution to the same, for browsers.
Of course one could also tunnel through.
It doesn't matter how much of your own software supports DNSSEC: if the sites you talk to on the Internet don't explicitly support it, DNSSEC does nothing for you. This is a problem, because most sites on the Internet don't support it. Several of the most important sites on the Internet, with some of the largest security teams in the industry, have said they don't intend ever to support it.
On the other hand, DNS over HTTPS defeats ISP DNS interception regardless of who supports it, which is why so many more people use it than use DNSSEC, which is moribund.
Here is what I do to take back control of host resolution which was probably the intended goal.
First of all, /etc/nsswitch.conf set the priorities. For instance the line `hosts: files myhostname mymachines dns` would exclude systemd-resolved (aliased to `resolve`) and try to identify containers before querying DNS. See the man.
I used to use resolvconf in order to merge my local rules with those set automatically by DHCP. Nowadays I have a dnsmasq daemon running a dns proxy. This way I can set rules so that some domains are linked to special DNS (e.g. *.debian.org should be resolved through 1.1.1.1 instead of the default that the DHCP set). And added benefit is the filtering of ad domains.
This article is a great condemnation of how badly Unix networking has gone in the last 10 years. For many, many, many years we used /etc/resolv.conf to configure name service. Easy and simple, but not very flexible.
All the tools this article talks about turning off are trying to solve problems. They're trying to let resolv.conf be configured with DHCP, or handle VPNs correctly, or roam across networks. You need software to do that, not just a static config file. The problem is every Linux distro has a different tool. And those tools behave opaquely or do bizarre things like use 127.0.0.53 as the loopback address. No one built the One True Tool that got it right. And so we have a mess that has the sysadmin throwing up their hands, disabling all the tools, and making immutable config files just to get back to basic static file functionality.
> No one built the One True Tool that got it right.
Finding the One True Tool takes time. Developers need to experiment to explore the various corners of the problem space. Eventually we'll settle on the best way to solve the problem for the majority of use cases, and distributions will in time settle on that solution.
In the meantime, sysadmins who only need the simple 20 year old use case can continue to disable the modern tooling and get what they want. Those who need the newer use cases can do so, provide feedback, and the ecosystem can slowly develop and settle on the best solution.
This is open development of Free Software in action. Tools can evolve incrementally or new tools can take big leaps, taking advantage of both the lessons learnt and the code produced in older tools. It's how the ecosystem gets better. I don't see it as a condemnation at all.
> And so we have a mess that has the sysadmin throwing up their hands, disabling all the tools, and making immutable config files just to get back to basic static file functionality.
I’m not a big fan of sysadmins coasting on their existing knowledge, and this sentiment sums up what I object to precisely. Complexity has increased somewhat since the Slackware days, and it doesn’t always seem to be for the best, but that’s no excuse for disabling it.
Expressing this viewpoint in an interview would materially harm a sysadmin candidate’s chances: it suggests that they are inflexible, unable to adapt to change, and prefer to simply sweep issues under a rug rather than document them and resolve them.
I empathize with the desire to set it all aside and go back to bare metal, but it’s not a position I would take as much pride in.
In my opinion, the current mess with resolv.conf is a case of accidental complexity. I don't think there's anything wrong with disabling badly designed systems that aren't useful for the problem at hand, assuming the admin is managing an environment where none of those tools are actually needed (which is not at all uncommon in a enterprise setting).
Sometimes there's a point when you're just fed up with crap messing with your config and spending time hunting it down every other week to fix it and you just want it to stop.
I'm lazy, and I've been doing this on my own computers for over a decade. I wouldn't do it on a box that anyone else uses. And I've had to do the normal fixes for some VPN software when the network blackholes any DNS traffic not going to their server IPs. Otherwise it's worked out fine for me
For example why some daemon in the future takes lots of CPU spinning and logging errors about not being able to write to a file it's configured to manage.
Let's be honest: Next to no service is designed to expect immutable files. At best they're handling a situation of read-only filesystems. If they document how configure not writing to a file and instead of doing that, the user is marking the file immutable, I wouldn't blame the app design.
Maybe someone later has some legitimate edit they want to make to the file, which fails because it's immutable, which is in turn very hard to figure out. (E.g., I didn't know that there exists an "immutable" attribute.)
Put a comment in the file with the exact commands to turn on and off immutability. This is just good practice for most things. Having a maintenance text file with all the system stuff is a secondary good practice along with the comment.
systemd-resolved explicitly (as in, it is documented behaviour) will not touch /etc/resolv.conf if it is a plain file.
It relies on /etc/resolv.conf being a symbolic link to it's generated version in /run/systemd/resolve/resolv.conf to do its thing. If it isn't linked, it assumes that is intentional and leaves your resolv.conf alone.
I'm pretty sure that just replacing the /etc/resolv.conf symlink with a regular file will prevent all of the major network configuration tools from modifying it. I've verified this for resolvconf and systemd-resolved at least:
> To make the resolver use this dynamically generated resolver configuration file the administrator should ensure that /etc/resolv.conf is a symbolic link to /run/resolvconf/resolv.conf. This link is normally created on installation of the resolvconf package. The link is never modified by the resolvconf program itself.[0]
> To improve compatibility, /etc/resolv.conf is read in order to discover configured system DNS servers, but only if it is not a symlink... [1]
NetworkManager is the one exception I'm aware of here, but I don't think any server distros use it. Using NM is also the best option for configuring desktop environments, so I wouldn't bother disabling its resolv.conf management functionality.
Resolve.conf is a blessing and a curse. On the one hand, you've got a file that nearly every flavour of Linux will use for DNS resolution with a standardised format. On the other hand, you've got four hundred network managing programs that all need to be told separately that you don't want to use them to manage the file for you because you want to set it up yourself.
Makes me wonder if the file options should be extended contain a config option to indicate what program manages the file so that others know to keep their hands off. Right now there's comments in there, but those comments are often hardly machine readable.
$ cat /etc/resolv.conf
# DO NOT EDIT THIS FILE. This file is currently under the management
# of NetworkManager. It's automatically generated according to the
# DNS configuration of the active network connection in NetworkManager.
# Any change to this file will be overwritten.
#
# If you are trying to change your DNS resolver, it's officially recommended
# to set it within NetworkManager on a per-connection basis via "nmtui", "nmcli",
# or alternatively, by editing files under "/etc/NetworkManager/system-connections/".
#
# To stop NetworkManager from managing it altogether, edit
# /etc/NetworkManager.conf, set the following and
# restart NetworkManager, before you edit /etc/resolve.conf.
#
# [main]
# dns=none
#
# If you are trying to set up a stub DNS resolver, it's recommended to
# invoke NetworkManager's builtin support. Set "dns=dnsmasq" or "dns=unbound"
#
# See "man 5 NetworkManager.conf" for more details.
I recommend reading RFC 1034 section 5.3.1 first, and learning what a "stub resolver" actually is. It's actually the thing that is reading /etc/resolv.conf .
Reading up about glibc's nsswitch would also help.
The problem with RFC 1034 section 5.3.1 ideas is, that it is insufficient if you have more than 1 interface and DNS resolvers available over them might provide different answers for some zones (example: you have a VPN connection and the DNS resolver on the other side of the link has ACLs for the zones it is handling, providing you with answers that the public DNS won't provide); that's why networkmanger+dnsmasq/systemd-resolved/whatever are used.
Linux is not the only system that has to handle /etc/resolv.conf being inadequte; but it is funny that we see articles like this only for Linux. I have yet to see an article "How to take back control of /etc/resolv.conf on MacOS".
That's because there actually is a comment in /etc/resolv.conf on MacOS telling you that it isn't used by the libraries there. This is ironic, given the context here.
#
# Mac OS X Notice
#
# This file is not used by the host name and address resolution
# or the DNS query routing mechanisms used by most processes on
# this Mac OS X system.
#
# This file is automatically generated.
#
/etc/resolv.conf hasn't been the place for this for the past 17 years, this having changed in MacOS 10.3. It's in the system preferences plist now, modifiable with the networksetup tool.
No, I've never seen cleanbrowsing and it's ilk; at least in this case it is a comercial companies trying to force the system to work with their product, not a part of community supposedly "fixing" the system and breaking it in process. They also do not force resolv.conf, only using the new configuration mechanism to make their changes stick.
Unbond & co are of course not stub resolvers; however the classic stub resolvers (like glibc nss dns module) are kind of obsolete too. All modern systems have an service, that does the resolving and caching in the separate process and the nss just asks that process. Just like unbound & co. With properly configured nsswitch (e.g. with system-resolved and nss-resolve), the resolv.conf is not used either, just like with MacOS. Resolv.conf pointing to local resolver is only for processes, that ignore nsswitch and would be break otherwise.
The handling for the resolv.conf file is embedded in the libc and is larger than Linux. This whole issue is a failure of the Linux distribution community to come up with coherent standards for managing a simple text file on a system using DHCP. The solutions today are horribly complicated and appear to optimize for laptop users on simple, naive networks. I have no idea why I have to mess with any of this garbage on my servers.
No, we need to get back to software like it was in the early 2000s when people cared about consistent, long-term behavior instead of re-inventing everything because laptops.
It makes sense to me. It solves the problem for simple networks and roaming users - so basically situations where things are expected to "just work" with minimal knowledge. It's our (sysadmins) job to configure the system correctly in other places. Especially in case of weird networks.
But simple networks and roaming users are perfectly fine with regular static file. I have no idea what user base this overengineered crap is catering but it seems equally useless both on (my) servers and laptops.
A significant number of hotels do the login portals using their own DNS servers, do if you ignore those, you can't use that network. Pretty much every corp environment forces local DNS servers.
The web portal isn't the only way on to a hotel network, just the most convenient. The Nintendo Switch console doesn't support captive portals either, so I've gotten in the habit of dialing the front desk any time I stay at a hotel and asking to be transferred to the network team. You tell them you need a device whitelisted and provide your room number, duration of stay, and the MAC of any devices.
I recently installed Ubuntu 20.04 and configured a home gateway. I had not done something like that in a while (3+ years). My experience was not much different this time than any other time before as far as I remember.
It took me an hour of reading the ubuntu official documentation and a few man pages on systemd, interfaces, networkd, resolvd, netplan etc to figure out how stuff works on this distro.
After that I was able to simply do 'systemctl status' and look at all the running processes, then stop and disable all the ones I didn't want.
I wanted to run unbound dns as my dnssec verifying, tls-forwarding, blocklist/filtering resolver. I was able to do that without any difficulty – stopped resolvd, enabled and started unbound and changed the entry in resolv.conf. I also brought up dhcpd for my network and was able to serve that dns server to the devices on my network.
Then I had to spend a bit of time figuring out pppd, policy routing and nftables to do a bit of fancy dual wan routing and traffic filtering. I found netplan to be quite good at what it does (still incomplete for more advanced scenarios - but then so was all the previous tools similar to this).
In all this, I actually found systemd quite helpful. I think those who complain don't take the few minutes it takes to read the man pages and familiarize themselves with the tools on the OS.
Ofcourse things change, and you will have to learn new things. Expecting things to stay constant is unrealistic. As long as I'm able to depend on man-pages and official distro documentation, I'm not objecting to change.
Haha I had a pi for toy projects and somehow dns kept breaking. I didn't have time to find the root cause so I made a cronjob to overwrite the file every minute. Maybe top 10 dirtiest things I've ever done but it worked fine for months
Sometimes I’m baffled that setting a static value for DNS servers is harder than updating resolv conf itself.
Not to mention the fact that the information containing the proper steps to do this simple task of often buried among other network configuration information.
I think SUSE had the best approach: if file doesn’t have a certain header (a comment stating something like “automatically created by x”) it means management scripts shouldn’t touch it. Not sure if it’s still like that.
"The primary purpose of adding 127.0.0.53 to resolv.conf is for client software that wants to do DNS resolution by itself instead of using NSS -- most notable example is Google Chrome, and third-party software which is statically linked (e. g. Go)."
All this mess just to work around Chrome? That's annoying.
I love FreeBSD. Are you using it in a laptop? If so, which one? I've been looking for a friendly laptop to check out the latest FreeBSD. It's been a while.
I've used it with much success on the ThinkPad X220 and more recently the ThinkPad W540 (the latter of which I dual-boot into Ubuntu).
I'm having a few issues with the docking station on the W540, which I'm currently putting down to the docking station being faulty, as I had no such issues docking the X220.
I plan to test a different docking station in the next few days and see if the glitches go away (USB mouse sometimes needs resetting, and X loses the external monitor until it's restarted).
What is the real world impact of every host doing their own recursive lookups?
The popular TLD nameservers would receive a lot of traffic but the amount depends on average TTLs in the wild — metrics which hopefully someone here will be knowledgeable about.
Presumably handling the traffic is nothing compared to handling www.google.com, but the TLDs can’t afford the same scale of infra that FAANG can?
Several billion people resolving their favourite websites every 86400 seconds — it’s either a lot, or one of those things the C10k folks love to show can be handled by a Pentiun III.
There's been several articles and HN posts about chromium's nxdomain/intranet detection feature and its impact on root DNS traffic that may give a hint.
A lot of this depends on where we want to go from here, too. If you use traditional UDP DNS, it's one of those things the C10k folks love to show can be handled by a Pentium III. You could probably get a single modern server to handle more than a billion UDP DNS requests per second.
But if you want to add a TLS handshake to each request and that sort of thing, not so much.
I've spent hours on figuring out Dns related issues in Ubuntu and always wonder how such a basic part of the OS can become such a huge mess... Impossible to figure out for average (/non techies) people.
I like to use Linux network namespaces for fun and profit. You can for example run a program with it's own IP/network.
Network namespaces doesn't however work with network managers, systemd-resolv et.al. because network namespaces create separate unique mounts for /etc/resolv.conf
This article offers wrong advice of disabling resolvconf. You shouldn't be doing that, since it's resolvconf that acts as an arbiter of resolvconf writes. Rdnssd in Debian, unfortunately, bypasses it, and it's a bug in the package.
Oh, I had that fun once, when Ubuntu had moved to a systemd resolver, I'd managed to get two/three programs conflicting - think I used `lsof` to track which programs were opening resolv.conf.
yeah, it does actually do its job pretty well in a large number of circumstances (especially compared to its competition). But if you don't have the problem its solving (which is basically the problem of dealing with dynamically connecting to a wide variety of different networks, i.e. a laptop), then there are nicer options (there's basically no reason to ever use it on a server, for example).
I ended up writing this series to exorcise the demons:
https://zwischenzugs.com/2018/06/08/anatomy-of-a-linux-dns-l...
https://zwischenzugs.com/2018/06/18/anatomy-of-a-linux-dns-l...
https://zwischenzugs.com/2018/07/06/anatomy-of-a-linux-dns-l...
https://zwischenzugs.com/2018/08/06/anatomy-of-a-linux-dns-l...