This is pretty interesting, and I've never heard of it before today. However - I am a little worried that this seems to allow anyone that enables this option to track all users that make DNS lookups. And it doesn't say if there's a way to disable the behavior, short of switching to a DNS server that doesn't do this.
We've thought about letting people turn it on or off, but have held off for now. We're only sending the edns-option to authoritative DNS providers who also handle the HTTP request, and thus would have seen the entire src_ip anyways.
David, I hope you'll forgive me for asking questions before I'm done
reading the draft, but I'm curious about one thing. Are there any known
effects on client-side caching? I can't think of any, but the rise in
high latency links on the last hop (particularly mobile, but to a
lesser degree WiMax, satellite, ...) is one of the places where
client-side caching (including dns caching) can make a big difference.
If you've ever seen the painfully common DNS issues on Verizon Wireless,
you'd know why I'm asking. ;)
Usually the company that now gets the first 3 octets of your IP4 address is the same company that will get your full IP address anyway when you hit their web page.
Your statement is true, but unfortunately, you're not thinking it
through completely, so you don't see the worrying implications. Most
modern web browsers these days do "DNS Prefetch" which is a fancy way to
say domain names found within web pages are resolved to IP address
before they are actually needed. This means the new GIS (Global
Internet Speedup) garbage will send the first three octets of your IP
address to every evil spamming/tracking site that has a link on the page
you're viewing regardless if you followed the link due to the browser
doing a DNS prefetch.
If you run your own DNS server, then the browser DNS prefetch is
already allowing you to be tracked, and by full IP address rather than
just the first three octets. On the other hand, most people do not run
their own DNS server, so GIS is a reduction of privacy.
The DNS prefetching was exactly what I was thinking of. With GIS, if my site shows up in Google (or elsewhere), I can now track exactly how many people see my site, and from where, without them ever clicking on a link.
You are correct in a general sense but it seems you're stretching the
truth a little bit. Since you only get three octets from GIS, it is
impossible to know "exactly" how many people visit your site. As for the
"from where," you could get a rough idea of location through GeoIP on
three octets, but the result would be generalized. The generalized data
would still be useful, but it would be lacking in resolution and
reliability compared to GeoIP on the full four octet IP address.
The approaching exhaustion of IPv4 in the coming years, and how, in
practice, it is handled could make a real mess of GIS. If your ISP
starts handing out IPv4 addresses in the private address space to
customers and does transparent PNAT, then GIS breaks badly for all
customers of said ISP. In the case of large ISPs, GIS could actually
make things slower.
The part I have no clue about is how GIS works with IPv6? I haven't read
the IETF draft, so I'll just shut up and hope someone more knowledgeable
chimes in here.
> The generalized data would still be useful, but it would be lacking in resolution and reliability compared to GeoIP on the full four octet IP address.
How is that? ARIN doesn't delegate IP address space to users or ISPs in smaller segments than 1024K IP addresses. So it seems that 3 octets is enough to map to a physical location. How does the additional octet give you additional geolocational abilities?
Good question. The answer is in understanding the details. The RIR's
(Regional Internet Registry - ARIN, RIPE, APNIC, ...) do allocate large
blocks as you state, but those large blocks are divided into subnets.
When you realize the subnets have routers and routers often provide
their GPS coordinates, you can see how Geolocation can become more
accurate with more address bits. That's just one of the ways. Another
way would be the subnet assignments often being public and location of
the company/organization with said assignment having a known location.
Still another approach is the looking up locations based on AS/ASN. And
yet another is GPS reporting (think mobile android/ios). There are
probably other ways that I don't know. The important part to realize is
how all of the various methods are both employed and combined to build
out geolocation databases. Geolocation by IP is far from perfect, but
often it can be surprisingly accurate.
Even with the full IP address, GeoIP won't necessarily tell you much. My location has been reported as York, Cambridge, City of London and the Netherlands, all of which are 100+ miles away from my actual location.
I'm a network neophyte, so go slow, but can you explain how?
Is it this?
"Basically, when your browser makes a DNS request, the DNS server will now forward the first three octets (123.45.67) of your IP address to the target web service."
So say you search for something on google; google returns its search results page, your browser gets the page, looks at all the links, asks DNS for the IP's to all those links' addresses, and DNS auto-sends YOUR (truncated) IP to all those addresses' servers?
I guess I'm unclear on why it would do that. If the truncated IP coming to a CDN isn't coming with an actual request, how do they know that at some time later your actual request is from your truncated IP? (I also don't understand why a CDN would use some sort of DNS address as a geolocation strategy, but I guess that's another discussion.)
> So say you search for something on google; google returns its search results page, your browser gets the page, looks at all the links, asks DNS for the IP's to all those links' addresses, and DNS auto-sends YOUR (truncated) IP to all those addresses' servers?
Yes.
> I guess I'm unclear on why it would do that.
The DNS prefetching done by the browser exists to save your time.
Instead of waiting to do a DNS lookup until you click on a link in the
current page, the browser does DNS lookups on all links in the page as
soon as the page is loaded. By the time you're done deciding which link
to follow, the browser is already done with the initial step required to
follow any link on the page.
> If the truncated IP coming to a CDN isn't coming with an actual request, how do they know that at some time later your actual request is from your truncated IP? (I also don't understand why a CDN would use some sort of DNS address as a geolocation strategy, but I guess that's another discussion.)
You seem to have misread the description.
A CDN is a group of multiple servers and all of them could, in theory,
respond to your request for a specific web page. The servers in the
group are spread out all over the globe, but all of them share the same
domain name. When you look up the IP address of the shared domain name,
this new GIS draft sends your truncated IP address to the DNS server of
the CDN so it can choose the server in the group that is "closest" to
you.
> The DNS prefetching done by the browser exists to save your time. Instead of waiting to do a DNS lookup until you click on a link in the current page, the browser does DNS lookups on all links in the page as soon as the page is loaded. By the time you're done deciding which link to follow, the browser is already done with the initial step required to follow any link on the page.
My apologies; I was unclear. I (think I) get the DNS prefetching idea (your browser asks DNS for all the IP's on a page in the hope that one will be hit, and it won't have to spend time to do it later when a link is actually clicked), but why would DNS send anything to the site that it's getting an address for? (And under what protocol?)
When my browser asks DNS for an IP for "www.foo.com", why does "www.foo.com" need to know I asked for it?
I think the phrase "target web service" from the article is misleading. This is about passing part of the client's IP address to the authoritative nameserver for the for the target web service. From my understanding the following is an example- Let's say I'm on the east coast, I'm using Google DNS on the west coast as my DNS server, and I want to load foo.edgecast.com. foo.edgecast.com has two servers, one on the east coast and one on the west coast. When I perform a lookup for foo.edgecast.com I talk to Google's recursive resolver which then talks to Edgecast's authoritative nameserver. Without EDNS edgecast doesn't get any information about me; it just knows the request came from Google on the west coast, so it gives out the IP address of their west coast server. With EDNS, Edgecast's gets enough of my IP address from Google to know that I'm on the east coast, so it gives out their east coast server's IP address.
AAaaahhhh... THIS makes sense. SO it's not the actual web server that's getting my truncated IP, it's the web server's provider's NAMESERVER. So if I'm hosting a website, but not its nameserver (say I'm using godaddy or whathaveyou for that), only godaddy's nameserver would get the truncated IP if my site shows up on google's search page; not my actual web server.
RequestPolicy users can disable Link prefetching and DNS prefetching from the Advanced tab in the RequestPolicy preferences. Everyone else can search for "prefetch" in about:config and do it from there.
Also, Prefetching is disabled by default if the page containing the link is opened over HTTPS.
I was concerned about privacy, too, but heck, your IP is already logged by the primary site and logged by the CDN. This is just adding some logic to the routing.
The article exaggerates the most likely effects of this, which are a mild improvement in latency and very little in throughput. If you're in a country with very poor international connectivity, it might be possible that your 20 Mbit connection is downloading at Kbit speeds due to choosing the wrong CDN site, but that won't be the case for most people. The common case is that throughput is not really affected--- typical downloads from well-provisioned sites on the wrong side of the Atlantic max out my DSL anyway. When they don't, it's almost always a problem with the particular mirror's last-mile connectivity, not with the transatlantic cable.
I think you misunderstand how this works. Poor geo-targeting results in increased latency which is directly correlated with throughput and it increases congestion.
I've never seen a significant difference in throughput on a home connection due to a US/EU mistargeting, at least, certainly not on the level of Mbits being reduced to Kbits. I just tested now, and downloading from a California mirror is able to max out my DSL quite easily (I'm in Denmark). I do get lower latency from European mirrors, but not higher throughput, because both can saturate the link.
That's the real issue - previously these DNS servers did not work well with CDNs because they didn't send the location of the client making the request to the origin server. This extension fixes that problem.
If you aren't using these DNS servers then CDNs probably route your requests properly.
I am using Google's, but that isn't really the point; the point is that in real-world usage, the example in the article of a multi-Mbit connection ending up with Kbit-level throughput due to being routed to the wrong CDN is very unlikely if you live in the U.S. or Europe. It would require either a really broken TCP stack, or gigantic satellite-internet-level latencies for latency to constrain throughput that much.
Yes, you are probably right about the throughput. The latency problems are very valid though, and make websites seem remarkably slower.
I live in Australia at the end of a very long trans-pac pipe, and properly configured CDNs make a huge difference. A great example is how a few cheap ISPs here route based on price, not latency. That meant that when Amazon opened their Singapore dataceter a visitor from Australia (using one of these ISPs) could be routed via the US West Coast.
The title is misleading. It not going to make the web faster for everyone. Content hosted on CDN's that don't support the proposed DNS extension (http://tools.ietf.org/html/draft-vandergaast-edns-client-ip-...) ie Akamai, is probably going to be slower (higher latency.)
In fact, worst case is that you will be mapped to a server that is not allowed to serve your ip. CDN's get a lot of free traffic by helping isp's with peering imbalances. But this free traffic is restricted to specific ip blocks.
I thought the parent is trying to say accessing content via Akamai will be slower when using Google Public DNS/OpenDNS because they don't support the purposed DNS extension thus the request is routed to a non-local server instead of local one?
(Akamai is my main reason of not using OpenDNS to this date, otherwise I would make a switch for long)
Ahh, this is possible. Not as likely in the US, but possible. We're working on it. At least it's fixed for Youtube, all Google properties, a bunch of other CDNs now.
But yeah, some big guys still remain! We'll get 'em.
Indeed. What I'm reading is, "Google Public DNS and OpenDNS are relatively centralized, meaning CDNs might deliver you content from a cluster in Great Britain rather than your home country Germany because your DNS server is located in London. Now, you will get content from Germany again." The article makes it out to be something fantastic and exclusive to these services when the news is actually that these services used to have a big problem in this regard, and now they're working to fix it.
While this is true for most consumer users who work using their providers DNS, for networks such as enterprises/corporate locations, it's common to find they all share a common internal DNS resolver, which may be located in one physical location, while the clients are sprawled all about over the world.
I think this does give a bit more power to the geo-aware dns services (such as Cisco GSS), if they implement it.. but it's a long way off, as many ISPs and networks would need to upgrade their resolvers to support this extension.
I have found setting custom DNS breaks many coffeeshop and airport wifi hotspots, and so I don't recommend it for the general user. I can't imagine there are that many people using Google's DNS or OpenDNS.
You could change the DNS settings on your home router, so when your machines pull the DHCP data you will receive the modified DNS. But when out and about, everything will work as expected.
the latter, e.g. I'd assume an competent ISP with a large customer presence on the west coast would have DNS servers for their customers on the west coast, or just use anycast...
Yeah, not sure. I think most ISPs use just a couple of DNS servers for their entire network -- or at least that's the case here in the UK. But as you say, anycast might mean that those two IP addresses are actually represented by many servers.
I guess we'll need the input of someone who works at an ISP :)
There is a good reason why this feature is not part of the official DNS spec, as it breaks DNS caching: Once you cache results those three octets are pretty useless.
No. ChromeOS is actually Gentoo Linux (surprise! Didn't you know?) and it uses a relatively standard method of grabbing DNS nameservers from DHCP.
You can't even change this without mucking around in the internal read-only filesystem. You can certainly assign new DNS nameservers after you've DHCP'd, but in that case, why not pick faster DNS servers than Google's?