The DNS prefetching was exactly what I was thinking of. With GIS, if my site sho...

jcr · on Aug 30, 2011

You are correct in a general sense but it seems you're stretching the truth a little bit. Since you only get three octets from GIS, it is impossible to know "exactly" how many people visit your site. As for the "from where," you could get a rough idea of location through GeoIP on three octets, but the result would be generalized. The generalized data would still be useful, but it would be lacking in resolution and reliability compared to GeoIP on the full four octet IP address.

The approaching exhaustion of IPv4 in the coming years, and how, in practice, it is handled could make a real mess of GIS. If your ISP starts handing out IPv4 addresses in the private address space to customers and does transparent PNAT, then GIS breaks badly for all customers of said ISP. In the case of large ISPs, GIS could actually make things slower.

The part I have no clue about is how GIS works with IPv6? I haven't read the IETF draft, so I'll just shut up and hope someone more knowledgeable chimes in here.

jcrites · on Aug 30, 2011

> The generalized data would still be useful, but it would be lacking in resolution and reliability compared to GeoIP on the full four octet IP address.

How is that? ARIN doesn't delegate IP address space to users or ISPs in smaller segments than 1024K IP addresses. So it seems that 3 octets is enough to map to a physical location. How does the additional octet give you additional geolocational abilities?

jcr · on Aug 30, 2011

Good question. The answer is in understanding the details. The RIR's (Regional Internet Registry - ARIN, RIPE, APNIC, ...) do allocate large blocks as you state, but those large blocks are divided into subnets. When you realize the subnets have routers and routers often provide their GPS coordinates, you can see how Geolocation can become more accurate with more address bits. That's just one of the ways. Another way would be the subnet assignments often being public and location of the company/organization with said assignment having a known location. Still another approach is the looking up locations based on AS/ASN. And yet another is GPS reporting (think mobile android/ios). There are probably other ways that I don't know. The important part to realize is how all of the various methods are both employed and combined to build out geolocation databases. Geolocation by IP is far from perfect, but often it can be surprisingly accurate.

pwaring · on Aug 31, 2011

Even with the full IP address, GeoIP won't necessarily tell you much. My location has been reported as York, Cambridge, City of London and the Netherlands, all of which are 100+ miles away from my actual location.

michaelcampbell · on Aug 30, 2011

I'm a network neophyte, so go slow, but can you explain how?

Is it this?

"Basically, when your browser makes a DNS request, the DNS server will now forward the first three octets (123.45.67) of your IP address to the target web service."

So say you search for something on google; google returns its search results page, your browser gets the page, looks at all the links, asks DNS for the IP's to all those links' addresses, and DNS auto-sends YOUR (truncated) IP to all those addresses' servers?

I guess I'm unclear on why it would do that. If the truncated IP coming to a CDN isn't coming with an actual request, how do they know that at some time later your actual request is from your truncated IP? (I also don't understand why a CDN would use some sort of DNS address as a geolocation strategy, but I guess that's another discussion.)

jcr · on Aug 30, 2011

> So say you search for something on google; google returns its search results page, your browser gets the page, looks at all the links, asks DNS for the IP's to all those links' addresses, and DNS auto-sends YOUR (truncated) IP to all those addresses' servers?

Yes.

> I guess I'm unclear on why it would do that.

The DNS prefetching done by the browser exists to save your time. Instead of waiting to do a DNS lookup until you click on a link in the current page, the browser does DNS lookups on all links in the page as soon as the page is loaded. By the time you're done deciding which link to follow, the browser is already done with the initial step required to follow any link on the page.

> If the truncated IP coming to a CDN isn't coming with an actual request, how do they know that at some time later your actual request is from your truncated IP? (I also don't understand why a CDN would use some sort of DNS address as a geolocation strategy, but I guess that's another discussion.)

You seem to have misread the description.

A CDN is a group of multiple servers and all of them could, in theory, respond to your request for a specific web page. The servers in the group are spread out all over the globe, but all of them share the same domain name. When you look up the IP address of the shared domain name, this new GIS draft sends your truncated IP address to the DNS server of the CDN so it can choose the server in the group that is "closest" to you.

I hope that makes more sense.

michaelcampbell · on Aug 31, 2011

> The DNS prefetching done by the browser exists to save your time. Instead of waiting to do a DNS lookup until you click on a link in the current page, the browser does DNS lookups on all links in the page as soon as the page is loaded. By the time you're done deciding which link to follow, the browser is already done with the initial step required to follow any link on the page.

My apologies; I was unclear. I (think I) get the DNS prefetching idea (your browser asks DNS for all the IP's on a page in the hope that one will be hit, and it won't have to spend time to do it later when a link is actually clicked), but why would DNS send anything to the site that it's getting an address for? (And under what protocol?)

When my browser asks DNS for an IP for "www.foo.com", why does "www.foo.com" need to know I asked for it?

sciurus · on Aug 30, 2011

I think the phrase "target web service" from the article is misleading. This is about passing part of the client's IP address to the authoritative nameserver for the for the target web service. From my understanding the following is an example- Let's say I'm on the east coast, I'm using Google DNS on the west coast as my DNS server, and I want to load foo.edgecast.com. foo.edgecast.com has two servers, one on the east coast and one on the west coast. When I perform a lookup for foo.edgecast.com I talk to Google's recursive resolver which then talks to Edgecast's authoritative nameserver. Without EDNS edgecast doesn't get any information about me; it just knows the request came from Google on the west coast, so it gives out the IP address of their west coast server. With EDNS, Edgecast's gets enough of my IP address from Google to know that I'm on the east coast, so it gives out their east coast server's IP address.

michaelcampbell · on Aug 31, 2011

AAaaahhhh... THIS makes sense. SO it's not the actual web server that's getting my truncated IP, it's the web server's provider's NAMESERVER. So if I'm hosting a website, but not its nameserver (say I'm using godaddy or whathaveyou for that), only godaddy's nameserver would get the truncated IP if my site shows up on google's search page; not my actual web server.

jpablo · on Aug 30, 2011

The directions that browsers are going is to prefetch the entire page ahead of time anyway, so this is a moot point.

mike-cardwell · on Aug 30, 2011

Firefox users:

RequestPolicy users can disable Link prefetching and DNS prefetching from the Advanced tab in the RequestPolicy preferences. Everyone else can search for "prefetch" in about:config and do it from there.

Also, Prefetching is disabled by default if the page containing the link is opened over HTTPS.