The privacy concerns nonwithstanding, I'm puzzled how ISPs are supposed to actua...

m0nty · on Nov 25, 2016

> implement that load of bollocks

They are retrospectively making legal things which have been going on for years.

https://www.theguardian.com/uk-news/2014/jan/28/gchq-mass-su...

Programmatic · on Nov 25, 2016

But it's more than that. If it's illegal to spy, that means you can't disseminate the fruits of that spying far and wide. You need to resort to parallel construction and carefully safeguarding your sources.

This allows a massive expansion in the scope of capture and use of that information to more agencies in a "legitimate" manner. At least when it was illegal they had to contain the "conspiracy" lest it get out.

Toenex · on Nov 25, 2016

Indeed one way to devalue this information would be to swamp ISP servers with 'fake' data; hide our real activities in the noise. What we need is someone to release a modified versions of Chrome/IE/Firefox that spends all your browsing downtime accessing 'dodgy' sites. If everyone starting using it this information would soon become either impossible to store and pointless as everyone is a criminal according to the data.

3chelon · on Nov 25, 2016

Maybe people will actively sign up to be part of botnets launching DDoS attacks continuously, just to generate enough noise.

mfukar · on Nov 25, 2016

> The privacy concerns nonwithstanding, I'm puzzled how ISPs are supposed to actually implement that load of bollocks.

DPI products have been doing that for years. No biggie.

The law is still stupid, but not for technical reasons, imo.

datenwolf · on Nov 25, 2016

The problem I see is not the computational power required for implementing the DPI, but the storage capacity and bandwidth required to implement retention for upo to a year. Lets assume that ISPs were applying a data cap of, let's say 200GiB/month. MTU for Ethernet is 1500 octets, with PPPoE it's 1480. IP Headers are at least 17 octets, so around 1% overhead for a optimally utilized connection. In that situation this gives about 2GiB/month of IP header data. Even if you strip that down to just the source/destination address that would still leave you with 800MiB/(customer·month) of data. That's the bottom baseline you have to provision for.

Of course your typical TCP stream is highly redundant and even simple RLE compression will cut that. But ISPs have to provision for the worst case. Currently there are about 60M internet users in the UK.

That would amount to about 536PiB/year of retention data to be provisioned for (worst case). And even if due to redundancies you can compress that down in practice that's still a lot of harddisks to keep around just to store the bare minimum (who with whom, but without context) of a whole country's internet traffic metadata (about 100k HDDs).

That's a significant investment that's expected from ISPs to be implemented in a very short timespan.

dorfsmay · on Nov 25, 2016

What about SSL?

If they don't terminate SSL a la NSA/google, then all they know is that you're talking to a lot CDNs and cloud provider.

I guess they can try to cross-match that with your DNS queries, but that still is fairly generic.

viraptor · on Nov 25, 2016

When you make an ssl connection you're sending the domain name in the clear. They don't need to match on dns.

EvilTerran · on Nov 25, 2016

This whole business has got me wondering if it's even theoretically possible to prevent the site identity being visible to middlemen. Like you said below, even without SNI the cert is sent in the clear, and I can't think of a way around that. You'd need to somehow set up a secure channel before communicating site identity, but encryption without authentication is insecure in the face of MITM, and you need to establish site identity before you can authenticate the server.

I suppose, with IPv6, we could do away with shared-IP virtual hosting, and hence SNI at least; and perhaps we could even devise a system whereby the domain is omitted from the cleartext-transmitted handshake, say by using the IPv6 address as the cert's DN instead... but then that numeric address would serve as a surveillable site identifier, and you can still be tracked.

Is there any active research in this area? Is it provably impossible? Anyone know?

movedx · on Nov 25, 2016

HTTP over SSL/TLS? No, the domain is not visible.

The domain (hostname) you request is inside the encrypted communications between you and the remote server. Only the TCP information is visible (IP, source port, destination IP, and destination port.)

It's the DNS request which reveals the domain you requested.

viraptor · on Nov 25, 2016

Have a look at https communication in Wireshark for example. What you wrote is incorrect. Https reveals the domain at least one time these days. First, ssl extension SNI (https://en.m.wikipedia.org/wiki/Server_Name_Indication) is sent, which reveals the domain you're requesting. This happens before the keys are exchanged.

Then, the matching certificate is sent (again in plaintext) from the server so that you can verify it and extract the keys. It will contain the domain again, although it may be a partial one like *.example.com

So no, the domain is public. The full URL path is encrypted though.

movedx · on Nov 26, 2016

Thanks for the info! I hadn't considered some of those aspects of the connection process.

mfukar · on Nov 25, 2016

Disclaimer: I won't claim to have read the law or even caring about what happens in the UK.

From reading related articles, I get the idea its requirements can be implemented in terms of a browsing history, which could point to a date in the internet archive for all the legislator cares. Hint: that's how you compress browsing habits for > quadrillions of requests.

I don't see why one would need complete packet traces of the whole thing.

datenwolf · on Nov 25, 2016

> From reading related articles, I get the idea its requirements can be implemented in terms of a browsing history, which could point to a date in the internet archive for all the legislator cares.

Good luck doing that with a TLS secured connection. All you see is the TCP stream between the two peers. And thanks to PFS enforced on the server side you can't even go around and force people to escrow their keys.

> I don't see why one would need complete packet traces of the whole thing.

Because that's the only thing an ISP is able to see of a properly encrypted connection.

mfukar · on Nov 25, 2016

So ISPs can now be coerced, by law, to allow for MITM in TLS connections. Another reasonable expectation from buyers of DPI products.

Like I said, nothing technically absurd about this law. It is its profound disregard for privacy that we should be discussing, instead of spending our time on technical issues which are solved.

viraptor · on Nov 25, 2016

But it's useless to save the encrypted bytes so nobody will do it. It doesn't matter the connection is encrypted - you still get the following information:

Source IP, mapped to customer. Timestamp. Target domain (from SNI or the certificate). Passive system identification (os, browser).

The only thing they're additionally interested in is the link and that's the only thing that encryption hides. I'm not sure they even care about cookies and headers in ICR

anonymousab · on Nov 25, 2016

They'll get around to government mandated certs, backdoors or other such stuff eventually.

datenwolf · on Nov 25, 2016

Certificate pinning anyone?

Also a few years ago DJB proposed to make a systems hostname the nonce of a key/signature and use secured DNS (DNSSEC or DNSCurve) as a means for establishing a web of trust; a CNAME would be used to for translating www.example.com into ${NONCE}.example.com.

Since DNSSEC (and DNSCurve) allow for signature verfication against a small number of root keys (ATM a single digit number) it'd be trivial to ensure an unbroken chain of trust for name resolution, which essentially completely mitigates a state level MitM attack on DNS.

So by combination of securing DNS and nonceing the hostname into TLS certificates you can throw quite a log into state level crypto circumvention. Of course the critical problem is rolling out all the necessary protocol changes and implementation. And of course DNSSEC is used only homeopathically ATM (and yes, I'm guilty of not having implemented for my stuff as well).

chinathrow · on Nov 25, 2016

They don't need to implement that - GCHQ has that already. One month buffer, at least. Scale to 6, done.

loup-vaillant · on Nov 25, 2016

> We're talking DPI here, applied as a dragnet on each and every connection.

Didn't Tempora¹ achieve something very similar? This law sounds eerily similar. Still stupid and ludicrous, though.

[1] https://en.wikipedia.org/wiki/Tempora

Create · on Nov 25, 2016

Create 291 days ago | parent | on: UC Berkeley profs lambast new “black box” network ...

Transparent monitoring for your protection

In keeping with this spirit, here is a reminder of how we monitor (your) CERN activities. We monitor all network Traffic coming into and going out of CERN.

Our new analysis infrastructure will be able to cope with the automatic live analysis of about one terabyte of data every day. All this data is stored for one year.

http://cds.cern.ch/journal/CERNBulletin/2016/05/News%20Artic...

datenwolf · on Nov 25, 2016

That's traffic analysis, not traffic metadata retention. The problem is not the computational load, but the storage capacity and bandwidth requirements.

And what is stored at CERN is the analysis results of the data, not the data history itself. Also it's one TiB/day in total for the whole of CERN.

Create · on Nov 25, 2016

_All_ this data is stored for one year.

datenwolf · on Nov 25, 2016

> _All_ this data is stored for one year.

Which refers to the result of the analysis. If CERN would retain all the data that crosses their network, or just the metadata they'd have to roll in truckloads of HDDs each day.