Detecting Tor Communication in Network Traffic

moxie · on April 6, 2013

A few years ago, I wrote a client (tortunnel) that connects directly to Tor exit nodes and pretends to be a full circuit, allowing you to use Tor exit nodes as one hop proxies. I'm sure that things have changed since then, but I got to know the protocol while writing that, and my recollection was that Tor traffic would be easy to detect:

1) Most people aren't using bridge nodes, and are instead connecting to Tor nodes publicly listed in the consensus. No DPI necessary.

2) Tor clients/servers used to signal their intent by including "special" cipher suite combinations in the TLS handshake. IIRC, they later switched to doing a "normal" looking TLS handshake with an immediate TLS renegotiation once the outer handshake was complete. That's a very distinctive traffic pattern.

3) All writes were translated into cells that were padded to 512 bytes. So by design, all Tor traffic looks the same.

4) The circuits were much longer lived than a standard TLS connection.

My sense was that Tor was originally designed to use TLS for innocuous egress filtering compatibility, rather than for explicit censorship resistance.

peripetylabs · on April 6, 2013

Yes, at the moment points 1 and 3 are enough to distinguish Tor traffic from other HTTPS traffic. They're working on a few solutions though:

https://www.torproject.org/docs/pluggable-transports.html.en

apawloski · on April 6, 2013

Meh, there's a reasonably large amount of research on disguising Tor traffic nowadays. Two recent interesting papers were StegoTorus [1], which conceals Tor traffic via steganography in a way robust to the above statistical analysis, and -- more exotically -- SkypeMorph [2], which obfuscates Tor traffic to mimic Skype traffic.

[1] http://www.owlfolio.org/media/2010/05/stegotorus.pdf

[2] http://www.cypherpunks.ca/~iang/pubs/skypemorph-ccs.pdf

D9u · on April 6, 2013

I find it interesting that the article uses the term "TOR" even though the Tor FAQ clearly states the following:

Note: even though it originally came from an acronym, Tor is not spelled "TOR". Only the first letter is capitalized. In fact, we can usually spot people who haven't read any of our website (and have instead learned everything they know about Tor from news articles) by the fact that they spell it wrong.

swinglock · on April 6, 2013

They're just not doing it right, that's all. Few are. Nothing interesting about it.

D9u · on April 6, 2013

Had the author used "Tor" instead of "TOR," I wouldn't be wondering what else they may have done wrong.

mpyne · on April 7, 2013

Kind of like the story about Van Halen and brown M&M's?

runn1ng · on April 7, 2013

Also Linux should be written GNU/Linux.

ibotty · on April 7, 2013

you are, of course, right. but it's irrelevant to this point. there is a strong consensus, that it's 'Tor' and not 'TOR'. there is _no consensus at all_ on 'gnu/' or not 'gnu'.

runn1ng · on April 7, 2013

I was trying to be sarcastic, but OK.

andreyf · on April 6, 2013

tl;dr: TOR traffic looks like https, but can be detected via traffic analysis. There is more info on how traffic analysis can chip away at TOR users' anonymity in this paper: http://www.cl.cam.ac.uk/~sjm217/papers/oakland05torta.pdf

GoofballJones · on April 6, 2013

That's a rather old paper though, it's 8 years old and I could have sworn that it was shown to not be effective in detecting current TOR traffic. From what I understand, you have to have massive traffic shaping capabilities to even begin to "chip away", just not a viable nor efficient solution as it takes huge resources. They're not going to devote that amount of time and money just to shut down a few nefarious drug resellers or other places.

andreyf · on April 6, 2013

Good call, I didn't notice that. Thanks for pointing it out.

pdog · on April 6, 2013

Detected but not decrypted, unless you're an exit node. TOR is still fundamentally sound, especially in the past few years.

moxie · on April 6, 2013

That depends on why you're interested in Tor. While my sense is that it was originally designed as an anonymity tool, it seems to have really exploded for use as a censorship circumvention tool.

Most of the latter types of users are probably much less interested in whether the destination website can identify them, and potentially much more interested in Tor traffic going undetected by censors.

marshray · on April 6, 2013

TL;DR: Tor is a peer-to-peer anonymity system providing TCP-like connections over SSL/TLS on TCP port 443.

Anonymity can be used for good. It can be used for evil. A botnet has been seen utilizing Tor.

There's a program called "Caploader" with a checkbox labeled "Identify protocols". Checking the box can identify traffic speaking the current vanilla Tor protocol.

X-Istence · on April 6, 2013

What specifically about TOR's TLS stream allows it to be identified as TOR traffic? The article simply says to load pcap files into the tool ...

Makes me feel like reading the article was a waste of time. I want technical details.

nitrogen · on April 6, 2013

Dead comment from h72a (brand new account; possibly double-posted and deleted the wrong one?):

h72a 14 minutes ago | link [dead]

Tor's TLS handshake exhibits a number of peculiarities which distinguishes it from HTTPS. The cipher list inside the TLS client hello used to be a (almost?) unique (see http://www.cs.kau.se/philwint/static/gfc/ ) and the SNI contains a random bogus domain.

cwu225 · on April 6, 2013

packet sizes and inter-packet timings. This paper might peak your interests http://cacr.uwaterloo.ca/techreports/2012/cacr2012-08.pdf . It tries to obfuscate the network traffics by morphing them so they statistically look like Skype Traffic.

They even open sourced their code at http://crysp.uwaterloo.ca/software/CodeTalkerTunnel.html

nitrogen · on April 6, 2013

My guess is that the timing, relative sizes, and/or destinations of packets sent distinguish one from the other.

jlgaddis · on April 6, 2013

Exactly, there's nothing really useful in this article.

For 900 EUR, however, you can buy yourself a copy of their tool.

pudquick · on April 6, 2013

100% agree about the uselessness of the article.

If I were to take a stab in the dark about how the tool is doing it, though - based on their "statistical" analysis comment, my guess is they're measuring sustained traffic levels / TCP connection duration. Your average encrypted web session won't look anything similar to a command-and-control bot calling home over Tor to some irc server (which is their example usage for the tool). Possibly including "known" Tor node IP addresses, as well.

In addition, there was that Ethopian DPI filtering project against Tor that happened last summer (https://blog.torproject.org/blog/update-censorship-ethiopia), with the Tor Project thinking they'd somehow fingerprinted some aspect of their TLS handshake. Maybe this knowledge is spreading.

rz2k · on April 6, 2013

I suppose this points out a social function that malicious software can fill.

The discovery of methods to identify TOR traffic in the pursuit of reigning in malicious software, should encourage the TOR network to become less easily detectable before authoritarian governments manage to shut it down more effectively.

marshray · on April 7, 2013

Seems unlikely that the antimalware scene is driving the Tor detection research. China seems to be putting the lots of effort behind it for censorship purposes.

See: How Governments have tried to Block Tor https://www.youtube.com/watch?v=GwMr8Xl7JMQ

rz2k · on April 8, 2013

I was partly responding to my own irritation with the underlying premise of the article. If Tor is being used maliciously to deliver or receive an encrypted payload on your computer or network, it isn't a problem caused by Tor. Furthermore, Tor has an enormous social benefit.

In other words, my first reaction was that it is harmful to attack the technology, but realized that is a silly argument for obscurity. Publishing a vulnerability, and more people publicly searching for vulnerabilities is a good thing, since authoritarian actors will just exploit what they find without any disclosure.

cjbprime · on April 6, 2013

Yes, the Tor developers are working on it. See e.g. https://www.torproject.org/projects/obfsproxy.html.en

rsync · on April 7, 2013

... with US government funding.

http://cryptome.org/2012/07/tor-exits-usg-funds.htm