Hacker News new | past | comments | ask | show | jobs | submit login
CSRF protection bypass due to Google analytics and weird server cookie parsing (hackerone.com)
97 points by amenghra on May 4, 2015 | hide | past | favorite | 39 comments



It might be interesting to some people that the "peculiarity" noted in some web servers is actually requested by the standard. It is noted in both the grammar and in the text (where it is described as for "future compatibility").

> cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value)

> Note: For backward compatibility, the separator in the Cookie header is semi-colon (;) everywhere. A server should also accept comma (,) as the separator between cookie-values for future compatibility.

https://www.ietf.org/rfc/rfc2109.txt

(I believe part of the issue was that MIME of headers describes how to deal with multiple headers with the same name, which is by concatenating them with commas, and that spec was the basis for the definition of HTTP headers.)


If I understand it correctly, the CSRF protection bypass is just an interesting consequence of the real culprit, which is an injection vulnerability in the data serialization format of the GA __utmz cookie.


I wonder which party should pay the bounty in this case. He seems to have reported it to Twitter, however wouldn't it be Google's responsibility to fix this?

Seeing as how it's a combination of an issue in the web server parsing cookies, and the client setting malicious cookies, I'm not sure which party should be responsible for the bounty.


They should both pay a bounty. But if you had to assign blame only one place it would be ECMAscript. Setting the value of an unintended cookie should not be so easy to do by accident.


Nitpick: strictly speaking setting cookies is not part of ECMAScript, it's part of the DOM api.


Cookies are, without a doubt, the worst "API" in the DOM. Here's the entire API: document.cookie

That's it. That one property.To add cookies, modified cookies, delete cookies, whatever, you do an assignment. To read all the cookies, you do a read.

That's it. That's the API.


This is correct, and it's a source of madness.

One more thing: Cookies are limited to 4KB total due to the HTTP/1.1 specification.


Hi, i am the author of this report.

I sent information about this vulnerability to: 1) Google (2 fix in Google Analytics) 2) Django / Python (https://hg.python.org/cpython/rev/270f61ec1157) 3) Twitter (changed CSRF protection) 4) Instagram [Facebook] (sent me to Django)


You are visiting this page because we detected an unsupported browser. Your browser does not support security features that we require. We highly recommend that you update your browser. If you believe you have arrived here in error, please contact us. Be sure to include your browser version.


I encounter those types of messages a lot due to my unusual and changing user-agent. My usual response is to click the back button since I can usually find the information on another site.


What browser are you using?


Same issue here.

Google Chrome Version 42.0.2311.135 m on Windows 7 64bit


This vulnerability would appear to be mitigated by using a CSRF secret in a server-side session rather than solely cookie based.

For example, Cookie: SHA1(salt + secret), then CSRF-TOKEN header/value is taken, the salt extracted, hashed against the server-side secret and compared.


Solely cookie based CSRF is the only way to do logged-out csrf. Once you are logged-in, you want to do session based to prevent a MITM-over-http from doing a CSRF on the https connection.

HSTS partially prevents the MITM-over-http, but not 100% the person doing the MITM potentially controls the clock (NTP is unauthenticated). Unless you do HSTS by whitelisting the domain in the browser.

Security is hard :(


Logged-out users could have sessions too, potentially. (Although logged out users don't need CSRF protection on many websites)


Logged-out csrf is an issue if you are serving as an oauth provider. There might be other use cases that I haven't thought about. I would ere on the side of precaution and implement logged-out csrf.

If you create a session for the logged-out user, you must be careful about not recycling the same session once logged in or you expose yourself to session fixation.

http://stephensclafani.com/2011/04/06/oauth-2-0-csrf-vulnera... and http://homakov.blogspot.com/2014/01/two-severe-wontfix-vulne... explain the issue pretty well.


What's the advantage of still having the cookie with some value in it when we use a random token as a hidden form value plus this token being stored on the server? Your suggestion seems overly complex to me, so maybe I'm missing an attack vector that can be mitigated by throwing salts, hashing and a cookie at the issue.


There are a TON of different CSRF schemes and once you have billions of requests per day, storing and distributing each CSRF token for each user can be a large burden.

Most advanced CSRF schemes use some crypto primitives in a manner like client-side sessions work (Enc+Mac on the server, give token to user, Validate Integrity+Dec on the server when you get it back), or HMAC some user meta-data. This allows you to distribute the key and then any server with the key can validate the token without needing to know the token in advance.

I've written a blog post about a bunch of types of CSRF schemes and their pros and cons, but never published it. Is this interesting enough for me to publish?


> I've written a blog post about a bunch of types of CSRF schemes and their pros and cons, but never published it. Is this interesting enough for me to publish?

Definitely. Having looked myself, I can tell you there aren't many resources online that get this right.


It is. I want to finally understand why and how I can secure a login form against CSRF (i.e. when I don't yet have a session for the visitor).


How we do it is a little wonky. Our web app calls GET /login which returns a CSRF cookie and an anonymous session cookie and an object that says there's no active session. The app then uses that CSRF token to POST to /login. Once the auth is successful, the anonymous session is destroyed and a new CSRF secret is generated on the authenticated session.


We serve static HTML and site is driven entirely by JavaScript with all communication with the server being solely through a REST API. There's no mechanism to add a value to a form field, thus we use a cookie.

Really the difference is that you compare the user's response to a known value on the server rather than two values in the same request body which cannot be independently verified.


How do you bind the value stored on the server to the logged-in user? If it's a global value (same for all users), you aren't correctly protecting against CSRF. If it's tied to the logged-in user, storing a value is essentially the same as deriving a value using HMAC (or some other one way function).


The CSRF token is generated on login and then stored in the user's session. We accept the risk of not having a per-form token for pure developer/user convenience reasons.


This is exactly what the parent suggests doing.

Keep in mind that if you don't change the client's view of the token on every page load (using some kind of salt), you are potentially vulnerable to CRIME/BEAST.


The comments mention this is also a problem with Django's CSRF cookie when using GA?


The default Django CSRF protection is based on a hidden field, not cookies. Thus, the default mechanism does not have this flaw.

One can always replace it with something that is vulnerable, so there are probably Django sites out there with this problem. But if you never thought deeply about it, your site is fine... What's a feature that I really love about Django, you'll see that same same phrase "if you never thought deeply about it, your site is fine" for lots and lots of security problems.

What I don't get is how is it possible to do CSRF with cookies? Aren't cookies always shared between simultaneous browsers sessions? Isn't the entire point of CSRF to avoid that?


> The default Django CSRF protection is based on a hidden field, not cookies.

This is not true. The default Django CSRF protection uses cookies for the session store[1]

> What I don't get is how is it possible to do CSRF with cookies?

The same as any other CSRF mechanism :)

1. You provide the user a (preferably per-request vs. BREACH) token, often in a hidden form field. The field being hidden is primarily for UX reasons.

2. You store a copy of that token in a session store. This is often a cookie due to the convenience of cookies (no server state required), but can be in a server-side store (memory, Redis, RDBMS, et. al).

3. Upon any non-idempotent HTTP method (anything that's not GET/HEAD/OPTIONS) you compare the token in the submitted form to the value stored in the session.

The benefits are that you don't have to maintain server state. The downside is that you're transmitting your token over a channel at risk of MITM. Using authenticated cookies helps as any attempt to modify both the form value AND cookie value (i.e. so they match) should fail when you verify the MAC on form submission.

By default, Django's cookies are authenticated.

[1] https://github.com/django/django/blob/master/django/middlewa...


So if I understand this correctly, people that use ad blockers that include the blocking of Google Analytics have been safe from this attack vector for months, while others have not, correct?


And people who use NoScript are safe from of any kind of javascript attack vectors.


This is hard to exploit if the CSRF token is a few characters from the cookie that identifies the user. Setting a new value for the cookie would log the user out.


Due to a security risk, until further notice all corporate firewalls should be blocking Google Analytics.


A lot of sites break if Google Analytics is blocked (on purpose).


It's not that bad. I have it blocked, and it only affects a few sites. For those sites, there are alternatives.


And for years NoScript has provided a surrogate script that makes those sites think GA isn't being blocked.


Is that so? I've had GA blocked with HTTP Switchboard since I installed it, and never come across a site which required me allowing it. Some sites need me to turn off Chameleon so that they recognise Flash or something, but GA's never been problematic.


Really?

I've had the GA domains redirected to 0.0.0.0 via HOSTS file for... ever since GA started to exist, and I haven't seen any specific messages to that effect. It'd make sense that some would use JS to detect this, but I have that disabled by default.

Maybe it's because I see a lot of other "broken" sites in my daily browsing due to that, and I don't really need to find out why - if a site isn't giving the info I need, I go back to the search results and to the next one.


It may only break those using advanced features, like A-B variations, but not for regular tracking.


Interesting point of view




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: