Hacker News new | past | comments | ask | show | jobs | submit login
PRISM fears give private search engine DuckDuckGo its best week ever (venturebeat.com)
459 points by tchalla on June 13, 2013 | hide | past | favorite | 206 comments



It's not safe to assume the NSA doesn't log DDG searches. Look at the PRISM logo - it's a beam splitter. Read the slide, look at the "Upstream" portion.

http://commons.wikimedia.org/wiki/File:Upstream_slide_of_the...

They're logging all your URLs and headers. How much are you willing to bet they can't decrypt https? I dont understand all the hubbub _is focused solely_ on direct server access (the bottom half of the slide), when "Upstream" access is just as big a concern.

EDIT: rephrased my concern about direct vs upstream


"How much are you willing to bet they can't decrypt https?"

I'd bet quite a bit, though not "my life", that they do not have a generalized "read everything" ability for all forms of SSL. They may have what cryptographers would call "a crack", but that's a low bar, and doesn't prove they have a practical attack.

However, DDG is currently using 128-bit RC4, which is very weak. [1] I wouldn't care to bet anything that the NSA doesn't have an RC4 cipher crack that is practical to run on wide swathes of traffic.

RC4 is very popular, which I believe is because some people claimed it was a defense against the BEAST attack. I researched this for work, and I couldn't find anyone whom I trusted saying that was a good mitigation. The people I trusted merely observed that RC4 was not vulnerable, but never said you should switch to it. Only secondary sources ever suggested that. My conclusion was that there was a reason for the primary sources never suggesting that; in response to a theoretical break of the rest of SSL, the correct move was not to move to a solution that had much more practical attacks already known than what BEAST demonstrated. But now it's even sillier; BEAST has been either entirely or almost entirely mitigated in browsers (there's no server-side defense against BEAST, but there's a client-side one you can use, and browsers now have it). As far as I can tell, RC4 should be abandoned and we should resume using stronger ciphers for SSL. Anyone still concerned about BEAST should update their browser.

[1]: http://nakedsecurity.sophos.com/2013/03/16/has-https-finally...


OTOH, if they can get a CA - any CA - to cooperate, they can MITM anyone without having to break SSL.


Not without someone noticing. Some sites have pinned certs in Chrome, which would stop this, and even without that you would expect some knowledgeable techie at Facebook or Github or something to be using their home laptop and say, "Wait a sec, this isn't my company's public cert!"

Not having seen any blog posts screaming, "OMG, my site is being hijacked wholesale," I can only assume that the NSA isn't doing this (or has managed to squelch by legal order every single person privy to the real cert at MITM'ed sites, which is absurd and would beg the question, why not obtain the private key from these people in a similar way?).


Could they do this selectively, and only MITM people on watchlists?


Do they need to MITM? If they have a copy of the private key, can't they just use it to decrypt the data .. even old data for which they've only just acquired the key?


Having the root CA's private key doesn't give them access to the end entity's private keys. When you ask a CA for a cert, you only provide them with your public key (in the form of a CSR) for them to sign. The CSR does not contain the private key.


But getting an employee to hand over the private key and giving him a gag order afterwards is an option of course.


https://en.wikipedia.org/wiki/Perfect_forward_secrecy

https://en.wikipedia.org/wiki/ECDHE

Google is using it, a few other sites, too, though they are in the minority. OpenSSL supports it since version 1.0.0 that was released in March 2010.


True, but they would have to do this for every single web server they would want to collect information from. Not impossible, but it'd be a lot of work.


to your latter question: no, not with the right ciphers.

http://en.wikipedia.org/wiki/Perfect_forward_secrecy


Perfect forward secrecy doesn't apply if the NSA has broken the key exchange algorithm and has your session keys.


They have to set up impersonating SSL certs for every connection they want to MITM. While there'd clearly be value in them inserting or subverting network hops between "the great unwashed" and gmail/facebook/aim servers, there's very little chance the NSA have access to hops along the path between my (Australian) adsl connection and my vps (located in Australia).

For internal (or routed through) US traffic - while Verizon's lack of interest in protecting customer data is probably shared by major backbone providers - I _strongly_ doubt even the NSA has enough gear hanging off backbones to actively MITM any significant proportion of the firehose that'd represent. Even the AT&T "secret room" probably doesn't house enough gear to be able to create fake(signed)certs and MITM every SSL connection for millions or more simultaneous users browsing every https site under the sun.

Having said that, I'd bet good money the _do_ target specific SSL traffic - has anyone checked the SSL connections to TOR entry and exit points recently? That'd be one spectacularly obvious path to try "speculative MITM attacks".


Only for targeted traffic though. They can't record, go back, and break it.


This is why I tell people that actual fingerprint check is much better than any CA.


Yes, but how do you get the fingerprint to check against?


The EFF's "HTTPS Everywhere" extension is a great place to start.

    https://www.eff.org/https-everywhere


I assume he also implies that fingerprints aren't any safer


Who?


Thanks for pointing this out. I'm looking into updating our ciphers very soon.


Heh, I was wondering if that might get noticed. :)

On the one hand, don't take my word for it; I also have not found anyone I trust who has verified my explanation directly. On the other hand, I did do my best to read the primary sources very carefully, both for what they say and what they don't say, and I was confident enough to implement more conventionally strong ciphers on the services I'm responsible for, so my money is where my metaphorical mouth is.


I might bet my life on HTTPS depending on what my alternate choices are. I'm loathe to bet my life on anything, though.


Thanks for the link, that was an awesome article. Well written and technically substantial. How did you come across it?


I googled for something like "rc4 weak crack" or something like that. I was just trying to substantiate the claim that RC4 is weak.


I know this is paranoid, but I keep coming back to this post that hit HN a few days ago: http://www.cypherspace.org/adam/hacks/lotus-nsa-key.html

When you combine that with the beam splitter logo, things get a bit scary.


Spoiler alert: They can't decrypt it. That's why they have to ask these companies for the info, rather than just take it from the ISPs.


It could just as easily come down to simple costs.

If it's less expensive to ask someone to hand over the data (in bulk) rather than burn CPU cycles cracking SSL (again: in bulk), then go for it.

Even if it's feasible to crack SSL for a few crucial messages, it's likely not so for the volumes of data the NSA are capturing.


If they can decrypt some SSL (maybe low bit) it is likely a very intense process that requires vast hardware resources, so even if they can do it, it is not likely it is being done for all traffic, but could be applied to some traffic.


Doesn't necessarily mean they don't have the ability to decrypt, it just means that they have a process in place so in case shit ever hit the fan (like it just did) they can come back and say "what's the big deal? we have a process in place"


Okay, tinfoil.


If you're using HTTPS, they can't even see your URLs or headers. That's sort of the point.


Email is unencrypted in transit.


That's not universally true - there's a remarkable amount of TLS/SSL encrypted email-in-transit, either via STARTTLS ESMTP commands or SSL over port 465 (and 993/995 for IPAM and POP3).

I don't think there's a way to guarantee your mail always travels over TLS/SSL secured connections, but I suspect more of it does than you think.


There's a straightforward way to make sure your email is always encrypted in transit: encrypt it before you send. No promises about making sure your email can always be read by the recipient, though...


And here's the problem. Email needs to be able to be read by the recipient, so until a significant portion of email recipients can handle encrypted mail - the NSA doesn't need to attack my encrypted email storage, because enough of my correspondence ends up in cleartext in gmail/hotmail/yahoo et al.

This is a hard one to solve. GPGmail seems to get broken with every Mac Mail.app release. Vast numbers of people rely on webmail - which'd need server-side or in-browser GPG decryption. My Mom's not going to use command like gpg tools. How the hell do we bootstrap our way up to ubiquitous encrypted email?


Hm, should be easy enough to have some browser plugin that lets you select a text/data field and recipient list field and encrypt it with the appropriate key; and to do something similar for recognition and decryption of fields.

If I implement this, will I become famous?


If you get it right you will.

I think there are complications though - you need to be very sure that rogue javascript can't dig around in your plugin and extract my private key. I'm not sure how securely sandboxed plugins can be.


What's the normal procedure for making a call whose output depends on a file that must be kept secret? Is there a typical OS API pattern that's seen in the various programs like ssh, scp, and so on?


I think the one of the problems is that software like GPG and OpenSSL go to a lot of trouble to make sure private keys don't hang around in memory for any longer than absolutely required - minimising the risk of having the OS preempt the executing code and write the key out to swap (or having malicious code slurp it up out of ram). The bare-metal hoop-jumping required to get that right might not be possible in the context of a browser plugin.


See also, mlock(2)


There was FireGPG but it was discontinued http://blog.getfiregpg.org/2010/06/07/firegpg-discontinued/


What does that have to do with using a search engine? I'm pretty sure you aren't emailing them your queries.


Maybe that's all Stallman's queries


In transit it’s actually pretty good bet it is encrypted. Now at rest it’s entirely different.


...unless you encrypt it.


While I haven't heard explicitly about Amazon being part of PRISM, the duckduckgo ips all point to amazon's EC2 platform.


PRISM doesn't matter that much, anyway - the NSA doesn't need a workflow automation system in place to send DDG or Amazon a NSL or a FISA order.


The point is that DDG doesn't log the information—even in the face of a court order, they have nothing to hand over.


More concretely, they don't have to rely on decrypting https. An approved NSA or FISA order to DDG will give the government an en clair "wiretap" on your DDG searches for up to a year. They may not be able to get the searches you did before the wiretap began, but that's all.


> I dont understand all the hubbub about direct server access

It's upsetting. It's a breach of trust for the provider and the government. It's more principle than technical.

Both the "upstream" and "direct" methods are upsetting, just in different ways and for different reasons.


There's no question that they're monitoring upstream traffic. In fact they may still be doing the old ECHELON trick in which the US eavesdrops on non-Americans, the rest of the world spies on Americans (among others) - and then everyone swaps the data received.

But in the light of the PRISM documents it's even more likely than it was before that the NSA doesn't have the ability to decrypt HTTPS, or at the minimum that the US considers it too important to risk giving it away by using it on routine Top Secret signals intelligence. (And/or maybe too resource-intensive to use for that.) The strongest evidence for this is that we haven't heard anything about such a capacity yet from Snowden, Greenwald et al., who all have the full PRISM deck (along with other documents) in their possession and would surely tell us about it if they knew of it. So either 1) the PRISM slides do mention the ability to decrypt SSL or SSH streams but Snowden and the journalists haven't picked up on it (not impossible given the apparent incompetence they displayed over "direct access"), 2) it's too sensitive to mention in a self-aggrandising Top Secret overview of upstream and "direct collection" Internet signals intelligence, which probably means it's not in use (or at least not in regular use) for upstream collection or 3) they really don't have it.

A supporting reason to think that they don't have it, or hardly ever use it, is the apparent emphasis on "direct collection" in the PowerPoint. Why go to the hassle of dancing the frenemy minuet with Google and other fairly-anti-surveillance Silicon Valley firms when you can just get what you want from upstream collection at the apparently more-accommodating telcos? This isn't conclusive because even if you could understand all the traffic into and out of someone's Facebook account you'd still like to be able to see the internal state of the account, in particular so that you'd know what they'd been doing before the upstream surveillance began. But I think it's at least as likely that the whole new focus on direct collection is a workaround for the fact that, thanks to SSL and SSH, upstream collection just isn't what it used to be back in the days of ECHELON.

As the slide said, You Should Use Both: direct collection to give you access to US-company servers, probably bypassing the HTTPS problem, and upstream access to give you data, probably only unencrypted data (email!), that passes through the US without going to a US-company server.

(If you want an exotic alternative theory, you could speculate that the PRISM document is a fake, a limited hangout http://en.wikipedia.org/wiki/Limited_hangout by the US spooks, maybe precisely to direct attention away from their ability to decrypt HTTPS streams. But this now seems unlikely, for example because DNI Clapper would surely have to have approved a managed release of a set of documents that both gave away the Verizon metadata surveillance and so also implicated him in perjury.)


I can't remember which interview it was, if on Democracy Now, or his MIT lecture video, but Bill Binney stated that the NSA in fact does decrypt HTTPS.


If Bill Binney said that, and if he is right, I'd assume the most likely explanation is that NSA can push over some low-security SSL connections of the type jerf describes above https://news.ycombinator.com/item?id=5877362 , but has to rely on "direct access" to get around most or all high-quality (but still widely-used) SSL encryption. (Or, again, that it also has the capacity to break high-grade HTTPS connections, but it's holding that back for really important occasions.)


With the history of the gov/NSA being effective crypto gods - my money is they are ahead of decrypting SSL and HTTPS and even of it is not real-time, they store streams from target end points regularly for slower offline decrypt.


I wonder why so many people believe this. Many simple and weak ciphers have been around for decades and - although they are considered to be very insecure by cryptographers - certainly can't be decrypted in real-time (!) on this scale (!).


If a CA is compromised, then it's just open sesame. So....


This has been talked about many times now. All compromising a CA lets them do is to create believable certificates to be able to man in the middle connections, but they can't be doing that for a large number of connections because it's resource intensive and detectable.

They still don't have the private keys of the sites if they break into the CA.


What are the odds that Google, Yahoo, et al. handed over their private keys, I wonder.


I find this quite plausible, with or without the knowledge of Page, Zuckerberg et al. the NSA might very well have the private keys of these companies. I would not be surprised if the CEO's of these companies choose to be ignorant of the NSA's methods to not have to lie to the public, shareholds and Congress.

Also, given that the world's best engineers work at either high-tech companies or the NSA there will be some who have switched between these industries, giving the NSA/CIA a headstart to get any information these companies hold through old-fashioned spy-tactics.


What about Yacy? (http://yacy.net). I am not sure if the queries inside a peer network can be decrypted as easily as http requests (I am not a networking specialist though, it is just an opinion).


Google wasn't passively spied on via a beam splitter or whatever you call it.

They were doing the spying and actively sending the data out.


I ceased using Google search except as a last resort when this story broke, and I had no idea what I had been missing out on with DDG: Excellent keyboard navigation. Also, DDG's results compared to a year ago are night-and-day. It seems to listen to my keywords better than Google did too, a growing annoyance I had. If you haven't, you really should try out DDG for a week.


> Also, DDG's results compared to a year ago are night-and-day.

Still looks 2nd rate. I replicated one of my last searches (learning rails): rails find if element is in array

First hit on google is the stackexchange answer with .include? (which I was spacing-out on)

DDG yields the Array docs, which is correct but is a helluva lot of info when I'm looking for a concise answer.


This impression must partially be due to how you write your queries, and the quality of the natural language parsing in DDG/Bing (your query leaning heavily on that). I searched "ruby element in array", Rails not being a programming language, and I get the Ruby Array docs with the first result (which is what I would want) and the StackOverflow result for the third. On Google, the StackOverflow item is first and the Array docs are third for the same query.

In this case you and I think that our respective search engines are providing the better result.


Don't forget about DDG's great bang shortcuts.

If you ever find yourself wanting to fall back on Google's results, just throw !sp at the front of a DDG search for StartPage's proxied Google service (or !g if you absolutely must go to Google). DDG has many, many shortcuts for directly searching StackOverflow, GitHub, Amazon, Wikipedia, etc.


This is most likely that Google has more user behavior data than DDG. If enough people use DDG and click on the StackExchange link for that query (or similar queries), DDG will be able to get that to the top.

On the other hand, did DDG just use Bing API, and only Blekko crawls the web? Or do I get my search engines mixed up?


That's correct, blekko does have our own multi-billion page crawl and index. And we're private, too.

Every web search engine depends on one of these indexes: google, bing, blekko, yandex, baidu.


Don't forget Gigablast, Procog, Yioop and Samuru which also have their own crawled index.


If you think those are viable sources of results, then by all means include them.


Personally I don't, but it should be noted that there are other indexes besides what you have mentioned. That said Samuru and Procog are pretty interesting.


DDG actually uses a handful of search engines (including Bing), though I think that Yahoo! is the primary source.


Yahoo search results are already 'powered by' Bing.


For programming questions, I almost always append site:stackoverflow.com to my search term... generally the anser is on SO, but I prefer the search engine's results over SO's search... on DDG and Google.


if you're using DDG just prepend your query with !so. Using bang notation is like having every search engine in your search bar at once.

...and of course, I had no idea if !so would work before I tried it. The obvious one always seems to work:)


He said "I prefer the search engine's results over SO's search..."

DDG's bang notation uses SO's search rather than the search engine's results.

Note that DDG's bang notation is also redundant with Chrome and most other browsers that let you set your own search engine keywords. Chrome's is especially nice, just type the first few letters of the site's URL and hit Tab and you're in a site specific search. It works automatically for any site implementing OpenSearch (or you can add it yourself by right-clicking the search box), and you don't have to memorize any keywords, so that's two advantages over DDG right there.


That was discussed here some time ago and many people suggested explicitly omitting SO and StackExchange sites when learning new languages/technologies. The argument was that while the answers on SO generally get the job done, they rarely provide the context, depth and coverage the docs give. I happen to agree - I only hit SO when I don't know what exactly I need to search the docs for. It's absolutely great for this. For serious learning not so much.


http://www.samuru.com is better for concise answers.


Concise, but wrong: It keyed off of "find", so it produced a top hit of items related to that method (which means in rails: find in the database)

Now you could argue that I should have omitted "find", and there was a time where that was how search engines were used, but the fact that google got the semantics right is why it is first rate.


I've done the same. Also switched to Firefox with full Adblock and tracking bugs blocking.

It's not much, but at least I'm no longer leaving a huge slimy trail behind me online.


You still very well may be leaving a trail behind; it's just likely that such a trail won't be correlated among multiple different domains you visit (from ad tracking bugs of a single company embedded into multiple websites you browse).


I'm using Ghostery on Chrome. Seem to do the job.


I really like the how the goodies have improved (or at least the discoverability). There's some really useful / geeky stuff in there.

https://duckduckgo.com/goodies#Science

https://duckduckgo.com/?q=how+much+magnesium+is+in+43+cubic+...


I second the observation that the results have improved considerably over the last year. There are still holes where I have to resort to using the !g option and use google (non-English searches would be the most prominent example) but I've set all my default search options to DDG and haven't regretted it!


And they are always there in your im client

http://ddg.gg


Plaintext, though. I wish DDG over XMPP would do OTR.


Isn't OTR usually don client-side?


I guess he was referring to the ddg bot answering your questions over XMPP.


I'm all for using DDG on principle, but google search results are still far superior.


I started using DDG few days ago and I am not disappointed so far.


That makes two of us. While lots of people say "well Google won't miss you", I think we still have the duty to do what's right.

The USA was founded as the most free, exemplary democracy in history and what we saw in this scandal is precisely the opposite of what this Nation earned so Google could exist. Then Google betrayed us all and was in bed with the government in the most sinister way possible since 2009.

Honestly? I'd rather give someone else a chance. Then if DuckDuckGo spies on us in the future, I'll switch again.


This is great news for DuckDuckGo and I'm all for it, however, DuckDuckGo isn't completely private right? From what I see, DuckDuckGo obtains much of its data from 3rd parties, such as Bing/Microsoft and Yahoo. http://help.duckduckgo.com/customer/portal/articles/216399-s...

"While our indexes are getting bigger, we do not expect to be wholly independent from third-parties. Bing and Google each spend hundreds of millions of dollars a year crawling and indexing the deep Web. It costs so much that even big companies like Yahoo and Ask are giving up general crawling and indexing. Therefore, it seems silly to compete on crawling and, besides, we do not have the money to do so. Instead, we've focused on building a better search engine by concentrating on what we think are long-term value-adds -- having way more instant answers, way less spam, real privacy and a better overall search experience."


While we do use 3rd parties to fulfill some of our organic results we always make those calls from our machines. We never pass along IP addresses. This means that while our other sources might see your queries, they are not tied with PII (personally identifiable information).


I misunderstood. Excellent.


I think you're not quite understanding how it works. They retrieve the data from e.g. Bing behind the scenes and the 3rd parties have no way to connect a query to an individual user.

That said, maybe there'd be an issue if you enter personally identifiable information as a query. But who does that?


Why would anyone believe that DuckDuckGo isn't already penetrated?


It's a fair question, I'm ashamed by the snarky answers.

If Google can't oppose the NSA in installing backdoors (by the way, this has to be demonstrated), DuckDuckGo can't oppose them neither.

Best luck to the team of DuckDuckGo, it's a nice project.


They probably didn't install back doors and are just pulling in the data from the internet backbone.

Tapping off the fiber.


Duckduckgo redirects all requests to HTTPS and uses only HTTPS; it is highly unlikely the NSA or anyone would be able to decrypt that traffic, unless of course they force DDG to divulge their SSL private key. Which I suppose is plausible.


Have a look-see at this section of their Privacy Policy.

https://duckduckgo.com/privacy#s2

Of course, they could be completely full of shit or unknowingly compromised.

My concern is this: even though they don't _collect_ information, they could still forward it on to authorities and not violate this policy.

However, they have a freaking TOR node, so it's almost a moot point.


Why would anyone believe his own computer isn't already penetrated?


"Why would anyone believe DDG isn't already penetrated?" should be read as, "Why would anyone believe that DDG isn't already penetrated and that their own computers are not already penetrated?" Your trust chain has to start somewhere.


Why bother infiltrating PCs when you can monitor the traffic?


SSL for example.


Because I am running GNU/Linux.


Why do you trust any binaries you've got? Where did your first-use/bootstrapping compiler come from?

And even if you wrote your own OS and compiler from the ground up - who wrote your BIOS? Your network card firmware? Your disk controller software? Your CPU microcode?

We _all_ abdicate our trust-chain _somewhere_


This is why it's important to look at PRISM as a political issue and not merely a technical one, like I see a ton of people doing now. The best solution to government spying isn't to tell everyone to use Linux and DuckDuckGo, it's to change the spying itself.


There's no reason you can't apply both tactics.

Shifting use away from, as Bruce Schneier puts it, feudal architectures, both puts the Government on notice that its methods aren't appreciated, and creates a damaged class (the SAAS feudal lords: Google, Facebook, AWS, Apple, Salesforce, and others) who can petition the government to lay off the tactics as it's hurting business. https://www.schneier.com/blog/archives/2013/06/more_on_feuda...

Hell, push this hard enough and a sufficiently feasible decentralized VOIP might become sufficiently common enough to put the WiFi carriers out of the voice business, relegated to carrying encrypted bits. They might know your handset location, your data usage, and the Tor entry point you're using, but that's it. It's something I've been giving though to.


Indeed - but the political changes (if we get them at all) will take time - time probably measured in years or political terms.

The "merely technical" solutions are going to be important in the meantime. Duckduckgo, encfs, Tarsnap, GPG, Tor, ForceSSL - things like that will (probably) help in the meantime (especially if we can help convince "regular users" to use them), as will encouraging places like DDG to implement TLS cyphers that use forward secrecy.


How do go about determining whether or not you're pwned?

Asking for a friend.


Because look at the logo and the design, that duck is cool! And it says it's private! /s


Or because their entire business differentiation from the beginning has been privacy.

This does not mean that it is not compromised, of course. But its why people would believe it isn't compromised. And its a much better reason than your sarcastic imitation.


"Or because their entire business differentiation from the beginning has been privacy."

The same is true of Hushmail. How did that work out?


I have no idea. I (as I said) was not trying to assert that DuckDuckGo is not compromised, I was asserting that the reason that people believe it to be secure is because they've differentiated on privacy since the beginning.

Mostly, I don't like seeing condescending, inaccurate statements.


Well, the Hushmail story is pretty famous:

http://www.wired.com/threatlevel/2007/11/encrypted-e-mai/

There is nothing inaccurate about claiming that people believe that DDG is protecting their privacy because of how the website presents itself and the claims made by the company. That is exactly why people believed (and many continue to believe) that Hushmail is protecting their privacy. The way companies advertise themselves is not necessarily reflective of reality.


I can't take their privacy stuff quite seriously, considering they are partnered with Microsoft.


Just like Google and Microsoft "take your privacy very seriously"? Not based in the US has become an undeniable feature.


If ducks were the secret to being cool, Aflac would be cool by now ;)

https://en.wikipedia.org/wiki/Aflac


Aflac is pretty cool. That duck has serious swag.


Exactly, I think what we might be needing is a non-US (perhaps Iceland or NZ) based alternative. The moment, duckduckgo becomes relevant enough (i.e significant traffic) then is game over IMO.


Ask Kim Dotcom about how well NZ's liberal laws worked out in practice when the US copyright police showed up asking the local cops to wildly overstep their legal authority…

I mean _seriously?_ Helicopters, silenced assault rifles, security dogs, and 72 cops - sent in against someone accused of _copyright infringement?_ And then a Hollywood showreel of the raid gets produced and publicised?

I _like_ New Zealand, they talk the talk, but when it comes to walking the walk - they're lead around by the nose to do whatever the US wants.


Because it's too small for them to care about.


Really. If understood correctly, the NSA had direct lines to the fiber. It doesn't matter what search engine you use.


the issue is with HTTPS. To read that traffic NSA collects private keys from limited number of [big] companies as collecting them from all companies would be a very public affair. Thus from NSA's price/performance point of view smaller enterprises may be still be off the hook (for now).


They could get a valid key pair from a CA and MITM the connection. It could be detected if the user knows what the public key should be and compares it with what they received, but that seems pretty unlikely.


>It could be detected if the user knows what the public key should be and compares it with what they received, but that seems pretty unlikely.

there is no need to know what the public key should be - only that there are several [more than expected] different keys. Any distributed organization (including Google itself who can be fully expected to monitor which certs their users receive especially after Iran/Diginotar story) could notice it and thus identify the MITM. Thus Google must be on it. Thus no need to involve extra certs from CA though of course i'm not arguing NSA's ability to do that.


Things like PRISM makes me completely want to back out of the Google ecosystem.

Part of that would be replacing Gmail. That can be done, but what good (free) options exists for a webmail solution?

I'd also love this instant to cut gtalk (or "hangouts" which it is called now. hopeless), but Google just declared hate on XMPP, so setting up your own node will land you on your own tiny island.

The trend is clear though: Google is stuffing the exit-holes while the US government is requiring more and more of Google's data.

If you haven't started moving out yet, you better get started. And for the love of God, ditch Chrome. Support someone who supports the open web and respects your privacy.


An observation: Ghostery blocked 22 trackers on this article.

Does it somehow make tracking my every move online okay if it's done for profit?

EDIT: phrasing.


With all the tracking by advertisers it makes one wonder if the Feds have approached them for access to those databases.


I know, it's ridiculous. But people do want to see how many likes the story has, and cross-service comments are pretty next-wave, so you have to have widgets now for 5 social sites, 7 comment systems, global analytics, live analytics, sitewide ad, contextual ad, site cookie and maybe add two more because I haven't thought of them... that's 19 right there!

We really need something better than having these MASSIVE amounts of callouts. It's like those pictures of Internet Explorer totally taken over by toolbars, except it's a different set on every single site on the web. Bah!


It turns out that if you block all the trackers and 3rd part widgets, the website continues to function as the use desires, and nothing of value is lost.


For some sites, yes. Space.com and Business Insider are two that I've noticed tend not to work if some of their trackers are blocked.


Try that with Noscript on, and it gets cut down to 3 trackers.


Everyone was ok with google's tracking for ads and commercial purposes, even though it felt kinda weird, we had a feelign it wasn't evil somehow.

Now I can't even look at Google anymore, they're like a spouse that cheated on you. You knew the spouse may have been watching you, but at least they weren't fucking with someone else while you trusted them.


Anyone going to say it? DuckDuckGo searches are still really low quality. I WANT them to be a legitimate competitor in the space (in fact, I've been having similar hopes for Bing for years) but it's just not there yet.


Glad to see them succeeding, but personally the privacy of my web searches doesn't bother me - as long as they aren't being passed along with personally identifying information. I'm far more worried about emails, messaging, video, storage etc.

Can someone explain to me (or point me in the direction of something that explains) what Google and Bing store in terms of tracking when you are not logged in?

Obviously you can use VPNs or TOR to be really safe, but do you need to go that far if you want an untracked search on Google and Bing?


They have access to IP+time, your search query, and cookies for correlation to other requests. It's valuable information and Google openly documents that they are keeping and using it, along with anything else you leak to them: http://www.google.com/policies/privacy/

IP+time is enough to get your personal identity information from your ISP (physical location of the endpoint, billing information), I have no idea if Google's relationship with ISPs is good enough to buy that or if it's only available to cops.


  > They have access to IP+time, your search query, and
  > cookies for correlation to other requests. It's
  > valuable information
Valuable for blackmail, but not really useful for anything else; the commercial value of information rapidly degrades over time. Knowing I want to buy a new fridge today is very valuable, knowing I wanted to buy one last month is nearly worthless.

  > IP+time is enough to get your personal identity
  > information from your ISP (physical location of the
  > endpoint, billing information), I have no idea if
  > Google's relationship with ISPs is good enough to
  > buy that or if it's only available to cops.
Despite what the RIAA think, a user agent's IP is nowhere near accurate enough for use as identification.

I hope that ISPs do not release personal identity or billing data to arbitrary third parties. I know my ISP (sonic.net) claims they don't[1], and even privacy-insensitive companies such as AT&T have privacy policies that would forbid them from selling personal data[2].

Even if it were possible for random companies to obtain personal data from an ISP, I doubt that Google would have any interest in participating.

[1] https://wiki.sonic.net/wiki/Category:Policies#Privacy

[2] http://www.att.com/gen/privacy-policy?pid=2506


I love targeted advertising because it is so blatantly obvious and hilariously over-optimistic.

I search for a lot of random crap with more curiosity than intent to buy. I looked up the price for several windmills , the late 19th/early 20th century style, (~$1000, by the way). For weeks or months afterwards, I saw windmill ads on a sizable fraction of the websites I visited.

To be fair, it's far more likely that I am going to by a windmill than a random ad viewer, but the probability is still staggeringly low. There had to be a hundred other products I was more likely to buy than the windmill, that would be more valuable to show me. But no! I had viewed their product and I! Must! Be! Targeted!

I really wonder what the set of products that do well from targeted ads looks like.


It does well because the probability of you buying a windmill multiplied by profit per sale is still more valuable than something non targeted like tampons (some of the advertisers get the numbers wrong but not the ones at scale.)

Interestingly, from the CPMs I've seen re-targeted/re-marketed ads perform on par or below contextually targeted ads. No one even comes close to Google for contextually targeted ad inventory (unless you are operating in a narrow niche and you are selling inventory directly, but lots of time and money to even match them.)


"Shop for venereal disease online!"


    >>Valuable for blackmail, but not really useful
    >>for anything else; the commercial value of 
    >>information rapidly degrades over time.
    >>Knowing I want to buy a new fridge today is very
    >>valuable, knowing I wanted to buy one last month is      >>nearly worthless.
I think that blackmail is already bad enough. Considering which topics somebody might want to learn about on the internet, various diseases for example.

    >>I hope that ISPs do not release personal
    >>identity or billing data to arbitrary third parties.
I hope that too. But this information is gathered and stored somewhere in flawed systems which are operated by humans which might decide to follow their own interests more than the interests of the customers. I know of at least one story where an employee of a search engine has been using his privileges to stalk other people.


> I hope that ISPs do not release personal identity or billing data to arbitrary third parties.

My ISP, Time Warner/RoadRunner, claims that they will do so. It's kind of ambiguous because their declaration combines several services and kinds of data covered by different laws. I think the applicable part for their cable ISP service is:

  In the course of providing Time Warner Cable Services 
  to you, we may disclose your personally identifiable 
  information to [...] consumer and market research firms,
  credit reporting agencies and authorized representatives
  of governmental bodies.
Selling their DHCP logs and customer records to a commercial data aggregator (who could then sell it to anyone) appears to be compliant with their privacy policy.

http://help.twcable.com/twc_privacy_notice.html


Valuable for blackmail, but not really useful for anything else; the commercial value of information rapidly degrades over time. Knowing I want to buy a new fridge today is very valuable, knowing I wanted to buy one last month is nearly worthless.

That depends on what they can get out of the data, besides the obvious. I'm thinking of the story about Target knowing that a girl was pregnant before even her father did[1]. Even longer trends can probably be derived, regarding personality traits, income, etc. That information is worth a lot even months or years after it was captured.

[1]: http://www.forbes.com/sites/kashmirhill/2012/02/16/how-targe...


"Despite what the RIAA think, a user agent's IP is nowhere near accurate enough for use as identification."

Maybe not when I'm coming from our company's NAT (700+ employees behind a single IP address) - but the number of people on my Comcast connection is limited.


The IP address from which you send your request to a search engines website may be regarded as a personally identifying information. In case that this information becomes publicly available the connection between your search terms and your IP address will be visible. In fact, this has happend in the past and there was a searchable database with the leaked information online where you could look up search terms.

I don't know what data Google and Bing are collecting, but here is one quote from the wikipedia entry on internet privacy concerning the AOL search engine:

A search engine takes all of its users and assigns each one a specific ID number. Those in control of the database often keep records of where on the Internet each member has traveled to. AOL’s system is one example. AOL has a database 21 million members deep, each with their own specific ID number. The way that AOLSearch is set up, however, allows for AOL to keep records of all the websites visited by any given member. Even though the true identity of the user isn’t known, a full profile of a member can be made just by using the information stored by AOLSearch. By keeping records of what people query through AOLSearch, the company is able to learn a great deal about them without knowing their names.

Source: http://en.wikipedia.org/wiki/Internet_privacy


Based on the leaked PRISM presentation, they could send your search log to NSA, and NSA can identify you based on your IP address that your ISP provides.


TOR is not that safe. If the exit node is compromised you’re f*cked for good.

As for Google the thing is that they’re also an advertising network so basically they track you all around the web.


If the exit node is compromised you’re fcked for good.*

It's not that simple, otherwise there wouldn't be any value in using an Onion architecture. Assuming you're using HTTPS, which every decent search engine supports, they either also need to create a fake but acceptable certificate for the domain, or to also control entry nodes and match the entering requests with the exit ones.

The NSA might be able to do it, but it's not just a matter of controlling an exit node.


"TOR is not that safe. If the exit node is compromised you’re f*cked for good."

Not if you are using TLS, which you should be using regardless of Tor.


What threat are you referring to here? Takeovers through the browser via HTML injected by the exit node?


I like DDG, but has it ever mentioned how this TRACKING data is used?

http://duckduckgo.com/l/?kh=-1&uddg=http%3A%2F%2Fwww.dmv.org... (every time you click on a search result you actually click on a link like this,which redirects you to the actual page)

Is it just for pagerank?


You can turn that on/off in the settings dialog, essentially it is meant to hide the search queries from the target website.


It might be used internally for additional ranking signals. But the privacy policy states it can never be tied back to you as an individual so nothing to worry about.

From memory the main reason they do this is to allow downstream websites to determine if a user was referred to them by DuckDuckGo without the actual search term. IE you know they came from DDG but with no leakage.

I run searchcode.com (which provides a lot of the code doco and sample results) and since this was done I can now determine how much referral traffic actually comes from DDG but have no idea what you were searching for when you click through.


This irritated me, as it makes me somewhat skeptical of the "we don't track" claim.

However, I noticed that if you use their HTML version (i.e., use duckduckgo.com/html/<query> instead), that they don't do the click-tracking. The only downside I've noticed is that there's no infinite-scrolling mode, you have to hit "next".

Whether they're still tracking the search queries, though, I have no idea...


DDG just claims they don't store your history. Presumably they could use redirected clicks as quality signal, without tying it to your browser cookies.


Here is a non-US-based alternative: https://startpage.com


Just because they say they don't collect your info, it doesn't mean that google doesn't get it.

Look at the source of a search page and you'll notice that they include scripts directly from google.com...


Google assuredly gets info, just not from the user. There aren't any google requests from Firebug's net tab, either, so perhaps you are mistaken.


Google knows what you were searching for, when, and for how long...


Hum, neither NoScript nor a search in their source reveals such scripts. Can you point me to them?


Open up the source and search for google.com...


You can see the stats here: http://duckduckgo.com/traffic.html


It's been around for a long time, but I really love the TTY mode of DDG: https://duckduckgo.com/tty/


Whenever I think about DDG, I think "Oh, you think Gabriel Weinberg suddenly cares about your privacy after he sold a truckload of user information to Classmates.com?"


Not that this is going to help. The NSA is probably tapping the fiber at the ISP's backbone in front of Google.

Why do you think it is called PRISM? It's probably named for the way they are splitting the fiber and recording everything.


In the case of DDG, that would be difficult. DDG uses SSL. If you make a mistake and type "duckduckgo.com" instead of "https://duckduckgo.com", it will automatically redirect you to the secure page. Unfortunately, that redirect gives a man-in-the-middle and opportunity to hijack your connection, even with SSL; however, that's tricky enough that its hard to imagine anyone pulling it off without ever being noticed.


HSTS allows a site to indicate that in the future it should always be loaded over a secure connection, so you only have an interceptable connection the very first time you visit that site. Both Firefox and Chrome allow sites to add themselves to a list to "preload" HSTS enforcement, so even that initial connection which is man-in-the-middle-able doesn't happen.

I don't see them in the current lists, so DDG should contact Mozilla and Google to get added to their preloaded HSTS lists[1][2] so all connections will automatically happen only over HTTPS.

[1] http://dev.chromium.org/sts

[2] https://blog.mozilla.org/security/2012/11/01/preloading-hsts...


>Unfortunately, that redirect gives a man-in-the-middle and opportunity to hijack your connection, even with SSL

As long as the SSL cert isn't compromised, I don't see how this is possible.


The initial request/redirect response is insecure. So a MITM can intercept the redirect response and replace it with his own content. That content could be, for example, a 200 response status and HTML pulled from the attacker's HTTPS connection to the target site.

So rather than being redirected to a secure connection, I happily communicate with the attacker instead.


But a redirect would change the status bar, right? So presumably it would still be pretty noticeable.


If you have an hour to kill.

http://www.youtube.com/watch?v=MFol6IMbZ7Y


They don't need the existing SSL cert. The "beauty" of SSL is that they can use a cert generated by any CA trusted by your browser - or even a second one from the same CA -, even if there's already a cert issued by one.


You are assuming they are looking exactly at search requests. Email is unencrypted and so is DNS. The rest they store in case they can break it later.

They were doing this in 2007 at AT&T Worldcom I expect it only got better over the past 6 years.



It could also be named PRISM as a form of misdirection, to make people think that the codename referred to upstream-collection operations. (It's beyond doubt that upstream collection is still ongoing too, though.) Or it could be that spy organisations just like optical metaphors. FWIW the You Should Use Both slide seems to use PRISM to refer specifically to the "direct collection" and not the upstream collection capability.


Or it could be an API for issuing FISA warrants and collecting the requested data.

(The NSA may very well be doing both.)


They are probably doing both.


I have a private writing app and I can confirm that this NSA thing has been good for me too and I'm betting lots of services concerned with privacy. I saw a 5X increase in users in the past week and reviews in blogs are getting more numerous.

I wonder though if now all services regardless of their real focus will start marketing privacy as a feature and muddy the waters, making it hard for consumers to discern who is really about privacy and who just uses it as a marketing ploy.


Anyone else wondering if DDG will come out with their version of email? Maybe something that uses end to end encryption (maybe working together with Mozilla Foundation)?


I would prefer if someone else did it. Not because I do not trust DDG, but because I don't like the idea of one company providing everything (or too much). Why step into the footsteps of the dinosaurs?


Fair enough.

>Why step into the footsteps of the dinosaurs?

I'd think adding E2EE to email would be like what pagerank did to the search engine. Why build from the ground up when you can build on the shoulders of the giants?


If only someone solved the problem of bringing PGP to the masses. Maybe the need has to reach a critical mass.


I agree, and that's why I was thinking that a company that is still growing and is known for their privacy practices could be good (beach-head?) at doing this (at least being able to advertise it on their own services to get some traction and feedback).

It's not like google or anyone else is going to do it. And looking at DDG traffic, it seems like it is a growing need. Then again, how to you monetize encrypted emails? contextual encyrpted ads? ;)


This is where they would kill it. Search would be supplementary, but if they pivoted and focused on email (eventually rebranding it), then that's a home run in my book. I'd like to see this, and would be interested in beta testing if/when it happens.


I'd prefer they became more competitive on search first: I try DDG a few times a year and it's a pretty big self-handicap when everything takes minutes rather than seconds to find.


Startpage/Ixquick is coming out with an e-mail service called StartMail: https://startmail.com.


And as far as browser is concerned, I would recommend go with Firefox vs Google or Safari. Mozilla is a non profit and I feel I can trust it the most.


I take it you mean Firefox over Chrome or Safari?

(I agree. The only thing I use Chrome for is Facebook)


even better prefer to use Private Window in Firefox. Chrome is only good for it's debugger so I rather prefer Canary.


The only safe thing to do would be to compile Firefox or Chromium yourself.


Well the safest of all would be to write your own highly secure browser and use TOR.


I've been using DDG for nearly a year now, and have rarely turned back to Google. I only do so when I really cannot find what I'm looking for on DDG, and I try to tell DDG about the bad results (which I've actually seen get fixed). Getting bad results happens infrequently though, so the benefit of using DDG really outweighs the occasional inconvenience.


One thing that I prefer about DDG is that it doesn't try to guess what language I want to search in other than by my input. It is ridiculous that google forces me to go through worse results based solely on my location, it shouldn't matter where you are from.


I tried duckduckgo and it did not work as well as google, yet.

I especially dislike its name, it's odd and too long to type, and again, duck 'walks' slowly, not a good sign.

can this be renamed to something better, and shorter? sometimes name does matter.


if the NSA is intercepting the communications, why would switching sites matter?


1. The NSA isn't the only threat to privacy on the net.

2. DDG is SSL, so there's some hope that the traffic is not visible to a passive observer, even the NSA.


Google is also SSL. There was a post on here a day or two ago with the theory that PRISM is really about intercepting and cracking SSL certs


DDG is has been my primary search engine for about a year now, and I do what I can to proselytize for it. Their results aren't always as comprehensive as the bigger engines, but I can usually find what I need.


It really only took me about a week to get used to DuckDuckGo, and now using Google just feels wrong.


Is there a way to make ddg the default on the iPhone?


Chrome integration possible?


you can integrate any search engine into chrome by right clicking in the search field and picking "Add as Search Engine".

Then you can go to Settings and make it your default.


this is a joke... right?


haha..I hope so..you need to use Firefox and not even Safari to be totally safe..even better Private Window browsing mode in Firefox.


At least you can't blame them for missing a PR opportunity: https://plus.google.com/+JeffJarvis/posts/5X7nHcjijsC


Somewhat related but I've been canceling my subscriptions (Office365 in my case) to services I can't properly secure..

I'm also in the process of selling my surface pro and going back to a Linux laptop (OneNote 2013 sucks with touch on the Desktop for me .. and Microsoft's Windows8 version won't allow you to not use Skydrive)

Also, I pay Microsoft .. why won't they let me save my OneNote docs in a secure way using their Windows8 apps?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: