Hacker News new | past | comments | ask | show | jobs | submit login
WhatsApp – Security of End-to-End Encrypted Backups [pdf] (whatsapp.com)
129 points by FiloSottile on Sept 10, 2021 | hide | past | favorite | 86 comments



Helpfully given in the introduction, here is some useful context for this change in case some people miss this part:

> Since 2016, all personal messages, calls, video chats and media sent on WhatsApp have been end-to-end encrypted. […]

> WhatsApp’s backup management relies on mobile device cloud partners, such as Apple and Google, to store backups of the WhatsApp data (chat messages, photos, etc ) in Apple iCloud or Google Drive. Prior to the introduction of end-to-end encrypted backups, backups stored on Apple iCloud and Google Drive were not protected by WhatsApp’s end-to-end encryption. Now we are offering the ability to secure your backups with end-to-end encryption before they are uploaded to these cloud services.


I'm pretty sure both Apple and Google are very happy with the current state of affairs, this system works great to keep people locked into IOS or Android, as exporting your data is super hard (there were a number of expensive sketchy-looking apps that claimed to be able to do this)


https://wabetainfo.com/how-to-migrate-your-chat-history-from...

This is possible now (in one direction, so far).


Slight Nitpick: This is only available for Apple to Samsung devices.

However,

> Unfortunately, it’s still not possible to migrate your chat history [from iOS] to a different Android phone, but WhatsApp is planning this in the future. It’s also not possible to migrate your chat history from Android to iOS right now.


It used to be encrypted before upload to google, and then … one day it just wasn’t (but came with the “candy” that it no longer counts against your account quota). I could never found any explanation for this, best hypothesis I found is that it’s a backdoor for law enforcement without admitting it.

I would be surprised, given everything happening in the world today, if the new system does not somehow allow law enforcement to get access (possibly indirectly, through the app giving the key in some weird back channel)


Fwiw, that “encryption” never used your own key or password. Facebook held the key, Google held the encrypted blob, and I doubt the extra warrant to get data from both companies was a huge hurdle.

Definitely was not E2EE before.


That's correct. And yet ... back then, Google couldn't read your messages; Facebook couldn't read your backup; And hackers or law enforcement would need to both get a copy of the blob from Google and the key from Facebook - not insurmountable, but requiring a lot more work and more likely to raise suspicion.

Whereas ... after that change, Google can read your messages, hackers/law-enforcement only need to talk to Google, and yet -- you yourself can't get that backup without impersonating the WhatsApp app to Google Drive; The local backup is still encrypted with a key that only Facebook knows (and would give you, but only if you impersonate the WhatsApp app when talking to Facebook).

I've looked for the logic and failed. Only reasons I can find is that FB wants to let Google index your messages behind your back (unlikely) or some plausibly deniable legal backdoor.


If you read Texas's antitrust lawsuit against Google, you will see that FB and Google signed an exclusive agreement, where Google gets access to WhatsApp messages stored on Drive in exchange for [redacted].

https://www.theverge.com/2020/12/17/22180258/google-whatsapp...


Ah so that's how it worked? I heard that concept once and thought it was a really interesting way to ensure a user wouldn't lose their backup while preventing the company from accessing it.


It's not solely law enforcement. It's a commercial deal.

https://www.theverge.com/2020/12/17/22180258/google-whatsapp...


Deduplication could be a thing


Incredibly unlikely. The backup is AFAIK an SQLite file that contains the text messages (and indexes, etc). The bigger files - images, videos and voice messages - are not included in that backup, and are backed up independently to Google Photos (and always have been).


And that's why I kept saying "no" to the backup requests in WhatsApp.


Doesn't matter; everyone else you talk to on WhatsApp is uploading those same conversations to Apple and Google effectively unencrypted.


I mean that's the problem of any protocol in general. Your opsec can be great, but if it relies on someone else's opsec...


What do you do instead? Replies like this I may mistakenly interpret as "doesn't matter, so I'll do nothing about it" which is just an excuse.


I deleted my Facebook, WhatsApp, and Instagram accounts, as well as iMessage/iCloud (same issue, unencrypted backups).

I am only reachable via email and Signal. I got my contacts to switch to Signal.


How does that solve the problem you mentioned? People on other side still may not be careful with mail/signal data privacy.


Mail is beyond redemption, but Signal for example only offers encrypted local backups and no cloud backups.


And that is why I dont use WhatsApp. Self hosted matrix is super awesome.


Why is this being called E2EE? If you're uploading them, shouldn't they be encrypted at rest? Why would we want it decrypted on the other end? I just want to upload an encrypted file that can only be decrypted by my app. No other end.


In this case both ends are you. You could back up on one phone and restore on another.

You're right that "E2E" is slightly ambiguous. But "encryption at rest" is even worse in my opinion, since it could just mean that Apple/Google's datacenters have disk encryption with a key they can access.


I am neither the authority to define "encryption at rest", nor am i saying "encryption at rest" is good wording in this case.

But from my understanding, "encryption at rest" is not disk encryption.

If you have a database with disk encryption, once the disk crypto key is entered an attacker could try to "do hacker stuff" and exfiltrate the WAL files.

If you have "encryption at rest", the WAL files are written encrypted and are decrypted on read. An attacker may get your WAL files, but they are still encrypted.


There are likely multiple definitions. This Azure definition disagrees with you:

>Encryption at rest is designed to prevent the attacker from accessing the unencrypted data by ensuring the data is encrypted when on disk. If an attacker obtains a hard drive with encrypted data but not the encryption keys, the attacker must defeat the encryption to read the data.

https://docs.microsoft.com/en-us/azure/security/fundamentals...

So with this definition encryption at rest has the threat model of an attacker who can physically steal a hard drive but not the hard drive's encryption key.


Well if Facebook is encrypting I'm hoping Apple/Google can't decrypt on their datacenters. That would be really weird. Just with the key that I hold (aka my app account). I've always understood E2EE as an in transit thing or public/private key where it is encrypted till the other end. I definitely want my backups sitting at rest on some server where I have no ends and it is just a loop.


>Well if Facebook is encrypting I'm hoping Apple/Google can't decrypt on their datacenters.

I was referring to the prior backup scheme where Facebook wasn't encrypting. It could accurately be described as encrypted at rest I think, but yet Google and Apple could read the contents I think.

>no ends and it is just a loop.

The terminology is getting pretty confusing at this point. I usually think of the ends as the intended parties to look at the data. If there are no ends, then no one will look at the data, meaning the data is useless. Actually if there's no beginning end, there's no one to create the data, so thus the data doesn't even exist.


Next step is who owns the key, if it is FB that kills E2E story. If it is user, they'll forget/lose it, so it doesn't scale to 2B users. Currently Apple/Google are set as the custodians of the key.


From the pdf, it sounds like the HSM owns the key, and will only give it up if the correct password is provided.


The data could probably be exfiltrated via WhatsApp web?


Still not total encrypted but getting there.


The worst part is even if you disable automatic backups, which you should, the app will nevertheless force the creation of a backup every day at 2am. And keep 7 days worth of backups at a time. Of every single thing it can gets its hands on. The amount of storage and processing that globally occurs daily due to this, that people neither want nor need, is probably jaw dropping.

Many non-tech people I know that are not aware of this have just come to terms with the fact that phone storage just runs out quicker than it did before, and old phones just lag at 2am for mysterious reasons.


Automatic backups do not include any images or videos. They remain just unencrypted.

> Many non-tech people I know that are not aware of this have just come to terms with the fact that phone storage just runs out quicker than it did before, and old phones just lag at 2am for mysterious reasons.

Non-tech people get pissed when one smiley message is lost. So for all non-hn people automatic backups are a boon. I know this as we run a free repair-cafe - and help people migrate data from old phones.


Not to mention that the solution to not uploading over cellular (on iOS at least) reads:

> To avoid excessive data charges, connect your phone to Wi-Fi or disable cellular data for iCloud. iPhone Settings > Cellular > iCloud Drive > OFF.

Like I have to be inconvenienced when I simply want to grab a pdf from iCloud, just to avoid having a few GB of my data cap used if I happen to be out at 2am.


They may say all the right words, but given how Facebook has been consistently behaving with respect to people's privacy, all this e2e goodness amounts to nothing less than an extremely disingenuous and misleading charade. So, yeah, good to know. But, no, still have zero trust in FB's implementation of it and won't touch it with a long pole.


WhatsApp has been pretty consistent with their track record, not every Facebook product is the same but if there's one part of the company that's doing really well in terms of security and privacy for its users that's the one.


Last week they were fined $270m by the EU for claiming they were anonymizing user data like phone numbers when they weren't.


I can't remember the source so take this as you will, but WhatsApp are appealing such a large fine because the privacy policy was in the middle of being updated during a transition. The policy was correct after the fact and ever since.


If I renegade on a contract with my bank because I was moving house I would still be sued into bankruptcy. Multibillion dollar companies have the onus to keep their legal documents (terms of service, privacy policy) up to date.


that's news to me, had to find a source: https://www.theverge.com/2021/9/2/22653747/whatsapp-fine-amo...

looks like WhatsApp is appealing, so not a case close.

> noting that WhatsApp did not properly inform EU citizens how it handles their personal data, including how it shares that information with its parent company.

I'm not sure I understand these kind of claims to begin with. WhatsApp is facebook, why would they have to warn users that the data is shared?


They did correct their policy to no longer lie to users after they were fined. I'm not sure that counts as "doing really well in terms of security and privacy for its users".


I think that's a bit disingenuous, who reads these policies anyway? And how much does this really matter compared to features like end-to-end encryption?


This wasn't a trivial technicality. They said users' phone numbers were being anonymized and they weren't.

How they handle private data, especially if they lie about what they're doing, does really matter. End-to-end encryption doesn't mean anything if they secretly keep the a key able to decrypt it, which is basically what they were getting fined for.


I'd argue it is because it's buried in some policy text that no user ends up reading anyway.


You have to take WhatsApp's word for all this and you can't, because it is a Facebook property.

Facebook doesn't think twice about doing highly unethical stuff, covering it up and then lying when it surfaces.

Fish rots from its head and the head is fundamentally rotten.


I must say, it is unclear to me why this is being downvoted -- it mirrors my exact reaction.

The old saying "Actions speak louder than words" has never been more apt. It was just two days ago that Ars & others ran "WhatsApp "end-to-end encrypted" messages aren't that private after all" [1]. Yet, here we are.

It's a strong "No thanks" from me.

[1] https://arstechnica.com/gadgets/2021/09/whatsapp-end-to-end-...


I don’t trust Facebook’s intentions, but WhatsApp has demonstrated consistency in bringing encryption to users.

The ProPublica article that the ones you saw are based on was flawed, and has been updated. https://twitter.com/propublica/status/1436054877663375372


Thanks for linking that, I had not actually seen the update to it. Of course, if one of the parties in E2EE shares the message it doesn't constitute a 'break' in E2EE. However, what I think was important from the Ars article I linked was this statement:

>An "end-to-end" encrypted messaging platform could choose to, for example, perform automated AI-based content scanning of all messages on a device, then forward automatically flagged messages to the platform's cloud for further action. Ultimately, privacy-focused users must rely on policies and platform trust as heavily as they do on technological bullet points.

Which doesn't break E2EE technically, but it certainly breaks it in spirit. And yes, I understand that really any application could feasibly implement something like this, it's not in many peoples threat models, etc. However, if I had to bet on which company would implement such a feature, it would be FB.

It just felt sort of funny, seeing this only a few days after all of those articles were written. Of course there is no way FB weaved the whole system and documentation together in two days, but I can't help but roll my eyes slightly at the timing of their release.


Your concerns seem reasonable and well-grounded, it’s just odd to insinuate a conspiracy of how these articles were released. It probably was a reaction but it a perfectly reasonable thing to do. WhatsApp is committed to being transparent, and this is apart of it. If you are highly principled about privacy or doing sketchy things yeah… don’t trust any software from for-profit companies.


Isn't the rollout of this encrypted backup functionality an "action"? And isn't the consistent availability of E2E encryption in WhatsApp an "action"? Whereas it seems to me like the idea that WhatsApp shouldn't be trusted just because of who they answer to is merely "words".


That link doesn't show Facebook broke e2e encryption. It shows Facebook build a possibility of the other secure end forwarding your message voluntarily to Facebook for review.

E2E is only as secure as the other end.


>it is unclear to me why this is being downvoted

I would tell you why, but you're not allowed to according to site rules (it rhymes with 'billing')


This. Exactly the reason why I use Signal and even though I encounter some bugs once in a while, it is the only messaging app I trust in respecting my privacy.


Yes! Signal isn't perfect either, so I keep my eyes peeled.

When Whatsapp wouldnt let me create a group chat without uploading my contacts I was like "yeah.....".


We’re sorry that we have accidentally introduced a bug, which allowed us to mine data and peep into everything.


> To decrypt the backup, the key K is needed Thus, to safeguard K in the HSM-based Backup Key Vault, the client performs a registration of K with WhatsApp.

> The key to encrypt the backup is secured with a user-provided password. The password is unknown to WhatsApp, the user’s mobile device cloud partners, or any third party. The key is stored in the HSM Backup Key Vault to allow the user to recover the key in the event the device is lost or stolen. The HSM Backup Key Vault is responsible for enforcing password verification attempts and rendering the key permanently inaccessible after a certain number of unsuccessful attempts to access it. These security measures provide protection against brute force attempts to retrieve the key.

> Additionally, the users have a choice to use a 64-digit encryption key instead of a password, which would require them to remember the encryption key themselves or store it manually as in this case the key is not sent to the HSM Backup Key Vault

So they do allow not storing the key on their servers, which is the only way I know to ensure encrypted backups can't be decrypted, but they make it inconvenient by forcing the key to be 64 digits, for a strength of 10^64.

They could make "no store" keys much easier by allowing the key to be characters, so that people could use a sentence or other sequence of words as a key and not have to write down or remember 64 digits. Using just letters (ignoring case), you'd need at least 46 to get equivalent (12x actually) strength. With uppercase, lowercase, and digits, you'd only need 36 to get 3x the strength of 64 digits.

If users already need to create a password to secure the random key stored on WhatsApp servers, it seems the strength of that password is really the strength of the whole system. In that case, they could just derive a key from the password and use that directly as the encryption key. Assuming they actually want to protect the backup that is.

Disclaimer: I have never used WhatsApp, but am author of HashBackup which does not store your key on any servers.


> it seems the strength of that password is really the strength of the whole system

Not quite. If you trust the HSM that WhatsApp is using the HSMs provide a defense against brute-force attacks that is infeasible with a mathematical key derivation function. For example even with a weak password you could limit an attacker to 10 attempts after which the key is wiped. This isn't something that you can do if your key is only protected my math. With a random 4 digit pin and 10 attempts you can only guess it 1% of the time. With a password you can brute force it until you get it (of course a password with sufficient entropy is probably still out of reach).

Of course trusting their HSMs is a huge if. There are also concerns about refreshing the attempt count (you don't want a brute force attack to wipe your key!) and synchronizing the attempt count across the distributed HSMs. (just enforcing the limit on each is likely to be sufficient though)


I think WhatsApp's proposed solution here is sensible and achieves both objectives of protecting user privacy whilst also preventing users from accidentally shooting themselves in the foot with "password123".

Introduction of friction can add security. For example, bitcoin wallets that are self-custody will often involve elaborate, un-skippable "write these 24 words down, repeat it one by one" processes to ensure users properly back up the seed words.


> I think WhatsApp's proposed solution here is sensible and achieves both objectives of protecting user privacy whilst also preventing users from accidentally shooting themselves in the foot with "password123".

My understanding is they create a random encryption key K and store it in their vault protected by a user-selected password. Knowing the password gets you the encryption key K. I don't see any restriction on a user picking "password123" as their password to the HSM vault, so how does this HSM setup prevent them from "accidentally shooting themselves in the foot with 'password123?'"


Best comment so far.


It seems that End-to-end (encryption) is now firmly established as a buzzword.

I'm not really a cryptographer, but from what I've gathered from a whitepaper, it's just an encrypted backup with a fancy system that allows users to safely store encryption keys on WhatsApp servers. But of course they have to call it end-to-end because users know it is safe


Saving encrypted stuff on a server is more properly known as client side encryption[1]. Any instance of cryptography used to protect the contents of anything in any way is commonly referred to as end to end encryption these days. Fortunately, the misuse of the term can serve to identify an entity with poor understanding of the technology they are try to sell you.

[1] https://en.wikipedia.org/wiki/Client-side_encryption


I don't agree, if you were to define end-to-end encrypted backup this is what it would be.


End-to-end encryption is when to entities communicate and establish an encrypted connection between them.

In this case one device makes a backup while another might not be even made yet.

(Edit: Rephrased for better clarity)


I'm not sure what you mean by "while another is not yet even made"


I mean it literally. It might be not yet even assembled at a factory, not delivered to its destination country and not sold to a user.


Ah, well that doesn't really matter, you can still see them as two separate participants in an asynchronous protocol.


End to wnd encryption is when on one end you encrypt data for every remote end that is supposed to decrypt this data. That's why it is called end-to-end, because all ends are known and nobody can tamper the correctly established communication with correctly verified recipient. That's how all e2ee protocols work, otr, omemo/signal, etc.

If you do not know what end is going to decrypt it, is is just an encryption with a key/password. Anybody who has the credentials can access the data.

These WhatsApp backups could be restored by 50 different 'ends', so using e2e in this context is incorrect.


Yeah sure, you're saying that the model for full disk encryption would be more relevant here. But at the same time, there is a third-party server in the middle of the protocol, and so I'm not sure if that model would be more relevant.


End to end encryption should be as secure as the underlying encryption technology, this is only as secure as a users password which 99% of the time is trivially crackable.

It’s like equating Fort Knox and a locked car. Fort Knox might not be impenetrable, but they really don’t provide similar levels of protection.


>this is only as secure as a users password which 99% of the time is trivially crackable

Are you talking about the WhatsApp backup scheme? I think the HSM should theoretically make even a weak password mostly uncrackable.


> even a weak password mostly uncrackable.

That’s not what the HSM provides it’s based on this.

OPAQUE: An Asymmetric PAKE Protocol Secure Against Pre-Computation Attacks: https://eprint.iacr.org/2018/163.pdf

The point of end to end encryption is you don’t need to trust anyone else as long as you verified each parties public keys it’s safe.

This fails that test, The threat model of government surveillance is both up front sniffing of the passwords provided to the server and copying the HSM and doing an offline check of every possible password. Both vulnerabilities are significant issues which allow hackers and state actors to read users backups and thus their messages.


>up front sniffing of the passwords provided to the server

I haven't read either pdfs fully, but it seems to me no password is provided to the server (though I'm actually not sure what you mean by password or server). The OPAQUE protocol means the HSM can verify the user has the password without ever seeing the password. So the password is never provided to the HSM or any other server. It's asymmetric. And for the encryption key, it's stored in the HSM yes, but when sending it to the HSM, it's unsniffable because it's encrypted with the HSM's public key.

>doing an offline check of every possible

The HSM should prevent that by limiting the number of attempts. From the WhatsApp pdf:

>The HSM Backup Key Vault is responsible for enforcing password verification attempts

Of course, if there's a hardware vuln in the HSM, then the verification attempts can be bypassed, and the backup is only secure if the password is quite high entropy. It comes down to how likely there is to be a hardware vuln in the HSM. I think in practice HSMs tend to have hardware vulns somewhat frequently, which is why I said "theoretically" in my previous comment. If we theorize the HSM has no hardware vuln, then it's safe with a weak password. We also have to assume there's no backdoor built into the HSM, or the HSM keypair computation or distribution process.


Your complete chat history with everyone on WhatsApp, to date, has been provided in basically unencrypted form to Apple and Google by your conversation partners, which means that it is available on demand and without a warrant to US federal authorities via FAA Section 702 (commonly known as PRISM, or FISA).

This means that even if you stop using it today, there is a huge wealth of information about your habits, travel, personal identifiers, social graph, location history, and personal thoughts and opinions that will be permanently stored associated with your name.

Enabling e2e on backups won't purge this information, especially if it has already been downloaded by USG from Apple/Google.

If you want to mitigate this, you basically have to move, replace all your friends/contacts, never go back to the same venues/restaurants/cities, et c, because your existing pattern of life is already archived.

Too little, too late.


I think the expectations of e2ee have been greatly stretched in this case. e2ee means that the data is encrypted from device to device only and that's it, from one end to another end. If someone backs up their device in an unencrypted way then thats out of scope for WhatsApp - that's not what e2ee is about.

People that expected full at rest encryption (which is what a backup system would include) despite the app never being advertised that way would have always needed a large kick to realise that isn't the case. Encryption is complicated and you can't expect everybody to fully understand what e2ee/at rest/etc really means. This whole situation is a learning experience for everyone and I wouldn't blame WhatsApp for it either. They now know that advertising encryption needs a little more explanation.


WhatsApp currently handles local backups entirely incompetently and infuriatingly despite claiming (IMO dishonestly) that the feature exists, providing inaccurate and incomplete documentation. This is nice to see, but far too little too late for me to trust the app for longevity.

I recently had the issue for the second time of losing over a year of messages due to dysfunctional WhatsApp backups, about which I wrote a blog post of complaints/rants [1]. The user, as far as I can tell at least on Android, currently has no viable option besides uploading their messages, unencrypted, to Google.

[1] https://vinayh.com/posts/2021-08-28/


Does anyone have an NSA address users can just send their backups to and cut out the middleman?


That’s actually hilarious. If you loose all 3 of your backup sources just FOIA the NSA for their copy!


That doesn't work for them, they want you to think you have rights and stuff, its more fun that way.


Genuine question, context first. I've never backed up any chat, I don't use WhatsApp anymore, I used to keep the photos I liked and deleted the rest.

What's the point? I've never felt the need to go back and read any messages I've previously sent. I have no idea why you'd keep them. And also can you imagine if someone got hold of your life's worth of messages?


I frequently search old chats to see when something happened or get some other information I know has been talked about.


Cool. At least now we can pretend the e2e didn't exist till now on WhatsApp. According to them only.

https://jknewsline.com/parras-email-whatsapp-data-to-be-acce...

Here is how political vendetta is taken against people. This news is just a few months old.

I am not on WhatsApp for a couple of idealogocal reasons, this being one of them


"encrypted"


Is this really end to end encryption?


To me it is just an encryption, which isn't bad, but still.


This is ridicolous, they block the account of people for no reason, making them loose years of messages, and now they come up with encrypted backups... they should focus on improving their support. They have only an email address for support. Try to get your account unblocked if their AI decides to block you. Good luck


Taking bets on how much of this is an ego trip from Zuck to stick it to the Apple people about their child protection controversy

"See? We're not like them"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: