"Federated Learning enables mobile phones to collaboratively learn a shared pred...

jd20 · on April 7, 2017

While I think you can definitely draw some parallels, differential privacy seems more targeted at metric collection. You have to be able to mutate the data in a way that it becomes non-identifying, without corrupting the answer in aggregate. Apple would still do all their training in the cloud.

In contrast, what Google's proposing is more like distributed training. In regular SGD, you'd iterate over a bunch of tiny batches, sequentially through your whole training set. Sounds like Google's saying each device becomes it's own mini-batch, and it beams up the result, and Google will average them all out in a smart way (I didn't read the paper, but this was the gist I got from the article).

Both ideas are in the same spirit, just the implementations are very different.

Eridrus · on April 7, 2017

Differential Privacy is much more than what Apple's PR department says, differentially private SGD is already a thing.

jd20 · on April 7, 2017

Well, forget Apple for a moment (that was just an example, since parent asked about them specifically): my point was what Google's describing is separate from differential privacy. There's no controlled noise or randomness being applied.

They even say at the end of the paper: "While federated learning offers many practical privacy benefits, providing stronger guarantees via differential privacy, secure multi-party computation, or their combination is an interesting direction for future work." So, the "practical privacy benefits" here is referring to the dimensionality reduction from running the raw data thru the LSTM.

sdenton4 · on April 7, 2017

This is different from differential privacy (which, btw, isn't just an apple thing). Differential privacy essentially says some responses will be lies, but that we can still get truthful aggregate information. The canonical example is the following process: Flip a coin, if it's head, tell me whether you're a communist. If it's tails, flip another coin and if that one comes up heads, tell me you're a communist, and if it's tails, tell me you're not.

From one run, you can't tell if any individual is telling the truth, but you can still estimate the number of communists from the aggregate responses.

This is doing local model training, and sending the model updates, instead of the raw data that would usually be used for training.

mikecb · on April 7, 2017

Chrome used differential privacy far before Apple. See the RAPPOR paper.

hiddencost · on April 7, 2017

Here's google doing this in November 2015:

http://download.tensorflow.org/paper/whitepaper2015.pdf