Hacker News new | past | comments | ask | show | jobs | submit login

Lots of talk about fault tolerance, not a lot of talk about trusting peers and preventing them from introducing bad data into your presumably precious model...

So if you're forced to trust all of the peers, how is this better than a cloud? Who out there is training models for purely benevolent reasons (i.e. non-profit seeking) and can trust random nodes? If not for purely benevolent reasons, who out there is going to donate CPU time to training your model, essentially writing you a blank check?





That's a damned timely shameless plug :). I'll add it to the reading list.


Thanks :-)


Generally, from what I've read on SETI @ Home, the way these work is they run the same calculations on multiple computers. It's still possible to fool the system but it gets increasingly harder the less portion of computers on the network you own (assuming everybody else has an honest computer)


In the case of neural network training the cost of verifying that a gradient submitted by a peer reduces the cost function should be significantly less than generating that gradient, so you wouldn't even need to burn 2x effort to evade cheaters.


Is it possible to submit a falsified gradient which still reduces the cost, but less so than the actual gradient would, and such that how the network behaves is manipulated?

Like, say, if one selected some of the images in the batch to use a different label for when computing the gradient, but still using the right label for most of the images in the batch?


subsequent gradient updates would probably wipe out the manipulation


For trusting peers I thought about adding certain questions into the dataset where you know the answer to and see which nodes give you the right answer and which nodes give you garbage.


You are right. Those concerns are are in the focus of some recent research (and some of it is the topic of my in-progress thesis)


The resulting model is shared, so everyone who gives their compute time will benefit from it.

Neural networks automatically error correct so they will be robust to some amount of corruption.


How well has this been characterised when some participants are actively malicious rather than wrong?

Reason I ask is, you can absolutely bet that states will attempt to cause interesting failure modes in other states’ A.I. — imagine if self-driving cars had a literal blind spot for the fifteen senators most aggressive towards [rolls dice] Agrabah?


Let people create accounts and introduce a reputation system.

There could be a ranking system. Rank I verifies 100% of submitted jobs. Rank II verifies 50% of submitted jobs. Rank III verifies 25% of submitted jobs. Rank IV verifies 12.5% of submitted jobs. Rank V verifies 6% of submitted jobs.

After 250 jobs you go to Rank I, after 500 to Rank II, after 100 to Rank II and so on...

If you submit a job with incorrect results then you lose your account and all unverified jobs submitted by that account are then verified. If you're a honest person you'll just create a new account, if you're a malicious actor then you just wasted a lot of money on nothing because doing a bait and switch will result in your malicious jobs being discarded.

There is still an opportunity for denial of service by creating lots of reputable accounts and then letting them go malicious all at once. You'll have a large backlog of jobs to verify.


Sounds like the kind of things goverments would do, although perhaps I’m listening to too many works of fiction?


Small to moderate groups of friends that all know each other and are excited about deep learning?

edit: Ah, just saw the "what it isn't for" section - apparently not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: