That's really no different than somebody uploading proprietary code they don't o...

megous · on June 23, 2022

Ok, takedown requests exists. Say Qualcomm finally wises up and asks github to takedown a copy of the millions lines of their super proprietary 4G modem firmware implementation from github. Will github retrain the model after each such takedown? :D

If not, then it's kinda stupid to argue the point about the lack of knowledge, since lack or not lack of knowledge clearly doesn't matter. Github will happily continue using confidential code even from trigger happy companies like Qualcomm for copilot.

redox99 · on June 23, 2022

I guess they would add some kind of filter to copilot output that removes results that clearly come from code that was DMCAd.

It's kind of like some employee that worked at Qualcomm and has seen the code. Do you retrain him (aka hit his head until he forgets) after leaving the company?

The comparison might seem silly but as AI advances I expect more and more arguments (especially in court) to come from analogies of humans and AIs.

megous · on June 23, 2022

What kind of filter? I thought copilot does not output the input data verbatim.

Creating an output filter based on millions lines of DMCAd code that would not cripple the copilot output completely at the same time, sounds like one of those hard problems. Especially if there's no agreed upon definition of copyright "violation" here.