That regurgitated code exists on Github exists under an MIT license: https://git...

monocasa · on June 23, 2022

Even if it's somehow available under an MIT license (which is questionable on the part of jethrodaniel), there's still infringement. MIT isn't public domain, it still has

> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

Replicating it without complying with those terms is still infringement.

sirsinsalot · on June 23, 2022

this. People are being willfully blind here, like cult members looking dead-eyed at their leader and chanting "This is great" as they drink the kool-aid.

And from Microsoft no less, once outcast for mass poisoning.

vorpalhex · on June 23, 2022

> but it's hard for Github to determine that in general, so I doubt they would be liable for the error.

Please insert that meme, "That's not how that works. That's not how any of this works!"

The legal system is permission based, not forgiveness or "I didn't know" based.

minhazm · on June 23, 2022

Actually the legal system is evidence based. Microsoft has evidence that the code they are producing is licensed under MIT as far as they can reasonably know. There's no definitive way to know that who actually owns the original copyright. I could grant permission to use my repo, but maybe I got that code from someone else, who then got it from someone else and so on and so forth. It's a similar situation with stolen goods, if you unknowingly purchase stolen goods you usually cannot be charged for theft as long as there aren't obvious signs that it's stolen such as the goods being priced far below market value.

sammax · on June 23, 2022

Microsoft has evidence that the code they are reproducing is MIT licensed, so are they intentionally violating that license or does this AI thing include the license and attribution in every snippet it generates?

monocasa · on June 23, 2022

Major aspects of copyright infringement are strict liability, like a lot of civil actions around damages. It doesn't matter if you thought it was OK, there's still a damaged party that needs compensation according to the law. At best you'll simply avoid the criminal and punitive penalties.

BaculumMeumEst · on June 23, 2022

Exactly, that's why Pornhub hasn't had any liability issues arising from where its content comes from either. It's just too darned hard to tell.

monocasa · on June 23, 2022

No, PornHub doesn't have liability in a lot of cases because of 17 § 512, but has still had to deal with liability in general, which is why they nuked some 80% of their library not backed by verified individuals a while back.

https://www.law.cornell.edu/uscode/text/17/512

A huge part of 17§512 is the DMCA takedown process mainly in 17§512(c)(3). Does Microsoft even have the ability to truly remove training data from the model? Or do they have to retrain on each DMCA takedown?

Flimm · on June 23, 2022

I personally don't want to have to upload proof of identity to GitHub and a signed document swearing that I own the copyright to all the code I upload to GitHub, or proof that I coded it. We need to be careful what we wish for.

vorpalhex · on June 23, 2022

Excerpt from the MIT license:

> THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

concordDance · on June 23, 2022

If they had a reasonable basis for believing they had a license they're in the clear. "I didn't know" might not be enough but "I had good reasons to think otherwise" is.

vorpalhex · on June 23, 2022

> If they had a reasonable basis for believing they had a license they're in the clear.

False.

If they committed copyright infringement, even if they genuinely believed they weren't, they are not in the clear. They still owe damages.

concordDance · on June 24, 2022

Can I have a citation?

vorpalhex · on June 24, 2022

https://www.traverselegal.com/blog/accidental-copyright-infr...

https://revisionlegal.com/copyright/what-is-accidental-or-in...

mrh0057 · on June 23, 2022

I’m not a lawyer but my understanding these are torts so all you have to prove is Microsoft has liability. I think this would be easy to prove due to the way neural networks work since it’s just a way of performing a search.

Since it’s a tort I don’t think you have to prove they should have know it would return copyrighted code, the fact that it does is enough to have liability.

jsiaajdsdaa · on June 23, 2022

That doesn't stop youtube from blasting people away over copyright issues?

On youtube, video uploads are a cost center, whereas on github, code is a profit center