wondering what everyone's thoughts are regarding open sourcing your own code you've kept under lock and key for a long time, with the rise of AI and inability to enforce licenses or control whether code is used for training.
You aren't going to have a choice - anything you make available on the web will be consumed by AI regardless of the license. That is essentially going to be the primary purpose of the web going forward, to provide training data for AI.
I wonder what this means for SEO. If someone develops content and someone else then generates very similar content using AI, how do the search engines know who to rank higher?
I suspect both SEO and search engines as a concept will be made obsolete. SEO doesn't even make sense when machines are creating the content and consuming it. AI is going to be the primary UI for software applications in general, so you won't even really use a browser, you'll talk to an AI (probably directly integrated with your OS) and it will generate content or dynamic applications on the fly - you won't ever even interact with the web directly, much less with human-created content.
Everything evolves. The internet existed before the web, and we navigated its content. The web existed before search engines did, and we navigated its content. Content and ways to navigate it will continue to exist once search engines are no longer relevant. Maybe SEO simply doesn't matter?
You'll need to clarify what your specific goals and concerns are. Most of my code is open source because I want people to build derivatives. There's lots of license variations to consider.
The question is specifically about the problem with licenses, if I understand correctly. AI is now a big license-laundering machine, so there are not "lots of license variations to consider": AI just removes the license, whatever it is.
For the legally paranoid, this is already a "fruit from the poisoned tree" situation.
Some people are informally "banned" from working on whole categories of open source projects because they've had provable exposure to closed source code in the same domain.
That's a partial motivator in maintaining pseudonymity online. If no one knows who you are, and they don't know you've done kernel work at Microsoft, you or the open source kernel project you're contributing to can't be sued by Microsoft, hypothetically, for "inspiration".
There's legal questions here for which there's never been precedent, so nobody knows where the line is -- and this is all before LLMs ever entered the picture. For other countries, whose courts don't rely on precedent, it's even more of a minefield since the legal outcome of a case is always undefined behavior.
A lot of companies avoid using GPL code at all, because they don't want to even accidentally find their own proprietary code subject to being released under the GPL.
I can imagine that some similar licensing concept could be applied to copyrightable works with respect to AI training. Use a legally restricted work for AI training, and your entire AI training set is subject to free public release?
No such licensing concept could work. The AI companies' argument is that what they're doing counts as fair use, so they don't need to abide by any license. If that argument is true, then such a thing wouldn't do anything, and if it's false, then such a thing is unnecessary, because they're already violating even the most lax licenses (e.g., MIT) by failing to provide the required attribution.
Fair use can be a bit ambiguous, but I think there are definitely grounds to claim that what the AI companies are doing is not fair use.
I imagine it will have to be tried in court for anyone to say for certain. (And I imagine whatever court ruling happens would be contested. We may actually need multiple court rulings...)
> with the rise of AI and inability to enforce licenses or control whether code is used for training.
The inability to precisely enforce license abuse has always been a problem with Open-Sourcing your code. I don't understand how someone would see AI as a dealbreaker in the real-world of Open Source pragmatism.
Even when they were called out on it, it took them years to respond. And this was before LLMs; even then we knew that releasing things under Open Source is mostly a good-faith social contract.
These "is it still okay to x because AI exists" questions are besides the point. If you weren't considering the worst-case scenario outcome of your actions already then maybe you should. If AI forces your hand, so be it. But you can always Open Source your code, as long as you've accepted the same risks that existed for the past 20-30 years.
I think this is the wrong question. The question assumes that AI is a derivative to your code. The reality is that there will come a day, sooner than you like, when the AI can replicate the application you wrote without your code. In that world, what does it matter what code is open vs closed?
FWIW, learning from copyrighted data should not be a copyright violation. This applies equally to both humans and machines. I see no reason to differentiate.
Well, I hope no one will listen to you and your Luddite friends because generative AI is amazing and one of the best things humans have created in my lifetime. Take all the available data and make amazing and fun things with it! That's what the Internet was created for.
> generative AI is amazing and one of the best things humans have created in my lifetime
Admittedly, humans haven't created a ton of great things in the last couple of decades. The evolution of Tech in the last 20 years is depressing. I guess I can understand how you may consider it "amazing", though I would expect that a teenager could understand how their life is going to get a lot harder in the future, and that Tech is a big part of the problem.
Damn it, I thought I was a miserable prick. It's statistically likely that I'm older than you; just not as bitter yet.
I am optimistic and cheerful considering the mind-blowing advancements in generative AI. Given my good fortune to be alive during this time, I'm pleased to see these developments unfold, and I try my best to disregard the haters who are disparaging one of the greatest achievements in my lifetime.
I, a teenager who create their account here in 2016, instructed a locally run LLM to write the above message.
Sorry, I didn't mean that you were necessarily a teenager. I was just saying that given the last 20 years in Tech, a teenager has only known it getting worse. You could be 60 and think that generative AI is one of the most amazing things invented in your lifetime, for all I know.
Kind of. Making stuff is great; having an AI make it for you is not so great. It's kind of great. The art is soulless; the code is wonky. It enables people to do more, but, if anything, I think what we've learned is that taking shortcuts to do more stuff is bad. Hey, let's allow people to download modules right into their codebase of code other people wrote so it's easier for them to do more stuff! Enter left-pad. Enter a flood of crapware. Enter dependency build issues. Enter more bloated software than you've ever seen before. Enter needing more RAM than ever to compile AUR packages during hour long updates (assuming no dependency errors happen).
I am a luddite. I didn't used to be, and I don't want to be, and it's positioning myself on the wrong side of history, consigning me to a diverging function of increasing bitterness, but I feel forced into it by the direction tech is going. I think we've bitten the apple in our greed, and now we have to worry about our car spying on us because it's equipped with an autonomous and intelligent agent of oppression where, just a few years ago, that wasn't possible. "Progress", huh?
Anyway, I had to say that before saying I totally agree with you about your open source philosophy. I hate Microsoft, but, before, Microsoft employees could read my code, take inspiration, and write stuff with it. Hell, they could even paste snippets of it in. I have no way of knowing. Now, I'm uploading my code directly to Microsoft (okay: this fucks with me), and a Microsoft virtual mind is reading my code, taking inspiration, and/or pasting snippets of it in. It's a petty difference. The idea of free software is that anyone, man or machine, can see it and learn from it.
If you truly believe in the idea of free software, you should de facto be okay with AIs, Nazis, terrorists, $MEGACORP_YOU_DESPISE, ANYONE using your code.
The biggest differences are the increased probability that someone else will now be using your code and that you won't be accredited or your GPL license will be violated, but I consider those pretty ethereal issues.
On the one hand, "working in tech" became so accessible that titles like "prompt engineer" don't sound completely ridiculous. Those people don't seem interested in understanding how technology works, but rather in making profit with whatever low-hanging fruit they find.
On the other hand, actually competent engineers get paid a ton to ignore any kind of ethics. They just get rich while having fun building tech that the first group will use.
The combination of both (unethical tech and not-so-competent "engineers" building on top of it) is destroying the world.
> The biggest differences are the increased probability that someone else will now be using your code and that you won't be accredited or your GPL license will be violated, but I consider those pretty ethereal issues.