In a world with 8 billion people and billions before you, if you have to ask “Am I the only one that _____”, just quietly whisper to yourself “no, I’m not”, and move along.
Do you only care about things that directly affect you negatively and personally? There are a lot of things that don’t affect me directly personally that I still care about.
I don’t write open source public software with a restricted license, but I understand why some people do. I respect that, and I probably use a lot of the software without realizing it. I can understand why they’d be bothered by theft of their work, and I could see downstream effects such as them stopping their work that would eventually affect me.
No, you are not the only one. There are unfortunately plenty of people who don't care about massive corporations asserting ownership over the work of other people whilst jealously guarding their own IP, and even some apologists for them.
Agreed, it's in the same vein as self-driving cars.
Sure, it could put a lot of programmers out of a job in theory.
In practice, the emergency brake alert goes off for cars waiting to turn in the oncoming lane, the blind spot warnings trigger on cars that are ahead of you, the lane departure warnings are usually wrong, and so on.
I'm not worried, and I prefer to avoid the uncertainty even if it means outsourcing a bit less effort to the magic AI pixies.
I am ignoring it for the time being. and maybe I am misunderstanding its usefulness. I'm having a hard enough time with less experienced engineers generating large blocks of code and pushing it on me to play find the errors; the idea of an AI doing it as part of my workflow doesn't sound very appealing.
Seriously! Some of the most frustrating coworkers I've had to deal with over the years are the ones who just churn out reams of lightly-considered code, as though they are getting paid by the line.
I would be far more impressed with an AI which could spot hidden redundancies and suggest opportunities for code reduction.
I trialled it. I found that most of the code it wrote was wrong so I didn't bother paying for a subscription. It was like pairing with a super enthusiastic developer that is eager to write code at every opportunity but doesn't understand what we're doing.
My problem is the that copilot (actually codex) training data and model isn't available. If it was derived from open source code, that work must be released.
I'm bewildered why nobody else has brought this up.
If it just took patterns out of open source and isn't overfit (there is evidence that it is overfit though) it seems ok. Viewing code doesn't actually taint you, clean room reimplementations are just done in a strict manner in corporations to remove all doubt legally, but aren't an actual legal requirement.
Why should I care if you care. If you value your time you'll maybe be interested in having it write unit tests for you (It worked for me™). If you value protecting intellectual property you might care that it's sucking up a lot of it and making available to people who do care. I think it's a losing game because there's enough people willing to give it their intellectual property for free that by the time your get your stuff removed it's already found something similar somewhere else (Drainage my boy!)
> I think it's a losing game because there's enough people willing to give it their intellectual property for free that by the time your get your stuff removed it's already found something similar somewhere else (Drainage my boy!)
I don't think many would find that objectionable; I don't think people who object to the manner GitHub (and OpenAI, etc) has gone about this generally object to the existence of Copilot (or large language models or image generating AIs) itself, so I don't see think we'd be "losing a game". I personally would see it as a great result if these companies were respecting licensing in a rigorous way, or otherwise ensuring that this was more of an opt-in system (I'm not a supporter of intellectual property broadly).
Ahh yeah two points got mushed together. I agree they should fight for their IP. I meant more that the value of their snippet of code will go down as others fill in with their own IP.
I think most of the authors of opne source code only care that their code is being stripped of credits, not that it's being used.
It's not a lot to ask.
If it IS somehow an impossible technical challenge that this commercial entity can't manage, but they still want to persue the potential of this technology, they are free to pay other commercial software developers for licences that grant this usage explicitly just like any other commercial use/redistribution license.
If that pool of software is too small to be as useful, and too expensive to be practical, so what? Tough shit.
I see no reason to excuse how they are currently just not even bothering to do either option, or address the issue in any other way, and are simply outlaw right now.
I care, very much. I will no longer host any new personal projects on Github, and I will not use Github at companies I lead. I'm looking for alternatives, major one at the moment being Gitlab.
For me the most interesting thing about using large language models is they offer a kind of conversation with the average of the human data they were trained on. They surprise me by telling me when I'm doing something boring.
When Copilot guesses the next method name or comment I was going to type, it's doing that by saying "this would be the most boring, average string of tokens to come next, so here you go," and it's fascinating how often that's right -- how often I'm wrong about how surprising the next line was. It's like how terrible humans are at generating unique passwords, except for everything I type. Copilot doesn't help by knowing things I don't, because it doesn't know anything, but it does help by guessing what I was obviously going to do next without me having to call out to memory.
Once I have access to that average-of-humanity information for a while, I start to want it for the rest of my life too. OK, fine, that's the next method name I was going to write. [tab, autocomplete]. OK, fine, that's how I was going to close out my email. [tab, autocomplete]. Well, huh, I wonder if it knew what I was going to type next on the command line? [yes, probably]. I wonder if it knew which things I was going to buy in the grocery store? [yes, probably]. It starts to feel limiting to not have access to what the average next step in the sequence would be.
And then it turns out that average-of-humanity models have all kinds of potential impacts on political power and labor and property law and so on, so all of that is pretty interesting too. But for me it starts with just poking at the model and going, oh, hey, it's ... everyone, how are you all doing?
I find the legal and ethical implications more interesting than the technology as it stands. Maybe I just don't do the kind of work it's suited to, but I didn't find it terribly useful when I gave it a go. I had an experience that went "oh wow, that turned out a lot of code fast and it looks pretty good" followed by "oh, it made the exact subtle mistakes everyone makes when they write this kind of thing". No surprise given the nature of ML.
On the other hand, I find myself torn between "information wants to be free and this is one more nail in the coffin of our odd and ahistorical concepts of 'intellectual property' and 'plagiarism'" and "oh great, another way for giant corporations to reap all the benefits from work done by individuals and smaller businesses".
I don't think I'll ever use the thing, and I have ~0 power over the societal implications, so overall - yeah, can't get that exercised over it either.
Systems like Copilot and Dall-E and so on turn their training data into anonymous common property. Your work becomes my work. This may appeal to naive people (students, hippies, etc.), for whom socialist/communist ideas are attractive, but it's poison in the real world.
These systems are a mechanism that can regurgitate (digest, remix, emit) without attribution all of the world's open code and all of the world's art.
With these systems, you're giving everyone the ability to plagiarize everything, effortlessly and unknowingly. No skill, no effort, no time required. No awareness of the sources of the derivative work.
My work is now your work. Everyone and his 10-year old brother can "write" my code (and derivatives), without ever knowing I wrote it, without ever knowing I existed. Everyone can use my hard work, regurgitated anonymously, stripped of all credit, stripped of all attribution, stripped of all identity and ancestry and citation.
It's a new kind of use not known (or imagined?) when the copyright laws were written.
Training must be opt in, not opt out.
Every artist, every creative individual, must EXPLICITLY OPT IN to having their hard work regurgitated anonymously by Copilot or Dall-E or whatever.
If you want to donate your code or your painting or your music so it can easily be "written" or "painted", in whole or in part, by everyone else, without attribution, then go ahead and opt in.
But if an author or artist does not EXPLICITLY OPT IN, you can't use their creative work to train these systems.
All these code/art washing systems, that absorb and mix and regurgitate the hard work of creative people must be strictly opt in.
If you don't care about this, it's naivete, or a lack of foresight, or apathy as these companies pillage the commons. Not something to be proud of.
Microsoft and OpenAI (and others) are robbing us and you should care.
Personally I think the modern hero-worshipping auteurship culture is far more harmful than anything else. You don't matter, your name doesn't matter. Culture is commons almost by definition; all works are derivative and stand on the shoulders of anonymous giants.
"All works are derivative" does not imply that just any kind of derivation is or should be OK. I can't write a Harry Potter novel and profit off of it (without being sued), for example, so clearly there are legal and ethical frameworks in place to prevent some kinds of "standing on the shoulders of giants".
Frankly, it almost seems like you're saying that since all works are derivative I shouldn't care how my own works get used -- it should all be thrown in the commons for anyone, including giant corporations, to use as they wish. Is that the case? It's not obvious to me why.
IP rights are a cancer on society and have held back progress for generations.
From drugs, to technology, I would say IP rights have done more harm than good. Especially when you consider the number of patent trolls that waste countless of hours and money.
Humans are not islands, nothing you make is yours. It is likely a remix of someone else's idea or influence.
I believe in property rights, but patenting ideas leads to such perverse incentives that the benefits rarely outweigh the cost. Often times the person isn't even the first to come up with the idea! They're just the first to patent it.
Look at the field of psychedelics for example, where there's a mad scramble to patent things as absurd as the process of administering the psychedelic.
With open source software you receive something very valuable: social capital, credit, respect.
Communism is where you do the work (or don't) and everybody is rewarded equally.
That's the Copilot model: you work hard to write the code, and receive nothing in return, because now everyone and their 10-year old brother can write it simply by asking for it... and your name never appears.
I wasn't robbed; I gave it away on purpose. I don't care if my name appears; credit was never the point. Sharing back to the community which has shared so much with me is enough.
The reason I'm thinking maybe it's not so bad, is because which snapshot does it use? How does it know which snapshot of an algorithm is correct and which has a bug? What if you have code that's perfectly fine but then you add several new branches, each one with a different critical security vulnerability added in that's intentional, and malevolent? Will the AI be able to detect these vulnerabilities and choose the correct version? It seems so easy to sabotage this thing. Why would anyone be stupid enough to use it? This thing seems like the perfect stupidity detector.
Here is the MIT License. Note the very important requirement for attribution ("you must give me credit for my hard work") italicized below:
Copyright <YEAR> <COPYRIGHT HOLDER>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
At least from my perspective as a web developer, I just never found it all that useful. When I finally got an invite to it a few months ago, I used it for a bit but it always just felt like more of a toy than a real tool. If you're someone that's learning to code then Copilot looks like this fantastical tool that can write all your code for you, and that's why I think it gets so much attention. But from my perspective, Copilot just completely breaks down inside of a large established codebase. It's great if you're just standing up an app for the first time but once you lay down standards, common services and components, etc, Copilot just starts getting in your way.
I was excited when it first came out but I'm just over it now.
I tried it out on the kind of work I do (Spark data pipelines, machine learning), and found it thoroughly useless. Getting it to generate code that was remotely close to what I wanted was pointlessly difficult. I tried throwing it at a parsing function in a date library I maintain and it spit out nonsense that, I suppose, looked like it was doing something reasonable (but wasn't, which is worse).
Maybe it's good for some things, but I wasn't impressed. My job's safe for a little while at least.
I gave it a try in pycharm about a year ago. It literally not once gave me good suggestions. It gave lots of suggestions to be sure but it was all worthless.
The way it felt to me, it samples all kinds of other people's code(probably from github)(who owns the copyrights here?) and pastes their code. Except, what's the quality of that code? I'm by no means a top developer but the recommendations were always trash and not what I wanted.
I think if you write code for a living then you should care and see what it does. Some folks might find it gives them a productivity boost. Some might have serious misgivings about the output quality / IP / security aspects. It might do nothing for you, as it does nothing for me, and you move on. But I guess I think it's something one should keep an eye on.
Yea it basically is concerning if you have code in GitHub or you want to play with copilot. If anyone can reply to me that uses copilot in a business setting because to me it’s a nightmare to use because of the probability of it being a liability.
Was available for everyone who asked, it now costs a monthly fee. I requested to try it out, but by the time they said yes, it transition to paid. Since I don't seriously want to use a tool like this I am not interested.
It's common for young people to not care what Microsoft does, when they haven't experienced a different world. Apple doesn't look much different from MS of today, so it seems perfectly reasonable to see software slowly weaponized against developers. Microsoft has viciously abused their marketshare in the past and it's likely they will do so again. How does that affect you? Doesn't seem to right now (unless you're unlucky, right?). Check back in 20 years.
It's basically thinking "this doesn't affect me now, so it will never affect me" combined with "as long it isn't my problem, I refuse to care about it"
I have no interest in it either. I am fascinated tho by how software developers have played themselves again. No wonder management and corporations treat most of them like children.