It costs money to run a huge language model with low latency, in the loop with you - charging 10$/month is reasonable. You need multiple GPUs to load even a single copy. Copilot is adding something extra to the original code - it selects the recommendation from the whole corpus, while keeping the surrounding context into consideration and adapting to your variable names.
And in reality 99.9% of the generated code has no long ngrams in common with the training set, it's already original. All they need to do is to enforce never to generate data identical to the training set, something that can be implemented with a bloom filter, then the generated code is impossible to attribute and should have no legal problems.
In the end what do models like Copilot do? They act like culture - absorbing and replicating memes. They free the knowledge and make it reusable. They can act like a general purpose NLP tool for information extraction, classification and text generation. You can implement your ideas faster with it, don't need to label much data.
It works even with just a prompt. Try OpenAi Codex to extract a receipt to see what I am talking about - it gives you the output in JSON. It's a new tool and a new interface to the computer. There are going to be plenty of open source implementations as well, some are already under training.
You are incorrect. The code it generates is substantially the same (complete with comments) as the input, which is often sought without permission and in violation of license.
And offers nothing back to those authors in return.
Thank you for you this. I wouldn't never been able to articulate it better - people are just annoyed that someone is making m money and they aren't, without considering why that is.
And in reality 99.9% of the generated code has no long ngrams in common with the training set, it's already original. All they need to do is to enforce never to generate data identical to the training set, something that can be implemented with a bloom filter, then the generated code is impossible to attribute and should have no legal problems.
In the end what do models like Copilot do? They act like culture - absorbing and replicating memes. They free the knowledge and make it reusable. They can act like a general purpose NLP tool for information extraction, classification and text generation. You can implement your ideas faster with it, don't need to label much data.
It works even with just a prompt. Try OpenAi Codex to extract a receipt to see what I am talking about - it gives you the output in JSON. It's a new tool and a new interface to the computer. There are going to be plenty of open source implementations as well, some are already under training.