Just a random thought: While they are already earning money by scraping data, would it not be nice if they pay the site owners a certain amount of the money they earn?
That sounds like an argument against any bot visiting a monetized website.
Some people publish content freely on the Internet as a form of note taking, publicity, public discourse or for the betterment of like minded individuals, akin to why we’re here commenting on HN.
I guess I assume public content defaults into this “for the benefit of the world” category, where it’s up to the publisher to gate content as desired.
No, that's an argument against any bot visiting any website for the purpose of repackaging and redistributing information regardless of motivation website was concieved for.
Public content is still mostly published with a reference to a certain or anonymous individual/organization and gains visibility based on a value and an effort to be seen. Individual/organization is still motivated by visibility, popularity, acceptance and approval of that content.
We can summarize that people are motivated by a reaction. What do you think will happen when you remove or decrease reaction to knowledge/content providers?
Some would argue that it allows for cutting edge research that could potentially massively benefit humanity. Whether or not that pans out is to be determined, but that is the stated goal. The point then, as advocates for this technology believe is for the betterment of civilization, that by training a neural network on this knowledge will make that knowledge more accessible to the rest of us.
One could of course debate whether or not OpenAI would be the best stewards of that knowledge or aligned with the best interests of humanity. However, it is important to recognize that building a successful business is key to funding the research, H100s aren't cheap. It's also important to note that as with all things tech, price of hardware will go down, and OSS models continue to get more capable every week.
Nobody is doubting that accessible and correct information is good for humanity, I am questioning how will that affect knowledge/content providers.
I repeat, what will be the motivation of an individual to share or provide valuable information if you decrease or eliminate any control of where and how that information appears?
> I repeat, what will be the motivation of an individual to share or provide valuable information if you decrease or eliminate any control of where and how that information appears?
Forum users don’t seem to mind. Reddit, HN, Twitter, Facebook, etc. are all examples of users freely providing valuable content without expectation or control.
I suppose it’s also not too different from a listener summarizing a speech. When you speak publicly you don’t get to control who hears it or how they will interpret it.
Very late to reply. Still you take part in a community, you have an identity and you get likes, karma, followers and credibility. Motivation is present just like in monetized systems and you choose where it appears and its related to your account.
In the meantime, I did relize I might have overblown consequences of what a product collecting and summarizing knowledge might cause.
Even something as simple as a blog accrues the author some small reputational benefit.
Of course if people do conclude that there’s no benefit to sharing knowledge and stop doing so then those who do share knowledge will have an outsized impact on AI training. In the extreme: the opportunity to create truth. Thus the incentive to publish in order to stop those people is created.
Oh gee, opportunity for volunteers to create truth versus resourceful companies and organisation eager to share their own version of truth on a system whose workings barely anyone can comprehend and for-profit company whose actions anyone can predict.
Sorry for the sarcasm, but your comment is essentially "Lets remove motivation for those who had incentive to share valuable information and see how it turns out.".
Then you will find people who will share low-value (but easy to create content) that is used everywhere, a bit like these "isOdd, "isEven" NPM packages.