Hacker News new | past | comments | ask | show | jobs | submit login

"Stealing data" seems pretty strong. Web scraping is legal. If you put text on the public Internet other people can read it or do statistical processing on it.

What do you mean he was "stealing data"? Was he hacking into somewhere?




In a lot of ways, the statistical processing is a novel form of information retrieval. So the issue is somewhat like if 20 years ago Google was indexing the web, then decided to just rehost all the indexed content on their own servers and monetize the views instead linking to the original source of the content.


It’s not anything like rehosting though. Assume I read a bunch of web articles, synthesize that knowledge and then answer a bunch of question on the web. I am performing some form of information retrieval. Do I need to pay the folks who wrote those articles even though they provided it for free on the web?

It seems like the only difference between me and ChatGPT is the scale at which ChatGPT operates. ChatGPT can memorize a very large chunk of the web and keep answering millions of questions while I can memorize a small piece of the web and only answer a few questions. And maybe due to that, it requires new rules, new laws and new definitions for the better of society. But it’s nowhere near as clear cut as the Google example you provide.


I love this argument.

"Seems like only difference between me and ChatGPT is absolutely everything".

You can't be flippant about scale not being a factor here. It absolutely is a factor. Pretending that ChatGPT is like a person synthesizing knowledge is an absurd legal argument, it is absolutely nothing like a person, its a machine at the end of the day. Scale absolutely matters in debates like this.


Why?


Why not? A fast piece of metal is different from a slow piece of metal, from a legal perspective.

You can't just say that "this really bad thing that causes a lot of problems is just like this not so bad thing that haven't caused any problem, only more so". Or at least it's not a correct argument.

When it is the scale that causes the harm, stating that the harmful thing is the same as the harmless except the scale, is like.. weird.


>> A fast piece of metal is different from a slow piece of metal, from a legal perspective.

I’d like to hear more about this legal distinction because it’s not one I’ve ever heard of before.



So there isn’t a legal distinction regarding fast/slow metal after all. Well that revelation certainly makes me question your legal analysis about copyright.


[deleted]


“Slow” doesn’t show up when I do a ctrl+F, so again, it seems like you’re just confused about how the law works?


So in your view, when a human does it, he causes a minute of harm so we can ignore it, but chatGPT causes a massive amount of harm, so we need to penalize it. Do you realize how radical your position is?

You’re saying a human who reads free work that others put out on the internet, synthesizes that knowledge and then answers someone else’s question is a minute of evil, that we can ignore. This is beyond weird, I don’t think anyone on earth/history would agree with this characterization. If anything, the human is doing a good thing, but when ChatGPT does it at a much larger scale it’s no longer good, it becomes evil? This seems more like thinly veiled logic to disguise anxiety that humans are being replaced by AI.


> This is beyond weird, I don’t think anyone on earth/history would agree with this characterization

Superlatives are a slippery slope in argumentation, especially if you invoke the whole humanity of the whole earth of the whole history. I do understand bmaco theory and while not a lawyer I’d bet what you want there’s more than one juridiction that see scale as an important factor.

Often the law is imagined as an objective cold cut indifferent knife but often there’s also a lot of "reality" aspects like common practice.


> So in your view, when a human does it, he causes a minute of harm so we can ignore it, but chatGPT causes a massive amount of harm, so we need to penalize it. Do you realize how radical your position is?

Yes, that's my view. No, I don't think that this is radical at all. For some reasons or another, it is indeed quiet uncommon. (Well, not in law, our politicians are perfectly capable of making laws based on the size of danger/harm.)

However, I haven't yet met anyone, who was able to defend the opposite position, e.g. slow bullets = fast bullets, drawing someone = photographing someone, memorizing something = recording something, and so on. Can you?


Don’t obfuscate, your view is that the stack overflow commentator, Quora answer writer, blog writer, in fact anyone who did not invent the knowledge he’s disseminating, is committing a small amount of evil. That is radical and makes no sense to me.


> Don’t obfuscate, your view is that the stack overflow commentator, Quora answer writer, blog writer, in fact anyone who did not invent the knowledge he’s disseminating, is committing a small amount of evil.

:/ No, it's not? I've written "haven't caused any problem" and "harmless". You've changed it to "small harm" that I've indeed missed.

I don't think that things that don't cause any problem are evil. That's a ridiculous claim, and I don't understand why would you want me to say that. For example I think 10 billion pandas living here on Earth with us would be bad for humanity. Does that mean that I think that 1 panda is a minute of evil? No, I think it's harmless, maybe even a net good for humanity. I think the same about Quora commenters.


Yes, that dichotomy is present everywhere in the real world.

You need lye to make proper bagels. It is not merely harmless, but beneficial in small amounts for that purpose. We still must make sure food businesses don't contaminate food with it; it could cause severe — possibly fatal — esophageal burns. The "A little is beneficial but a lot is deleterious" also applies to many vitamins… water… cops?

Trying to turn this into an “it’s either always good or always bad” dichotomy serves no purpose but to make straw men.


Not a nice interpretation of what is being said.

Clearly there is nuance that society compromises on certain things that would be problematic at scale because it benefits society. Sharing learned information disadvantages people who make a career of creating and compiling that information but you know, humans need to learn to get jobs and acquire capital to live and, surprisingly, die and along with them that information.

Or framing the issue another way, people living isn’t a problem but people living forever would be. Scale/time matters.

Here again I’ve fallen for the HN comment section. Defend your view point if you like I have no additional commentary on this.


When you use some webpages, it forces you to agree to an EULA that might preclude web scraping. NYTimes is such a webpage which is why they were sued. This is evidence that OpenAI didn't care about the law. Someone with internal communications about this could completely destroy the company!!!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: