So, I looked at the table appendix you're referencing and I think you're overstating your case a bit.
Among books within copyright, GPT-4 can reproduce Harry Potter and the Sorcerer's Stone with 76% accuracy. This is, apparently, the highest accuracy GPT-4 achieved among all tested copyrighted books with 1984 taking a distant 2nd place at 57%.
With this in mind, we can verifiably say that GPT-4 is unusually good at specifically reproducing the first Harry Potter book. An unscrupulous book thief may very well be able to steal the first entry in the series... assuming that they're able to get past one quarter of the book being an AI hallucination.
You misread. They did not find 76% reproduction of the book. When asked to fill in a name within a passage, e.g. "Stay gold, [MASK], stay gold." Response: Ponyboy, GPT-4 got the name right 76% of the time.
> You misread. They did not find 76% reproduction of the book. When asked to fill in a name within a passage, e.g. "Stay gold, [MASK], stay gold." Response: Ponyboy, GPT-4 got the name right 76% of the time.
What is the temperature / top_p setting producing that 76%? The default? If you dial down the randomness, would that number go up?
Among books within copyright, GPT-4 can reproduce Harry Potter and the Sorcerer's Stone with 76% accuracy. This is, apparently, the highest accuracy GPT-4 achieved among all tested copyrighted books with 1984 taking a distant 2nd place at 57%.
With this in mind, we can verifiably say that GPT-4 is unusually good at specifically reproducing the first Harry Potter book. An unscrupulous book thief may very well be able to steal the first entry in the series... assuming that they're able to get past one quarter of the book being an AI hallucination.