Right, and whether you call it "General AI" or "trivial Python script" my complaint stands–that it's a misfeature for the user, the novice reader user, the English-as-a-foreign-language user, who relies on a machine review that tell them reading Joyce is "easy English". That would seriously suck if that happened to someone, though I assume that's statistically unlikely (particularly given Joyce-is-difficult-English is a widely-known meme). It'd be an unpleasant experience, like being told glue is tasty on a pizza.
I *get* that my opinion is an unpopular and minority one, so I accept the downvotes and ridicule, fine. This is the minority viewpoint I hold, I stubbornly stand by; and the hill I will die on. That it's disrespectful to users to inject unvetted machine scoring into book reviews; it's a malfeature and should not be a socially accepted practice. Treat the human user with awed respect; where you can help them, help, and where you don't know, say nothing—don't let loose some talking Python script. The user doesn't know the limitations of your script; the user doesn't know the language you posted on your page isn't authoritative language and is prone to major errors.
> That it's disrespectful to users to inject unvetted machine scoring into book reviews
Very, very far from being unvetted. This algorithm has been used, unchanged, for the 50 years since Flesch–Kincaid was developed. I've used this metric for my entire life as a rough indicator of difficulty, and it is widely accepted. But it's a limited metric: it has two factors for difficulty that generally rate text as more difficult if it has more words per sentence and more syllables per word. It's a good heuristic, but as with all heuristics, there will be edge cases, and Ulysses is one of them.
As I do with all critiques, I guess I'd ask you to make a better suggestion for Standard Ebooks. Given their resources, and the available alternative of "have a panel of diverse humans read every book and grade its difficulty", your position is dangerously close to letting the perfect be the enemy of the good. Is your argument that Standard Ebooks would be a better product if they didn't include reading ease metrics? If so, I respectfully disagree.
> Treat the human user with awed respect; where you can help them, help, and where you don't know, say nothing—don't let loose some talking Python script. The user doesn't know the limitations of your script; the user doesn't know the language you posted on your page isn't authoritative language and is prone to major errors.
I don't think this is fair. Reading ease has flaws, but is widely accepted (although seemingly poorly understood, despite its simplicity). The guy who runs readable.com (DaveChild) responded to a post on Reddit about reading scores a few years back (that thread was also filled with tons of misinformation about how this is some black-box AI algorithm that's making everyone stupid), but his comment was quite well-grounded:
> Readability scores are fairly crude, almost by design, because they were all created at a time when they had to be worked out without computers. But they do give a decent idea of the overall readability of a piece, and that helps you to see if your content is too wordy. They are not, by themselves, an indicator of quality. They are not a substitute for proofreading and editing. But they are a useful tool to have in your arsenal.
This is a balanced, practical opinion. Life is filled with proxy metrics that are flawed, from insurance risk and credit ratings to SAT scores and the ability to do whiteboard-coding. In context, I think Standard Ebooks made exactly the right choice to incorporate some measure of reading ease in their offering, even if it doesn't get it 100% right 100% of the time.
I see several people calling this an edge case. That might well be, but how about giving us something to compare it to, in the realm of early- or pre-20th century novels?
- "Very, very far from being unvetted. This algorithm has been used, unchanged, for the 50 years since Flesch–Kincaid was developed."
I mean that the instance is unvetted: the machine score is generated automatically, and placed on the website automatically, and no human in the loop checks if it's reasonable or not. Not that the general algorithm is un-reviewed.
- " But they do give a decent idea of the overall readability of a piece, and that helps you to see if your content is too wordy. They are not, by themselves, an indicator of quality. They are not a substitute for proofreading and editing. But they are a useful tool to have in your arsenal."
This is very fair.
- "Life is filled with proxy metrics that are flawed, from insurance risk and credit ratings"
And a lot of them are very rightly illegal to score algorithmically in the EU (for important decisions), without manual oversight, because of the possibility of egregious and unaccountable machine error. The trend of abdicating human agency is not overall a wholesome one.
I'm coming from a place were I do read books (despite the fact I write HN comments like an illiterate stoned baboon, I'm trying my hardest really I am), and they come lovingly edited by obsessed people who put probably thousands of hours into editing each one, individually, with commentary essays that are up to 50-100 pages long, fastidiously crafted to guide the novice explorer. Standard Ebooks is neither a publisher not attempting to replace publishers. But: it's viscerally disturbing to me to see robots taking the hallowed place of human scholars in annotating—in this narrow example, scoring–books, and when they go badly wrong like this Joyce example, it's very upsetting, and makes me (irrationally?) think there's some terribly dangerous cultural normalization for replacing authentic human intelligence with fake, stupid, hopelessly lost machine imitations. And we'll lose many valuable things and our humanity in the process.
I sincerely apologize to anyone I've annoyed with this (I infer I've annoyed a lot of people). I'm just very upset with seeing fake machine stuff everywhere.
I *get* that my opinion is an unpopular and minority one, so I accept the downvotes and ridicule, fine. This is the minority viewpoint I hold, I stubbornly stand by; and the hill I will die on. That it's disrespectful to users to inject unvetted machine scoring into book reviews; it's a malfeature and should not be a socially accepted practice. Treat the human user with awed respect; where you can help them, help, and where you don't know, say nothing—don't let loose some talking Python script. The user doesn't know the limitations of your script; the user doesn't know the language you posted on your page isn't authoritative language and is prone to major errors.