Hacker News new | past | comments | ask | show | jobs | submit login

No, Twain was satirising the whole ridiculous idea that trying to regularise the spelling of English is a good idea.

English has homonyms and homophones in part because it is always open to taking words from other languages. In fact "We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary."

Any attempt to force English to be strictly 'pronounced as written' will fail in the future even if it succeeds in the present because someone will import a word from another language that sounds like an existing word or invent a new one unaware that a word of the same pronunciation already exists or just attach a new meaning to an existing word.

See https://wiki.c2.com/?PurityOfEnglish and https://library.conlang.org/blog/?p=116




That argument makes no sense. English is far from alone in borrowing heavily from other languages. Loanwords are one of the most obvious cases of linguistic contact and can be found everywhere throughout the world. There are multiple ways how a language can deal with such loanwords: they can keep the original spelling, making those words an irregular exception to the pronunciation rules, or they can adapt the spelling and pronunciation to the borrowing language, or anything in between. Often it starts as the former and at some point migrates towards the latter.

Meanwhile, there is absolutely no reason why English couldn't in principle have a much more phonetic or regular spelling system, except maybe dialectal variation (which you also do have in other languages with more regularised spelling, though).

Now the larger reason for why a huge spelling reform would fail is because it would be a massive continuity break: people would have to relearn, new spellings would appear totally unnatural, it would mean that past text would at some point start being incomprehensible, etc., so I don't think there is any practical way of getting there.


> There are multiple ways how a language can deal with such loanwords: they can keep the original spelling, making those words an irregular exception to the pronunciation rules...

Isn't that the point? IIRC, English was originally about 50/50 Saxon and French - most of the fancy / unintuitive / rule-breaking words are French loan words (e.g., "rendezvous" & "accomplice" are French, "loan" is old Norse, "word" is old English). The combination happened when Britain spent a while under Norman rule. The commoners were largely Saxon & Norse, the rulers were French. Eventually the languages simply merged, with grammar rules and vocabulary taken from both sides.

I believe the first edition of the Webster dictionary in the 1800s was when the spelling was standardized. At that point, Webster looked at the original language of the words to both define them and figure out a correct spelling.

Edit: additional recollections, formatting.

Edit2: I'm being downvoted?


The "English creole" hypothesis has been discussed a lot and it is true that English is a bit special within Europe in having absorbed a lot of material from other languages such as French and Norse, but I still think that it is very recognisable as a Germanic language (both grammatically as well as from the perspective of "core vocabulary"), so I don't think that the languages "merged" in the same way as this maybe happened in the case of some recognised creoles. (See e.g. here for a summary: https://en.m.wikipedia.org/wiki/Middle_English_creole_hypoth...)

There is still no reason in principle why English couldn't have chosen a more regular system of spelling, especially at a time where literacy was low anyway. The idea that the spelling is irregular because you have several competing systems (French vs. Anglo-Saxon vs. Norse) might sound compelling, but I don't think it holds water, cf. homographs such as "sow" (female pig) and "sow" (to plant) which both have Old English origins.

(Also I don't think that "rendez-vous" is a good example. I don't have hard evidence, but I would find it entirely implausible that this is a word of Norman origin, it's much more likely to be a late import from the 18th or 19th century, when French was fashionable, not at all comparable to something like "pork".)


> There is still no reason in principle why English couldn't have chosen a more regular system of spelling, especially at a time where literacy was low anyway.

I mentioned that farther downthread. The original Webster's dictionary served exactly this purpose - in the early 1800s, his stated goal (IIRC) was to give the new USA a standardized language to help differentiate it from the country they'd split off from. Also mentioned downthread, he based his standardization on the languages the words were loaned from.


You ignored the part of my comment where I showed that English spelling is inconsistent even between words that have purely Old English origins. I really don't think your hypothesis holds water. The chaotic spelling of English is a historical coincidence and not some necessary consequence of how the English spoken language developed.


There has to be a rule for rules to broken. English orthography is practically irregular (although it's vaguely, chaotically regular if you subject each word to an analysis based on most probable language origin.) Figuring out how to pronounce a word you haven't seen before in English is only slightly easier than trying to figure out how to pronounce a word you haven't seen before in Mandarin.

> Isn't that the point? IIRC, English was originally about 50/50 Saxon and French - most of the fancy / unintuitive / rule-breaking words are French loan words (e.g., "rendezvous" & "accomplice" are French, "loan" is old Norse).

Also, it's far worse than this. The Normans changed the spelling of English words that were unpronounceable to them by adding a bunch of letters. One I remember is that the reason the "-shire" suffix is confusing is because before the Normans it was just "-scr".

Also, British English tends to Anglicize French pronunciations, like "herb," "valet," etc. French borrows aren't the major thing making English difficult to read and write. French orthography is also pretty bad (and Portuguese.)

obligatory: http://zompist.com/spell.html


> There has to be a rule for rules to broken. English orthography is practically irregular (although it's vaguely, chaotically regular if you subject each word to an analysis based on most probable language origin.)

I mean, yeah. The rules governing English are a combination of the rules governing a couple other languages. Divide up the vocabulary according to the applicable rules, and I'm pretty sure you end up with a fragment each of the French, Saxon, Norse, etc. languages, and within those fragments the rules make as much sense as they do in the full versions of the respective languages. E.g.: put "rendezvous" & "accomplice" in one pile, "loan" in another, and the rules of each pile would be internally consistent, and consistent with the rules of French and Norse respectively...accounting for language drift, of course.

A large fraction of English follows French rules - might as well ask the French language to fix their spellings to make phonetic sense. The work put into that would translate directly to fix a lot of English.

> Also, it's far worse than this. The Normans changed the spelling of English words that were unpronounceable to them by adding a bunch of letters. One I remember is that the reason the "-shire" suffix is confusing is because before the Normans it was just "-scr".

Personally, given the entymology of English, it makes a lot of sense to preserve the original, applicable, rules of the languages English is comprised of. Otherwise, it would be like rewriting either Norse or French to fit the other - as you mention, that's even worse than the combination of languages in the first place.

Edit for clarity.


Many more phonetic writing systems re-write loanwords to match local pronunciation, at least once the words are established enough.

For example, in Romanian, recent borrows do typically preserve the original spelling (e.g. English 'computer','mall' -> Romanian 'computer', 'mall'; plausible Romanian phonetic spelling 'compiutăr','mol'). But older loan-words that have become established take on a phonetic spelling (e.g. French 'bureau' -> Romanian 'birou'; English 'interview', 'tramway', 'jam' -> Romanian 'interviu', 'tramvai', 'gem').

So it is possible to absorb large amounts of loan-words into a language and completely disregard the orthography of the original language. I don't think you lose much by doing that, in fact.


That's true. I guess what I didn't really say explicitly is that I don't think there is a 'local pronunciation' in the case of English. French and Saxon seem to have merged on equal terms, where neither was sufficiently dominant to determine how words were re-written, and neither vocabulary would be counted as loan words.

I think English would claim both as the original languages.

I, think, english, would, both = old english

claim, original, language = old french


> French orthography is also pretty bad (and Portuguese.)

Actually, french orthography is a bit complex, but pretty regular (if you exclude some old names). Also, French grammar would be more complex if the spelling was more inline with pronunciation, as the grammar rules have changed slower than pronunciation has.

If you went by current pronunciation, the feminine or sometimes plural forms of many words, and the tenses of many verbs, would add random consonants, while the current spelling shows that they simply "revive" a consonant that is now elided, but has stayed part of the root of the word (e.g. 'present/presente' pronounced something like 'prezan/prezant'; 'mis/mise' pronounced something like 'mi/miz').

French spelling is also an interesting showcase of what happens when a phonetic spelling is frozen while pronunciation changes (Old French was almost 1:1 with today's spelling, but pronunciation has changed dramatically).


As a non-native English speaker, the bigger problem to me are not homophones, but homographs that aren't homophones. Or in general, same groupings of letters that are pronounced differently. That aside, there's a bunch of words that have been imported from English to Polish that mess me up (I'm getting tripped by the Polish pronunciation, it bleeds into how I speak the word). Grammar is actually great to learn (same goes for Spanish based on my limited experience with it).




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: