I'm a native English speaker. I studied a little French and speak (and can read/...

blahedo · on Aug 21, 2011

There is a certain amount of misinformation in this post that I'd like to clear up:

1) In the 10th century, (Old) English and (Old) German were already somewhat divergent, and not really mutually intelligible, but closely related.

2) The same goes for Norman and French. While in the same language family, these were not quite the same language.

3) Have you read any Chaucer? I'm not sure I'd call Middle English "mostly comprehensible". It takes a lot of work and a good ME dictionary at hand. By contrast, Early Modern English (think Shakespeare) is more in the line of "mostly comprehensible".

4) It's not just the fact there was no central authority that caused English to massively change. There still isn't a central authority, but change has been slow and non-radical for the last 500 years or so. Having a central authority doesn't really prevent change, either, although it might help to slow it down.

5) It's not at all clear what you mean by "arbitrary" in your claim about language change and state control, but you certainly have not backed up any claim about languages under "state control" being more "regular".

6) The alphabet switch for Turkish did not succeed because the Latin alphabet is "simple and phonetic". The Arabic alphabet is also simple and phonetic, as is (or as could be) any alphabet. It was not a great fit for the Turkish language, however, and the Latin alphabet (with a few additions) mapped better.

7) But much more important than any facts about the Latin alphabet was the fact that Kemal Ataturk was a dictator who pushed through a massive literacy programme on the population (who, it should also be said, was basically supportive of this goal). Had he done an alphabet reform instead---adding a few letters to the Arabic alphabet to make it a better fit for Turkish---and accompanied it with the same literacy policy, it'd do just as well. (Better, arguably, because it would have left the writings of the Ottomans much more accessible to the literate modern population.)

You might be right about computers arising in alphabet countries (I'd broaden that to any non-ideographic writing system, including Greek, Korean, and those of the Indian subcontinent and southeast Asia), but it might be more accurate to say that if computers had arisen first in China they would have looked very, very different. (Even in Japan, they might have gone the route of kana-only systems first, in a similar way to how early western machines had ALL CAPS interfaces.)

vacri · on Aug 22, 2011

On the simplification of roman script, I've just travelled through Vietnam, and learned that Ho Chi Minh did a similar thing to Ataturk - he mandated that Vietnamese be written in Roman script (as set down by a Portuguese bloke several centuries ago) and not in the traditional Chinese script. His reasoning was that the easier it is to learn, the better for the general population.

I can't say what Vietnam looked like before the change, but certainly today there's writing blazed over everything in the cities - and there's very little in the way of images or pictographs to indicate what a shop might sell to an illiterate (I may have been unaware of other indicators). They do make prodigious use of diacritics to adapt Roman script to Vietnamese, though.

It was while puzzling over these diacritics that I finally realised that English uses (needs?) these as well, we just don't write them down - wind (moving air) and wind (make a coil) are pronounced differently, but without diacritics, someone has to tell you how to do so.

devijvers · on Aug 22, 2011

Be careful who you call a dictator. While Ataturk has been President of Turkey for 15 years (until his death in 1938) he encouraged a multi-party system. However, during his lifetime several parties were formed and again self-dissolved or dissolved after an uncovered assassination attempt on Ataturk. It's only in 1945 - after Ataturk's death - that the multi-party system in Turkey took off for real.

eru · on Aug 23, 2011

From what I can tell Ataturk was mostly a benevolent dictator.

(And like a good wine, he gets better with every passing year since his dead. When I was in Ankarka in 2008, they had pictures / flags of him on the high rise buildings covering five storeys.)

mortenjorck · on Aug 22, 2011

Actually, isn't the concept of a kana-only character system how early, domestically-marketed microcomputers in Japan did in fact work? The only example I can think of is a home entertainment console, but the Nintendo Family Computer generally used kana with a few highly common kanji.

w1ntermute · on Aug 22, 2011

Yes, early computers in Japan used katakana only. They were half-width (normally they are written in a square box) so as to be compatible with the Latin alphabet. This is also how telegrams were sent, starting from when Japan began modernizing in 1868.

theli0nheart · on Aug 21, 2011

First: I lived in China for a year studying Mandarin, and spent 3 years studying at school.

You're wrong in that there is no way to describe characters in Chinese to other speakers. There are certain words you use to describe strokes. The equivalent in English would be something like "there is a cross on the left and a flower on the right." Chinese speakers do this all the time, so I'm kind of surprised you jump to this conclusion when it's clear you aren't knowledgeable on the issue.

cletus · on Aug 21, 2011

Care to explain how you'd explain this [1] to someone else such that they could reproduce it in a readable fashion? Or one of these [2]?

[1] http://necromanc.blogspot.com/2006/05/most-complicated-chine...

[2] http://www.chinese-forums.com/index.php?/topic/437-most-comp...

fortheoccasion · on Aug 21, 2011

A challenge!

First of all, [1] is ridiculous. Second of all, you just do it recursively. I would describe the first example roughly as follows:

Walking radical (162); cave top (116); to the left a left-right combination with "moon" on the left (74) and a thread radical (52) above "long" (168) on the right; in the middle "speech" (149) above "horse" (187); to the right, a left-right combination with thread above "long" on the left, and a knife radical (18) on the right; under all of that, a heart radical (61).

Numbers refer to the chart here

http://www.yellowbridge.com/chinese/radicals.php

In practice, it is very common for two Chinese people to meet for the first time and explain to each other which characters their name are. This normally does not involve pulling out pencil and paper.

brown · on Aug 21, 2011

That's the equivalent of explaining how to spell "supercalifragilisticexpialidocious" -- yeah, it's an English word, but it's ridiculous and nobody really expects you to know how to spell it.

Most simplified characters can be explained with a handful of strokes. Furthermore, many of them can be broken down into "radicals" which are commonly repeated patterns.

nwomack · on Aug 22, 2011

I know #1 looks completely insane, but I have been studying Chinese for about 2 years, and I just started writing 3 or 4 months ago, and the breakdown of it is actually fairly simple. There's actually a somewhat limited set of characters that are reused over and over again, and I have already learned to write every one of the pieces (*edit: In that particular character). It's already been broken down in a post above me so I won't do it again...

Some characters would be harder for me to remember than this, because this particular character is made up of common used components.

est · on Aug 22, 2011

齉 == nose left, bag right.

龘 == three dragons stack

驫 == three horses stack

It's pretty simple and straight forward.

w1ntermute · on Aug 21, 2011

> IMHO (1) is incredibly important. In Mandarin, if I want to tell you a new character I have to show you. There is no way to describe it.

Have you actually studied hanzi? It is very easy to describe a character verbally, and if you live in Asia for any period of time, you will see that people do this quite often. There are only 214 Kangxi radicals[0] (plus some variations based upon how much space is available). Clearly not the same as having 26 letters, but not unmanageable by any stretch of the imagination.

The second difference is that characters are "spelled" in 2 dimensions. Once again, there is a set of rules for radical placement, and if you're familiar with these (as you would be if you'd studied Chinese or Japanese), it is very straightforward.

0: http://en.wikipedia.org/wiki/Kangxi_radicals

chc · on Aug 21, 2011

This is true, but I've noticed that in practice, Chinese or Japanese trying to identify characters to each other (rather than look up an unfamiliar character in a dictionary) tend to draw rather than list the components.

Natsu · on Aug 22, 2011

I think it depends mostly on how hard the explanation is or how available paper is.

I had a friend of mine show me how to explain how to write her name in Japanese. There are even special names for the different radical forms (e.g. ninben vs. ningen).

wisty · on Aug 22, 2011

I think some of radicals are homonyms of other radicals. Most of the radicals have a lot of near-homonyms.Also, those radicals can change their shape depending on what position they are in.

Yes, it's possible, but nowhere near as easy.

tty · on Aug 21, 2011

>English in the 10th century was basically the same language as German (Althochdeutsch or Old High German, to be precise). In 1066, the Normans conquered England, bringing French which became the official and court language of England for several centuries.

Old English and Old High German being "basically the same language" is a very ignorant thing to say and categorically wrong. Old High German might have been much more close to Old English than modern standard German is to English but they weren't mutually intelligible and they were clearly different languages even at that time. Major differences between the two had arisen many centuries earlier. Take for example the High German consonant shift. Now, would you please explain why you would say that both were "basically the same language"?

billswift · on Aug 21, 2011

Old English (aka Anglo-Saxon) was a Low German dialect, closer to Dutch in many ways, and to Danish in others, than to High German. Which isn't surprising when you consider the Angles, Saxons, and Jutes came from the North Sea coast region from southern Denmark to the eastern Netherlands.

anghyflawn · on Aug 22, 2011

No, it was not a Low German dialect. It was what is called an Ingvaeonic dialect, most closely related to Old Saxon and Old Frisian. Old High German is what is sometimes called Istvaeonic (and the Franconian dialects that were to become Dutch is also part of that group). There is a bunch of important differences between Ingvaeonic and Istvaeonic, and both are quite, quite different from North Germanic.

buss · on Aug 21, 2011

> In Mandarin, if I want to tell you a new character I have to show you. There is no way to describe it.

This isn't strictly true. Some characters are simple combinations of radicals and other characters and can be sufficiently described as saying "the radical for x and the character for y." For example, the word for hungry (饿, è) is a combination of the radical for "eat" or "food" (饣, shí) and the character for "self" or "me" (我, wǒ).

I'm only a beginning student of the language, so I can't claim that many other characters are as simple to describe.

nikcub · on Aug 22, 2011

Chinese characters may seem intimidating until you remember that we read and memorize English words, not letters, when reading - and Chinese characters are much the same. If you don't know a character you can slow down and figure it out based on its composition

The Chinese composition just has many more variables than the English alphabet (it is a few hundred, IIRC)

_kdhr · on Aug 21, 2011

Interesting post and interesting linked article. One part strikes me as especially noteworthy:

“Lieberman, Michel, and colleagues expect that some 15 of the 98 modern irregular verbs they studied -- although likely none of these top 10 -- will regularize in the next 500 years.”

Now, is this based on previous evolution of verbs? Because the past thousand years have had a certain feasibility of communication, locally, nationally and globally. I think the past 100 years with the advent of radio, television and now the internet, language is really going to evolve at a rate previously unseen in history.

The whole globe is connected, textually, verbally and visually, and it's immediate and constant. The past thousand years the only way to get your novel usage of a particular word or grammatical construct was to either go to some venue and talk, send a letter, or write a book. Now you can spread your literary love everywhere, constantly and with a wide audience. And not only to people with your local dialect, but every dialect. What a melting pot.

I'm quite a lover of language evolution. I moved to Italy a year ago in a very multilingual office, and my French and Italian colleagues noted how nice it is in English that you can verb nouns. It hadn't occurred to me that this wasn't possible in French or Italian. I expressed that though English is quite liberal and almost anything goes in a lot of areas, I still wish that people were more accepting of linguistic novelties. People scorn you if you play with language, or actively drop old ways, or invent new words, with the exception of high school kids who, in my experience, are the most inventive English speakers I've seen. When I was in school the amount of new language and idioms introduced every week was overwhelming.

I'm quite descriptivist, though. I like dictionaries that are extremely up to date, like [Wordnik](http://www.wordnik.com/), that encourage people to just use words freely, and take 3 seconds to explain to their partner in conversation what the word means, without fear that their new word isn't cromulent! (I just added "cromulent" to Chrome's dictionary.) Some words I like to use when talking to myself (hey, kids do it, so sue me), are words that don't exist already but are the 'root' of existing words, like inane (“That's quite ane.”), edible (“I think I'll ed some peanut butter sarnies”), etc.

Anyway, I'm rambling, too.

eru · on Aug 23, 2011

On the other hand, prescription has a far bigger reach today than even thirty years ago. Trivial example: When everything you write gets spell checked automatically, new orthography develops slower.

Also immediate communication can slow down a language, and homogenize it. Radio and TV certainly brought the German dialects closer together.

It depends on the patterns of communication. The internet allows lots of small groups to interact with each other all over the world. That has a different effect than the few to many pattern you get with traditional mass media.

You might enjoy http://verben.texttheater.net/Englisch and if you know German, you might enjoy http://verben.texttheater.net/ even more. On the German version they are doing stuff like inane, ane, or overwhelmed, underwhelmed, whelmed.

jodrellblank · on Aug 23, 2011

Chromulant: a fake word legitimised on a personal basis by adding it to Chrome's dictionary. (verb: Chromulate).

Cushman · on Aug 21, 2011

I don't know if it's completely true that English has less character variation than Arabic— as you say, in English there is a choice of upper or lower-case, with occasional changes in meaning. In Arabic, the form of a letter is completely determined by the letter it follows. It's purely a display difference, not a separate character set. There are the diacritics to think about, but outside of the Quran they are simply ignored.

That's still a big problem for a universal language of the internet though, since written Arabic is highly non-phonetic.

idiopathic · on Aug 21, 2011

A couple of corrections I feel I have to make as an Arab.

> There are the diacritics to think about, but outside of the Quran they are simply ignored.

They are not ignored - when they are present, attention is paid to them. I think what you mean is that Arabic speakers, knowing what the diacritics are, do not bother writing them down. That is not because they ignore them, it is because we have paid such close attention to them when learning Arabic that we no longer need to be reminded of them.

> That's still a big problem for a universal language of the internet though, since written Arabic is highly non-phonetic.

Arabic is highly phonetic, and if being phonetic was a criterion for being universal language of the internet, English should be disqualified immediately.

On arrival in England at the age of 10, I had no idea how English people knew how to pronounce their words. Now I know that non-Arab speakers may think the same way about Arabic because we do not write down the diacritics by default... but it is easy to buy books that have these diacritics written, and thus to crack the code.

But English seemed designed to trap foreigners into mispronunciations, to the great amusement of my classmates. (Traveling to America after college, it was mostly place names that tripped me up.)

philwelch · on Aug 21, 2011

American place names are a constant source of confusion and amusement even among Americans, largely because many of them are adapted from American Indian words. You might be a perfectly normal English-speaking American, but if you've never been to the state of Washington before you won't know how to pronounce "Puyallup" or "Sequim" just by reading them.

Cushman · on Aug 21, 2011

Yeah, I'm sorry, I was highly unclear. What you said is what I meant :) I was saying that the Arabic character set is actually simpler, but vocalized Arabic becomes harder again.

hetman · on Aug 22, 2011

Someone more familiar with Arabic may need to correct me, however I was under the impression that the shape of the letters is not at all affected by the letters it follows, but by their position in the word.

Arabic letters look different depending on whether they are in the initial, medial, final position in a word (as well as having a 4th form when they appear in isolation). However, there are patterns of similarity so it's not as difficult as having to learn 4 completely random shapes for each letter.

Cushman · on Aug 22, 2011

That would be me :) No, the preceding letter matters. Some letters (د،ذ،ر،ز،ؤ) have no medial form-- they take the terminal form instead, and the following letter takes the initial form.

There are patterns (17 by my count?) among which letters differ only minorly, so it's not as bad as learning four forms for each letter, but there are some additional gotchas too. For example "ل-ا" ("laa") is always written as a single character "lamalif": لا. (Bonus knowledge: that's also the word for "no". You can see it at the beginning of the Shahada: ...لا إله إلا الله <- "There is no god but God..." etc.)

idlewords · on Aug 22, 2011

Not quite. There are several letters that don't connect to the letter that follows them, so a letter that follows them will appear in the 'initial' form even though it's not at a word boundary.

hessenwolf · on Aug 21, 2011

There are some differences in the way the way the characters are laid out in English too, unless you use a monospace font.

burgerbrain · on Aug 21, 2011

Those differences are not to my knowledge as a native English speaker used to convey any sort of meaning, and as demonstrated by the existence of monospaced fonts have no particular importance. They are merely an artifact of the glyph geometry.

blacksmith_tb · on Aug 22, 2011

They don't convey any sort of meaning in Arabic either - letters just change their shapes depending on where they occur in a word. Cursive English handwriting does the same thing, to a lesser degree, and for the same reasons.

roel_v · on Aug 22, 2011

Like where? In cursive, at best what changes is that some ending curse goes up or down a bit more. The existence of cursive computer fonts shows that even in cursive writing, the shape of letters can be the same across a text. Minor changes are caused by the speed of writing the cursive.

On the contrary, and as I understand it, in Arabic there are rules on how to change the shape of letters in certain contexts. In Arabic it's part of the writing system, in Western European languages it's not.

hessenwolf · on Aug 23, 2011

I give you Typographic Ligature: http://en.wikipedia.org/wiki/Typographic_ligature

roel_v · on Aug 23, 2011

Exactly, ligatures don't have any grammatical meaning. They're merely that - 'typographic'.

hessenwolf · on Aug 24, 2011

And the Arabic letters don't have any grammatical meaning. They are also merely typographic.

I think we are saying the same thing here.