This is almost certainly written in literary classical Chinese (even if written recently). Basically, in Chinese, words that were once pronounced differently merged in pronunciation in modern Mandarin, though since the writing system is quite divorced from sound, the written forms never changed (at least, not until Simplified Chinese under Mao).
What that means is that a lot of classical Chinese literature has a bit of this problem, in that if you read it aloud, it is tricky to fully comprehend. (This poem/story takes it to an extreme, obviously.) For a long time, up through the early 20th century, it was fairly common for written things to be written in this literary Chinese even if the writer would say their ideas completely differently. A "write as you speak" movement has largely changed that: in modern written Mandarin, many words are two or three syllables long, and written with two or three characters. Etymologically, the words may have derived from compounding monosyllabic words, but in the modern spoken (and now also written) language, the sub-word syllables are simply no longer words in any meaningful sense.
(An analogy on a much smaller scale in English: many dialects of English merge /ɪ/ and /ɛ/ before a nasal sound, so that "pin" and "pen" sound identical. Speakers of those dialects will often say "straight pin" or "ink pen" (or "pig pen" or "cow pen") to distinguish which kind of pin/pen they mean; they may or may not make the distinction in writing. So the writing is unambiguous, the speaking is unambiguous, but reading a written thing is ambiguous. When it's just one or two words it's no big deal, but imagine that nearly every word suffered from that problem....)
One solution to the "many characters sounding the same" problem was turning Chinese into a tonal language. I guess technically that makes them sound different. but even with tones (in modern mandarin Chinese there are 4 tones; in older time there were more; Cantonese reportedly still has 9).
Even with tones though, there are still too many characters sounding exactly the same. In a Chinese dictionary (you might want to pause for a moment and imagine what a Chinese dictionary looks like. if you come across an unknown English word in writing, you can look it up by spelling. but what's the equivalent thing to do for a chinese character?), under any one sound specification, you can easily find 10-20 characters.
where I disagree is that, yes, classic written Chinese is tricky to understand, but not in the sense that they are mostly one-syllable words. they are tricky to understand in the same sense that Shakespearean (or Chaucerian) English was tricky to understand to modern English speakers - due to unfamiliar vocabulary and old-style grammar. if this opinion is valid, I further propose that, when the Chinese talk, they might rely more heavily on semantics to parse sentences in real-time, since based on sound alone there might be too many character candidates. in other words, they use what's being talked about to do some pretty heavy-handed proning as they process incoming syllables.
On Chinese dictionaries. I actually don't think there is any difference from English ones, at least in the way I use them.
When I use an English dictionary I use the word's spelling. I don't check how the word is pronounced when I do this.
When I use a Chinese dictionary I use the number of strokes that make up the character, which puts all the characters in a sort of order not unlike an alphabetical one. I also do not need to know how the word is pronounced when I do this.
There is no such thing as "turning Chinese into a tonal language." Chinese is tonal to begin with. For native speakers, a different tone does sound as distinct as a different phoneme. The problem the poem exemplifies is the sort we run into when we reduce the language into only the phonetics -- a problem I'm not convinced is unique to Chinese. In fact, just the post above pointed to an example of such in English.
The overloading of the shi sound in mandarin approaches absurd levels.
I've always liked:
四 是 四 , 十 是 十 , 十 四 是 十 四 , 四 十 是 四 十 , 四 十 四 只 石 狮 子 是 死 的
sì shì sì
shí shì shí
shí sì shì shí sì
sì shí shì sì shí
sì shí sì zhī shí shī zǐ shì sǐ de.
Translation: 4 is 4, 10 is 10, 14 is 14, 40 is 40, 44 small stones are dead
Then get a southerner with say a Sichuan or yunnan accent to say this (in mandarin). They cannot properly pronounce shi (they say it like si). The above just sounds like an angry bee. Given that shi and si is used a lot this becomes a real pain for a non native speaker.
Makes this tough when you're buying something that is 44 rmb and you cant tell whether they said "is 14" or "44" or what.
My impression (from very brief study of Mandarin) is that Mandarin has a lot more homonyms than most languages, e.g. lǐ has a ton of meanings: http://en.wiktionary.org/wiki/l%C7%90 It also seems to me that a lot of Mandarin phonemes are much closer together than most languages, e.g. ch and q are both a similar "ch" sound (likewise sh and x). Or maybe these sounds just seem similar to me because of my English upbringing?
Do linguists have a way to quantitatively measure how close together the sounds and words are in a language? Some sort of Shannon entropy measure, maybe? Or a way to measure how spread out words are in "phonetic space". I couldn't find anything, but I'd like to know if there's a way to measure these things objectively.
On the first part: See my other post for more, but basically, much (not all) of the homonymy is from older and/or written-only forms; and yes, ch and q don't sound any closer to a native Mandarin speaker than, say, sh and s do to a native English speaker or u and ou to a native French speaker.
On the second part: no defined measure that I know of. It would be a little tricky in that language is a moving target, everyone speaks it slightly differently, and even for a single speaker the "location" of a particular phone is more of a probability distribution even after you factor out varying context. That said, there definitely are charts that map out the space and take a stab at identifying the prototype location of each phone in the sound space, so it's not entirely implausible that you could summarise that with a distance measure. I strongly suspect that the value of the measure would not vary much among languages with similar-size phoneme inventories, though.
In real life speech, people don't say such things in Chinese, just as English speakers don't say "Suzie sells sea shells by the sea shore". If they do, they'll say it slower to avoid ambiguities, maybe rephrase it as well, e.g. "You remember Suzie? She's into sea shells. She sells them by the beach."
If we remove the adjectives formed from proper nouns, turn the nouns into "people", turn the verbs into "bully", and add a pronoun and an adjective, we get
people whom other people bully bully people.
This makes sense.
But when we remove the pronoun and adjective and we get
people people bully bully people
which is as far I can tell gibberish (without changing meanings such that they don't correspond with the original sentence). It can't have anything to do with many people persons, because in the original there's an adjective there.
If we assume that, just as there are social adept people, whom we call "people people", there are also socially adept buffalo called "buffalo buffalo", we can get "Buffalo buffalo buffalo buffalo buffalo". But to go beyond that is to claim an incoherent sentence missing many words is coherent and grammatical.
It seems to me that if you replace "whom other" with "that" you end up with a similar meaning:
people that people bully bully people
I've found it fairly common to hear people drop the "that" when speaking - so in this case:
people people bully...
the first people refers to the people being bullied while the second people refers to the people doing the bullying. It's a bit clearer if you replace some of the nouns - for instance:
dogs people bully bully cats
which I believe is a perfectly legal English sentence.
It's interesting to compare the original version in Classical Chinese with the vernacular Chinese one.
The Classical Chinese version just sounds (err, reads) so much more awesome, like a smooth-talking wisecrack who can't be bothered to dispense extra breaths, while the vernacular version reads like a seven-year old's essay.
I always had the impression that Classical Chinese was painstakingly crafted and preserved by a small group of elitists obsessed with aesthetics and abhorred ease-of-learning. The ultra-minimalist syntax seems very "inorganic", but it makes every line poetry and invitation for puns. As a result, most people were illiterate, but if you were well-read, boy, the fun you could have with words.
While its true that native cantonese speakers can generally be identified, its not the case that they lack the ability to speak mandarin. They just speak mandarin with a southern regional accent. Native mandarin speakers from taiwan are also easily distinguished from beijingers.
Generally, cantonese speakers actually pick up mandarin a lot more easily than mandarin speakers pick up cantonese.
Adopting an accent can be a lot harder than adopting a dialect because 1. your brain doesn't parse the new sounds well if you didn't grow up hearing them and 2. you have no experience creating these sounds and its a lot harder to learn to produce them. There's a lot of literature about native accent adoption in second language acquisition that suggests an age dependency due to increased neural plasticity in youth (even though there's a lot of dispute about that critical period for language (not accent) acquisition).
True, it is also good to have an idea which tone means what. As far as I know it's: high-level, raising, falling-and-then-raising and falling (the shape of the dashes over i gives the hint) http://en.wikipedia.org/wiki/Mandarin_phonology#Tones
Agree on the last point -- at least, for Taiwan. In fact, those idealized graphs caused me a bit of a headache early on. Native mandarin speakers in Taiwan tend to drop the end of the 3rd tone, so it falls slowly then holds at the bottom for a minute, then rises slightly or even just cuts off.
I've asked a few Japanese people to say "She sells sea shells by the sea shore..." and it comes out sounding a lot like that... Tricky because Japanese uses Shi and Chi sounds but not Si on its own.
Not quite true: although the tongue is not as obviously involved as it is in stop sounds (like /t/ or /k/), friction between the tongue and the air are what make the sound /ʂ/ (here transcribed as "sh"), while during a vowel sound (/i/), the air passage is open. So the tongue does have to move a little relative to the roof of the mouth; just not very much.
I really don't move my tongue when I say these in a row. Is this because I'm speaking with a Beijing accent? Apparently what's coming out for me is: ʂ̺ɻ̩. Or should that combination also force me to move my tongue?
When I lay off the accent and move the i to the front of my mouth a bit, I do seem to move my tongue a little. I'm not sure I'm doing it right though, the i's in non-Beijing Mandarin are actually always a bit stressful for me, but I'm pretty sure I'm pronouncing my "shi" correctly for Beijing.
It's likely that your tongue isn't moving much. It's also possible that the tongue isn't moving front-to-back (as it would for the English word "she") or that it isn't moving with respect to your lower jaw or that it isn't moving very much compared to nearly any other set of sounds you might utter. But the tongue must be moving if you are making any differentiated sound at all there.
"This thread is useless without" mp3s. In all seriousness, I'd love if it somebody took a crack at this because I'd love to hear it. (And I'm sure the karma would flow.)
That's unlikely. How do Chinese people write a sentence on the computer? Old systems would have you write the letters, eg.: "shi", then display a long list of all the possible characters with that sound (for all 4 tones). Newer systems use context to figure out the right characters for the whole sentence at once (probably using naive Bayes --- see Peter Norvig's spell checker), so that I can just type "Wo shi Jianada ren" and the output would be the correct 6 Chinese characters. I never need to specify the tones.
So I would assume that voice recognition could do even better by analyzing tonal information.
What that means is that a lot of classical Chinese literature has a bit of this problem, in that if you read it aloud, it is tricky to fully comprehend. (This poem/story takes it to an extreme, obviously.) For a long time, up through the early 20th century, it was fairly common for written things to be written in this literary Chinese even if the writer would say their ideas completely differently. A "write as you speak" movement has largely changed that: in modern written Mandarin, many words are two or three syllables long, and written with two or three characters. Etymologically, the words may have derived from compounding monosyllabic words, but in the modern spoken (and now also written) language, the sub-word syllables are simply no longer words in any meaningful sense.
(An analogy on a much smaller scale in English: many dialects of English merge /ɪ/ and /ɛ/ before a nasal sound, so that "pin" and "pen" sound identical. Speakers of those dialects will often say "straight pin" or "ink pen" (or "pig pen" or "cow pen") to distinguish which kind of pin/pen they mean; they may or may not make the distinction in writing. So the writing is unambiguous, the speaking is unambiguous, but reading a written thing is ambiguous. When it's just one or two words it's no big deal, but imagine that nearly every word suffered from that problem....)
EDIT: It turns out there's a Wikipedia page on the poem (naturally, should've checked that first). A lot of this is covered there: http://en.wikipedia.org/wiki/Lion-Eating_Poet_in_the_Stone_D...