Hacker News new | past | comments | ask | show | jobs | submit login

Japanese, since Unicode is set up to display Japanese wrong (CJK unification - theoretically you can configure your application to display Chinese wrong instead, but no-one does that).



Interesting, I didn't know that.

But I read up on it and it sounds like Unicode is widely used in Japan and the issue is mainly academic when it comes to old texts. It also seems to have been mitigated with selectors:

> Since the Unihan standard encodes "abstract characters", not "glyphs", the graphical artifacts produced by Unicode have been considered temporary technical hurdles, and at most, cosmetic. However, again, particularly in Japan, due in part to the way in which Chinese characters were incorporated into Japanese writing systems historically, the inability to specify a particular variant was considered a significant obstacle to the use of Unicode in scholarly work. For example, the unification of "grass" (explained above), means that a historical text cannot be encoded so as to preserve its peculiar orthography. Instead, for example, the scholar would be required to locate the desired glyph in a specific typeface in order to convey the text as written, defeating the purpose of a unified character set. Unicode has responded to these needs by assigning variation selectors so that authors can select grapheme variations of particular ideographs (or even other characters).

From: https://en.m.wikipedia.org/wiki/Han_unification


> the issue is mainly academic when it comes to old texts.

If it was just an academic concern, there could have been other solutions. It is more about how much people care about the exact glyph in display, and Japaneses were particularly pedantic in my experience (even more so than Chineses).

The Han unification itself was a well-intentioned but ultimately misguided effort to fit the massive Han characters into 16 bits. Another evidence that a 16-bit character set was so naive.


It’s not an academic or legacy text issue at all, systems and apps that deal with Japanese texts were usually configured to do Chinese wrong and vice versa, which worked until computer OS images became universal.

The Chinese Hanzi codespace and Japanese Kanji codespace are effectively reused for each other’s sets. That’s what it is.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: