Hacker News new | past | comments | ask | show | jobs | submit login

I'm still trying to find one valid use for length of string in unicode characters. What one usually needs to know is length of string as it's rendered by some output device, which is not related to count of unicode characters in any useful way. Even for fixed point fonts you can have glyphs that are composed from multiple unicode characters or characters whose glyphs occupy two consecutive positions.



Twitter has a limit of 140 "codepoints". Not bytes. Not glyphs.


That's weird, I thought its limit was deliberately low enough to fit into an SMS message, which has a limit of 140 octets (160 characters in some 7-bit encoding GSM uses). Do they actually allow, say, 140 kanji?



That post basically just says go look at this wiki page: https://twitterapi.pbworks.com/Counting-Characters

Why not link to that in the first place?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: