Hacker News new | past | comments | ask | show | jobs | submit login

Ruby's character encoding and Unicode support is pretty strong. I'm intrigued how you think it's half-ass, partial or just plain wrong (Really! If there's something really borked with it, it's in my interest to know :-)). Every string has full encoding support and it's baked right in to the language.



tchrist's OSCON Unicode talks:

http://98.245.80.27/tcpc/OSCON2011/index.html

Specifically the third talk:

http://98.245.80.27/tcpc/OSCON2011/gbu.html http://98.245.80.27/tcpc/OSCON2011/gbu.pdf

Excerpts:

  Its String functions like upcase or capitalize won’t even look at
  anything but ASCII. 

  It’s completely missing a whole lot of critical Unicode
  functionality:

    casemapping & -folding
    grapheme support
    normalization
    collation
    text segmentation, &c &c &c. 


  Every Ruby string carries around its encoding, instead of sanely
  unifying into Unicode internally like nearly everything else does.
Also:

  > baked right in to the language
is not synonymous with "intelligently implemented"

Note that I wasn't implying that "half-ass," "partial," and "just plain wrong" necessarily all apply to Python and/or Ruby's implementations. Some may apply to some areas while others may not, and really this extends outside of just Python and Ruby, but I'm trying to stay in context here.


This is interesting stuff - thanks for sharing, I'll be checking it out. The upcase/downcase stuff definitely checks out so far :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: