Hacker News new | past | comments | ask | show | jobs | submit login

That's more of a proof that regexes can't do everything, than a proof email is bad.



Pick a language of your choice, and fully implement the spec. I bet it'll still be long.

Edit: The following is completely wrong

For example, the python module to parse an email address is around 500 lines and repeatedly warns it'll be very hard to follow without a copy of the spec in front of you. It contains code for parsing multiple timezone formats and cite to a follow up spec addressing a bug in the initial treatment of negative timezones...

https://github.com/python/cpython/blob/master/Lib/email/_par...


The timezone isn't for parsing addressing headers, it's for parsing date headers. And actually, that file isn't for parsing email addresses, it's for parsing addressing headers in mail messages.

The code I wrote for parsing email headers is here: https://github.com/jcranmer/jsmime/blob/emailutils/headerpar... . A decent chunk of it is building a full lexer for email headers, and trying to cope with only supporting internationalization support in a few cases where they need to be supported. And the corner cases for that i18n support are really nasty.


Once again, you prove me completely wrong. Thanks for the detailed correction.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: