Hacker News new | past | comments | ask | show | jobs | submit login

Just adding an @ to the string match would make it a bit more robust. (Would still be vulnerable to jim@their.domain.my.domain, so add a $ on the end if it’s a regexp.)

But even with the most rudimentary web-dev languages you can replace the inner string match with a lowercase transform, split on @ and perform an exact string compare. Insanely simple stuff. Probably still a one-liner in any sane/productive framework.




Frameworks usually have some sort of email parser. Email parsing is non trivial. But I agree matching .*?@domain.com$ would probably work fine.


Definitely use a real email address parser if it’s available, easy and/or you’re dealing with unknown email addresses. But absent any strange circumstances there’s also nothing wrong with basic string manipulation if it’s done properly (e.g. split on @ and test for an exact string match, case insensitive). As personal preference, I’d choose that over regexp.


Hackers deliberately create strange circumstances, it's the primary way to find exploits. Any code that relies on a lack of strange circumstances is a time-bomb.


There aren't too many strange circumstances for a properly written split/test routine. Described more precisely:—

  1. Split on @
  2. Get last string from array
  3. Convert to lowercase
  4. Perform exact string compare against target domain
It's possible that there's some window for obscure unicode hijinks, but I'd posit that a regexp parser or a "proper" email parsing library is just as at-risk. Possibly more so as those would be significantly more complicated and involve significantly more code.


What is the purpose of the ? here?


It makes the preceding * less "greedy". I don't think it has any effect on the set of strings matched by this regexp, though, which is a simple string suffix check.


I agree, but the dot should be escaped because it matches any character, so "@domain\.com$" should just works for.


Or use [.] so it's super clear on the a-human-is-reading-it parse.


I don't think that's clearer because for [.] I need to remember that . does not need to be escaped in character classes whereas \. is quite clearly an escaped literal character without any advanced regexp knowledge.


I dunno, but I've seen a bug like this in prod while consulting.


Non-greedy match (match what's necessary nothing more.)

The default is greedy... match match match nom nom nom!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: