Even though he has a good point: strings are lists, and erlang is good at lists ...

jerf · on Nov 28, 2009

Frankly, I think "native regexps" has proved a mistake, not a virtue. If you're using so many regexps in your code that you actually care whether the syntax is optimized for it, the odds of you Doing It Wrong (TM) are very, very high. (And it has to be direct use, too... if you want something like a regexp-based dispatch map like Django uses, Perl doesn't even have a significant character advantage over Python; r"" vs qr//.) Making it easy to do the wrong thing and harder to do the right thing (than the wrong thing) has certainly wrecked up a lot of Perl code I've had to deal with.

The majority of my professional programming is in Perl with many other programmers, and the number of times I see people do something like $settings =~ /read/ over a string containing settings, often complete with more than one setting that has the substring "read" even though they're only looking for one particular one... oi. Makes me sick.

aaronblohowiak · on Nov 28, 2009

i think this is a cultural thing. ruby is at least as convenient to deal with regexes as perl, but we don't have the same kinds of problems. i think this is because in ruby, the culture is to use the hippest tool for the job. frankly, regular expressions aren't very hip. YAML or JSON is much more hip and therefore likely to fill that niche. Perl-users made regular expressions something that we all expect each other to know. Did they go too far? Seems that way (just like java-users obsession with introducing abstraction layers helped the programming community learn DP.)

Your point implies that exposing a powerful mechanism that is easily abused should be avoided. I disagree. I think this is a matter of culture and not "law".

adamc · on Nov 28, 2009

Picking what approach to use based on whether it is "hip" is just a terrible way to write software.

ellyagg · on Nov 28, 2009

Aside from the fact that regex are implemented as a library, can you elaborate how support for them is not that good? I think that's wrong:

http://www.erlang.org/doc/man/re.html

mahmud · on Nov 28, 2009

Lisp doesn't have native regexps, but of the two main Lisp regexp libraries, one, cl-irregsexp, is 3 times faster than Perl, 6 times faster than Python, 7 times than C, and 8 times faster than Ruby.

Scroll to the bottom of the page http://common-lisp.net/project/cl-irregsexp/

If you have been following some recent papers on regexp performance, there is consensus that things could be a lot faster with better algorithms. I expect the game to change dramatically soon.

alxv · on Nov 28, 2009

The cited benchmark is completely bogus.

The benchmark compares the different implementations based a single trivial regular expression: /indecipherable|undecipherable/. You simply cannot claim a regex engine is faster than another with such a poor experiment. It is evident that Boyer–Moore string search algorithm will outshine any engine on that regular expression.

mahmud · on Nov 28, 2009

Ouch! Ineed, it's a lousy benchmark; I only recommended it from memory because it blew me away the first time I saw it.

FWIW, if anybody can recommend a good benchmark, I would be happy to do a write up since I am proud of the regex performance of the other CL library (cl-ppcre.)

rjurney · on Nov 28, 2009

This is true - looked at making a very concurrent webcrawler in Erlang, and the regex bit was painful.

babo · on Nov 28, 2009

From 12Bx that changed, regexps are enjoyable but still not as rich as Perl.

tlack · on Nov 28, 2009

Maybe someone should write a parse transform for dealing with them?

rjurney · on Nov 28, 2009

Cool, I'll look again. Thanks.

kscaldef · on Nov 28, 2009

Hmm... I also wrote a concurrent webcrawler in Erlang, and at no point was I tempted to use regular expressions.

rjurney · on Nov 28, 2009

Then we were crawling for different purposes - imagine that! :)

aaronblohowiak · on Nov 28, 2009

instead of being snarky, can you please provide the reason why regular expressions were required?

rjurney · on Nov 28, 2009

I needed to do pattern matching to pull data out of many different formats, then clean the data before processing it. We were parsing radio station song feeds, and the data is varied, chaotic and often unavailable. This was also a port of a perl POE app, and so it was regex intensive in its original implementation and I don't see how you could effectively deal with such noisy text without regexes.

As to snark - his reply was snark, I just replied in kind. Regexes are incredibly useful for all kinds of things, most especially in parsing data from web services. The fact that he didn't need them isn't 'funny,' it means we were doing different things.

In any case - using Erlang for something like this is so much win. The POE, and threaded implementations got real ugly real fast as we scaled it up. I knew Erlang could do it - across boxen, without a problem. It sounds like the regex libs have improved, and I look forward to using Erlang again in the future.

Happy? :D