Regex Golf

josephlord · on Dec 20, 2013

  ^(?!(..+)(\1)+$)

Why does that work on primes? I got it by mistake when fiddling with the parenthesis locations but I was expecting to have to deal with xx separately.

surreal · on Dec 20, 2013

Nice find. It works because it rejects "2 or more x's" repeated "2 or more times". So xx doesn't get rejected, but any multiple of that (xxxx, xxxxxx, ...) will be. The same way xxx doesn't get rejected, but any multiple of that (xxxxxx, xxxxxxxxx, ...) will be.

You've solved it using the actual definition of prime numbers, no trickery needed. Well played.

FYI, you don't need brackets around the \1, so can score 286.

Aissen · on Dec 20, 2013

More interesting than the definition of primes, it's almost the definition of multiplication that is embedded in this regex. We have two numbers(of occurrences) being multiplied:

- the first one is represented by the group (..+) it represents the number of occurrences n between 2 and +∞

- the second one is represented by (\1)+. We will repeat the first number m times, between 1 and +∞ times.

So the result of the multiplication is n*(m+1), which cannot be a prime. We just have to take the opposite with negative lookahead. It's very beautiful indeed.

See http://regex101.com/r/qN2fQ8 or http://www.regexper.com/#^%28%3F!%28..%2B%29\1%2B%24%29 to follow the above explanations.

vijucat · on Dec 21, 2013

Thanks for the web site links! Both are pretty interesting and I actually learned something from the detailed description(s) that regex101 provides.

(I learned that for (\1)+, "Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data")

vijucat · on Dec 21, 2013

Later, I realized that in a regular program / on the command line, the negative lookahead can be avoided by using the !~ (doesn't match) operator,i.e., we just check that the number is not a non-prime using a simpler regex:

  DB<63> print "Matches!" if (("x" x 31) !~ /^(..+)\1+$/)

Matches!

  DB<64> print "Matches!" if (("x" x 18) !~ /^(..+)\1+$/)

The Regex Golf site only asserts matches, i.e., it's using =~. That's why the negation using negative lookahead was needed.

(The simpler regex merely looks for non-primes by matching any number of characters which are a multiple of two numbers, n x m, i.e., those which can be factorized. n comes from (..+), m comes from \1+).

josephlord · on Dec 20, 2013

Thanks, I was going for the definition of primes but was just going a bit loopy about the negation. I didn't want to trim surplus characters until I understood it but yes I can see those brackets are unnecessary.

mistermcgruff · on Dec 20, 2013

This is amazing.

chaz · on Dec 20, 2013

Shouldn't the objective be to get the lowest score if it's called "golf?"

danceonfire · on Dec 20, 2013

The goal in code golf is lower character count, not lower score - which is also the case here.

talmand · on Dec 20, 2013

Therefore, the score it provides should be a running tally of how many characters you've used to make it match the scoring system of golf; which is the lowest number of strokes wins.

The scoring system for this is incremental which is the opposite of golf.

A proper scoring system with this would provide a character limit (par) for each section and the goal would be to write a shorter regex formula to complete the task. Final score would be how many characters under or over the total character limits (course par) you scored.

Seems this is more like Regex Darts or something like that.

But it's fun nonetheless.

danceonfire · on Dec 20, 2013

I don't get what this is all about.

The title was (most probably) derived from Code Golf, which is a competition in coding something with as few characers as possible. Code Golf was derived from Golf, where you want to use as few turns as possible.

The score going up and not down, which is done because you get more points the more objectives you fulfill, does not change the objective of this game or what it is based on.

rschmitty · on Dec 20, 2013

> The score going up and not down, which is done because you get more points the more objectives you fulfill, does not change the objective of this game or what it is based on.

Golf scoring penalizes you with more "points" by how many strokes you take.

If you are playing a Par 4 and it takes you 6 swings to get in the cup you just got _penalized_ +2

If your partner gets in the cup in 2 swings he is awarded -2

Therefor "Regex Golf" has its scoring reversed

talmand · on Dec 20, 2013

From my quick read, Code Golf uses a similar scoring system as to the sport of golf, the lower the score the better.

This game is the opposite, meaning the higher the score the better.

This has nothing to do with the objectives of the games in question, just the scoring method.

To make the sport of golf have a similar scoring system as this game you would grant ten points for every stroke under par, deduct ten points for every stroke over par, and deduct one point for every penalty stroke.

maxerickson · on Dec 21, 2013

Alternatively, it's a loose analogy and the main point of the game is to save keystrokes (with the positive scoring system added to reward partial effort; if it came down to a scored competition the leaders would anyway all probably have full credit for solving the problem).

Regex golf is to code golf as paintball golf is to normal golf. Or something.

dfritsch · on Dec 21, 2013

The golf part is that you can only get the most points by writing the shortest regex possible. However, I think the game avoids just giving a score based on the number of characters. To do that, you would be required to match all on the left and none on the right (the equivalent of actually getting the golf ball in the hole). This game lets you pick up your ball whenever you want and scores you based on distance to the hole. For the golfers, think of it more like a closest-to-the-hole competition than the more common stroke play.

davidbrent · on Dec 20, 2013

I have the most trouble with regex, and something about seeing my matches as I type made this incredibly useful for me!

Inversechi · on Dec 20, 2013

The best online tool I've seen is http://www.debuggex.com/

Spittie · on Dec 20, 2013

I'll add http://www.regexper.com/ to the various sites everyone posted, I find that this one output the nicer graph.

tokenizerrr · on Dec 20, 2013

http://regexpal.com/ or for a paid desktop program http://www.regexbuddy.com/, which is most excellent

rschmitty · on Dec 20, 2013

Regexbuddy is great. Well worth the coin I paid way back when

mcescalante · on Dec 20, 2013

I've used the previously mentioned regexpal quite often. http://www.rubular.com is another one for Ruby only.

ToastyMallows · on Dec 20, 2013

While people are recommending online regex tools.. http://regex101.com/

danielweber · on Dec 20, 2013

Many times I would match the exact opposite opposite of what I wanted. Is there a general rule for inverting regexps? ^ and ?! don't seem general purpose.

josephlord · on Dec 20, 2013

You need to pin the match to the start and end of the string with ^ and $ respectively otherwise the negation just matches an empty string or other irrelevant string.

danielweber · on Dec 20, 2013

Just to follow my thought pattern.

I start with

(.)(.)\2\1

Now to invert I change it to

(?!(.)(.)\3\2)

As you say, the negation matches an empty string. So I put on anchors:

^(?!(.)(.)\3\2)$

But now it matches nothing. I'm not even sure what that regexp says. The entire string is a negation? Would anything match that regexp?

I see elsewhere on this page that the answer involves putting in an extra dummy character, putting that new negation-and-dummy-character in parens, and then requiring that, between the anchors, there be 0-or-more of negation-and-dummy-character.

^((?!(.)(.)\3\2).)∗$

Two questions:

1. Why are my backrefs still \3 and \2? I added another pair of parens. (I thought ?! might not count, but it counted in my second example above.

2. Why does abba no longer match? It has no matches to the negation-and-dummy character construct, which ∗ should match, right?

NB: I used ∗ as my asterisk to avoid bb-code.

chingjun · on Dec 20, 2013

  >  ^(?!(.)(.)\3\2)$
  >  But now it matches nothing.

Things like (?!..) are called assertions. I like to think of assertions as patterns that match the space between two characters. So (?!a) means that it matches the space, where the next character is not 'a'.

So in this case, your regex matches an empty string (since it doesn't match any character at all between ^ and $, only spaces.)

  >  ^((?!(.)(.)\3\2).)∗$
  >  1. Why are my backrefs still \3 and \2?

Well, you're referencing the second and third opening parenthesis.

  >  but it counted in my second example above

Maybe that's because your second example is wrong? I mean, I don't really know what you are trying to achieve in your second example.

By the way, my solution for abba is

  ^(?!.*(.)(.)\2\1)

I guess you'll will be able to figure out what this regex means by now :)

surreal · on Dec 20, 2013

Remember

  (?!....)

is a lookaround, so it matches 0 characters. So

  ^(?!....)$

will only match an empty string! (Think about it). Try

  ^(?!....$)

instead (the $ anchor inside the lookaround)

josephlord · on Dec 20, 2013

  ^(?!.*(.)(.)\2\1)

You may also need to fill in the places where it could be anything. The above worked on abba for me.

BTW a double space indent then formats as code on HN I think.

Procrastes · on Dec 20, 2013

Your approach scores higher, but it only matches the (imaginary) space before the "good" words.

I went with:

^(?!.(.)(.)\2\1).$

with the thought that if I really wanted those matches I would want the whole strings.

Fun game!

Procrastes · on Dec 20, 2013

oops,

  ^(?!.*(.)(.)\2\1).*$

With proper formatting hopefully.

criswell · on Dec 20, 2013

Yeah, I couldn't figure out how to invert the abba one (.)(.)\2\1 without help.

elwell · on Dec 20, 2013

I figured out Ranges!

  abac|accede|adead|babe|bead|bebed|bedad|bedded|bedead|bedeaf|caba|caffa|dace|dade|daff|dead|deed|deface|faded|faff|feed

Edit: /s

abus · on Dec 20, 2013

Prime:

    ^x{2,3}$|^x{5}$|^x{7}$|^x{11}$|^x{13}$|^x{17}$|^x{19}$|^x{23}$|^x{29}$|^x{31}$|x{33}

jonahx · on Dec 20, 2013

josephlord's solution is much nicer:

^(?!(..+)(\1)+$)

abus · on Dec 20, 2013

Look at the parent.

criswell · on Dec 20, 2013

[a-f]{4,} works as well... but I do like your solution!

danielsamuels · on Dec 20, 2013

[a-f]{4} is shorter.

josephlord · on Dec 20, 2013

  ^[a-f]*$

Is the same length too.

hadrel · on Dec 20, 2013

Glob (333) without cheating (replace ⁕s with asterisks, they get turned into italics):

^(\⁕?)(\w⁕)(\⁕?)(\w⁕)(\⁕?)(\w⁕) .⁕ ((.(?!\1))+|\1)\2((.(?!\3))+|\3)\4((.(?!\5))+|\5)\6$

Edit: ((.(?!\1))+|\1) is used to conditionally match .+ iff a * has been found. .(?!\1) Matches any character if it is followed by \1. When * has been found then it matches no character, when * is not found it matches every character.

Edit 2: Formatting to avoid the *s becoming italics :/

josephlord · on Dec 20, 2013

  le[^*]|co|ito|dr|^p|su|gi|nr|hw|fa|[eo]b|ide

Glob 376 although it isn't pretty and better may be possible.

Indent two spaces with a blank line above to avoid code mangling.

falsedan · on Dec 21, 2013

Glob 380

  ^([wlpb]|c[hor]|do|re|mi|\*[pifvt]|\*er)

GregP91 · on Dec 20, 2013

en(?!c)|rr|il|eat|^(do|p|b|c(?!a))

Glob 386 although not the cutest way to do it.

sbirch · on Dec 21, 2013

An interesting bit on the computational complexity of solving this problem (with a slightly different scoring function):

http://cstheory.stackexchange.com/questions/1854/is-finding-...

ColinDabritz · on Dec 21, 2013

Hrm, on number 8 "Four" using:

    (.)(.*\1){3,}

I got all but the "do not match" for "Ternstroemiaceae"

The challenge appeared to be to match words with four instances of the same letter. "Ternstroemiaceae" contains four 'e's, and thus should be in the "match" column, instead of the "don't match" column, no? Did I miss something?

denkfaul · on Dec 21, 2013

Look closer, there's something different about the matches and this word.

amix · on Dec 20, 2013

JavaScript solver: https://gist.github.com/amix/8063003 ;-)

daGrevis · on Dec 20, 2013

I like the idea, but words there seems to be pretty random. I can't figure out the pattern, not even talking about writing regex... :(

maxerickson · on Dec 20, 2013

I got a 3 character pattern for the first puzzle and a 4 character pattern for the second puzzle (and reading a comment here I see there is a 2 char pattern for the second puzzle).

Only used 1 special char between them (anchoring something). So the first two sets have repeated 3 character sequences, and in one of them they follow an additional pattern.

For the third sequence, 8 character solution, lots of specials.

rplnt · on Dec 20, 2013

Read a) name of the puzzle b) the little help below the score. It gives out the pattern you should (not) look for.

edit: Not in Glob. What is Glob about?

schoen · on Dec 21, 2013

It's about implementing * as a wildcard character:

https://en.wikipedia.org/wiki/Globbing

I haven't figured out how to solve this with the parts of ERE and PCRE that I know. (I definitely don't know the entirety of PCRE.)

It's straightforward for me to write a substitution using regular expressions to create a pattern-matcher for a given glob (just anchor the ends and replace literal ? with . and literal * with .*) but here we have to do it inside a single regular expression.

I don't think there's a BRE solution if the number of stars is unbounded because I don't think this is a regular language.

jpsim · on Dec 20, 2013

Gist with my answers: https://gist.github.com/jpsim/8057500

If you look at the revisions, you'll see my 1st iteration was mostly identifying patterns, then with more and more cheating (and looking at this thread) to squeeze every point possible.

galen_tyrol · on Dec 20, 2013

https://gist.github.com/jonathanmorley/8058871

3121 points

chingjun · on Dec 20, 2013

  5.  193  ^(?!.*(.)(.)\2\1)
  11. 379  ^\*(er|[fiptv])|^([blpw]|c[hor]|do|mi|re)

JadeNB · on Dec 21, 2013

What are the rules? That is, are these Perl regexes, POSIX regexes, …? (Come to that, what is this site? Going up one level to alf.nu gives me a lot of suggestions for what I can do by modifying the address, but no clue of who's doing it on my behalf.)

hyp0 · on Dec 21, 2013

challenge: use machine learning to find the best solutions.

They might improve on those intended by exploiting accidental regularity in the corpus - though charmingly, the golf-cost of regex length helps combat this overfitting. They might also find genuinely cleverer solutions.

hadem · on Dec 20, 2013

Reminds me of Vim Golf.

http://vimgolf.com/

The_Double · on Dec 20, 2013

Does anybody know how to do math with regex? (triples)

And conditionals don't seem to work?

dudus · on Dec 20, 2013

From: http://quaxio.com/triple/

    ^([0369]|[258][0369]*[147]|[147]([0369]|[147][0369]*[258])*[258]|[258][0369]*[258]([0369]|[147][0369]*[258])*[258]|[147]([0369]|[147][0369]*[258])*[147][0369]*[147]|[258][0369]*[258]([0369]|[147][0369]*[258])*[147][0369]*[147])*$

raverbashing · on Dec 21, 2013

The idea is as thus:

Remember division rules? If the sum of digits is divisible by 3, then the number is divisible by 3

So, if you have 0, 3, 6, 9, it's as you can remove them, the value won't change

Remaining are the 2, 3, etc digit numbers divisible by 3

moron4hire · on Dec 20, 2013

569pts: ([^31]0|31|[017]2|[03]03|[^1]4|(900|01|7)5|6|[48]7|[57]8|09)$

chch · on Dec 20, 2013

Originally had 568, then saw this and improved. :)

580pts: 00(0$|3|6|9|12|15)|[^0]14|.53|^3[^38]|55|43|23|9.7

mbauman · on Dec 20, 2013

Nice. A smidge better at 582:

    5[54]|2[437]|00($|[369]|1[25])|^8[17]|^3[29]|9.7

jaytaylor · on Dec 21, 2013

One more smidgen, behold 584:

    5[54]|2[437]|00($|[369]|1[25])|^[83][1729]|9.7

grobie · on Dec 21, 2013

586:

    ^[378][12479]|00($|[369]|1[25])|5[45]|2[347]

nwellnhof · on Dec 21, 2013

589: ^[378][12479]|00($|[369]|1[25])|55|2[347]

danielweber · on Dec 20, 2013

My guess is that 147 must appear the same number of times as 258, but I'm not sure if that's even expressible in regexp.

johnlbevan2 · on Dec 20, 2013

Refined: [0369] can appear any number of times [147] and [258] must appear an equal number of times, or for any remaining: [147] must appear a multiple of 3 times [258] must appear a multiple of 3 times

recursive · on Dec 20, 2013

No. 111 is a multiple of 3.

johnlbevan2 · on Dec 20, 2013

Sadly that wouldn't work for: 140091876 147 = 4 times 258 = 1 time

tcfunk · on Dec 20, 2013

I got 187 with this

  [369](?![7238])|(?:0)12

It's sloppy and certainly not correct, but a decent score.

galen_tyrol · on Dec 20, 2013

^[01349][064] works for 197

johnlbevan2 · on Dec 20, 2013

Ternstroemiaceae contains four es; anyone else hit that issue / know what that's not in the valid results for Four? I'm guessing there's a pun in there that I didn't get :/

lsv1 · on Dec 20, 2013

Fun learning tool for regex.

chrismorgan · on Dec 20, 2013

Practise, maybe. Learning, probably not so much. Certainly fun, though, if you can cope with them!

zedadex · on Dec 21, 2013

Learning, kinda. I've often been motivated to learn something after being given a problem that can only be solved (or be solved much more easily) using it. Every other Excel trick I know is a result of that.

Trufa · on Dec 20, 2013

I like the concept but the word choice doesn't seem too "regular", it is more about catching all the particulars rather than finding a pattern as far as I can tell.

ZirconCode · on Dec 20, 2013

There are good patterns, the first regex for me was /foo/, the second /ick$/, for example. They're there.

cpeterso · on Dec 21, 2013

What purpose do the "plain strings" serve?

tareqak · on Dec 20, 2013

I got Four for 196 with

    (.).*\1.\1.*\1

and Order for 156 with

    ^a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*$

gorhill · on Dec 20, 2013

Order for 198:

   ^[^o].....?$

Probably not what was wanted, but it works (or maybe it was to trick people onto a false path)

zedadex · on Dec 21, 2013

This right here is why test cases are always blackboxed...

XaspR8d · on Dec 20, 2013

Makes my similarly off-track score of 184 seem foolish:

  ^[a-g][^airn]|[fs].*y$|ot$

yogo · on Dec 20, 2013

Nice, I earned 197 with: ^[a-m].{1,5}$

I must say Regex golf is a lot of fun.

ekke · on Dec 20, 2013

Nice, turn it around for 199:

  ^.{5}[^e]?$

raverbashing · on Dec 21, 2013

or ^[^o].{1,5}$

198 points

acln · on Dec 20, 2013

Balance (194)

  ^<.*>$|^<<.*>>$|^<<<.*>>>$

nadinengland · on Dec 20, 2013

Balance (286)

  ^(<(<(<(<(<(<<>>)*>)*>)*>)*>)*>)*$

If we had named captures:

  ^(((?<P><)|(?<-P>>))*(?(P)(?!)))$

skeltoac · on Dec 21, 2013

Balance (288) with one bad match: ^(<(<(<(..)>)>)>)$

harpastum · on Dec 20, 2013

Balance (212)

[<>]{28}

chrismorgan · on Dec 20, 2013

Four: 199

    (.)(.\1){3}

Order: 168

    ^a*b*c*d*e*f*g*h*i*l*m*n*o*p*r*s*t*w*y*z*$

(You just didn't remove unused letters.)

chingjun · on Dec 20, 2013

Order for 195

    ^[a-g]*[^a-g]*$

ryanthejuggler · on Dec 20, 2013

Order for 170.

  ^([ab][c-gio][^d]|c[ehlo]|d|f[il]|gh|mo)

Not as clever as you guys though!

burke · on Dec 20, 2013

Order for 196:

    ^[a-i]*[l-z]*$

spystath · on Dec 20, 2013

Or just (.).\1.\1.\1 for Four (198)

MereInterest · on Dec 20, 2013

So long as we are going down this route, you can shave off another character with (.)(.\1){3} (199).

maxerickson · on Dec 20, 2013

Can chop out one check to save a character:

(.)...\1.\1

scott_karana · on Dec 20, 2013

For Abba...

Why doesn't (.)(.)\2[^\1] work?

I thought backreferences matched the captured literal, so negating it would match? But this looks the same as (.)(.)\2\1...

goldenkey · on Dec 20, 2013

You cannot negate a capture, only a character literal. A capture might only be a character but it is NOT a literal,

scott_karana · on Dec 20, 2013

Gotcha. That makes perfect sense. Thank you! :)

goldenkey · on Dec 21, 2013

You got it :-)

annasaru · on Dec 20, 2013

(.)(.)\2\1 ?

on Dec 20, 2013

[deleted]

shdon · on Dec 20, 2013

Anchors (208):

k$

Ranges (202):

  ^[a-f]+$

Backrefs (199):

  (\w{3}).*\1

gorhill · on Dec 20, 2013

Backrefs (201):

(...).*\1

selmnoo · on Dec 20, 2013

Wait, so what's with the 'point system' here? Why's shdon answer for backrefs 199 and yours 201? (and while you're at it, could you briefly explain (...).*\1 for the regex newbies?)

josso · on Dec 20, 2013

Regex explained:

    (...) # Match exactly 3 (the dots) characters and save them as a group (the parenthesis)
    .* # Match any character (the dot) 0 or more times (the asterisk)
    \1 # Reuse the first group

pyed · on Dec 21, 2013

backreferencing is new to me, but why this: (...).*(...)

isn't working !! ain't I back referencing the first group which is (...) ?

josso · on Dec 22, 2013

It isn't working because (...) is just matching 3 characters. What your regex is saying, is basically: Match any 3 characters, then any character 0 or more times, then any 3 characters.

You're reusing the regex and not the match. Back referencing essentially means that you'll match whatever is matched the first time, rather than reusing the regex pattern.

I hope that made a little sense. Grouping, capturing, matching and backreferences is part of what makes regex tricky and very powerful. You might want to use some of the online tools, which can help explain the regex visually.

pyed · on Dec 24, 2013

thanks for clearing that for me, I figured it out my self after giving it sometime.

taylorfausak · on Dec 20, 2013

"You are deducted one point per character you use, and ten if you match something you shouldn't."

ddevault · on Dec 20, 2013

I got ranges at 202 with

  [a-f]{4}

198 on plain strings with

  .\*foo[lt]?.*

Then I stopped trying.

gorhill · on Dec 20, 2013

Plain string: just 'foo' works.

easy_rider · on Dec 20, 2013

shdon · on Dec 20, 2013

Indent your regexes by two spaces and precede them with a blank line, that will preserve asterisks ;)

subbz · on Dec 20, 2013

This will earn you some more points: Anchors: k$ Ranges: [a-f]{4}

easy_rider · on Dec 20, 2013

"sick" :) one for anchors. I was being quite complex with \w+(ick)$ Maybe I should read from right to left.

That ranges one was pretty obvious though. Stuck at abba.

maxerickson · on Dec 20, 2013

Why publish them?

You only need 1 range.

sciolistse · on Dec 20, 2013

that's twice as long as it has to be.. not that it matters.

onaclov2000 · on Dec 20, 2013

Glob: 277, it didn't match all and not match the others, but it's reasonably high. ^([bcdlmpwr]|\*[efptv])

moron4hire · on Dec 20, 2013

378:

  ^(\*(er|[fiptv])|b|c(?!a)|do|le|mi|p|re|w)

chingjun · on Dec 20, 2013

379:

  ^\*(er|[fiptv])|^([blpw]|c[hor]|do|re|mi)

and it has "do re mi" in it!

moron4hire · on Dec 20, 2013

380:

  ^.[^bds].*[^e-kjotz] .* [^eiz].+[^lx]..$

ekke · on Dec 20, 2013

390:

  ^[lwp]|fa|r[ro]|[isytd]l|c$|de

3rd3 · on Dec 21, 2013

390:

    de|eat|fa|rr|ow|[rl]o|^p|[cd]$

jmallard · on Dec 21, 2013

Glob 392: ^p|c$|[wrbc][npbro]|ai|fa|il

nwellnhof · on Dec 21, 2013

Glob 396: ai|c$|ep|[bcnprw][bnopr]

nwellnhof · on Dec 21, 2013

Glob 397: ai|c$|^p|[bcnrw][bnopr]

ZirconCode · on Dec 20, 2013

For "6. A man, a plan", I thought it was impossible to match palindromes with regex, am I wrong?

quarterto · on Dec 20, 2013

It gives you hints below the score, for 6 it's:

  You're allowed to cheat a little, since this one is technically impossible.

No idea how you cheat...

EDIT, SPOILERS: I get 170 with ^(.?)(.)(.).?\3\2\1$

mryingster · on Dec 20, 2013

176 with ^(.)(.).*\2\1$

hyp0 · on Dec 20, 2013

  ^(.)[^p].*\1$           # 177, "cheat a little"

Yen · on Dec 21, 2013

You're pretty much correct. True regular expressions don't have the expressive power to decide whether arbitrary-length strings are or aren't palindromes.

That said, 'regular expressions', as used in most programming languages, have extensions that extend the expressive power. One such extension is the matched group & backreference, used in other commentor's answers.

From a theoretic stance, these aren't really 'regular expressions', but that's what we call them in practice.

endophage · on Dec 20, 2013

Seems very very very broken. The regex "[a]" apparently matches "crenel"

pomfpomfpomf3 · on Dec 20, 2013

It doesn't. The green ✔ indicates that you've completed the task successfully — that is, your regex does not match "crenel".

shawabawa3 · on Dec 20, 2013

It's a bit confusing. Ticks on the left means it did match, ticks on the right means it didn't

Pxtl · on Dec 20, 2013

I hit enter and nothing happens.

danielweber · on Dec 20, 2013

Use the yellow box, not the name box that gets your focus when you land on the page.

This confused me for a few minutes.

bencoder · on Dec 20, 2013

what's the pattern on "Abba"? I thought it was just to exclude doubled letters but I have doubles on two words on the left hand side as well (noisefully and effusive, in case the word lists are the same)

shdon · on Dec 20, 2013

Excluding doubled consonants that have the same vowel on both sides of the pair:

  ^((?!([aeiou])([^aeiou])\3\2).)*$

The following also works for the testcases and is shorter:

  ^((?!(.)(.)\3\2).)*$

surreal · on Dec 20, 2013

Can get 2 more points with:

  ef|^((?!(.)\2).)*$

But I reckon yours wins for having a pair of breasts in the middle

moron4hire · on Dec 20, 2013

well, if you are strictly interested in matching with the fewest characters, regardless of the apparent pattern:

  ^(.)(.).*\2\1$

danielweber · on Dec 20, 2013

Your anchors at the end make it match nothing.

moron4hire · on Dec 20, 2013

sorry, misreplied. This was for palindromes.

abus · on Dec 20, 2013

5. Abba (^unv|.u|z|ph|mi|st|vi|tan)

haha

joelanman · on Dec 20, 2013

am I being a bit slow? Why doesn't [^g-z] work on 'ranges'?

martinml · on Dec 20, 2013

Because for example "beam" matches [^g-z]. That is, it has a letter somewhere that is not between g and z (namely e and a). I came up with ^[a-f]+$ but I'm guessing it could be shorter :)

Edit: ah, every word in left column has 4 a-f letters. So [a-f]{4} is a shorter match.

joelanman · on Dec 20, 2013

ah you're right :) I was being slow, what I thought I wrote was 'words consisting only of letters that arent g-z'

bluedino · on Dec 20, 2013

Another dumb question - why does 'beam' not match [bdf][ae] ?

maxerickson · on Dec 20, 2013

It's complaining because it does match.

bluedino · on Dec 20, 2013

Oh. I told you it was a dumb question.

dsschnau · on Dec 20, 2013

Me and a coworker totaled 3079 points. Anyone beat it?

ekke · on Dec 20, 2013

Got hooked and to 3202, but ca 10-20 more points could be gained according to answers here and there: https://gist.github.com/jonathanmorley/8058871

Kudos to the author of the game, good job.

PS. http://regexcrossword.com/

ddebernardy · on Dec 21, 2013

Doesn't seem to do anything on an iPad...

xarien · on Dec 20, 2013

Why is Kesha an optimal answer? (#2 k$) ;)

jhight · on Dec 20, 2013

Spoiler alert (201 points): f[ao][no]

elwell · on Dec 20, 2013

You can just write: foo

jhight · on Dec 20, 2013

Oh, there's more than one step, I see.

easy_rider · on Dec 20, 2013

(.+|)foo(.+|)

Lol so awesome this. I oblige. Much love!

ryanthejuggler · on Dec 20, 2013

Powers:

    ^((((((((((x)\10?)\9?)\8?)\7?)\6?)\5?)\4?)\3?)\2?)\1?$

I feel like there's gotta be a sneakier way of doing this.

XaspR8d · on Dec 20, 2013

Completely not sneaky at all, but does reasonably well at 71 points:

  ^(x|xx|xxxx|x{8}|x{16}|x{32}|(x{64})+)$

rplnt · on Dec 20, 2013

Mine is a bit shorter, though a bit more "meh" as well.

    ^x{32}$|^(x{2}){1,8}$|^(x{64})+$|^x$

galen_tyrol · on Dec 20, 2013

improved, gives 80

    ^(x|(xx){1,9}|x{32}|(x{64})+)$

rplnt · on Dec 21, 2013

I tried to get rid of those redundant ^ and $ but it somehow didn't work. I probably forgot to put it all inside one group.

alanh · on Dec 22, 2013

They are not redundant. Any string of one or more exes is going to constrain a substring of exes of length 2^n (consider n=0 for a trivial proof), so you do need those anchors!

rplnt · on Dec 23, 2013

I meant my previous post where I had them in every single possibility.

cleaver · on Dec 20, 2013

I did similarly, but managed to squeeze out a few more points:

^(((((((((xx?)\9?)\8?)\7?)\6?)\5?)\4?)\3?)\2?)\1?$

(60 points)

willchang · on Dec 21, 2013

A non-cheating solution, 79 points.

  ^((xx?|xxxx)(\2{7})?)(\1{63})?$

danielweber · on Dec 20, 2013

I was working on this pattern:

^(x{1,2}){1,2}$

But it wedged my browser as I started adding more terms. :/

delluminatus · on Dec 20, 2013

58 points:

    ^x$|^(x+)\1$

sriharis · on Dec 21, 2013

^ answers all questions.