A column in some old programming magazine— I think it might have been Dr. Dobbs'— had a similar ongoing contest. Some situations could cause a compiler's error-message generator to loop indefinitely, and some compilers would produce a fair amount of error text given a zero-length input, but ignoring these and only counting finite, nontrivial-input cases was still kinda fun.
256 bytes C++ busy beavers are going to run longer than the universe has existed. Expecting to run the tests seems like a misguided way to do the judging.
Could you explain a bit more how the busy beaver problem applies to this context?
Are you proposing that the error messages generated are analogous to the tape in a busy beaver Turing machine, and the 256 bytes of C++ is the machine's state transition definition?