Some comments about Duff's Device from a compiler writer: The biggest problem wi...

tialaramex · on Nov 19, 2021

Correct. Duff's device is for MMIO which means it needs to do explicit memory writes. We are not doing arithmetic here, we're doing I/O. In C (and C++) you express that by labelling the variable volatile.

Today MMIO is very slow and so you just don't need unrolling. Or I guess from another perspective, MMIO isn't as much faster as absolutely everything else is now, and so again you don't need unrolling. Your CPU can do some arithmetic, make a conditional jump, and still schedule the next MMIO write in plenty of time. Therefore Duff's device is now irrelevant.

The ++*x is completely toxic in the context Tom Duff wrote this - where we want actual writes, because then it actually amounts to this:

  int tmp = read_mmio();
  tmp++;
  write_mmio(tmp);

And immediately you should be filled with dread, who said we could read this MMIO register? What happen if the contents of the MMIO register change while we're twiddling tmp? This all seems like a very bad idea.

oneoff14326 · on Nov 19, 2021

What's more curious than the irreducability issue is, how on earth does the syntax ever get accepted by the parser?

I mean the lex/yacc syntax has to reflect the nestability of the while/switch [1], ditto if you do recursive descent, so how is it possible this even parses?

[1] original and weirder version here https://en.wikipedia.org/wiki/Duff%27s_device#Original_versi...

mananaysiempre · on Nov 19, 2021

Because the switch statement is not multiple conditions with a funny syntax, it is a computed goto with funny labels. Those labels are permitted to be anywhere at all inside its associated statement, including perverse things like

  switch (x) default: {
    if (y) case 1: case 2: return;
    /* more stuff with more labels */
  }

Joker_vD · on Nov 19, 2021

Ugh, very simply? Any STATEMENT is allowed to be, among other things, a LABELED_STATEMENT which is (<ident> ":" STATEMENT | "case" <constexpr> ":" STATEMENT | "default" ":" STATEMENT). Then in the semantic pass it's checked that labeled statements starting with case/default must be inside some switch. That's it.

Sharlin · on Nov 19, 2021

This is a good question. The grammar of the C switch statement is not context-free. In practice this is easiest to resolve by just allowing case labels (which in the end are just decorated goto labels!) to appear anywhere, when it comes to parsing, and only check the existence of an enclosing switch during semantic analysis, which is necessarily context-sensitive anyway.

mananaysiempre · on Nov 20, 2021

Where’s the problem for context-freeness here? Just have two copies of every statement production, one that doesn’t allow for case labels and one that does, have function bodies start as the former and switch bodies unconditionally switch to the latter. It’d be thoroughly unpleasant to deal with in an implementation and not very helpful for the eventual user, but as a pure existence proof I don’t see any problems.

Sharlin · on Nov 20, 2021

Yes, that's true. Maybe I should say that it's effectively context-sensitive. In any case, the grammar as specified by the standard is context-free and the restriction of where case labels can occur is a separate semantic requirement.

phkahler · on Nov 19, 2021

>> The biggest problem with this technique, from an optimization perspective, is that it creates an irreducible control-flow graph.

The technique is considered obsolete with todays compilers specifically because they can do loop optimizations like this without writing strange C code.

OTOH you never know what those wacky compiler guys are going to do. For example, it seems GCC hasn't been doing any vectorization on x86-64 with standard -O2 even though the ISA has supported SSE2 as a minimum from day one. This is being fixed (to some extent) in the next release.