My Favourite Diff

kazinator · on April 5, 2020

Recent GCC can diagnose the situation where a switch case falls through without a /* fallthrough */ comment.[1]

C++17 has a comment-like language feature for demarcating desired fallthrough.

Fallthrough in case processing is often useful; it's just not such a great default.

In the TXR Lisp tree-case[2] construct, any case which returns the : object falls through to the next case.

This is used all over the place in the library.

--

[2] https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Warning-Options...

[1] https://www.nongnu.org/txr/txr-manpage.html#N-03D834A5

adrianmonk · on April 6, 2020

> not such a great default

This is a tangent, but it reminds me of a personal policy of mine: if I'm thinking about changing a default, I ask myself if there should be a default at all.

The consequences of having the wrong default were serious enough that it even came to my attention. So by changing the default, am I potentially exchanging one set of problems for another?

Sometimes it would be better overall if the user is forced to explicitly state which way they want it. Not always, of course, but I like to ask the question.

adrianmsmith · on April 6, 2020

Right! I had a bug in a SQL statement to fetch tasks “order by priority” and then execute them in that order. A stupid mistake, but I must have looked at that code tens of times and I just couldn’t spot it. The bug was that that statement found then executed the lowest priority items first.

It’s not clear to me that an “order by” should have a default direction. I don’t know why one direction is more likely to be used than the other. Without a default, I would have been forced to think about the direction and wouldn’t have had that bug.

Apanatshka · on April 6, 2020

A similar example is in a "filtering" operation. I mostly have this in functional programming where you pass a predicate to filter on. But what's the perspective of the filter? Are you filtering poison from your water, or are you sifting the water looking for gold nuggets. I prefer to use functions remove and retain (respectively) to be more explicit.

kazinator · on April 6, 2020

GNU Make has this naming problem too $(filter pat,list ...) and $(filter-out pat,list ...).

Someone · on April 6, 2020

That’s an ambiguity in English. “Foo is priority #1, bar priority #2, so foo has the higher priority” is perfectly valid English.

The only way to ‘solve’ that would be by banning priorities higher than 1, for example by numbering them 1, 0, -1,… or by using floats in the range (0,1] for priorities.

ken · on April 6, 2020

I would say this is a problem with type systems. Programming languages (including SQL) mostly just have one type for “integers”, or possibly a few types for different numbers of bytes of storage.

“Priority” and “array index” and “number of cantaloupes” and “age on 1-Jan-2020” may all be represented by integers but they’re all very different types.

Hungarian tried to solve this, sort of, but it’s at the wrong level. It’s not going to know which ones of these sort in which direction.

brigandish · on April 6, 2020

Isn't everything usually ABC and 123 by default?

I suppose countdowns aren't but there's a clue in the name…

adrianmsmith · on April 6, 2020

Right, priority was 1..5 where 5 was a higher priority. So “order by priority” delivered priority=1 items first, and processed them in that order, which was the opposite of what I wanted.

It was absolutely my mistake, but “order by priority” reads so well in English, so my eye just couldn’t see the mistake.

I mean I’d been programming for decades at that point, I knew the syntax, I know our data model, I stared at the code and I still couldn’t see it. (Eventually I did see it of course..)

Since then I’ve been of the belief it would be better if SQL forced you to put “asc” or “desc” when using “order by”. Then you’d be forced to think about it.

So I was agreeing with the previous poster, who said that defaults aren’t always a good idea.

gonzus · on April 6, 2020

This is also an example of how difficult naming can be. This is arguably a "ranking" (1=low, 5=high), not a "priority" (1=high, 5=low).

adrianmsmith · on April 6, 2020

That’s a very good point.

However people say things are “high priority” which, at least to my mind, implies higher numbers, so I don’t think ”priority” is necessarily the wrong term here.

andrey_utkin · on April 6, 2020

right, and you execute tasks in descending priority order, not just the priority order.

aasasd · on April 6, 2020

> “order by priority” reads so well in English

I had my suspicions that programming reads very different to native speakers, and I guess here's a case when it's better to have English as a second language: I heard a ding-a-ling in my head immediately upon seeing the absence of ‘desc’.

renox · on April 8, 2020

It's a good question but in this case, I wouldn't be surprised if 99% of the usage don't fallthrough.

johnny-lee · on April 6, 2020

My perl script which scans C/C++ source code for typos would emit a warning (case 19) for a case statement that fell through to the next case statement. The script handled the 'fall through'-comment situation as well.

You can read the 1999 Perl Conference paper for the perl script at https://sites.google.com/site/typopl2/afastandeasywaytofindb...

letientai299 · on April 6, 2020

_Often my biggest contribution isn’t being the only one who is able to fix something, but being the only one who realises something is fixable._

I feel relatable. At all the placed I've worked, people realize the problem, sometime find the solution, but most of the time, they won't fix it, unless it directly affect their code/project/product. As a result, technical debt keeps bubble up and we have more and more issues.

Right now, the only thing I can do is just to keep fixing issues as I encounter them. But that's for the code, I don't have any solution for the "people" problem.

xorcist · on April 6, 2020

I call this the person with taste. One of the first questions I try to answer in new setting is to identity the person "with taste" around. They are usually a good entry point if you want to get something done.

nicbou · on April 6, 2020

Many problems lack a Designated Person Who Cares. Everyone has something to do, a task to accomplish. There's never someone whose job is to improve the codebase and the tooling, even when it can benefit the team.

auxym · on April 6, 2020

I have not worked in software for long (a year, a while ago) but my experience with agile at that time was that even if you realize that you can and should fix something, you still have do political representation to the scrum master, team lead and product manager so that an issue (sorry, user story) is created and gets prioritized (over features and other bugs) in a future sprint.

Going rogue and spending an afternoon fixing stuff without an approved US (to charge time on) was not seen well.

julius_set · on April 6, 2020

Unfortunately this is often the case at larger companies, unless you find a really cool manager / PM to work under who just trusts you — note this can be achieved over time.

I’m talking from personal experience of going rogue :)

xelxebar · on April 6, 2020

This reminded me of Raymond Chen's blurb article on his blog, The Old New Thing:

https://devblogs.microsoft.com/oldnewthing/20110118-00/?p=11...

For whatever reason that heavily influenced me, and now I almost always submit patches together with my bug reports. Sometimes its a whole lot more work, but I feel like this practice pays dividends in the amount you learn as well as the comraderie it fosters with the upstream devs.

MrStonedOne · on April 5, 2020

One annoying thing about the war on case fallthrough at the language level is the lack of imagination.

`continue` is a thing, compliments switch's existing `break` statement usage and is overall a much better solution for explicitly specifying fallthrough then returning special objects or c#'s abuse of goto.

Spivak · on April 6, 2020

I think it's a pretty syntax idea but it means continue works opposite your intuition inside switch statements compared to everywhere else.

I mean you could kinda argue that switch could conceptually mean something like

    for matcher, block in cases:
        if matcher.matches(subject) or continued:
            eval block

but then continue still feels magic since it bypasses the test.

JadeNB · on April 5, 2020

Since there's not much context here: the article, which I enjoyed, ends with "Be the change you want to see in the world", and shows how the author exemplified this by coding a fix to a persistent source of PHP bugs at $WORK.

thunderrabbit · on April 6, 2020

And it doesn't show an actual diff.

aasasd · on April 5, 2020

As a technical aside (I know it's not the main point, but still): with the ubiquity of Git workflows employing pre-merge checking on the platform of choice (e.g. Github or a CI tool), this is rather easily done via a rule in an off-the-shelf linter. No `break` in a `case`? Can't merge it.

As a language-design aside, `switch` in general is stinky stuff. Not just with the fall-through: it also violates the regular C-style syntax for no particular reason, having a mini-syntax of its own instead.

But the most perverse thing I've seen done with `switch` is using it as `if`:

    switch (true) {
        case ($a == $b): ...
        case ($c == $d): ...
        case (itsFullMoonToday()): ...
        default: ...
    }

Someone · on April 5, 2020

If that’s the most perverse you’ve seen, I guess you haven’t seen Duff’s device, which mixes it with a do…while loop (code copied from http://www.catb.org/~esr/jargon/html/D/Duffs-device.html):

   register n = (count + 7) / 8;      /* count > 0 assumed */

   switch (count % 8)
   {
   case 0:        do {  *to = *from++;
   case 7:              *to = *from++;
   case 6:              *to = *from++;
   case 5:              *to = *from++;
   case 4:              *to = *from++;
   case 3:              *to = *from++;
   case 2:              *to = *from++;
   case 1:              *to = *from++;
                      } while (--n > 0);
   }

tzs · on April 6, 2020

I have no idea if this still works in modern C, but here's something kind of similar I've seen used in C circa 1986:

  switch (foo)
  {
    case 1:
      ...code just for case 1...
      if (0)
      {
    case 2:
        ...code just for case 2...
      }
      ...code for both case 1 and case 2...
      break;
    case 3:
      ...

  }

used when two cases have a common tail, and you have sufficient space constraints that you do not want to duplicate the common tail code, and you have sufficient time constraints (and maybe space constraints on the stack) that you don't want to put the common tail code in a subroutine and call it from both cases.

fisherjeff · on April 6, 2020

Oh my god. I’ve wondered about a way to do this but... this is just devious.

X-Istence · on April 6, 2020

This is goto with slightly more code ;-)

aasasd · on April 6, 2020

That's not perversity, that's suicide by art.

jwatzman · on April 5, 2020

Post author here.

One of the nice advantages we had in using our static analysis tool for this, with real code intelligence, instead of a simpler linter, was that we could do a much more clever check. The actual rule was "every case must be terminal", and we had a rich analysis for "terminal", so stuff like this would be considered legal:

    case blah:
      if (cond) {
        throw blah;
      }
      if (cond2) {
        some_noreturn_function();
      } else {
        return 42;
      }

etc etc. My example is a bit contrived, and the wisdom of doing such complicated things in switch/case is questionable, but it was useful when dealing with an existing codebase.

Another advantage was that our tool was designed for extremely fast analysis over the entire codebase, so errors were given to programmers as they were writing code, immediately. (200ms response time to any change. At that level you can run it on every keystroke, which we did.) Traditional linters can in principle do this, but in practise are often not written with this sort of performance requirements in mind.

(And yes, PHP allowing arbitrary expressions in case labels was... probably not a good idea.)

muglug · on April 6, 2020

FYI, phpcs now has something similar (since roughly 2016), but I imagine your implementation predates that by a significant margin (and yours obviously supports noreturn too).

> PHP allowing arbitrary expressions in case labels was... probably not a good idea.

Yeah, possibly the worst offender is the (somewhat common) "switch (true)" pattern: https://psalm.dev/r/4c168a9c5d

Sadly I don't have the latitude you all do to govern how code gets written by users of my own static analysis tool – fall-through has to work there, too: https://psalm.dev/r/1d4ee59bd6

edflsafoiewq · on April 5, 2020

A switch is a kind of computed goto. It's a fantastically useful construct when you need it. You don't usually need it. I'm not sure what mini-syntax you're referring to, the syntax is roughly what I would have guessed it would look in C if I hadn't seen it and you just told me its semantics.

Most people treat switch as a peculiar form of match (or cond or whatever), probably because this would usually be much more useful to have than a computed goto, and in many derivative languages they've put in weird additions that seem to be an attempt to drag it halfway to that, things that don't really make any sense in C, like allowing dynamic expressions in the cases (such as your example), or forcing all cases to be at the top level.

(I feel like I write a lot of "in defense of switch" comments on HN...)

Supermancho · on April 6, 2020

> Most people treat switch as a peculiar form of match (or cond or whatever)

Completely made up switch illustrates how I use it in every language if I can...Go has a succinct version of this.

    switch(True) {
        case ($x == 1):
            $x++;
            break;
        case ($x > 3):
            $x--;
            break;
        default:
            break;
    }

edflsafoiewq · on April 6, 2020

Why? It's longer, more complex, and more highly indented than a normal if-elif-else.

benhoyt · on April 5, 2020

This is actually a feature in Go, but you leave off the "true" condition entirely: https://tour.golang.org/flowcontrol/11

I quite like it, as it cuts down on the amount of brace and if/else if noise.

kazinator · on April 5, 2020

That could be the work of a Lisp coder rebelling against having cond taken away, and having to fend with if/else/elif.

aasasd · on April 5, 2020

Oh, no—that was the guy who loved MS and .NET despite working in a Linux-based shop, drove a Chinese-Russian SUV when not riding a roadbike in tight pants, supported Putin and swore about every ten words in both speech and chats, that's all while looking hella dorky. One thing I can't picture him doing is pining for Lisp.

Now, grinding everyone's gears with his abuse of `switch`—that was right up his alley, what with already grinding them with his political commentary outside of code.

RealityVoid · on April 5, 2020

What language is that? It's not C, because it wouldn't compile.

JadeNB · on April 6, 2020

Looks kind of like Perl to me, except that Perl uses `given … when …`.

naniwaduni · on April 6, 2020

The third example is valid C.

userbinator · on April 5, 2020

On the contrary, I've heard of a story where a team decided to apply some new tool to their codebase which warned about a missing break, inserted the break so the warning disappeared, "fixed" the failing tests that now resulted, and then upon the next release was subsequently screamed at by numerous customers for the unwanted behaviour change.

but in a very high-level language like PHP, there’s really no reason for switch to fall through at all

I disagree. Switch fallthrough is very useful because it can reduce code duplication, especially when the logic has a ladder-like structure. Trying to impose increasingly draconian and such arbitrary rules is only going to lead to a self-fulfilling-prophecy where all the intelligent developers will get fed up and leave, and what's left are those which will continue to create tons of bugs some other way instead.

jwatzman · on April 5, 2020

Author here.

Yeah, you can't just mechanically insert the breaks :) I carefully went through each instance where my rule tripped, looked at the surrounding code, and figured out the right fix. Trying to automate that would indeed have lead to disaster -- and the lack of ability to automate this is why no one had done this in the past, it sounded like too much work. (It wasn't that much work.)

And I agree switch fallthrough can be useful! We just required it be annotated from now on, to convey the intent, as opposed to doing it by accident.

nicbou · on April 6, 2020

I don't think I would quit a company over a more stringent linter.

To me, this is the same attitude that leads to safety hazards in other industries. "I don't need safety measures because -I- am not an idiot".

Even the smartest developers make dumb mistakes. If you can eliminate some of those early with minimal friction, it's worth it. The earlier you catch a mistake, the cheaper it is. Linter rules are less of a hassle than bugs in production.

earthboundkid · on April 6, 2020

The kind of person who complains about the linter is the exact reason your company needs a linter.

Cyberdog · on April 6, 2020

It reduces code duplication at the cost of increasing complexity and error-prone-ness, and there are ways to reduce the duplication. For example, given the following snippets:

    switch ($foo) {
      case 123: 
        // do 1
      case 23:
        // do 2
      default:
        // do 3
    }
    
    if ($foo == 123) {
      do1();
      do2();
      do3();
    }
    else if ($foo == 23) {
      do2();
      do3();
    }
    else {
      do3();
    }

Yes, the second one is more verbose and duplicates code, but I think it's also far more clear in terms of intention.

dalemyers · on April 6, 2020

You can even take it a step further:

    if ($foo == 123) {
      do1();
    }
    
    if ($foo == 123 || $foo == 23) {
      do2();
    }
    
    do3();

evmar · on April 6, 2020

I did a similar thing for Chrome back in the day, to have the compiler check an `override` annotation:

http://neugierig.org/software/chromium/notes/2011/01/clang.h...

vortico · on April 6, 2020

Just a reminder that in most compiled and JIT optimized languages with switches, this

  switch (c) {
    case 4: return 30;
    case 5: return 70;
    default: return 0;
  }

compiles to the same instructions and therefore same performance as

  if (c == 4) {
    return 30;
  }
  else if (c == 5) {
    return 70;
  }
  else {
    return 0;
  }

As a side note, it's easier to see accidental fallthroughs if you write switch statements like this.

  switch (c) {
    case 4: {
      return 30;
    } break;
    case 5: {
      return 70;
    } break;
    default: {
      return 0;
    } break;

But in my opinion, the `if` statement is more clear in both cases, and only one level of indentation instead of 2.

comex · on April 6, 2020

Huh. To my eyes, the switch version is "obviously" much more clear! It's significantly more compact, with less visual noise you have to mentally ignore. And having the possible inputs and outputs lined up vertically (especially with nothing else between them) makes it easy to understand the logic. Plus, using a switch indicates at a glance that the variable being tested is the same in all cases, whereas with an if-else-if chain there could be any weird expression hiding in one of the conditions.

Eye of the beholder, indeed.

vortico · on April 6, 2020

Perhaps a better comparison to the switch statement is

  if (c == 4) return 30;
  if (c == 5) return 70;
  return 0;

comex · on April 6, 2020

Yep, that's a big improvement in my eyes, making it comparable from a readability perspective. In my opinion, that leaves more minor pros and cons of each approach, depending on the language. For example, in C and C++, as kazinator mentioned elsewhere in the thread, GCC and Clang now have an option to require all fallthroughs to be explicitly annotated, which effectively removes one of the main downsides of switch. And switch is better at detecting duplicate cases, as well as checking for exhaustivity if you're matching against enum variants. On the other hand, if you change the example so that the cases don't return (perhaps they assign to a variable instead), with switch you would need to add a `break;` after each one, increasing visual noise. If statements don't have that problem. Or if you change the example so that one of the values `c` is compared to is not a constant, some languages (C, C++, Java but not JavaScript, PHP) don't allow that with switch statements...

bla3 · on April 6, 2020

That's only true if there are <= 3 "clusters" of switch case values, else compilers do smarter things. See https://youtu.be/gMqSinyL8uk

vortico · on April 6, 2020

Is this what you mean by more clusters? Looks like clang compiles the two functions to roughly the same "tree search" code. https://godbolt.org/z/7M_3PU

enriquto · on April 6, 2020

Are you sure? I'd guess that if the cases are an interval of consecutive integers it may do a "jmp [c]" sometime.

KMag · on April 6, 2020

You're correct. Many compilers will emit jump tables if the domain is dense enough. Also, even if compiling to a bunch of conditional jumps, they'll usually emit them to require O(log N) conditional jumps (equivalent to a binary tree of nested ifs) instead of the O(N) linear arrangement suggested by the GP.

vortico · on April 6, 2020

Clang compiles both switch statements and `if` statements to the same tree search code. E.g. https://godbolt.org/z/7M_3PU My point is that the optimizations you're describing for switch statements are also applied to a bunch of `if` blocks, at least in LLVM-based languages with an LLVM version from the last several years.

KMag · on April 6, 2020

Sure, I didn't mean to imply that compilers never (or commonly didn't) detect that long if-chains could be optimized to jump tables, lookup tables, or if-trees. I was just answering the more narrow question the GGP was asking: if switch statements were always lowered to if-chains and never compiled to jump tables.

vortico · on April 6, 2020

Ah I see, gotcha.

vortico · on April 6, 2020

Looks the same to me. https://godbolt.org/z/CfhD5k Switches were invented mostly because they were an easy hint to the compiler to reduce to a tree search, but because compilers are smart today, they're obsolete unless you just prefer their syntax.

Even if the cases are consecutive, most compiler optimizers are smart enough to determine that the ASTs are equivalent.

enriquto · on April 6, 2020

> they're obsolete unless you just prefer their syntax.

This is a good point.

Sometimes I prefer the (ugly) syntax of switches than that of conditionals. The reason is that a chain of conditionals is non-symmetrical (there's no condition for the first "case"), while a switch is formally invariant to permutation of the cases. I have even seen shit like this:

    if (false) ;
    else if (c == 1) { first case ; }
    else if (c == 2) { second case; }
    ...

just to preserve the symmetry along the cases.

vortico · on April 6, 2020

Are you talking about the `else` being non-symmetric? You don't need it at all. I do this.

  if (c == 1) { first case ; }
  if (c == 2) { second case; }

enriquto · on April 6, 2020

...and hereby I give back my programming license and go back to civil life.

pintxo · on April 7, 2020

This might result in another fallthrough scenario though, wouldn't it?

vortico · on April 7, 2020

No, because `c` can't be 1 and 2 at the same time.

pintxo · on April 8, 2020

In this specific example. But you cannot rule it out in general.

joatmon-snoo · on April 6, 2020

Google applies ErrorProne in the default javac toolchain to mitigate this for Java builds: https://errorprone.info/bugpattern/FallThrough

dlbucci · on April 6, 2020

Recently, my coworker basically asked if he should set a TTL on a password for a job we were running. We said yeah, but our code would need changes to accommodate it before it expired and we didn't have time to make them.

I joked that we'd make a JIRA ticket that we'd repeatedly deprioritize until our code mysteriously broke in 6 months, but then my coworker actually made an Outlook reminder for when we had a month left!

I thought that was really smart, so I guess I learned a lesson about actually solving problems instead of just pointing them out like a smart ass.

CGamesPlay · on April 6, 2020

What did you do in cases where the fall through was desired behavior? Did the resulting code (replacing a fall through with a nested if or something?) look better or worse as a result?

The typical way to handle this at companies that don't have a static-type-checker department is to require a comment that says /* falls through */ at the end of the switch. Why was such a simple solution inappropriate here?

mplewis · on April 6, 2020

Because code reviews are more fallible than automatic linters, and OP had the luxury of working somewhere that the noisy people didn't fight for an anti-feature.

CGamesPlay · on April 6, 2020

I wasn't talking about requiring it in code review though. The OP didn't make an automatic linter that issued a warning when you did this implicitly, he removed the ability to do it entirely. With a linter, you can add a comment, like /* falls through */ to disable the lint warning. With the OP's change you can... use a series of if-else blocks? The alternative isn't listed, which is why I asked.

Sorry about your last company, though. Sounds like it was annoying to work there!

jwatzman · on April 6, 2020

OP here.

Explicitly-annotated fallthrough was still allowed; I codified a particular annotation that the analysis tool understood and allowed. I removed implicit fallthrough, but explicit fallthrough can be quite useful sometimes.

GhostVII · on April 6, 2020

Sounds like that's what they did:

The cleanup diffs I’d written that afternoon mostly just annotated clearly-intentional fallthroughs

l0b0 · on April 7, 2020

Some of my favourite diffs were related to linters:

- Linting files different from origin/master in a pre-commit hook.

- Linting everything in CI.

- Making the linter rules stricter (mypy does a great job of having several orthogonal strictness flags, so you can pick the set appropriate for your familiarity with the tool).

- Implementing project-specific lint checks.

ChrisMarshallNY · on April 5, 2020

Cool story.

I use Swift. It does not have default switch fallthrough. After a couple of versions, I learned about the fallthrough statement.

I remember being frustrated by it, at first, but, like so many Swift oddities, it rapidly became second nature; thus, proving out that one of the goals of Swift is to train developers to write code properly.

tcbasche · on April 6, 2020

Go also has the same behaviour, with the fallthrough keyword. I find myself using a switch so rarely anyway...

earthboundkid · on April 6, 2020

Go has a fallthrough keyword, but you can also do case "a", "b": for simple things where two cases are the exact same.

ChrisMarshallNY · on April 6, 2020

Swift allows the same thing. In fact, Swift switch statements are powerful as all git-go. You can do things like specify ranges and whatnot.

https://littlegreenviper.com/miscellany/swiftwater/ranges-an...

https://docs.swift.org/swift-book/LanguageGuide/ControlFlow....

That said, I don't use them that often. They tend to take up a lot of real estate on the screen, and introduce a fair bit of CC.

saagarjha · on April 5, 2020

I thought the first release of Swift had fallthrough?

ChrisMarshallNY · on April 5, 2020

Maybe it did, but I didn’t learn about it for quite some time. I just learned to work around it.

Swift is a deep river. I still learn new stuff every day, and I write Swift, seven days a week.

hkai · on April 6, 2020

If the author discovered tslint/eslint he'd be surprised that this problem was long solved in the JS world :)

muglug · on April 6, 2020

I believe this happened before tslint or eslint existed.

unnouinceput · on April 6, 2020

So many cases about this and that, pro and cons of switch fall-through, while Pascal was King from beginning.