Hacker News new | past | comments | ask | show | jobs | submit login
Tidy First? (henrikwarne.com)
153 points by ingve on Jan 10, 2024 | hide | past | favorite | 58 comments



Guard clauses, something so simple yet so underrated, you never know how much you love them until you meet people from the single return per function church.

It still amazes me how they can believe that putting all the guards in tens of nested ifs can be saner than a few extra returns at the top.

And don't get me started on those that goto is bad but it's ok to put everything inside a do while (0) they can break out of and clean up at the end.

Sorry for the off-topic rambling, tidying is like "clean" code, everyone has their own definition and sometimes changing code for the sake of making it tidy is more dangerous than just changing the behaviour, especially if it was already tidy by some other people standard.


I don't favor single return per function myself, but in C code bases where that was the norm, I liked being able to add a debug print in one place which logs the return value.

  DEBUG("%s (%s) -> %d", __func__, str_arg, ret);
  return ret;
Single return doesn't necessarily mean nested ifs; it can be linear.

  bool success = true;

  if (condition(arg1)) {
    success = try_that(arg2);
  }

  if (success && condition2(arg3)) {
    success = other_thing(arg2);
  }

  ...

  return success;

Resource acquisition:

  widget_t *w = widget_create(arg);
  gadget_t *g = w ? gadget_create(w, arg2) : 0;
  char *gncopy = g ? strdup(gadget_name(g));

  if (gncopy) {
    // we must have g and w */
  }
Early returns can cause bugs; they can be premature. The coder though that if a certain condition is true, none of the other cases matter. And may be that was true, but now some of the cases (perhaps new ones) do matter. Someone wrote code to handle them, but didn't notice it's not reached because of that early return.


You can define your guard clauses as macros and have them always log a warning and still have your debug log before the final return. See e.g. g_return_if_fail/reached and g_return_val_if_fail/reached in glib/gmessages.h.

Early returns can definitely cause bugs, but here we're talking about basic sanity checks and argument validation, not about program logic. Their purpose is if any of those checks fail we are in a very bad state and there's no point on entering the function at all.


I agree. Guard clauses and chunk statements make code much more maintainable. It has little cost of implementing, but a huge return on readability and reduced complexity. Nested if-statements are often hard to read and easy to break.


"Goto is bad, throw an exception instead."

I meet them everywhere. It's just obvious how strong culture is.


You can mostly avoid running into the single return people, if you sneak into and out of the building by the rear entrance.


> It still amazes me how they can believe that putting all the guards in tens of nested ifs can be saner than a few extra returns at the top.

If I have a function that needs a group of guards like that but local style guides demand a single exit point, I start a boolean like “AllHunkyDory” that gets set false where a return would be. Then you can keep the less nested structure and just wrap the main body in “if(AllHunkyDory){}”.

It is probably a touch less efficient, that variable is going to take up a register at least throughout the checks, but a fiunction with piles of guard clauses is hopefully not in a tight loop anyway so that is unlikely to matter.


Sure, there are ways, less readable and more error prone. But that's not what you see where this rule is enforced. You see nested ifs.


> It still amazes me how they can believe that putting all the guards in tens of nested ifs can be saner than a few extra returns at the top.

On the other hand, if you have ten early returns your function should most likely be split into smaller functions anyway. The advantage of sticking to structured programming is that it becomes obvious when it's time to do so.

Another advantage of pure structured programming (which implies a single exit point) is that you can easily add post-conditions to your functions, i.e. assertions which must hold true before the execution returns to the caller.


I miss this in LISP. It might force me to better structure my programs so that a lot of this is dealt with in an up-front manner or simply can’t occur. I do it all the time in JS though and love it.


    (let/cc ret
      (when (= y 0) (ret nil)
      (/ x y))


Your snippet is not working.

  #lang racket

  (define (division x y)
    (let/cc return
      (when (= y 0) (return null))
      (/ x y)))


Does this church still exist?


Sure, I'll play devil's advocate!

> Guard clause – exit a function early if certain conditions are not met. This makes the rest of the function easier to write (no nested if-statements).

If you have:

    if(!isUserPort(strPort)) return;
You're gaining knowledge about input, and then immediately throwing it away (assuming the 'happy' path.)

Later when you use strPort, do you check it again? If you pass it to another function, do you check it there too?

"Parse, don't validate" is the name that's been given to the following:

    case parsePort strPort of
        NotAPort -> ...
        RootPort port -> ...
        UserPort port -> ...
It's more nested, but now you can pass your UserPort around and not check it too many or too few times.


Hm

Unless your parsePort returns a specialized type which then can be enforced by the compiler, and all your other APIs use the specialized representation, it's sort of "six of one, half a dozen of the other" situation because if you're throwing your strPort around, you'll be parsing it again and again and again.

And I can achieve what you are showing me just the same by putting my guard clause up the stack, closer to the main program or whatever.

Not all data is so clear cut, your guard clause may be guarding against a numeric struct member whose value is out of range for this one function to operate.

Which is why I usually have a few different wrenches for each size of nut I may encounter. Not dunking on your approach, let a hundred flowers bloom and all that.


I’ve definitely run into its members recently. They tend to have other strange beliefs, like “methods can’t have more than four lines” and “you can instantiate an object only in a factory service”.


I didn't know the Clean Code church was also preaching the single point of return. Single point of return used to be big when Structured Programming was this "newfangled stuff". I'd rather expect some CS professor types smelling of mothballs and shambling about in empty hallways.


I think it's still required by by the MISRA-C standard, so if you write code for cars and some similar industries, you may get forcefully dragged into that church.


MISRA 2004, Rule 14.7: "A function shall have a single point of exit at the end of the function.", required - https://www.ibm.com/docs/en/rtr/9.0.0?topic=review-code-misr...

MISRA 2012, Rule 15.5: "Only one exit point should be defined in a function.", advisory - https://www.ibm.com/docs/en/rtr/9.0.0?topic=review-code-misr...

HN users cordenr and bfrog say it was dropped in MISRA 2023: https://news.ycombinator.com/item?id=38680587 and https://news.ycombinator.com/item?id=38704631 . Both in a recent thread on MISRA 2023 at https://news.ycombinator.com/item?id=38674158 .

FWIW, HN user FirmwareBurner in that thread links to the "Embedded System development Coding Reference guide" version 3.0 (2018) from the Software Reliability Enhancement Center, Japan at https://www.ipa.go.jp/publish/qv6pgp00000011mh-att/000065271... which says

  M3.1.5:
   A function shall end with one return statement.
   A return statement to return in the middle of processing shall
   be written only in case of recovery from abnormality.


I was not aware of the changes in MISRA-2023, thanks for informing me!

Also, that's great news to hear in case I ever want to go back to the automotive industry. :)


Well, aren't early returns in guard clauses "recovery from abnormality"?

You could flex some language lawyering to define "abnormality"...


I don't know.

Here is the only relevant example I found:

  p = X_MALLOC(sizeof(*p) * NUM);
  if (p == NULL) {
    return (MEM_NOTHING);
  }
  ...
  X_FREE(p);
  return (OK);
It's clearly abnormal.

Something like a:

   if (get_size(obj) == 0) {
       return empty_case;
   }
   ... do more complicated code here ...
   return result;
does not feel "abnormal", which are things you likely want to log, given:

   Logs should be output not only when an abnormal condition
   is detected, but also at the timing of, such as, data 
   communication with an external system.


I believe that's exactly it. An exception to the single return rule to allow for guard clauses. Wording is maybe a little too obscure though.


Yep, it may come from that, or maybe as it often happens someone wrote a best practice in the coding style rules, something reasonable like "try to avoid returning from the middle of a function".

People didn't understand that, too vague, hard to enforce. Someone else decided to make it clearer, "try to avoid multiple returns inside a function".

Then someone else comes and decides to tidy up the coding style document, put it in imperative form: "All functions should have a single return statement".

Give it a couple of years, people forgot why the advice was there in the first place and now they just blindly apply an insane rule, teach that to new hires and enforce it in reviews. All functions now look like big arrows and every once in a while the max line length rule has to be relaxed a little to allow for all those levels of indent.

Maybe that's how it entered MISRA-C in the first place.


I'd argue that in languages with manual resource management, you should be very careful how you return in the middle. Say you allocate something, or initialize communications with a device before a loop begins. In the middle of the loop, you find your result early and want to return it.

But your "return" should also have all deallocations performed! Also if you open communications to some external devices which tend to be notoriously stateful, you need to bring them back to a usable state by either completing commands sent so far, or resetting them, or whatever is needed.

If there is nothing analogous to "defer" statements or "finally" clauses, you will be in a world of pain if you sprinkle return statements without paying attention to all details.

Thus I can see a point of such a rule being in MISRA-C or something like it. Guess it's better to enforce an easily checkable rule (prone to birthing monsters) than educate all people properly.


Notice I purposedly used "reasonable" and "in the middle of a function". I know where this rule comes from. It's how it evolved into a blind prescription that is a problem.


Yep. When investigating a solution for a government-adjacent organisation I've been handed a standard which specified single return per function.

A recommendation likely lifted from one of these sources.


Definitely. Let's wait if some adept shows up. I met plenty IRL.

One specific case I have in mind I think it came from the company mostly hiring fresh graduates and seniors indoctrinating them with their blind and badly aged structured programming rules.


>Sorry for the off-topic rambling, tidying is like "clean" code, everyone has their own definition

That's blasphemy. There is only one true clean code and that is what Uncle Bob is preaching himself.


For anyone interested in reading full chapters of the book (all are relatively short!), here are three full ones [1], shared with the permission of the author (Kent Beck) and publisher (O’Reilly).

[1] https://newsletter.pragmaticengineer.com/p/dead-code-getting...


>move declaration and initialization together

Another, even better reason to do this is to prevent the declared object ever being in an invalid (e.g., undefined or null) state. If it's born in a valid state, and every method call leaves it in a valid state, it's harder for things to go wrong.

In the C++ world this approach is encouraged, finding its highest form as RAII: the convention of designing classes to be declared as objects (not references to them), with constructors that leave them fully formed, and destructors that silently and automatically clean them up (thanks to deterministic destructor calls -- one of the best decisions made by the language designers.)

Sadly Java prefers the "bean" convention of new-ing up a "blank" object and then calling setAbc(), ..., setXyz() on it. I hope every time you create an object, you remember to call set...() for every field! Including when you add a new field later!


Current Java practices seem to prefer the 'builder pattern' where you chain a few setter functions and the call a .build() function that returns the object. If a new field is required you can add an assertion in the builder.

The 'never and invalid state' pattern has one downside: there always seem to be exceptions in business logic and models so it's very hard to require things to be there 100% of the time. I've seen a system being built over a few years that started with valid objects/hard requirements that were partly removed when business needs and exceptions to the rule became more clear. Those changes caused bugs due to code not handling the null/doesn't exist case. Handling that from the start would have been a lot less work.


Good point about the Builder pattern. Not a fan of the additional boilerplate but it does give the class developer the opportunity to detect forgotten fields and fail fast.


Am I the only one who thinks that code where declarations are mixed with statements is harder to read?


Nope!

Keep the declarations, ditch the statements.


Pure declarative programming.


> In many work places, there are high fixed costs (in time and effort) associated with PR reviews. The ideal solution for this, according to the author, is to not require PR reviews for only tidyings.

Well that sounds like a loaded footgun.


I think requiring reduced rigor for housekeeping changes is reasonable but there should 100% be somebody in the loop doing manual reviews for all changes.


“One pile” vs “extract helper” - I like this distinction. This is the main mental model I’ve arrived at for reading Peter Norvig’s style of coding - a style which i can’t help but be really drawn to.

I’d understand one pile not as just throw everything in one function but thoughtfully arrange code with surgical precision into purposeful blocks with no noise or distractions. I.e. it’s as much about getting rid of crap as it is about concentrating the line of execution.


I'm fairly certain I used one pile for functions that were broken down to multitude of poorly named ad-hoc helpers.

Bringing in the logic into one place then enabled me to eliminate dead code, redundancy and pinpoint bugs that were hidden deep inside helpers but could only be recognized as bugs in the context of the rest of the algorithm.

Sometimes veritable forest of helpers reduces to a single short, clear function.


Seems like mostly good advice that we all learn (well most of us) but is rarely written down.

I have to caveat this though:

> Therefore, reducing coupling will reduce the cost of change.

This seems like unwise advice. People are going to read it as "coupling = bad" and then try to eliminate coupling even between components that are still coupled, through things like dynamic typing, queues and so on. This only reduces the appearance of coupling.

For example say you have an API endpoint that is supposed to perform some action in another service. A lot of people would "decouple" them by having the endpoint place a request in some shared Kafka/MQTT queue or a file or whatever, and then the other service can poll that. Tada! Decoupled! Much better than a direct RPC call right?

Except they still depend on each other - you've just made the coupling less robust, so it can easily break when you make a mistake refactoring.


> - you've just made the coupling less robust

In my opinion this depends on the use case. If one service has a strong dependency to a different service, it may be a case of "distributed monolith" and they should consider if the functionality should be moved to the original service instead.

If two services need to communicate with each other, I don't see how a message queue makes the coupling less robust. What is the difference between sending a message or making a http call? You still have a coupling between the two services.

Message queues used right can actually increase operational robustness in my opinion. By placing a message on a queue you get a lot of functionality for free.

- You distribute the load automatically so that the risk of overloading a single instance is reduced

- You can limit the ingestion flow, so that services only process the data at a rate that they can handle - If a service fails or shuts down, the message remains on the queue for the next instance to retry. You get a lot of retry logic for "free".

- You can easily monitor and scale resources based on the processing rate of the queue itself. Some queues can be allowed to grow large during peek hours, because they can be processed during off hours, while others may require scaling up the service capacity to cover a minimum latency requirement.

Misuse of message queues may of course lead to distributed monoliths and add a lot of complexity in understanding how data flows within the system. This is not something specific for queues and it's the same for direct HTTP calls. As all things, we need to use the right tool for the right purpose.

Actually, one of my pet peeves are internal web hooks. Web hooks are good at notifying an external partner compared to the alternatives, but the complexity is too great to justify its use for internal services as well. Just use a message queue instead for your own services.


I read the book a few weeks ago and I don't remember the passages very clearly but I definitely didn't get the sense it made blanket judgments on coupling, in fact it appeals to an awareness of weaker, perhaps implicit forms of coupling like you the one suggested.


There is a chapter about coupling in the free sample [1].

And it is essentially your point.

[1] https://news.ycombinator.com/item?id=38944611 -- https://newsletter.pragmaticengineer.com/p/dead-code-getting...


Bought with work education fund. Read it in a few hours on a flight. Like others have mentioned, it is a lot of common sense, but seeing techniques on paper was useful to me. I learned a few new things and it reinforced the things I have already picked up in my career so far.


Kent is an amazing guy! Met him a year+ ago at a friend's house and was fantastic to discuss the initial idea he had around this book, which was based on tons of experience at various engineering firms. He has been posting parts of chapters or thoughts in his newsletter for a while now - so it's awesome to see it compiled in one place!

Newsletter if anyone wants to subscribe https://tidyfirst.substack.com/


Often when I read articles/books with this sort of advice I get halfway and think to myself "Hmmm... I really should write my own book about programming stuff".

I'm sure many other senior programmers have similar thoughts.


Would anyone else recommend the book?


Everything mentioned in the article is obvious to me ... however it has been hard won lessons over years and years and years.

Someone more freshly baked could probably get a real nice head start with the advice in the book.


On the pricing that the other comments mentioned, if you happen to have friends coming from India -- it is less than $10 here. ;-)


I would under the condition you have never read or heard about refactoring. If you’re expecting thorough techniques, you won’t find them. They’re literally single page chapters where the advice might be “move functions near their argument declarations.” It’s very basic, but the content is quite ever green. It’s something that should be included in every employee handbook and often feels like trying to change SWE culture to one of “fix then ship.” I’m really curious what Beck’s other books in this series will be.

I’m about halfway thru, can’t comment on the last third which is different and offers a newish mental model.

tldr: Good but expensive for the content amount. Should be $15 IMO.


It's funny. The last third is where it gets interesting. And then it stops – to be continued in the next volume. Except that Volume I is only 100 pages, and many of them blank.


That's really exciting to hear. I hope I didn't come across as negative in my view. I am enjoying the book immensely but so far doesn't seem too different than other refactoring books.


Bought it with work money but would be unhappy spending $38 of my own since its pretty short and a lot is common sense. Some good nuggets of info in there though.


> a lot is common sense

Probably the best compliment one can make of such a book.

Common sense tends to be lacking when it comes to coding practices, and books about it don't really help. Instead, they tend to go with practices that are completely detached from reality. Also, things that require unreasonable effort, so that if you don't do it because you are too neurotypical for that, then "you are doing it wrong" and if it doesn't work then it is not the author's fault.

And even if it common sense in that that's what every experienced programmer is already doing, then, still great. First because not every programmer is an experienced programmer, obviously. And even if you are, seeing it writing helps with consolidation.

So really, you are getting me very interested right now.


Perhaps a good metric for how much this should be worth would be to ask whether buying this book for a junior developer with your own money would save you that amount in the long term. Based on the article only, and the amount of times I've personally had to have similar discussions, I'd say it's worth that amount at least (and likely more).


I thought a lot of it was common sense too but I've been coding for 15+ years. I wonder how much of it is common sense to someone with less experience.


I haven't read it but it sounds like it gives names to and elucidates a lot of the stuff I am doing these days so I will read it soon.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: