Book Review: "Tidy First?" By Kent Beck

edvardas · 2024-04-01T21:34:24 1712007264

> He gives some examples of ways that programmers, even after being taught in intro classes not to use magic numbers, still litter their code with constants like 404. But I wish he’d tell Python programmers to stop designing APIs where you write string constants like “r--” and “bs” to denote that your scatterplot should use red dashes and blue squares.

Intriguing. I assume the author would prefer a stronger-typed alternative like explicit parameters and enums. Yet I wonder if having a small DSL-like syntax is actually the better for a scripting language. Most of these plots will be hacked together in a local notebook anyway.

What would be a better alternative to this terse DSL in such case?

bckygldstn · 2024-04-02T00:11:54 1712016714

This line style API was included MATLAB[0] (and perhaps designed elsewhere earlier) in the Olden Days where terseness was both more necessary (due to space and performance constraints) and more accepted. MATLAB development started in the 60s, though this DSL was likely added in the 80s.

Later, Python's matplotlib library started life as an emulator for MATLAB graphics in Python so naturally included the same plotting DSL.

Only later still did matplotlib morph into the defacto general Python plotting library. And then because plotting is so complex and matplotlib exposes so much control, most subsequent plotting libraries were based on matplotlib, opting to add value via high-level abstractions and better defaults, often exposing the underlying matplotlib objects to allow for fine-tweaking. And so the linestyle API leaks into those libraries too.

All of that to say, this DSL was likely invented by a scientist in a lab in 1981 and has survived through inertia and "jumping hosts" a couple times, rather than careful design.

I think the DSL is bad. And the matplotlib developers may agree, because while you can pass a combo like "ro--", you can also pass these parameters separately and more descriptively like

   color='#f00', linestyle='dashed', marker=matplotlib.markers.CARETDOWNBASE

[0] https://www.mathworks.com/help/matlab/creating_plots/specify...

trollerator23 · 2024-04-02T02:05:18 1712023518

Indeed this comes from MATLAB which is god-awful, and I say this as an enthusiast hardcore MATLAB user.

alanbernstein · 2024-04-02T00:14:35 1712016875

Is there another library besides matplotlib (and its derivatives) that use this "linespec" format? And doesn't matplotlib's API derive from Matlab?

What's worse, using a terse linespec syntax and being stuck in Matlab-world, or using a terse linespec syntax in Python, and at least possibly being exposed to relatively modern and maintainable software practices?

petsfed · 2024-04-02T00:07:02 1712016422

I suspect that the plot formatting syntax is very old. I remember doing stuff like that with SuperMongo (popular amongst astrophysics ca 2002), which predates e.g. MatPlotLib. It may be as old as computer controlled plotters themselves.

mplewis · 2024-04-04T13:52:35 1712238755

Use enums and types to compose an API that shows you what options are supported and how you can combine them.

piokoch · 2024-04-02T08:09:35 1712045375

From this review it looks as this book was some version of infamous Youtube "coaching" that gives you advice "to be strong", "hard working", "believe in yourself" and if you feel depressed, well, you should run 10 miles every morning.

My the most favorite fragment of this review:

"But then he [Kent Beck] spends 2/3 of the book talking about how to schedule time for tidying [...] code"

"And then when I asked him about this, he actually said that I’m right but should wait for his next book [...]"

Yuck. At least this Kent Beck guy is honest.

criddell · 2024-04-01T20:21:23 1712002883

I recognize most of the tidying patterns listed in the article and the associated Twitter feed. My problem is the unnecessary noise in version control. If I’m trying to see when some chunk of code was changed using blame, having a bunch of small edits can make it a lot more difficult.

Still, I tend to do this kind of work on days when I’m not feeling great. I work on a large, reasonably old codebase (28 years old) so tidying busywork sometimes leads me to someplace interesting.

jolmg · 2024-04-01T21:38:53 1712007533

If the noise is behind merges, you can also use `git blame --first-parent` and it should show the big merge commit with the comprehensive explanatory message, rather than the small commit in the feature branch. You can use `git show -m` to show the diff of those merges.

netghost · 2024-04-01T23:54:25 1712015665

For a little more background, this article explains how to keep a record of which revisions shouldn't show up: https://tekin.co.uk/2020/09/ignore-linting-and-formatting-co...

Used well, I think it should help with this problem.

thestoicattack · 2024-04-01T20:31:03 1712003463

git blame does include --ignore-rev and --ignore-revs-file, so maybe if people updated such a file when making small edits, it would make your life easier.

criddell · 2024-04-01T20:53:19 1712004799

Thanks for the idea. I’ll check it out.

We’re actually still using Subversion for our main codebase and mostly happy with it. Being able to use --ignore-revs-file is a reason we might want to switch some day.

david_allison · 2024-04-02T00:49:32 1712018972

    git config --local blame.ignoreRevsFile .git-blame-ignore-revs

barfbagginus · 2024-04-01T22:14:58 1712009698

It can be hard to review or blame small disordered commits. But you can rebase the commits to group and squash them for review. Then you may choose to squash the entire PR when you rebase main onto it.

This has pros and cons. Let's explore them after an example.

Scenario: Big & Noisy PR

We have a PR with 13 commits : 3 fix commits, 5 refactors, 2 doc updates, a CI modification, 1 feature commit, 1 test update

The commits are in random order, and some of the commits are revisions to earlier commits. It's hard to understand the narrative of the commits and review things accurately.

Solution Part 1: Tidy the PR

1. Reorder the commits by type and relevance: 3 fix -> CI ->5 refactor -> 2 doc -> test -> feature.

2. Squash logically similar commits: fix -> CI -> 2 refactor -> doc -> test -> feature.

A squashed commit should have a bullet list detailing each changed module and scope of change:

"""

Fix(package a): Fix a, b, c

This patch fixes:

* (module b): fix [...]

* (module c): fix [...]

"""

3. Review and CI the PR

4. Add commits needed to complete the PR

The tidied PR was easy to understand: We fixed some pre-existing issues, beefed up the ci, then set the stage for the feature. The feature itself was clear and simple.

If you do just this, your history will be much cleaner.

Now I'm going to recommend something controversial:

Solution Part 2: Squash-rebase the PR onto main

Yup. Take all that work and mash it together. The final commit message should look clean and detailed:

"""

Big Shiny Feature

This patch implements [...]

Feat:

* (module d): Implement big shiny feature

Doc:

* (readme): update feature list

* (userguide): add tutorial for feature

CI:

* (workflow a): modify [...]

Fixes:

* (module b): fix [...]

* (module c): fix [...]

Refactor:

* (module a): rename [...]

* (module b): delete unused [...]

[...]

"""

Benefits: - We drop 7x fewer commits onto Main - Project history is more legible - Commit messages are detailed and useful - Bisecting takes log 7 = 2.8 fewer steps

Risks: - File diffs can be illegible if feature work intersects with refactor or fix work - There are 7x more defects per commit - It is harder to uncover root cause if we bisect

Conclussion

Tidying PRs before review is a no brainer - it greatly improves our review and history.

Squashing PRs onto main loses some information, but can make history easier to navigate. Since we're disciplined and detailed in our commit messages, this is often much less of a footgun than it might seem. Each commit is now a logical and self-contained unit.

x3n0ph3n3 · 2024-04-02T04:29:40 1712032180

> He repeats the typical line of “Delete code that’s not used instead of commenting it out, because you can always recover it from VCS.” (That’s a view that I’ve [controversially] started to turn against, for the same reason that “you can recover it from backups” is not a compelling reason to delete files currently not in use.)

That's an interesting take. I've usually encouraged removing commented code because every line of code is a liability, even if commented out.

rjprins · 2024-04-02T09:28:53 1712050133

I agree that recovering from VCS is something that never actually happens. Mostly because the code is forgotten.

Still, commented-out code is generally worthless and should be deleted. Unless it is actively being worked on and only commented out to achieve some short-term goal. In which case it should also not be commented out but instead live on a branch.

Occasionally you might have two flows and you're not sure which is best. But in that case keep them in separate functions and one of those will not be used but can still be covered with tests. And all static analysis tools will work on it. Just needs a comment about why this unused code is there.

pnt12 · 2024-04-02T09:25:47 1712049947

I can see his point: deleted code is really hard to discover. You can fetch it if you know it's there, but how do you find it in the 1st place?

With that being said, commented code is rarely useful and it cna be a liability indeed.

aardvark179 · 2024-04-02T10:22:27 1712053347

Commented out code in downstream forks is particularly bad as it either generates spurious merge problems ( // style commenting means you’ll have to do manual resolution for every line in the commented out sections) or hides real changes altogether (/* */ comments allow whole new functions to be added inside the commented region and go unnoticed - which might cause a compilation failure, but might also just result in different method resolution).

Maybe put a comment about the removal if you want something that can easily lead back to the commit which did it.

hn_user82179 · 2024-04-01T20:08:37 1712002117

Good review, I enjoyed it. It does what I think I look for in technical book reviews, specifically:

- gave a detailed overview of the book

- gave an actual opinion, instead of just the summary

- included specific excerpts to support the opinion

- [bonus] talked to the author about specific questions the reviewer had

And I appreciated this last paragraph:

> So, if you’re considering buying this book, just purchase a subscription to his Substack instead. He’ll earn more and you’ll learn more. The only one who loses is the publisher.

Jtsummers · 2024-04-01T20:07:40 1712002060

> If you’d like more detailed criticism of the book, you can buy my raw unfiltered chapter-by-chapter notes. This will only be available until April 8th, and then I’ll take it down forever. Click here to purchase.

(Link goes to: https://mirdin.com/downloads/notes-on-tidy-first/)

Follow the link and you can buy the detailed criticism of the book for $25. Which is more than the cost of the book new on Amazon ($20, or $24.99 for the Kindle version). Seems somewhere between scummy and scammy.

jrpelkonen · 2024-04-01T23:58:29 1712015909

Unrelated, but this reminded me of this old chestnut that still makes me smile. It is from comp.lang.c FAQ, e.g. <https://www.cs.rpi.edu/courses/fall96/netprog/cfaq.html>:

        The cost is $130.00 from ANSI or $162.50 from Global.  Copies of
 the original X3.159 (including the Rationale) are still
 available at $205.00 from ANSI or $200.50 from Global.  Note
 that ANSI derives revenues to support its operations from the
 sale of printed standards, so electronic copies are _not_
 available.

 The mistitled _Annotated ANSI C Standard_, with annotations by
 Herbert Schildt, contains all but a few pages of ISO 9899; it is
 published by Osborne/McGraw-Hill, ISBN 0-07-881952-0, and sells
 in the U.S. for approximately $40.  (It has been suggested that
 the price differential between this work and the official
 standard reflects the value of the annotations.)

mtlynch · 2024-04-01T21:02:27 1712005347

I admit that my gut reaction is that there's something fishy about selling notes about someone else's book, but thinking about it more, what's the harm?

There's nothing deceptive or sneaky about it. There's value in distilling information from a bloated book down to its useful ideas.

There are lots of books that are just existing ideas repackaged for a different audience. For example, Atomic Habits is basically a repackaged version of BJ Fogg's research papers. And people see value in that, so why isn't it okay to do that in a more 1:1 way?

Jtsummers · 2024-04-01T21:13:57 1712006037

The short release period gives off "For a limited time only!" vibes, a scummy way to entice people to buy something now because they might not be able to later. It's also unclear that the notes even add value, they're described as "raw unfiltered chapter-by-chapter notes". Are they just random thoughts? Are they a better version of the chapters they line up with? Are they a companion to the book (the notes elaborating on the parts this author found to be too brief)? Are they responses to the contents? You don't know, until you spend more than the cost of the not-actually-bloated book that they cover.

> There's value in distilling information from a bloated book down to its useful ideas.

Yes, but it's not clear at all that this $25 limited-time-only set of notes actually does that, and they cost more than a book that is decidedly not bloated (it's 100 pages, 33 chapters, and each chapter is quite short).

mtlynch · 2024-04-01T21:28:45 1712006925

Yeah, I agree. The "limited time only" seems manipulative and totally unnecessary.

I'm less convinced by the criticism about the notes potentially not being worth the money. That's true of basically all products.

In this case, if the buyer chooses to buy these notes based on the minimal information the author has shared, then the buyer should be ready to accept the possibility that the notes won't be what they expected.

jbenoit · 2024-04-01T21:44:46 1712007886

He said it was unfiltered. If I was giving out some of my unfiltered thoughts, I don't think I'd want them floating around the Internet for anyone to read at any moment.

TylerE · 2024-04-01T22:11:28 1712009488

If you’re not comfortable with something “floating around on the internet” show could you possibly be comfortable SELLING that thing to the same people?

jbenoit · 2024-04-01T22:21:42 1712010102

Of course. It means people genuinely interested in your content will have it, but people looking to score cheap Internet points on you won't. ("Did you know he once called chapter 5 of Bob Smith's book 'under-researched," and Bob Smith's grandfather is Puerto Rican, so he's literally saying that people of color are stupid")

If you look at the link, that's what he actually says.

     Sometimes these notes are not things I want to share with the whole world, but might be helpful for a few people.

     So what do I do? I release them, but only for a limited time. I also put on a price tag — not because I expect to make any money, but so that only those genuinely interested will read it.

jjice · 2024-04-01T20:56:20 1712004980

Is it scummy and scammy? I don't see anything wrong with someone selling their notes on a subject, no matter the price. The purchaser can choose if it's worth it to them or if it's not.

I don't see the scummy or the scammy part here. Am I missing something?

fnl · 2024-04-02T04:37:40 1712032660

To me it implicitly means: “I am selling you my distilled copy of the book at nearly the price of the book. That lets me recoup my investment, and after a few days I will take it offline, so I don’t get a law suit from O’Reilley for copyright infringement.”

fnl · 2024-04-01T20:39:40 1712003980

Oh, wow, scummy to scammy, indeed! It’s all in the fine print… I didn’t bother reading this appendix on the first read of the blog post.

Disclaimer: I do really enjoy this book, as it reminds my of uncle Bob’s Clean Code, but a short version, and with the focus on what to do after writing code, when you need to change it or want to understand it better.

Yes, it is structured very much like a collection of blog posts. Which is great for me, as I typically work on learning one “tidying” a day or less. So I am not yet done with the book, end-to-end.

mindaslab · 2024-04-02T10:11:42 1712052702

This guy should learn Clojure https://clojure-book.gitlab.io/

theonething · 2024-04-02T04:17:27 1712031447

tidying first sounds like a good way to go down a rabbit hole before you even start.

Jtsummers · 2024-04-02T05:01:22 1712034082

That’s why the title has a question mark. The book lays out what he means by tidying (structural changes, not behavioral changes) and factors to consider on when or even whether to tidy.

selimthegrim · 2024-04-01T22:05:17 1712009117

And here I was anticipating a pun on tidyverse and a journey into how agile techniques would apply to R

PaulHoule · 2024-04-01T20:08:12 1712002092

Isn't Beck the Dr. Oz of software management? He's built a career on cargo cult practices that destroy software developers. Pushing agile he's done more harm than Tony Hoare's billion dollar mistake.

ebiester · 2024-04-01T20:35:46 1712003746

I go through some of the alternatives to agile in project methodology here: https://www.ebiester.com/agile/2023/04/22/what-agile-alterna...

Go back to 2001. Which non-agile methodology would you have liked instead? RUP? How about we all end up in the PMBOK circa 2001?

We take releasing monthly, weekly, or faster for granted. We take continuous integration for granted. So many of these ideas came explicitly out of agile delivery or were popularized because of it.

legulere · 2024-04-01T20:59:19 1712005159

No silver bullet was published 1986 advocating for incremental software development. Feedback cycles in development were always a target for optimization, but often had technological roadblocks.

ebiester · 2024-04-01T23:33:48 1712014428

That looked a lot more like Spiral development. Think about 3 month iterations where the design docs are built and approved before software development begins. It's a bunch of mini-waterfalls.

PaulHoule · 2024-04-01T21:43:10 1712007790

I've got no problem with continuous delivery. I do have a problem with sprints (instead of wrecking your project with a fake deadline every year, wreck your project with a fake deadline very month.)

Granted to do sprints you have to have a better build process than a lot of shops had in the 1990s but taking a week or two to deliver just because the process says so... means a 1-day delay can snowball into a delay of weeks or months.

Also there are all the meaningless meetings that people dread. The standup that inexplicably happens first thing in the morning when you struggle to remember what you did the day before. The retrospective that I only want to answer with "I am so tired at the end of this sprint that I just want to go ride my bike in the hills or drink some beer or smoke some pot or play a videogame, not sit around uncomfortably in a group of people that are either disengaged and hiding it or doing a good job of pretending to be engaged"

---

To me it is fighting works to bring up the PMBOK because the PMBOK doesn't specify a particular process but it does enumerate the things that have to be managed to manage a project, in "agile the good parts" you are just addressing all of these on a weekly cycle instead of a yearly or longer cycle. I also see the rejection of PERT charts and other dependency managements as a fatal flaw for A.I. and data science projects where model training might take 1/2 of the sprint so if you don't start building your model in the first half you blow. your. sprint. every. time. I first saw people make this mistake 12 years ago and they are still making it. By making people pay attention to a bunch of fake management metrics (story points) you distract them from paying attention to the metrics that matter for a particular app. I've even seen a lack of attention to dependencies be quite harmful to teamwork in more normal software projects because if a team really understood that getting Task A done means you can be efficient at Task B and Task C they might get as much real work done in one sprint than they wind up getting done in three.

layer8 · 2024-04-01T22:10:28 1712009428

Sprints are Scrum, they are neither Agile [0] nor Kent Beck’s Extreme Programming [1].

[0] https://agilemanifesto.org/

[1] https://en.wikipedia.org/wiki/Extreme_programming

bducycy · 2024-04-02T03:26:42 1712028402

https://en.m.wikipedia.org/wiki/Extreme_programming_practice...

Agile is just scrum with less steps.

That said, the idea above that you can do software in year long release cycles is pretty insane.

Release cycles and amount of stakeholder feedback that should be involved in the development process is unique per business and industry needs.

ebiester · 2024-04-01T23:31:44 1712014304

> (instead of wrecking your project with a fake deadline every year, wreck your project with a fake deadline very month.)

I have worked in both (albeit as a junior dev), and the culture of the environment and the severity of the deadline meant much more than the space between deadlines. Deadlines at a tax company are always going to be worse than one with no high and low periods. Missing a back to school launch in edtech is going to be catastrophic, and I'd rather know sooner than later so that we can pull things out of the release.

In the massive integration days, everyone had already done most of the work and there were a lot of pieces that were 70% complete, and the integration dependencies made it brutally difficult to pull something out of a release. I will defend short iterations all day long.

Remember that "sustainable pace" also came out of the agile community. Kent Beck called it a "40 hour week" in XP.

> The standup that inexplicably happens first thing in the morning when you struggle to remember what you did the day before.

Does nobody else keep notes on their days? That said, what is important at a standup is "who needs help" or "who is waiting on something?" e.g. what is falling through the cracks? It helps to get this information on a daily basis, yes.

> The retrospective that I only want to answer with "I am so tired at the end of this sprint that I just want to go ride my bike in the hills or drink some beer or smoke some pot or play a videogame, not sit around uncomfortably in a group of people that are either disengaged and hiding it or doing a good job of pretending to be engaged"

I will defend retros, but only if they result in changes that make the team a better place. If I was in a retro that had low engagement, I'd explicitly call it out in the retro that the format is not generating change.

> To me it is fighting works to bring up the PMBOK because the PMBOK doesn't specify a particular process but it does enumerate the things that have to be managed to manage a project, in "agile the good parts" you are just addressing all of these on a weekly cycle instead of a yearly or longer cycle.

I'm not talking about 7th edition PMBOK here. Let's go back to the 2000 edition.

"Each project phase is marked by completion of one or more deliverables. A deliverable is a tangible, verifiable work product such as a feasibility study, a detail design, or a working prototype. The deliverables, and hence the phases, are part of a generally sequential logic designed to ensure proper definition of the product of the project."

2.1 and 2.2 talk about waterfall and spiral before heading into stakeholder management. It follows through the entire system - consider what integrated change control looked like in those days. It's all things we take for granted now - when was the last time you had to write a design doc that was longer than 2 pages? When was the last time you got in a room to hash out and negotiate scope for a year of work? It's truly brutal!

Now, this isn't to say I love Scrum. I agree that if you are letting your metrics get in the way of your work then you've done the exact wrong thing. If you know you need to start building your model early then you start building your model early. If you aren't paying attention to your dependencies, then the process has failed and maybe you jump to something else.

I'm not even saying there are no flaws in agile, Scrum or otherwise! However, so many of the things that came out of agile were net positive.

bitwize · 2024-04-01T22:04:26 1712009066

Sprints are essential to software development because they make underperformers much easier to fire. With long deadlines, a straggler can take their time and evade detection by management for months if not years, but with sprint commitments to which devs are held accountable on a consistent, fast cadence, stragglers can be easily found and eliminated.

Agile in the workplace is not for you, it's for management.

ebiester · 2024-04-01T23:43:01 1712014981

Any software process is for management. That said, you don't need sprints for that. It just looks like a monthly meeting with your manager, where they say they're disappointed in what you have so far. Then weekly meetings, and then daily micromanagement.

You can micromanage in any process.

zwischenzug · 2024-04-01T21:19:58 1712006398

ebiester · 2024-04-01T23:35:50 1712014550

I meant RUP - Rational Unified Process, but RAD also would work - https://en.wikipedia.org/wiki/Rapid_application_development

maximinus_thrax · 2024-04-01T21:15:44 1712006144

> He's built a career on cargo cult practices that destroy software developers.

That's a very strong statement for which I don't believe you have the data to back it up. He (or his practices) has not 'destroyed' software developers. Not sure if you were working in the pre-agile era, I do agree some shops take to the extreme, but what we had before was a complete mess.

fnl · 2024-04-01T20:27:35 1712003255

That’s an interesting viewpoint, even surprising to me. Can you be more concrete about what you think is wrong with the agile software development movement Kent Beck co-founded, and how it destroys developers?

simonbarker87 · 2024-04-01T20:45:46 1712004346

It has turned into a system that keeps developers in a constant state of crunch time. It simultaneously pushes all responsibility for delivery on to devs under the guise of “the team self organises” whilst stripping them of any real autonomy to actually feel a sense of satisfaction in completing tasks and delivering projects.

When something goes well the anonymity that “the team” creates to the wider business means that praise goes to the project managers and product owners who, realistically, probably did very little beyond the kick off to deliver.

Agile is a system that creates misery for devs.

bitwize · 2024-04-01T22:06:58 1712009218

Indeed. In agile, successes belong to the team; failures belong to the individual developer. The higher-ups will know that it was your commit that broke the build, and they know exactly how many story points you delivered -- or did not deliver -- per sprint by your JIRA metrics.

VHRanger · 2024-04-01T20:32:28 1712003548

Did you read the original agile manifesto? It's very reasonable.

In the 20 years since, agile morphed into the agile industrial complex, where practices at best cargo cult the original idea.

PaulHoule · 2024-04-01T21:28:38 1712006918

It's semi-reasonable. The individual parts are reasonable but putting it together to create a branded methodology™ which is "my way or the highway" was the beginning of the agile-industrial complex as we know it.

You might make the case that Kent Back has actually put together two lines of code and run a compiler (even started JUnit) and the thousands of imitators who talk just like Kent Beck haven't, but if you're going to be critical of the agile-industrial complex you have to be critical of the founder too, who created a toxic style of discourse and a piratical business model of agile consulting.

jdlshore · 2024-04-02T13:59:09 1712066349

The agile-industrial complex came from Scrum and its certified trainers, and its imitators such as SaFE, not Kent Beck and XP. Kent disappeared for a while to raise goats in Oregon, then worked for Facebook and other companies (as an employee). He’s the last person you should be blaming for what Agile’s become.