Hacker News new | past | comments | ask | show | jobs | submit login
The Clean Coder: The Craftsman 62, The Dark Path. (thecleancoder.blogspot.com)
85 points by puredanger on Oct 4, 2010 | hide | past | favorite | 51 comments



This article sort of makes me wonder about the value of test-driven development.

Just glancing at the problem, I came up in my head with what I thought was a simple algorithm to solve it— which turned out to be pretty much the same as the final implementation in the article. It made it a bit torturous to have to keep reading the author beating his head against his tests instead of just asking himself, what's the simplest way to solve this?

Maybe I'm missing the purpose of the exercise? I do see the value of behavioral testing in general. I just have to wonder if we're taking it too far in saying that even if you can see the answer to the problem, you're not allowed to write it down yet.


I'm a raving TDD/BDD fanatic, and this kata still makes me cringe a little. At least they finally converge on a plausible-looking solution. Sadly, some TDD katas on the web turn into train wrecks. See Ron Jeffries' Sudoku solver, for example:

http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-s...

If I were implementing this algorithm using TDD, I'd first write down a bunch of test cases, and then try the obvious algorithm. If that worked, I'd test a few more corner cases. There's really no reason why you have to blindly stagger through the design space when the algorithm is obvious.


I think this is a problem with a lot of methodologies, in that they often obscure that the point is to write good software. If you can write good software without following the methodology, do that. All these techniques - agile, scrum, stand-up meetings, CRC cards, TDD, pair programming, code review, unit tests - are just tools at your disposal to make writing the program go smoother.

Google's methodology for writing quality software is basically:

1. Hire good people.

2. Let them do what they need to to get the product out the door.

This seems to have been adopted by FaceBook as well. I wish more companies would consider it instead of adopting the snake oil that a bunch of consultancies peddle.


Google does a lot of test automation. I don't know if they do test-first or not but I heard that new-hire is given the "Working Effectively with Legacy Code", written by someone who works with Uncle Bob. At Google, there is a special role called Software Engineer in Test: they write tools to help developers debug performance issue, they also sit down with developers to help them to make their code more testable (this happened when you don't do test-first, sometime/once-in-a-while, developers will write code that's not easily testable via automation tests). Google is huge fans of code-review (probably that's why Andy Hertzfeld dislike Google).

In contrast, Facebook probably doesn't do these practices (or did them in minimum). Facebook subscribes themselves to cowboy culture. Few days ago there were 2 posts in HN complaining about the Facebook API quality and documentation. I've used Facebook since 2006 and I do notice that their software is buggy whenever they push a newer build (you can tell: as soon as things degenerate, they just pushed the latest build). Aditya, who was the Director of Engineer back in 2008 (don't know if he's still), gave a talk recorded by InfoQ. He was asked about the state of unit-tests (or automation tests) and he said that the number of tests are minimum and not as many as he would've liked it to be.

I know that some companies enforce Code-Review, TDD and CI: kaChing, IMVU. There are others (I think Disqus and Quora do that too, but I'm not sure to what extend) as well but they might not say it out-loud around here.

The point is this: It doesn't matter what it calls (unit-tests, integration-tests, acceptance-tests), as long as you have extensive automated-tests, you can iterate faster and safer in the long-run. You can also deploy faster because you're quite sure nothing breaks.

The old QA process no longer applies when people would write code and throw it over the wall to the QA and hope everything goes well (check Uncle Bob article about how we do QA is wrong).


Actually this post from that same blog (to me) correctly summarizes the Kata exercise: http://pindancing.blogspot.com/2009/09/sudoku-in-coders-at-w...

Specifically Norvigs quote in the blog: "I see tests more as a way of correcting errors rather than as a way of design. This extreme approach of saying, Well, the first thing you do is write a test that says I get the right answer at the end, and then you run it and see that it fails, and then you say, What do I need next? that doesn’t seem like the right way to design something to me."


Yeah, I didn't mean to say that I was questioning TDD in general— rather that it was obvious that this dev's insistence on solving the smallest part of the problem first was blinding him to the fact that the problem as a whole is quite small.

And wow— that Sudoku epic is terrifying.


> (...) and then try the obvious algorithm. If that worked (...)

Personally, I would not bother to implement an algorithm if I was not sure it works correctly -- if it turns out to be wrong, I had wasted my time.


Interesting! How do you know your algorithm handles the tricky corner cases? I'd imagine you need to write down all the corner cases, write a formal proof for each, and hope that (1) your proof is correct, and (2) your code actually matches your proof. Could you elaborate on how you do this?

Here are a few corner cases I can think of for word-wrapping:

1) The line is empty.

2) The line is 1 character shorter/1 character longer/exactly the same length as the wrap limit.

3) The first break point for the line falls before/after/on the wrap limit.

4) The first character(s) of the newly-wrapped line are spaces.

I don't think that I could just eyeball a word-wrap algorithm and know that it handles all these cases correctly. I would, at a minimum, need to check each case. But at that point, I'm just as well off writing test cases, and allowing the computer to verify that my algorithm does the right thing for each. Of course, that's not as good as a proof by induction for each desired property, but it's a _lot_ less work, and it's almost always adequate. And I don't need to worry about mistakes in my proof.

Am I misunderstanding something about your approach? How would you handle this?


I do not say I do not test my code. I just see no purpose in implementing code that I hope will work. I need to believe it will work, and if it does not, I am sure that it is an implementation, not an algorithm issue. The first attempt in the article was just silly, and the second was just wrong. I would not waste time writing these, instead I would use this time to figure out the real solution and then I would implement it.


Hrmm? How else are you going to solve a problem that you've never faced before?

And yes, wasted time is basically a necessary feature of working on problems that don't have a textbook solution.


I am going to think about a problem, and if I come up with a solution I am sure is correct, I shall implement it. If I do not, I will search the web for a solution. If I find one, I will make sure it is correct, and then implement it. If I do not, I am going to ask some friends of mine about it. If they do not know the solution, I will temporaily loose the constraints and implement a solution that is "good enough", then I will make the problem my next research problem.

Implementing an algorithm we are not sure that works correctly is not only a waste of time -- it is actually harmful, for it is possible to ship an algorithm that only seem to work correctly. Test cases will not prove that a solution is correct.


What did I get downvoted for? I am relatively new here and I do not know all the things that are frowned upon?


I suspect that in many people's experience, "a solution that I am sure is correct" is anything but, until you've actually tested it. And testing it often takes far less time than puzzling over it and pondering. And so claiming that you never implement anything until you're absolutely certain it's correct smacks of youthful naivete, of the type common to college students who've never implemented anything.

In many of the more interesting fields, it's often not well-defined what it means for an algorithm to be "correct" anyway. What would a "correct" web search algorithm look like? How about a "correct" recommendation algorithm? A "correct" flight fare prediction algorithm? A "correct" stock trading algorithm? A "correct" Starcraft AI? There're "better" and "worse" algorithms for these, but there's no such thing as a "correct" one.

For many problems that do have an obvious "correct" solution (eg. OCR, face recognition, fare optimization), getting there is a nearly intractable problem, and the interesting part is in figuring out how you can approximate it as well as possible with the resources you have available is the best you can do.


You are right in what you say, however I think I have been misunderstood. It is well defined what it means for an algorithm to be correct -- it means that it returns expected output for a given input. Ambiguous situations you are talking about are when the problem is not well defined. That is why I used the word "algorithm" in my first comment. Trial and error is perfectly fine when we are not exactly sure what result we expect, but we will "feel" when it's good enough. It is unacceptable if we know exactly what the result should be, though.

When the problem is well-defined, like word-wrapping, or sorting, or whatever, if you blindly stagger and write code hoping it will work, instead of stopping to think why it should work, you are doing it wrong.


In fact, it is pretty obvious that the developer's decision to write tests first is blinding him to the problem and seeing a solution. By whiplashing around tests rather than taking time to just think about what you want, it takes an hour to write a simple solution to a warm-up question in a good technical interview.

On the other hand, as a way to get around sticky conceptual barriers, perhaps this approach is great. One just wishes the barriers were set a bit higher. But perhaps the trick doesn't work as well there.


Funny how that which is right in front of your face can be so elusive. Are we so busy amongst the trees that we can't see the forest?

One of my mentors once gave me a list a obvious things to check when stuff doesn't work. Funny, years later I still need this list:

1. It worked. No one touched it but you. It doesn't work. It's probably something you did.

2. It worked. You made one change. It doesn't work. It's probably the change you made.

3. It worked. You promoted it. It doesn't work. Your testing environment probably isn't the same as your production environment.

4. It worked for these 10 cases. It didn't work for the 11th case. It was probably never right in the first place.

5. It worked perfectly for 10 years. Today it didn't work. Something probably changed.


"Hi Alphonse. What's up?" The gruff voice of Bob, Jerry's manager (or "master", as he called him), came from behind them. "Oh, I see Jerry is showing you how to generate unnecessary String garbage and stack overflow exceptions. Wow, and in a simple word wrap function, too. Nice job, Jerry."

Alphonse was confused. "Bob, how do you know that? We didn't write tests for those things. I don't even know if you can write tests for them."

Bob looked at Alphonse with a sad expression. "Jerry, could I see you in my office--sorry, my 'dojo'--for a minute?"


Also, too bad Alphonse didn't include a test such as:

    @Test
    public void stringContainsNewline() throws Exception {
      assertThat(wrap("ab\ncd e", 4), equalTo("ab\ncd e"));
    }


I'd have a great deal of difficulty in taking anyone seriously who called their office their "dojo". Or liked their direct reports to call them "master".


Jerry is the one making up all this martial arts nonsense...Bob is just humoring him.


I have no problem calling Sensei a particular great coder/architect. I even used to do it for the fun of it. But when it comes down to management asymmetric relationships, I do my best to avoid biasing it more ("master").

Could be a great way to practice office theater, though.


It's not healthy to consider oneself subordinate to ones manager. You're peers with different roles, that's all. He's more about the what, you're more about the how.


I've never used TDD, and this definitely hasn't convinced me I should do so. I knew the solution as soon as I sat down and thought for 1 minute to myself - it's a pretty obvious algorithm. I just don't see how coding test-first helps in this case. It certainly didn't help in the story told, but I don't know how it can help in the real world.


What struck me is the guy was coding to pass his tests only, not thinking about the right way to code the algorithm.

It's like coding by taking stabs in the dark until every degenerate case you thought of is handled correctly. There's still no understanding of the problem or even a hope of a "proof" that what you did was correct.

Is this the future of software development? No wonder software is crap.


Yup. That's why I prefer only writing a test for the most nominal case at first, and then code that solution. I don't write any more tests until I have a bug in the system, and I feel the need to convince myself that the subsytem is behvaing the way I think it does. Basically test-writing becomes a debugging tool like putting in trace, breakpoints or watchpoints. You have doubts about a part of your system, so write a test to confirm that it does what you think it does. I find that by the time a module is reasonably mature, I have enough tests to make a fairly solid test rig. I also don't waste time writing tests for trivial degenerate cases that I can handle correctly in my code without even having to think about it.


I'm not a TDD person at all, nor even a "unit test everything" person, but the idea is that by testing even the trivial degenerate cases, you can refactor everything with confidence that you didn't accidentally slip and typo one of them or otherwise fuck up.


> the idea is that by testing even the trivial degenerate cases, you can refactor everything with confidence

The trouble is that on the one hand, you get diminishing returns for each more extreme edge case you include as a test, while on the other hand, unless you have coded up every possible scenario, you can't really refactor with impunity. Something has to give, and in most cases it can only be the confidence in refactoring, because most interesting problem spaces are infinite and it's tough to write a comprehensive test suite for an infinite number of scenarios!

I've found that for many types of project, some degree of automated testing is well worth the trouble. However, the "test everything" mentality seems to breed a dangerous false sense of security. Perhaps worse, in some contexts the "test everything" approach also forces developers to warp otherwise clean and natural designs into a shape where automated test tools can work with them more easily, at the expense of making it harder for people to work with them. I am far from convinced that that particular trade-off is ever worthwhile, and I've always found it a rather odd contradiction that many Agile methodologies supposedly advocate people over process and the like, yet stick to their guns on this one.


The people who believe with test-everything is probably the same people who believe with 100% code-coverage.

I think most people, by now, have learned that 100% code-coverage and test-everything are superfluous so there's no point of discussing these two or making a big deal of these two as a problem of subscribing to TDD.

The idea of TDD is to test the very minimum such that the code is proven to work as per requirement. When there is a bug found, write the test first before you fix the bug. This way, at one point of the life of the software, you'll eventually have enough tests to cover. I think most people put too much focus on the development story rather than the maintenance story thus most people only explain how to do TDD on new code, not how to do TDD on existing code (or rather, the next phase).

I've been in projects where because the people behind them were not putting too much effort for testing, they start automation effort from behind. Eventually you'll hit a chicken-n-egg situation: we'd like to refactor this buggy part but the architecture makes it hard to write automation test.

All professional projects will have automation tests at some point of their life. People by now should already know that software grows and hiring more QAs, re-test everything (regression, smoke, full-blown, etc), or even telling devs to test the code they just wrote manually don't scale.

Keep in mind that sometime, quality is defined by the client (or by the requirements). The client might not ask superb quality (as long as there is no data corruption) thus one probably does not have to write extensive automation-tests.


I know. But I find that you get that benefit without having to have tests for each and every degenerate case. I don't even think you need to have tests for every nominal case. Besides, I don't do the whole mocking thing, removing all interaction between different classes. This means that I end up testing common edge cases as a sideeffect of testing another module....


Uncle Bob is a well-known snake-oil, ermm I mean methodology salesman. That's all you need to know about this article.


He's trying to make our industry a little bit better one programmer at a time. Why throw stones at him? He believes in TDD. He's passionate about it and he shares his knowledge. If he makes money from this (which probably not his end-goal when he decided to learn and share his knowledge) then more power to him.

Not everybody should make money from being a "hardcore (but cowboy) coder".


OK, someone just give me the answer: is this designed as an example of what one should do, or is it a parable of what not to do?

I started reading it earlier today, and gave up once I hit the sexist tone. Then seeing the comments here, I realized that it was just parodying Heinlein. Cuz it's not really sexist if you're quoting, right? Anyway, given all the upvotes I thought there must be something to the article.

In the same way, at the start I thought that ridiculously verbose ad-hoc testing functions were also a parody. Surely his point was going to be that your time is better spent writing a testing framework that reads from a text file rather than writing a 95% duplicate function for each test?

But then I got to the end, and I lost my faith. Is he mocking Heinlein or just channeling some unrepentant 60's SciFi sexism? And is he really suggesting that this is really a good way to write software?


I don't think it's a parody, because I can see what he's getting at. If the programmer had written the "simpler" test first, he might have stumbled upon the algorithm as illustrated without the false starts at the beginning, and then it would seem like more of a successful approach.

Then again, I'm not sure how you're supposed to know which tests are the magic ones which will guide you to a solution, and the author conveniently trails off without telling us.

(I'm also not sure what happened to first sitting down with a pen and paper and figuring out the algorithm, instead of praying to the TDD gods that you might magically stub your toe on it.)


> Then again, I'm not sure how you're supposed to know which tests are the magic ones which will guide you to a solution, and the author conveniently trails off without telling us.

Oh ye of little faith: you just do it the same way you know which magic refactorings to apply once your code is working, to turn your test-passing ad-hoc spaghetti into a clean, maintainable design. :-)

> (I'm also not sure what happened to first sitting down with a pen and paper and figuring out the algorithm, instead of praying to the TDD gods that you might magically stub your toe on it.)

Indeed. My current project for one client is building a user interface. I just spent nearly a week writing a design spec -- though some might call it a math paper -- to prove that all those transformations I'm planning to do to get from a user-friendly description of the situation to a hardware-friendly one are actually going to work. The TDD gods would no doubt consider me a heretic, but I suspect that I'll somehow avoid eternal damnation (as long as I write some tests later, of course).


Lots of ways to interpret it, but stripping back the story, it's showing a technique to sharpen your skills and looking at a problem in different ways.

As you say there are definitely better ways to test the actual wrapper class in a real life situation, but showing the test suites one by one gives the reader a chance to see how the narrator approaches the problem and various situations are word-wrap class would face.

If you were set your own 'katas' and reproduce similar code from scratch or a framework, I can see it helping some people look at various problems in different ways and look objectively at how they solve them. Every person has their own methods all the same.


I was going to reply to this, but suddenly I flashed back to a lot of Usenet arguments from about fifteen years ago and thought better of it. I quote Tim Skirvin's Godwin's Law faq:

8. Are there any topics that lead directly to Godwin Invocations? Well, yeah. Of course. Case's Corollary to the Law states "if the subject is Heinlein or homosexuality, the probability of a Hitler/Nazi comparison being made becomes equal to one"


A far more entertaining (it's funny because it's true!) variant is

http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-s...


Is this idea of a "Kata" commonplace? I rather like it...



It comes from martial arts. I don't know which martial art specifically, but the word seems Japanese.


A kata is a series of movements in a number of martial arts (also called 'forms' and a million other names depending on the style and origin). You practice the same movements over and over focusing on technique, balance and your state of mind.

As referenced in in the story there are often a number of different types to choose from depending on what you are trying to achieve (relaxation, a specific move/technique, different fight situations). They also generally get harder and more refined as you go up belts/levels.

Never heard of it before in the coding world though, might take it up, definitely looks useful!


"A kata is just a simple program that you write over and over again as a way to practice."

Sounds like a waste of time. I can see how it would be good for a physical activity where you need to train your muscle memory, but I can just as easily train my programming reflexes by doing my job or working on open source. There are so many things you could program, why make up things to program that don't need to be programmed?

Does anyone actually do this? If so, can you describe the ways in which it actually helps you get better.


Great article. I especially love the Heinlein quotation from The Moon is a Harsh Mistress, for hopefully obvious reasons. There is another similar quote from Heinlein's "Have Space Suit Will Travel" (written several years earlier) that is also good programming advice:

"Daddy says that, in a dilemma, it is helpful to change any variable, then reexamine the problem."


Somehow that way of coding makes my head hurt.


While the final recursive solution is nice, it is trivial to split the string on spaces and recombine inserting newlines when the current line overflows. This is not as elegant of a solution, but it took far less time to build that than it did to read the article.


I'm a bit surprised to find that Uncle Bob is still writing installments of "Software Craftsman"-- I can't even remember how many different defunct software development magazines I used to follow the series in...


My martial-arts self died a little at the Kata word used for such things.


Let's see. A short, pointless, repetitive exercise, done for its own sake and with very little practical application or relevance. Sounds like a kata to me.



I feel pain when I see the word "recurse."


Sure uses a lot of stack.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: