Hacker News new | past | comments | ask | show | jobs | submit login
Is TDD Dead? (2014) (martinfowler.com)
445 points by cik on Aug 26, 2020 | hide | past | favorite | 476 comments



The best talk on this topic, IMO is Ian Cooper: "TDD, Where did it all go wrong?" https://www.youtube.com/watch?v=EZ05e7EMOLM

Couple of notes:

- TDD, much like scrum, got corrupted by the "Agile Consulting Industry". Sticking to the original principles as laid out by Kent Beck results in fairly sane practices.

- When people talk about "unit tests", a unit doesn't refer to the common pattern of "a single class". A unit is a piece of the software with a clear boundary. It might be a whole microservice, or a chunk of a monolith that is internally consistent.

- What triggers writing a test is what matters. Overzealous testers test for each new public method in a class. This leads to testing on implementation details of the actual unit, because most classes are only consumed within a single unit.

- Behavior driven testing makes most sense to decide what needs tests. If it's required behavior to the people across the boundary, it needs a test. Otherwise, tests may be extraneous or even harmful.

- As such, a good trigger rule is "one test per desired external behavior of the unit, plus one test per bug fixed". The test for each bugfix comes from experience -- they delineate tricky parts of your unit and enforce workign code around them.


If I recall correctly, another very important point he makes in that talk is that it's fine to delete tests. TDD tends to result in a lot of tests being created as a sort of scaffolding to support initial development. The thing about scaffolding is, when you're done using it, you tear it down.

I don't think he mentions it during the talk, but the next step after deleting all those tests is a little bit more refactoring for maintainability. Now that you've deleted all the redundant tests, you can then guard against future developers unwittingly becoming tightly coupled to your implementation details, by taking all the members that used to only be exposed for testing purposes, and either deleting them or making them private.


Yes, you should delete tests for everything that isn't a required external behavior, or a bugfix IMO.

Otherwise you're implicitly testing the implementation, which makes refactoring impossible.

A big smell here is if the large majority of your tests are mocked. This might mean you're testing at too fine-grained a level.


> you should delete tests for everything that isn't a required external behavior

Wait, I'm terribly confused here.

Aren't a huge part of tests to prevent regression?

In attempting to fix a bug, that could cause another "internal" test to fail and expose a flaw in your bugfix that you wouldn't have caught otherwise. And it's not uncommon for your flawed bugfix to not cause an "external" test to fail, because it's related to the codepath there was never a good enough external test for in the first place -- hence why the bug existed.

I can't imagine why you would ever delete tests prematurely. I mean, running internal tests is cheap. I see zero benefit for real cost.

And not only that, when devs don't document the internal operation of a module sufficiently, keeping the tests around serves as at least a kind of minimal reference of how things work internally, to help with a future dev trying to figure it out.

If you're refactoring an implementation, then obviously at that point you'll delete the tests that no longer apply, and replace them with new ones to test your refactored code. But why would you delete tests prematurely? What's the benefit?


> In attempting to fix a bug, that could cause another "internal" test to fail and expose a flaw in your bugfix that you wouldn't have caught otherwise.

If an external test passes and an internal test fails, the external test isn't really adding any value, is it? And if the root of your issue is "What if test A doesn't test the right things", doesn't the whole conversation fall apart (because then you have to assume that about every test)?

IME this is a common path most shops take. "We have to write tests in case our other tests don't work." Which is a pretty bloaty and wildly inefficient strategy to "Our tests sometimes don't catch bugs." Write good tests, manage and update them often. Don't write more tests to accommodate other tests being written poorly.

> I mean, running internal tests is cheap.

Depends on your definition of cheap, I guess.

My last job was a gigantic rails app. Over a decade old. There were so many tests that running the entire suite took ~3 hours. That long of a gap between "Pushed code" and "See if it builds" creates a tremendous amount of problems. Context switching is cost. Starting and unstarting work is cost.

I'm much more of the "Just Enough Testing" mindset. Test things that are mission critical and complex enough to warrant tests. Go big on system tests, go small on unit tests. If you can, have a different eng write tests than the eng that wrote the functionality. Throw away tests frequently.


I understand what you're saying, but in my experience that's not very robust.

I've often found that an internal function might have a parameter that goes unused in any of the external tests, simply because it's too difficult to devise external tests that will cover every possible internal state or code path or race condition.

So the internal tests are used to ensure complete code coverage, while external tests are used to ensure all "main use cases" or "representative usage" work, and known frequent edge cases.

That doesn't mean the external tests aren't adding value -- they are. But sometimes it's just too difficult to set up an external test to guarantee that a deep-down race condition gets triggered in a certain way, but you can test that explicitly internally.

It's not that anyone is writing tests poorly, it's just that it simply isn't practically feasible to design external tests that cover every possible edge case of internal functionality, while internal tests can capture much of that.

And if your test suite takes 3 hours to run, there are many types of organizational solutions for that... but this is the first I've everd heard of "write less tests" being one of them.


> I've often found that an internal function might have a parameter that goes unused in any of the external tests,

It seems that you're still thinking about "code". What if you thought about "functionality"? If an external test doesn't test internal functionality, what is it testing?

> But sometimes it's just too difficult to set up an external test to guarantee that a deep-down race condition gets triggered in a certain way, but you can test that explicitly internally.

I would argue that if you're choosing an orders of magnitude worse testing strategy because it's easier, your intent is not to actually test the validity of your system.

> while internal tests can capture much of that.

We can agree to disagree.

> And if your test suite takes 3 hours to run, there are many types of organizational solutions for that... but this is the first I've everd heard of "write less tests" being one of them.

I was speaking about a real scenario that features a lot of the topics that you're describing. My point was not that it was good, my point was that testing dogmatism is very real and has very real costs. To describe writing/running lots of (usually unnecessary) tests as "cheap" is a big red flag.


Not the poster you replied to, but I've been thinking of it lately in a different way. Functional tests show that a system works, but if a functional test fails, the unit test might show where/why.

Yes, you'll usually get a stack trace when a test fails, but you might still spend a lot of time tracing exactly where the logical problem actually was. If you have unit tests as well, you can see that unit X failed, which is part of function A. Therefore you can fix the problem quicker, at least for some set of cases.


It is a combinatorial explosion problem.

Internal code A has 5 states, piece B has 8 states.

Testing them individually requires 13 tests.

Testing them from out the outside, requires 5x8=40 tests.

Now, if you think of it that way, maybe you _do_ want to test the combinations because that might be source of bugs. And if you do it well, you don't actually need to write 40 tests, you have have some mechanism to loop through them.

But the basic argument is that the complexity of the 40 test-cases is actually _more_ than the 13 needed testing the internal parts as units.

FWIW, my own philosophy is to write as much pure-functional, side-effect free code that doesn't care about your business logic as possible, and have good coverage for those units. Then compose them into systems that do deal with the messy internal state and business-logic if statments that tend to clutter real systems, and ensure you have enough testing to cover all branching statements, but do so from an external to the system perspective.


I've got the impression that you are both talking slightly past each other.

At least my impression is that these "internal tests" you talk about are valid unit tests -- but not for the same unit. We build much of our logic out of building blocks, which we also want to be properly tested, but that doesn't mean we have to re-test them on the higher level of abstraction of a piece of code that composes them.

From that thought, it's maybe even a useful "design smell" you could watch out for if you encounter this scenario (in that you could maybe separate your building blocks more cleanly if you find yourself writing a lot of "internal" tests)?


Isn't the idea with unit testing forgotten here? The point is to validate the blocks you build and use to build the program. In order to make sure you've done each block right you test them, manually or automated... Automated testing just is generally soooo much easier. If you work like that, and do not add test after y,ou've written large chunks of code you should have constructed your program so that there's no overhead in the test. Advanced test which does lots of setup and advanced calculations generally ain't the test fault, but the code itself that requires that complexity to be tested.

Wanna underline here that system tests are slow, unit tests are fast.

This said, i agre that you should throw away tests in a similar fashion as you do code. When it does not make sense don't be afraid to throw it, but have enough left to define the function of the code, in a documenting way. Let the code /tests speak! :D


> Wanna underline here that system tests are slow, unit tests are fast.

System tests are slow but, in my experience, are far far far more valuable medium and long term. Unit tests are fast and relatively unhelpful.


Imo the value of unit tests is partially a record for others to see "hey look, this thing has a lot of it's bases covered".

Especially if you're building a component that is intended to be reused all over the place, would anyone have confidence in reusing it if it wasn't at least tested in isolation?


If the test suite took hours, couldn't part of the problem be that a lot of those tests should have been more focused unit tests? With small unit tests and mocking, you could run millions of tests in 3 hours.


There were all kinds of problems with the test suite that could've been optimized. The problem was that there were too many to manage, and that deleting them was culturally unacceptable.

Lots of them made real DB requests. It's hard to get a product owner to justify having devs spend several months fixing tests that haven't been modified in 9 years.


If it can cause a regression, it's not internal. My rule of thumb is "test for regression directly", meaning a good test is one that only breaks if there's a real regression. I should only ever be changing my unit tests if the expected behavior of the unit changes, and in proportion to those changes.


This is wrong.

A well-known case is the Timsort bug, discovered by a program verification tool. Also well known is the JDK binary search bug that had been present for many years. (This paper discusses the Timsort bug, and references the binary search bug: http://envisage-project.eu/proving-android-java-and-python-s...)

In both cases, you have an extremely simple API, and a test that depends on detailed knowledge of the implementation, revealing an underlying bug. Obviously, these test cases, when coded, reveal a regression. Equally obviously, the test cases do test internals. You would have no reason to come up with these test cases without an incredibly deep understanding of the implementations. And these tests would not be useful in testing other implementations of the same interfaces, (well, the binary search bug test case might be).

In general, I do not believe that you can do a good job of testing an interface without a good understanding of the implementation being tested. You don't know what corner cases to probe.


Using implementation to guide your test generation ("I think my code might fail on long strings") is fine, even expected. Testing private implementation details ("if I give it this string, does the internal state machine go through seventeen steps?") is completely different.


Sure.


That's not what he's saying. He's saying the test should measure an externally visible detail. In this case that would be "is the list sorted". This way the test will still pass without maintenance if the sorting algorithm is switched again in the future. You can still consider the implementation to create antagonistic test cases.


One of my colleagues helped find the Timsort bug and recently another such bug (might be the Java binary search, don't remember).

The edge case to show a straightforward version of that recent bug basically required a supercomputer. The artifact evaluation committee complained even.

So you can try to test for that only based on output. But it's gigantically more efficient to test with knowledge of internals.


this sounds like a case where no amount of unit testing ever would've found the bug. someone found the bug either through reasoning about the implementation or using formal methods and then wrote a test to demonstrate it. you could spend your entire life writing unit tests for this function and chances are you would never find out there was an issue. i'd say this is more of an argument for formal methods than it is for any approach to testing.


But once you've found the bug, you'd like to add a test case that prevents regression - a test case that doesn't require a supercomputer.

That might not always be possible - but if it is, the test would be based on implementation details.


One doesn't need to have detailed knowledge of the implementation, but merely if provided initial state creates invalid output then we can write test for that. Though yes, having knowledge of implementation allows you to define the state that produces invalid result.


> If it can cause a regression, it's not internal

Fair enough. And how do you know, before causing a regression, whether your test could detect one? In other words, how can you tell beforehand whether your test checks something internal or external?


"External" functionality will be behavior visible to other code units or to users. If you have a sorting function, the sorted list is external. The sorting algorithm is internal. Regression tests are often used in the context of enhancements and refactorings. You want to test that the rest of the program still behaves correctly. Knowing what behavior to test is specific to the domain and to the technologies used. You can ask yourself, "how do I know that this thing actually works?"


Isn’t the point that internal functions often have a much smaller state space than external functions, so it’s often easier to be sure that the edge cases of the internal functions are covered than that the edge cases of the external function are covered?

So, having detailed tests of internal functions will generally improve the chances that your test will catch a regression.


> Isn’t the point that internal functions often have a much smaller state space than external functions

That's the general theory, and why people recommend unit tests instead of only the broader possible integration tests. But things are not that simple.

Interfaces do not only add data, they add constraints too. And constraints reduce your state space. You will want to cut your software over the smallest possible interface complexity you can find and test those, those pieces are what people originally called "unities". You don't want to test any high-complexity interface, those tests will harm development and almost never give you any useful information.

It's not even rare that your unities are composed of vertical cuts through your software, so you'll end up with only integration tests.

The good news is that this kind of partition is also optimal for understanding and writing code, so people have been practicing it for ages.


I agree that they would help in the regression testing process, especially in diagnosing the cause. However, I think those are usually just called "unit" tests, not "regression" tests. For instance, the internal implementation of a feature might change, requiring a new, internal unit test. The regression test would be used to compare the output of the new implementation of the feature versus the old implementation of the feature.


Having regression tests greatly improves your chances of catching a regression.


Worth noting that performance is an externally visible feature. You shouldn't be testing for little performance variations, but you probably should check for pathological cases (e.g. takes a full minute to sort this particular list of only 1000 elements).


> "how do I know that this thing actually works?"

Agreed, but how do you know your test tests this? Or to re-phrase it: why would you even write a test that doesn't test this?


For bugfixes, I just write a failing test.

For features, I need to take the time to think of required behavior. If I just focus on the implementation, the tests add no documentation and I'm not forced through the exercise of thinking about what matters.


> If I just focus on the implementation [...]

Agreed, but why would you even write those tests to begin with?


> Aren't a huge part of tests to prevent regression?

Just a quibble: I would argue that a huge benefit of tests is preventing regression, but that's a very small part of the value of tests.

The main value I get out of tests is informing the design of the software under test.

* Tests are your straight-edge as you try to draw a line.

* They're your checklist to make sure you've implemented all the functionality you want.

* They're your double-entry bookkeeping to surface trivial mistakes.

But I think I mostly agree with your point. I delete tests that should no longer pass (because some business logic or implementation details are intentionally changing). I will also delete tests that I made along the way when they're duplicating part of a better test. If a test was extremely expensive to run, I suppose I might delete it. But in that case I would look for a way to cover the same logic in tests of smaller units.


All legitimate tests are[0] regression tests. TDD, to the extent that it's actually useful, is the notion that sometimes the bug being regression-tested is a feature request.

Edit: 0: I guess "can be viewed as" if you want to be pedantic.


> Aren't a huge part of tests to prevent regression?

Depends on the kind of tests. Old school "purist" unit tests are meant to help you verify the correctness of the code as you're writing it. Preventing regressions is better left to integration tests and E2E tests, or smoke tests. Alternatively to "unit tests" if your definition of "unit" is big enough (in which case it only works within the unit).

It's totally fine and common to write unit tests that are not meant to catch bugs of significant refactors. If you do it right, they should be so easy to author that throwing them away shouldn't matter.


Integration, E2E, and smoke tests are generally slow, flakey, hard to write. They should not cover/duplicate all the cases your unit tests cover.

They are good at letting you know all your units are wired up and functioning together. In all the codebases I've ever worked in, I would feel way more comfortable deleting them vs deleting the unit tests.


> Integration, E2E, and smoke tests are generally slow, flakey, hard to write.

This is not really true anymore in a modern system.

I can spin up an entire cluster to mirror prod - including databases and all - and run approx 10k integration tests all in under 5 minutes.


Why would you want to? when the same unit test coverage will run under 1 minute, and be smaller easier to understand/change tests and can all be done on your laptop.

it all depends on your definition of unit/integration, what I am talking about as unit tests you may very well be talking about as integration tests...

one of the main points I was making is you shouldn't have significant duplication in test coverage and if you do, I'd much rather stick with the unit tests and delete the others.


> Why would you want to?

Because they catch more bugs than unit tests, are easier for our product team to understand, and rarely break when refactoring.

Even a simple business flow like registering a new user will touch half a dozen systems.

5 or 6 integration tests can cover this flow far better than 100 unit tests.

> and be smaller easier to understand/change tests

That’s not my experience at all.

Unit tests are generally much harder to understand and need to be changed much more frequently.

Where unit tests help in my experience is:

A) in pinpointing where in a complex bit of logic the bugs are.

B) for generic libraries and building blocks where you don’t know exactly how your users will actually use them.


> Unit tests are generally much harder to understand and need to be changed much more frequently.

Changed more frequently, yes.

Harder to understand is usually because they're not-quite-unit-tests-claiming-to-be.

Eg: a test for function that mocks some of its dependencies but also does shenanigans to deal with some global state without isolating it. So you get a test that only test the unit (if that), but has a ton of exotic techniques to deal with the globals. Worse of all worlds.

Proper unit tests are usually just a few line long, little to no abstraction, and test code you can see in the associated file without dealing with code you can't see without digging deeper.


Yes. I believe you shouldn't delete tests.

After all, one of the best reasons for tests is to be able to refactor code confidently.


If you can refactor (make a commit changing only implementation code, not touching any test code) and the tests still pass then you’re probably fine.

If you’re changing tests as you change the code you’re not refactoring. You have zero confidence that your changed behaviour and changed test didn’t introduce an unintended behaviour or regression.

So many developers miss this in my experience.


if you can refactor without touching your tests and your tests still compile afterwards either the refactor was extremely trivial and didn't change any interfaces or you only had end to end tests.


I think the point is that if you have to change a test to make it pass or run after refactoring, it is not useful as a regression test. By changing it you might have broken the test itself so you have less confidence.

There is also the question of what a unit is. If you test (for example) the public interface of a class as a black box unit, you can refactor your class internals as much as you want and your tests don't need to change. You have high confidence you've done it correctly. At this point adding more fine-grained tests inside the class seems like more of a compliance activity than one that actually increases confidence, since you probably would've had to change a bunch of them to make them work again anyway.


Personally the way I'd phrase it is you need to refactor your tests just like you'd refactor the app code, but even looking at doing that independent of any app code refactoring.


Agreed. I would take an even stronger position, and say that a high degree of mocking actually implies two things: First, yes, you're testing at too fine-grained a level. Second, it's a code smell that suggests you may be working with a fundamentally untestable design that relies overmuch on opaque, stateful behavior.


Mocks are worthwhile though. Otherwise you end up not being able to unittest anything which accesses an external api such as databases, rest services etc.


IMO, databases is often an integral part of the program and should be part of the test (a real database in a docker image).

For instance, if you are not relying on unique constraint in the DB to implement idempotency you are probably doing something wrong, and if you are not testing idempotent behaviour you are probably doing something wrong.


I have plenty of tests actually calling database. Maybe it is not a proper unit test, but it is not a problem for me.

I want test to protect me against regressions, and I do not really care how they are classified.


It really depends on your definition of unit. In the London school of TDD, no, a unit cannot extend across an I/O boundary. The classicist school takes a more flexible, pragmatic approach.


I'd love to know the story behind "the London school".



Intriguing. Thanks!


You mean fakes/stubs, right? Unless you're testing whether you're correctly implementing the protocol exchange with an external party, you don't need to record the API calls.


How do you test your mocks?


How do you test the tests that are testing your mocks? That said verifying mocks are a great help - they won't let you mock methods that don't exist on the real object.

Some mocking libraries, like the VCR library in ruby, can be turned off every now and then so you tests hit real endpoints. It is worth doing from time to time.


> A big smell here is if the large majority of your tests are mocked.

We fell into this hard for a few years at my work.

Going back to any of that code is a nightmare because the test suites are so fragile.

Testing behavior and setting up as much of the system as possible leads to much better results in my experience.


Bertrand Meyer had the right of it, but I had to figure this out myself before I ever saw him quoted on the subject.

Me:

Code that makes decisions has branches. Branches require combinatoric tests.

Code with external actions requires mocks.

Therefore:

Code that makes decisions and calls external systems requires combinatorics for mocks.

Bertrand, more (too?) concisely:

Separate code that makes decisions from code that acts on them.

Follow this pattern to its logical conclusions, and most of you mocks become fixtures instead. You are passing in a blob of text as an argument instead of mocking the code that reads it from the file system. You are looking at a request body instead of mocking the PUT function in the HTTP library.

The tests of external systems are much fewer, and tend to be testing the plumbing and transportation of data. If I give you a response body do you actually propagate it to the http library? And even here, spies and stubs are simpler than full mocks.


I used this strategy when developing a client library for a web socket API. It was hugely helpful. I could just include the string of the response in my tests, instead of needing a live server or even a mock server for testing. Tests were much simpler to write and faster to execute.


This is great until the API response changes and you have to painstakingly update all your string fixtures to match.


One would argue that you should change your string fixtures to match and verify that the new API response doesn't break anything with your existing API client. Then you change the API client and verify that all the old tests still work as expected.

Better yet is if you keep the old fixtures and the new fixtures and ensure that your API client doesn't suddenly throw errors if the API server downgrades to before the new field was added.


The mocks have the same fixtures, plus a bunch of plumbing you have to sort out with every major and some minor version number upgrades.

You pay a bigger tax on the mocks down the road.


Yes, you should delete tests for everything that isn't a required external behavior, or a bugfix IMO.

For the edification of junior programmers who may end up reading this thread, I’m just going to come right out and say it: this is awful advice in general.

For situations where this appears to be good advice, it’s almost certainly indicative of poor testing infrastructure or poorly written tests. For instance, consider the following context from the parent comment:

Otherwise you're implicitly testing the implementation, which makes refactoring impossible.

A big smell here is if the large majority of your tests are mocked. This might mean you're testing at too fine-grained a level.

These two points are in conflict and help clarify why someone might just give up and delete their tests.

The argument for deleting tests appears to be that changing a unit’s implementation will cause you to have to rewrite a bunch of old unrelated tests anyway, making refactoring “impossible.” But indeed that’s (almost) the whole point of mocking! Mocking is one tool used for writing tests that do not vary with unrelated implementations and thus pose no problem when it comes time to refactor.

Now there is a kernel of truth about an inordinate amount of mocking being a code smell, but it’s not about unit tests that are too fine-grained but rather unit tests that aren’t fine-grained enough (trying to test across units) or just a badly designed API. I usually find that if testing my code is annoying, I should revisit how I’ve designed it.

Testing is a surprisingly subtle topic and it takes some time to develop good taste and intuition about how much mocking/stubbing is natural and how much is actually a code smell.

In conclusion, as je42 said below:

Make sure you tests run (very) fast and are stable. Then there is little cost to pay to keep them around.

The key, of course, is learning how to do that. :)


Did you ever actually refactor code with a significant test suite written under heavy mocking?

The mocking assumptions generally end up re-creating the behavior creating the ossification. Lots of tests simply mock 3 systems to test that the method calls the 3 mocked systems with the proper API -- in effect testing nothing, while baking in lower level assumptions into tests for people refactoring what actually matters.

You might personally be a wizard at designing code to be beautifully mocked, but I've come across a lot of it and most has a higher cost (in hampering refactoring, reducing readability) than benefit.


Did you ever actually refactor code with a significant test suite written under heavy mocking?

I have. The assumptions you make in your code are there whether you test them or not. Better to make them explicit. This is why TDD can be useful as a design tool. Bad designs are incredibly annoying to test. :)

For example if you have to mock 3 other things every time you test a unit, it may be a good sign that you should reconsider your design not delete all your tests.


It sounds like your argument is “software that was designed to be testable is easy to test and refactor”.

I think a lot of the gripes in the thread are coming from folks who are in the situation where it’s too late to (practically) add that feature to the codebase.


Mocks allow you test a certain method was called, with certain parameters, and in certain order.

That's extreme test implementation coupling.

Most don't use those features, but in my experience mocks indicate implementation coupling.


You seem to think the rationale is testing performance; but from GP it seems that the rationale is avoiding the tests ossifying implementation details against refactoring rather than protecting external behavior to support refactoring.


I think you wrote this before I finished elaborating on my comment. :)


> Mocking is one tool used for writing tests that do not vary with unrelated implementations

What if I chose the wrong abstractions (coupling things that shouldn't be coupled and splitting things in the wrong places) and have to refactor the implementation to use different interfaces and different parts?

All the tests will be testing the old parts using the old interfaces and will all break.


So today I have been writing a lexer and parser. The public interface is the parser, the lexer isn't exposed.

The problem is if I delete all the tests for the lexer then any bugs in the lexer will only get exposed through the parser's tests.

This makes no sense to me.


The lexer is a unit then.

The lexer has a clear boundary from the parser.

The issue that takes experience here is how to determine what's a unit. "The whole program" is obviously too big. "every public method or function" is obviously too small.

Just be pragmatic.


> "The whole program" is obviously too big.

Of course.

> "every public method or function" is obviously too small.

Why "obviously"? If it's public, someone outside the class can call it. That's an external behavior.


If the class is only consumed in the context of one code unit (module, service, whatever) then the class itself is an implementation detail.


"every" being the operative word.

The "feel" for good code that comes with experience is not reducible in practice to a set of black-and-white rules.


Ideally, the lexer should be a system in of itself, exposing a public interface that is consumed by its client, the parser.

Public doesn't necessarily mean "not you".


Even if your code never graduates to being used by multiple teams in your project or on others, “You” can turn into “you and your mentee” anyway, if you’re playing your cards right.


Or more trivially, "you and you half a year from now".


Every feature of the lexer should be testable through test cases written in the syntax of the language. That includes handling of bad lexical syntax also. For instance, a malformed floating-point constant or a string literal that is not closed are testable without having to treat the lexer as a unit. It should be easy to come up with valid syntax that exercises every possible token kind, in all of its varieties.

For any token kind, it should be easy to come up with a minimal piece of syntax which includes that token.

If there is a lexical analysis case (whether a successful token extraction or an error) that is somehow not testable through the parser, then that is dead code.

The division of the processing of a language into "parser" and "lexer" is arbitrary; it's an implementation detail which has to do with the fact that lexing requires lookahead and backtracking over multiple characters (and that is easily done with buffering techniques), whereas the simplest and fastest parsing algorithms like LALR(1) have only one symbol of lookahead.

Parsers and lexers sometimes end up integrated, in that the lexer may not know what to do without information from the parser. For instance a lex-generated lexer can have states in the form of start conditions. The parser may trigger these. That means that to get into certain states of the lexer, either the parser is required, or you need a mock up of that situation: some test-only method that gets into that state.

Basically, treating the lexer part of a lexer/parser combo as public interface is rarely going to be a good idea.


For any token kind, it should be easy to come up with a minimal piece of syntax which includes that token.

There is the problem, any tests that fail in the lexer now reach down through the parser to the lexer. The test is too far away from the point of failure. I'll now spend my time trying to understand a problem that would have been obvious when the lexer was being tested directly.

>Basically, treating the lexer part of a lexer/parser combo as public interface is rarely going to be a good idea.

This is part of the original point, the parser is the public interface which is why the OP was suggesting it should be the only contact point for the tests.


When a test fails, your understanding is informed by the nature of the code change that is responsible.

If you keep code changes small, and keep tests working, you're good.


Lexer/Parsers are one of the few software engineering tasks I do routinely where it's self evident that TDD is useful and the tests will remain useful afterwards.


Indeed! I recall a lexer and parser built via TDD with a test suite that specified every detail of a DSL. A few years later, both were rewritten completely from scratch while all the tests stayed the same. When we got to passing all tests, it was working exactly as before, only much more efficiently.

From that experience, I would say that in some contexts, tests shouldn't be removed unless what it's testing is no longer being used.


So what?

If you have a good answer to that, then the lexer is separate (as others said). If you don't then wirte parser tests for the lexer so that you can more easily refractor the interface between them.

There is no on right answer, only trade-offs. You need to make the right decision for you. (though I will note that there is probably a good reason parse and lex are generally separated and that probably means that the best tradeoffs for you is they are separate. But if you decide different you are not necessarily wrong)


So what?

Well, I was responding to "you should delete tests for everything that isn't a required external behavior".


If bugs in the lexer never cause the parser to fail for any possible input, does it really have bugs? ;-)

Or, as @VHRanger pointed out, the lexer can be considered a unit and be tested independently.


Sounds like your lexer has a public interface to you.


these are rules of thumb, not laws


The trouble is they get presented as laws and then some jobsworth will make damn sure you are following the rules.


Rules of thumb are just low-fidelity windows allowing glimpses of poorly researched, not yet understood laws.


I’ve watched this play out a few times with different teams and different code bases (eg, one team two projects).

Part of the reason existing tests lock in behavior and prevent rework/new features is that the tests are too complicated. Complicated tests were expensive to write. Expense leads to sunk cost fallacy.

I’ve watched a bunch of people pair up for a day and a half trying to rescue a bunch of big ugly tests that they could have rewritten solo and in hours if they understood them, learn nothing, and do the same thing a month later. The same people had no problem deleting simple tests and replacing them with new ones when the requirements changed.

Conclusions:

- the long term consequences of ignoring the advice of writing tests with one action and one assertion are outsized and underreported.

- change your code so the don’t need elaborate mocks

- choose a test framework that supports setup methods

- choose a framework that supports custom/third party assertions, sometimes called matchers. You won’t use this often, but when you do, you really do.


> Otherwise you're implicitly testing the implementation, which makes refactoring impossible.

Red green refactoring isn't, and shouldn't, be a goal of unit testing. Integration and E2E tests provide that. Unit tests are mostly about making sure the individual pieces works as you author them, as well as implicitly documenting the intent of those individual pieces.

If done properly, they're always quick/easy/cheap to author, and thus are throwaway. When you refactor significantly (more than the unit), you just throw them away and write new ones (at which point their only goal is for you to understand the intent of the code you were shuffling around, and making sure you're breaking what you expected to break). Delete, rewrite.

People are resistant to getting rid of unit tests when they did complex integration tests that took forever to write instead. So the tests feel like they were wasted effort. Those tests are totally valuable, in this case for things such as red green refactoring, but then yes, you have to carefully pick and choose what you're testing to avoid churn.


I would also test implementation details that are legitimately complicated and might fail in subtle ways, or where the intended behavior isn't obvious.

If I've implemented my own B+ tree, for example, you better bet your butt I'll be keeping some property tests to document and verify that it conforms to all the necessary invariants.


+1

There are two audiences for tests. 1. Your current self, who is trying to ensure you're going in the right direction.

2. A future dev, who wants to understand the relevant info about what to expect from a module.


Tests took work to produce and provide some sort of information.

It seems foolhardy to start off a process by throwing away information which could inform it.

Not having tests which cover the implementation makes refactoring impossible if the goal of refactoring is to preserve certain salient aspects of the implementation, rather than uproot it entirely.

Why not just start refactoring first. Then see what breaks, and then decide on a case-by-cases who wins: do you keep the refactoring which broke the test, and delete the test? Or do you back out that aspect of the refactoring.


This is my preferred approach as well.

When one does this one hopefully also get a feel for what tests will be useful and which ones will be thrown out early and start writing more of the first ones.

Couple this with a well designed language and a good IDE that can do the trivial refactorings (method rename - including catching overload collisions etc) and it becomes easy to maintain tests.


you do not want to delete tests that provide insight how your unit works internally.

If you were to delete these and you happen to have a regression, you need more time to analyse the faulty external behaviour and make conclusion about how the inner parts are working to produce the behaviour.

If you didn't write the code yourself, you might be in a situation that you will never able to be fix issue fully within reasonable time.

Side-note: I have seen these problems multiple times in production, where missing tests resulted in a large and expensive engineering effort to figure out the inner-mechanics of a particular piece of code.

Make sure you tests run (very) fast and are stable. Then there is little cost to pay to keep them around.


I agree with deleting tests. But when raising this with any team I've ever worked on, I might as well have said I was going to go drop the prod database. Deleting tests, in my experience, comes with a massive stigma that I am not sure how to surmount.


+1 especially to the final point.

If there was a bug, this bug should be replicated in the test. You then solve the bug, make sure the test (and others pass), and you'll be (relatively) sure the bug will not be reintroduced with a later change.

Every bug you find is an "edge-case" you didn't anticipate. Leave it in the test for the future. I find that the "table test" approach of Go works surprisingly well with this. You just add a case to the table, and often that's all you have to do.


> TDD, much like scrum, got corrupted by the "Agile Consulting Industry" > Overzealous testers test for each new public method in a class.

The Agile Consulting Industry can be summed up as: know only one rigid way to do things, and if it doesn't work, blame the context.

In the case of unit testing, if my blind mechanical rules for unit testing don't apply sanely to your code, then your code is wrong.


This.

This is why I stopped following agile and other XP gurus who just live from consulting.

It's easy to have strict principles when you never need to apply them yourself.


Uncle Bob comes to mind


He's to blame for popularizing a lot of the unit testing insanity IMO, yes.


Agile is a religion.

Well-meaning, smart people wrote some good, general guidelines. Somehow it got turned into an industry of people who want money to tell me how I'm living my life wrong.


Yeah.

Agile can be summed up by making little releases which you show to the client and so the developers can make decisions on what to develop next quickly.

Everything else naturally derive from this. All the other stuff are just ways, which you can adopt or not, to accomplish that. You don't need to hire a consultant to tell you that you're not doing your poker story points meeting wrong because of metaphysical explanation.


It can also be summed up as: "we see there are managers with budget, and we have a solution to this problem..."

Any place you have money, you will find someone selling snake oil.


The problem with formulas is that they have a tendency to become formulaic, if successful.


Next introduce some metrics that measure how "well" you are doing. Then make those your primary goal.


This is an excellent point. I once had to deal with an external contract firm on a project that I was hired to fix. We had issues of production code breaking so badly that it brought down the entire server (N+1 query issue that triggered 50k queries).

The tests passed. When I emergency patched the issue and deployed it to production, the contract firm got mad at me for breaking their tests...to fix a production emergency.

It’s put me on guard again militant test ideologues with no concept of real priorities ever since.


We generally wouldn't allow that either. We've ran into cases where emergency fixes cause even more damage (e.g., the system is up, but now it's processing payments wrong), so you have to prove beyond a shadow of a doubt that the test failures are irrelevant or less bad than the current incident.

Often times it's less effort/more expedient to make the change pass tests (or update the tests) than convince all of the stakeholders that what you're about to do is safe, but the break-glass is there if needed.

Maybe you'd call this a militant test ideology, but I think it's perfectly reasonable. Systems are complex, and people can get tunnel vision during a bad outage.


The way the tests were written in that case, it was hard coded to how the work was being done and not the result produced. Both the code and the tests were bad.

Normally, I’d agree with you though.


Be careful not to confuse militant ideology and local incentive structures.


Fair point. In this particular case the primary developer for them wanted to “publicly shame” me in Slack. Seemed much more ideology driven at the time.


Public shaming of a developer is rarely the right thing to do–and I would expect most design paradigms do not include a section on it ;)


That may have just been the rationalization for an ulterior motive.


> The tests passed. When I emergency patched the issue and deployed it to production, the contract firm got mad at me for breaking their tests...to fix a production emergency.

I get the point, but it depends on which tests failed.

Tests for unreleased features and trivial UX stuff is not the same as breaking a test making sure not every customer gets a 50% discount.


I think most disenchantment with TDD comes from the second point that you notice. If one attempts to test every method of every class one ends up testing implementation details that are very much subject to change. Also, one can easily end up testing trivial points like 'is the processor still capable of adding two integers'. As you note in your third point it seems much more productive to test properties of code that the customer could potentially recognize as something they value.

TDD really isn't dead for me. I do it pretty much every day. Both in work and in personal projects.

I am not sure talks/conversations like these are very valuable. In the end it turns out that every practical question has the answer 'it depends'. Maybe the important thing to realize is that most questions do have the answer 'it depends' and that one can never stop using ones brain.


> When people talk about "unit tests", a unit doesn't refer to the common pattern of "a single class". A unit is a piece of the software with a clear boundary. It might be a whole microservice, or a chunk of a monolith that is internally consistent.

This is a really important aspect, and I think is one of the key things that separates "journeyman" level from "master" on the subject of testing.

The most concise piece I've found on this is Bob Martin's "Testing Contra-Variance" (https://blog.cleancoder.com/uncle-bob/2017/10/03/TestContrav...), if you can look past the slightly forced "Socratic dialog" style of the article.

> The structure of the tests must not reflect the structure of the production code, because that much coupling makes the system fragile and obstructs refactoring. Rather, the structure of the tests must be independently designed so as to minimize the coupling to the production code.

The "one test class per class" or "one test file per file" approach is an extremely common anti-pattern, and it's insidious because a lot of engineers think it's the obviously correct way of writing tests.


> one test per bug fixed

this, and you should try to write the test before fixing the bug - you cannot trust a test that you haven't seen failing


I often find that before I've diagnosed and fixed a bug, I don't know what the best test is.

So I tend to: fix the bug; write the test; see the test fail on the old version of the code; see the test pass on the new version of the code.

Something else I'll throw in on tests. Many times I've caught a bug I would otherwise be introducing because an existing test — which wasn't written to catch my new mistake — fails.


Sometimes you’re in a hurry to fix a bug, so you write the functionality and ship it out fast after some manual testing, without an automated test reproducing the bug. That’s okay!

Write the test afterwards in some branch, and after committing the tests, make an extra commit that undoes the bug fix. Let your CI run it and confirm it fails there, then bring back the bug fix on the next commit.

That way, you can have flexibility to fix things fast, but still keep the regression test that’s proven to check for it around for the future.


> to fix things fast

Note that if you see _another_ developer taking the time to write a test when you wouldn't, it doesn't necessarily mean they are wasting time.

As I am debugging something, I need to write myself a very clear description of my hypothesis of the steps to reproduce the bug -- otherwise it is hard to see incremental progress. I work faster if I can have the machine execute those steps.


Fully agreed. I didn't intend to imply a developer writing a test is wasting time, either!

Our team has been in situations where we're highly confident of the root cause, but we know creating a test to duplicate this might take hours, if not days. It might even be a fairly finicky scenario to try and setup.

Rather than letting customers have to handle the negative consequence of our bug for hours, we'll make the change, run it through our existing test suite (to make sure we're not making yet more troubles for ourselves!), and then release it after another teammate reviews the change.

But a test certainly will help with the confidence that the right thing was changed. I would definitely encourage writing a test if it's easy for your group to handle whatever negative consequences the bug existing in the wild is producing for as long as it takes to write a test.


"That's okay!"

Why is that Ok? How does the dev know that they've not undone anything else? Also, how does the dev know that the fix is complete? Or that it caters to the defect?

This anxiety of pushing out at fix - at the risk of undoing other working functionality - ought to be addressed first. It is better (and safer) to get into the habit of writing a test to reproduce the defect and then write the fix. After all, if the fix appears trivial, then the test for the defect ought not to take too much time either. I have been able to get to this mindset with practice.


Because being dogmatic is exactly what causes people to start ignoring this sort of methodology. Pragmatism really does have to win out sometimes. Maybe you're in a hurry because the bug is causing active downtime. Some bugs really are "obvious" once they've failed and you're looking at the code. Maybe development of a proper test involves some test infrastructure work that's a larger undertaking than you have the opportunity for at the moment. Maybe you have a solid manual/QA testing system behind you, allowing you at least temporary assurance that your fix is valid.

No team does everything perfectly all the time, and that's fine. The real question is what gets done about it afterward: is the technical debt that you've incurred paid down in a reasonable time frame?


This nicely encapsulates what I was hoping to say, but didn't take the time to write out. Thank you!

> Maybe development of a proper test involves some test infrastructure work that's a larger undertaking than you have the opportunity for at the moment.

This is something I've encountered many times.

> Maybe you have a solid manual/QA testing system behind you, allowing you at least temporary assurance that your fix is valid.

I would hope most people are doing this anyway, especially when a big production bug has been found.


I've fixed plenty of bugs which I could not reproduce and thus could not test, simply based on the source code and description of problem from customer. Based on the symptoms it was clear what the code must be doing, studying the code reveals it is clearly wrong, and so the fix was obvious.

So make a patch, make a new build, send to customer and get a call "yeah that fixed it, thanks!".

Sometimes the issue is I simply don't have time to reproduce it, like that time a blocking bug had to be fixed within 30 minutes, or the customer would have had to charter their own helicopter to get a few packages to an offshore oil platform, rather than piggy-back on the worker transport heli.

Other times it's some combination of the customers system it's running on and configuration of our software that I can't reproduce.

Not saying it's ideal, but it's quite possible to successfully fix issues without being able to reproduce and test.


I once fixed a bug like that, then went wait a minute. Sure enough source control revealed I'd made exactly the same fix 2 years ago, and 2 years before that someone else had done exactly the same. In the odd years someone else undid that fix to fix a different bug that seemed unrelated. Once I figured out what the other problem was I was able to find the more complex fix for both situations.

I wished I had heard of automated tests then


Yeah I always check source control (blame/annotate), even if I wrote the code myself, just to be sure I'm not missing some context.

Automated tests is pretty great, but a lot of the stuff we do is difficult to test, mostly due to a lot of legacy code that's not well confined. As we work on a piece of code we try to clean that part up, but it takes time.


Maybe add a dated note as a code comment breifly describing the reason for the change.


I'd imagine stuff like race conditions would fit into this category nicely. Obvious upon inspection but annoying to test for.


>Why is that Ok? How does the dev know that they've not undone anything else?

If the bug is "I thought we were supposed to have a 10 minute timeout and we accidentally set the timeout to 10 seconds" it's pretty screamingly obvious that if you change the time from "10" to "600" the problem is now solved and you haven't broken anything else.

Religiously applying this rule of thumb as you describe causes ALL sorts of problems including the problem of people writing tests for the above kind of behavior. That test will fail when it's changed from 600 to 1800 deliberately and that will create a pointless waste of time for everybody.


Yeah, and then you find out that that constant was also used in another piece of code as number of minutes, so you've just changed another timeout in the system from 10 minutes to 10 hours.

Yes, of course that constant should never have been a naked integer in the first place, but we live in an imperfect world. One thing I like about Go's standard library is that almost everything takes time.Duration instead of a plain integer that's then interpreted to be milliseconds (or micro/nanoseconds in the most unexpected place, gotcha, developer, you should've read the docs!)


> This anxiety of pushing out at fix - at the risk of undoing other working functionality - ought to be addressed first.

It really depends on how urgent the fix is. Naturally you want the fix to be as isolated as possible so it does not regress other functionality. But even non-urgent fixes can sometimes benefit from getting pushed out following quick manual testing. For example we use Sentry.io for capturing errors in our applications and LogDNA for logging.

Occassionally we'll encounter some kind of edge case where we see a spike in logged errors which blows us past our Sentry and LogDNA quotas. Pushing a fix out before the test is written can be beneficial in cases like this although yes, it's worth avoiding if possible.


Sounds like I could have been more thorough in describing why I think "that's okay."

> How does the dev know that they've not undone anything else?

I didn't write it, but I was assuming a scenario where a system already has a comprehensive automated test suite. If you have most functionality under test, then hopefully you're pretty confident that it won't undo anything else.

> Also, how does the dev know that the fix is complete?

The same way a dev knows if one single automated test that addresses the one known failure scenario is a complete fix.

In other words: you don't know. You keep doing some manual testing, and watching whatever logs or status indicators, to see if things go back to normal after deployment.

> Or that it caters to the defect?

I also didn't write this, but I had in mind some level of manual testing before deploying the code change to ensure it caters to the defect.

> Why is that Ok?

Hopefully my answers above help explain why I said that's okay.

I'm not advocating for flippantly shipping code without having a variety of other guard rails in place; I was primarily talking about when a bug has really bad consequences for users, your team is confident that writing a test is going to take a long time, and you do some level of manual testing to confirm all seems generally well.


>How does the dev know that they've not undone anything else?

No one says that rest of the test suite is not run.


Sometimes you’re in a hurry to fix a bug

Most of the time I find it faster to write the test first. If you haven't got a test what are you going to do? Manually test?


And since everything usually takes longer than expected, the benefit of automating that manual testing is usually greater than expected.


We tend to have a web focus on this site but not all IT is like that.

If I have a data processing job that takes 3 hours and it fell over at the 1 hour mark (and let's say whoever wrote it didn't have the foresight to make it resume neatly, because that's added complexity that never got budgeted), I'm going to fix the obvious bug and kick off reprocessing immediately. Possibly after some messing around in a REPL to confirm how the code acts.

While it's going, I can then do some manual tests and sanity checks and cancel/restart if necessary - but if not, I've gained a lot of time.


Nope. With experience comes wisdom. I've had the situation happen when I was sure I knew what the bug was. I wrote a test that I expected to fail—and it passed. My diagnosis was incorrect and the bug was somewhere else. Had I not done this, I might have not only not fixed the bug, I would have likely introduced new bugs.


> That’s okay!

The effort put into manual tests can be committed directly to adding a test case. Furthermore, a fix should be accepted only when a full test suite with the new tests pass. Such manual testing, depending on what the issue was, may result in a regression in other parts of software. Just a thought...


I agree with the last part, but the ordering isn’t crucial. For example you can stash your changes in the implementation to check that the test fails when the change is not present and passes when it is. For this (among other reasons) it is nice to have a test runner that re-runs when it detects file changes.

The ordering is one part of TDD that always bugged me- you have to write the tests first. But I often prefer to experiment and try a couple approaches before deciding on one. Having tests first would make a lot more overhead for that way of working.


To be fair though, if your tests are at the right abstraction level, the specific approach you are choosing for the implementation shouldn't matter for the test.

Writing the test first also forces you to think abouy what API do you actually want to expose. Once you got the API right, there is still room for experimenting.


That’s only true when you have decided what the abstraction will be. That’s my point, a lot of times you don’t know yet!


In this approach you decide on the abstraction (i.e., the API) by writing example code (i.e., some tests) that uses the API. The tests are how you decide which abstraction seems to make the most sense.

It sounds like you actually implement the abstraction to decide whether it seems like the right one, which is a lot more work.


Yes, my position is that tests as client don't really tell you the truth about the abstraction because they don't represent a real usage of it.

It is better to write tests for code when you know what it is and what it should do. Tests also introduce a drag on changing strategies- if the choice you made when you wrote them is now not necessarily the optimal one, you must now change your tests or convince yourself that you were actually right the first time.

If people like to work this way then great, I'm just explaining why for me it feels bad and runs counter to my instincts.


I think I understand what you mean. At the same time though, one crucial takeaway for me from Ian's talk is that my tests might be on a too small scale if they are not useful whilst I am changing the implementation strategy.

For example, I found it useful to ditch concepts like the testing pyramid and focus on writing e2e tests for my HTTP API instead of trying to cover everything with module or function level tests. That makes it much less likely that they need to change during refactorings and hence provide more value.

I generally think that "What is going to break this test?" is a really powerful question to ask to evaluate how good it is. Any answer apart from "a change in requirements" could be a hint that something is odd about the software design. But to ask this question, I need to write the test first or at least think about what kind of test I would write. At some point, writing the actual test might be obsolete as just thinking about it makes you realize a flaw in the design.

Other interesting questions I like to ask myself are: "How much risk is this test eliminating?" and "How costly is it to write and maintain?"


In reality I tend to do both: write example client code to think through the abstraction (some call this “README-driven development”) and then write tests once the implementation is under way. Though you can get the first as a side effect of the second, I find that good tests aren’t really good example code (too fragmented, focus on edge cases, etc.).


“TDD, much like scrum, got corrupted by the "Agile Consulting Industry". Sticking to the original principles as laid out by Kent Beck results in fairly sane practices.”

Totally agree. Somehow every good idea gets converted to a rigid ideology after a while. Same for OOP. It’s a solid idea but then the ideologues pushed it way too far. And instead of dialing back a little we see other ideologues declare “X is dead“ and the pendulum swings into another extreme direction.

My company is generally behind the curve so now people have been bitten by the REST, JSON, Microservice bug. They don’t know why or what it really is but things have to be done that way. That together with calling themselves “agile” without understanding what it means besides using JIRA and having fixed sprints.


> My company is generally behind the curve so now people have been bitten by the REST, JSON, Microservice bug. They don’t know why or what it really is but things have to be done that way.

This resonates with me. My first job out of college was with a big, very old insurance company. My team lead became obsessed with using microservices for some reason, even though we were only building internal web apps that would have about 1,000 users on a busy day. There would be no performance concerns whatsoever that would warrant "breaking up a monolith" to make it more scalable. But microservices were a great way for the team to feel like we were using trendy tech despite not having any idea how to really go about doing it or any particular reason for doing so.


> When people talk about "unit tests", a unit doesn't refer to the common pattern of "a single class". A unit is a piece of the software with a clear boundary. It might be a whole microservice, or a chunk of a monolith that is internally consistent.

Every example I've read pertaining to unit testing uses a function as the unit to test. The easiest functions to test are ones that don't have side-effects (network, I/O, disk, etc). Could you point me to an example where a unit test applies to something beyond a function?


This bugs me as well. The people who argue about whether unit tests works will often redefine "unit" to mean anything from "the entire application" to a single function.

Colloquially, however, anything above the level of a self contained function or class is called something else - typically an integration test.


Unit tests, integration tests were never fully defined that well. They change based on who you talk to. Especially integration tests.

Better definition is the small, medium and large tests from Google's testing blog.


I've used unit tests on attributes and methods of a mock class instance, in order to test the class's construction method.


>> When people talk about "unit tests", a unit doesn't refer to the common pattern of "a single class". A unit is a piece of the software with a clear boundary. It might be a whole microservice, or a chunk of a monolith that is internally consistent.

Its OK to dislike unit testing, but please don't redefine the term to avoid it. That's not helpful. Instead try to find the papers (by NASA or IBM?) That show unit testing finds only very few actual bugs, making it low value.

That said, there are IMHO some units worth testing more.


They aren't redefining it though. The term has always been fuzzy. A common boundary of unit testing has always been a module's publicly exposed interface.


Wasn't the "clear boundary" definition the original, that was later interpreted as syntactic boundary (function/class) instead of semantic boundary (a chunk of business logic)?


Integration tests vs. unit tests.


It's worth noting that the greater the granularity of your unit, the more likely you will be able to write tests targeted at bugs.

For instance, if you are only testing through the API of the service, you may have a hard-to-impossible time confirming you recover gracefully from certain exceptions. You generally don't have service-level APIs to throw various exceptions intentionally.

Point being, the overzealous testers do have some good points, even if they miss the forest for the trees.


The bug triggered the exception. The test case encapsulates reproducing the bug. The bug is fixed. The bug can no longer be reproduced. As long as the bug remains fixed the test passes..

Another way of thinking about it. Unless your exceptions are a documented part of your API no one cares about them - they only care about the outcome they actually expect. If you construct tests that pass for positive outcomes or fail for any other outcome then your exceptions remain implementation details.


I think GP is referring to nondeterministic exceptions. For instance, if the service under test depends on some other service, then you may need to test the scenario where the other service is unavailable. The exception is not triggered by a bug, it is triggered by an operational externality.


For networking related problems you can deterministically control failures from the test using something like Toxiproxy. This can be especially useful if you’re working out a particular bug (e.g. gracefully handling a split brain situation or something).

A more general approach would be to just run your happy path tests while wrecking the environment (e.g. randomly killing instances, adding latency, dropping packets, whatever).

I’ve found that the latter often uncovers problems that you can use the former to solve.

Testing these sort of things with unit tests can work, but I’m more confident in tests that run on a ‘real’ networking stack, instead of e.g. mocking a socket timeout exception.


This is more integration testing than unit testing. Certainly valuable but it shouldn't replace your unit tests.


Yeah, reading it again I mixed up what you wrote with someone else.


Then test the externalities. They probably have healthcheck endpoints....

Don't overthink everything. KISS


Imagine I am implementing a service that queries 20 separate MySQL database servers to generate a report. (I'm not saying this is a good architecture, it's merely to illustrate the point.) I know that sometimes one of the MySQL instances might be down, e.g. due to a hardware failure. When this happens, my service is supposed to return data from the other 19 databases, along with a warning message indicating that the data is incomplete.

I would like to write a test to verify that my code properly handles the case where one of the MySQL instances has experienced a hardware failure. The point is that I can't do this as a strict black-box test where I merely issue calls to my service's public API.

[edit] And of course "testing the externalities" doesn't help here. I can test the MySQL instances and verify that they are all running, but that doesn't remove the need for my code to handle the possibility that at some point one of them goes down.


First. Don't do this!

Second. You've done this/someone else has done this and now you need to maintain it (we've all been there!). In this case my original post holds. Your test suite mocks the databases for unit tests anyway right? So write some test(s) checking that when the various databases are down appropriate responses are given by your service.


Yeah, sometimes for practical reasons you don't want to, or can't test directly across the API, as good testing practice would dictate.

Taken to the extreme, the philosophy I laid out leads to something that looks like "only integration and end-to-end tests" depending on your architecture.

So I try to be pragmatic whenever possible, but I think leaning towards BDD works better, after 18 months of doing it.


Corrupted is not the right way to describe it. Both TDD and agile provide something amazing to management: a way to make the black hole called "software engineering" into something tangible and quantifiable. This ofcourse also makes it possible to execute some bad management practices on software engineering as well. People like to complain about agile (and apparently TDD) but I would argue that there are also huge success stories that don't make it to HN.


Let's keep in mind this Fowler post has no science in it. It's just some "lauded practitioners'" view of TDD. Our industry is driven by this kind of discourse. There's very little good scientific research to answer questions like these in software in general. How much is any good on TDD? One paper? Two?


Replying to myself (facepalm): just to make sure people know what I think. I've been doing TDD since 2002. I'm of the "You will take this out of my cold, dead hands" variety on the usefulness of TDD. That doesn't mean that there's any science that proves it.


If you have a reference, I’d love to read about some of these success stories.


> - When people talk about "unit tests", a unit doesn't refer to the common pattern of "a single class". A unit is a piece of the software with a clear boundary. It might be a whole microservice, or a chunk of a monolith that is internally consistent.

I don't understand how this matches to the idea of "don't write one line of code unless it's necessary to make a failing test to pass".


What is it that you think is contradictory?


All of the design advantages I've ever heard advertised for TDD come from writing a bit of test, a bit of code, a bit more test, a bit more code.

If instead you are writing a few tests and then an entire module, you're doing test-first development, but it's definitely not test-driven development as I've ever seen it presented by proponents.


You need to write a module. You write tests that tests the functionality of the module. But to pass those tests you need a couple of classes. You write tests for the classes (or at least the one you intend to work on next). The class needs some methods, you write tests for those. You write code for the methods until all methods tests pass. Hopefully the tests for the class pass, otherwise you might need to update methods and tests for those. Unt so weiter.


> plus one test per bug fixed

I used to be a big proponent of this, until I suggested this to my manager: "We ran the stats on our bugtracker, and bugs coming back are really rare, so we like to focus our effort on testing with a higher ROI".

And in the end I agreed with him on this.


Yeah, I never understood this safeguarding against bugs resurfacing after being fixed once. I only saw bugs coming back at a company that didn't use version control and instead copied source code back and forth with a USB stick.

I can understand something like test driven bug fixing, where you basically create a simple test to reproduce the bug quickly and then fix the bug using that. In many cases that is the most efficient workflow.

The test succeeding can then serve as evidence of the bugfix (though it might not be enough). So if you have already written the test, you might as well leave it in there, because it doesn't bother anyone usually, and the tiny chance that someone breaks this exact same thing again, while tiny isn't non-existant.

But fixing a bug and then putting extra work just for a test, if there is another easier way to prove that the bug is fixed? No, thanks.


> Yeah, I never understood this safeguarding against bugs resurfacing after being fixed once.

In my experience it's not infrequent for bugs to unknowingly only get half-fixed, not realizing that the true problem actually lies a level deeper, or has a mirror case, or whatever. Maybe a good example is that a parameter to a command is 0, the bugfix sets it to be 1, but a later bugfix changes it back to 0, when the correct bugfix would set it to be 0 in some cases and 1 in others.

And that if you fix the bug without a test, then the second related bug crops up a couple months later, and somebody else tries to fix it similarly naively, and can wind up re-introducing the first bug if there isn't a test for it.

Basically, in practice bugs have this nasty habit of clustering and affecting each other -- if the code was trickier than usual to write in the first place, it's going to be trickier than usual to fix, and more likely than usual to continue to have problems.

So keeping tests for fixed bugs is kind of like applying extra armor where your tank gets hit -- statistically, it's going to pay off.


I remember reading that half of all bug fixes introduce a new bug. I'm pretty sure it comes from these sort of scenarios.


I have a perfect real-world example. About four years ago some of my code broke in certain cases. I came up with a fix that relied on a case-sensitive regex to check for those cases. I think I made it case-sensitive because I wanted to make sure it didn't trigger accidentally on something added in the future. And these case names had never changed, right?

Yep, now that I've spelled it out, what happened is obvious. Three years later, I got ordered to change one letter in these case names from lower case to upper case. Of course I didn't remember that I'd used a case-sensitive test against the names three years before. And bam, the bug was back, and as there was no test for it, I shipped code with the bug.

The good news is the bug was obvious as soon as the customers tried to compile my code, so it didn't cause any harm but embarrassment on my part. Even so, it took me a while to track down what was going on. Imagine my shock when I got into the code and found the fix I thought I needed to make was already there... but itself needed to be fixed!


I have seen bugs resurface regularly, but it was always due to a missing merge/cherry-pick.

Tests wouldn't help in that case.


Avoiding too many layers of useless indirection is not hard: avoid refactoring effort that looks like a line by line, linear normalization of syntax.

Target tests and refactor for the confusing, incomplete chunks according to the team if there is one.

Semantic understanding of the system will improve. Instead of fetishizing code patterns, fetishize systemic understanding.

Personally, that habit has made it so I write better code from the start. It’s acted like a forcing function to reconsider if a habit is useful or just a habit.

My code went from deep OOP hierarchies of indirection, to composable, more functional, chunks. I import less, define fewer objects to begin with, compute a larger variety of useful objects, and can pull together features faster.

Have standardized machine? Nursing my own symbol library is where most of the fun is. With respect to Martin Fowler and the rest, who are great engineers in their own right, but this all smells like pandering to efficiency, importing a shared model, which impacts resiliency.

We shouldn’t have people think within the same context box everyday. Software philosophy has been taken over by the equivalent of popular bean counters. Focused on minimize the idea space for the perception of productivity gains. It’s cognitive indirection, IMO


Tedu had an interesting blog post about testing recently:

https://flak.tedunangst.com/post/against-testing

What I got out of it was that tests for regressions are really good, but there are lots of considerations to make when determining which other tests to write, and why you are doing it. A good read nevertheless.


> When people talk about "unit tests", a unit doesn't refer to the common pattern of "a single class". A unit is a piece of the software with a clear boundary. It might be a whole microservice, or a chunk of a monolith that is internally consistent.

Yes, this. However, when we go in the direction of testing a complete service, is it not more convenient to call it an integration test?


My personal view is that it's an integration test only if the test actively involves external components (like a real database, some other api, etc..). I don't use integration test terminology if all those external components are mocked/stubbed out (even if it encompasses a large "unit").

I also call those mocked surface area tests "functional" tests, partially to differentiate them from non-mocked integration tests, and partially because people get too hung up on the "unit" term.


Often it's convenient to test the functions in your database access layer against a real or convincingly simulated database (e.g. sqlite). This is "like a unit test" in that it focuses on the details of individual functions, but "like an integration test" in that it crosses a system boundary rather than mocking it. I've not found it productive to use either term when talking about it.


The parent above you didn't necessarily imply that the whole microservice was being integrated before testing it. It is reasonable that a microservice can exist as whole, but nonintegrated with its other deliverable components, and remains sufficiently unit-testable.

It's also extremely likely that the microservice needs to be integrated to test it :)


Test coverage is the most conveniently accessible numeric value in the neighborhood of software quality, therefore it is software quality. Merging a change that reduces test coverage is reducing software quality. That's okay, sometimes we all have to cut corners, but defending this choice means you don't value quality, and are therefore not a culture fit.

/s


This is a rampant problem in the enterprise world, and it drives me nuts. I regularly have to work for clients who mandate using Sonar Qube (or/and other SAST tools) with strict policies, and also require 85%-100% test coverage on all projects regardless of how much sense it makes.

Predictably, teams have to spend way too much energy getting "waivers" approved by some ridiculous group, and inevitably end up creating tests that don't actually test anything, just to get the coverage figures up.


Can you give us an example of where it doesn't make sense?


In Go, it turns testing into a game of “how do I make this stdlib function return an error”?


Error handling taking up 50% of the code is definitely a problem with Go itself. For each meaningful line, there's an accompanying "if err != nil {return err}", so if you want coverage you end up testing this kind of boilerplate.


A trivial example would be getters and setters for plain old classes.


Surely the getters and setters of POJOs would be exercised in the tests of other classes? If not, why are they there?


E.g. it’s public api in a library and they’re only used by other applications.


I've lost so much time making these arguments with people. Unfortunately the combination of dogma + an industry of sham consultants who have monetized that have created a monster.


I think I won an argument with an interviewer, but I lost any chance at the job ;)


>As such, a good trigger rule is "one test per desired external behavior of the unit, plus one test per bug fixed". The test for each bugfix comes from experience -- they delineate tricky parts of your unit and enforce workign code around them.

So much this. The smoothest integration I've ever worked on was one where I owned the API and back-end and another team built the front end. I defined the API behaviors, built the tests verify those behaviors, and created a mock API for the front-end folks to use while testing.

Over the course of building the API, I slowly got my test success rate from 0% to 100% and always immediately knew when I accidentally changed behaviors. When we finally integrated, there were literally 0 errors related to the client/server interface. Yes, there was some UI wonkiness and we discovered some issues of scale, but there were no issues with parameter values, HTTP response codes, error messages, etc. It was a) amazing and b) the only time I've had the luxury to build something in that manner.


Did you build it around OpenAPI? I'm interested to try that approach.


No, this effort pre-dated the development of that spec.


The main issue apart from the "Consulting Industry" is that too many people try to follow a methodology to the letter, but superficially, without trying to understand the point and deeper meaning.

That's why people argue about rule or whether this or that can or cannot be done instead of trying to understand the aim first and then the cost/benefit balance.


Which is a problem when you see something that seems like it should help so you hire a consultant who can push the letter of the rules but doesn't understand it well enough to figure out how it should work in your organization.

I wasted a lot of time trying to get acceptance tests in here, they seem like a good idea but I can't get traction on them and the consultants preached the rules not how or why so the rules got roboticly followed with no helpful results. I'm still not sure if the concept is flawed or not just the execution but we threw it out.


Yes this is the best talk so far. Right now I am in a company that has strict class-method testing strategy. This makes refactoring a pain in the A and decreases the programmer time / quality insured by the test ratio.

In "unit testing", the unit doesn't mean "the smallest thing we test which we decided is a class even if doesn't make sense"; it simply means "on its own" so external dependencies are not in the way.

My heuristic for determining how much a test is worth writing is something like this:

cost of number of outbound dependence * human understanding complexity + non-immutability + how much code coverage it adds.

This leads to a grouping of tests and code that naturally fit together.


>- TDD, much like scrum, got corrupted by the "Agile Consulting Industry". Sticking to the original principles as laid out by Kent Beck results in fairly sane practices.

Trying to explain this to people just gets exhausting after a while.


It would be great if it worked this way. What I see instead is that managers are complaining if coverage is less than 80% and you have to write tooling to compensate for generated code and all that crap. It is the 10th circle of hell.


Generated code is easy, just generate the tests.

I wish I was kidding.


> What triggers writing a test is what matters. Overzealous testers test for each new public method in a class.

I have seen this before. At one company I forced by a couple of other developers to write tests for accessors(get/set) on model classes. They would reject my PR if I didn't do it. To leadership it looked like I was against automated testing.

To me its more important to write tests for places where your most likely to make mistakes. For example calculations or business logic or how the view renders your model as it changes. Not every single small method.


But how can I measure that with a red/yellow/green stoplight chart based on code coverage? I’m actually supposed to trust that the developers will do the right thing?!


Sorry. How is this hard? You test a function you write and write it down. Run the test. Why is this on Youtube? This is nuts - this isn't an existential developer crisis.


Great advice. Most tests are too discrete.


OOP Class really devalued the idea of module. Focusing on terminology and not the concept caused harm.


I've been reading "Large Scale C++" by Lakos. There are two kinds of code Lakos writes about:

1. Application code -- Fast changing, poorly specified code. You need to have a rapid development cycle, "discovering" what the customer wants. Your #1 job is pleasing the customer, as quickly, and as reliably, as possible.

2. Library code -- Slow changing, highly specified code. You have a long, conservative development cycle. Your #1 job is supporting application programmers.

TDD probably works for #2, less so for #1. Furthermore, we all dream of being library developers (that our code and specifications are stable, and that our code can last decades). Alas, most of us are Application developers, and the lifespan of our code isn't really that long.

Recognizing the lifespan of your code, as well as the innate goals of your team (quick and dirty application style, or slow and careful library style) is important.

------------

Mixing up the styles causes issues. If you write library-style code for an application, the specification will change under your feet and everything needs to be rewritten to the new whims of your new customers.

If you write application-style code for a library, you yourself won't be "stable enough" to support your peers, and no one will want to use your library.


This is something that is touched upon much. TDD fits library writing like glove because the developer naturally thinking of the public contract of its components.

This not to say that Application Code doesn't need the same approach but its a lot harder for developers who don't have their design hats to think of pieces of their functionality as contracts of behaviour that contribute to a feature or requirement. To your point to think of in terms of the life of their code.

I think the top comment links to an excellent talk on TDD and how it is often misapplied.


Determining which is which is definitely a skill developed from experience. For example, being able to see that if every time you use library part Y, you have to add algorithm X around it, algorithm X should be a part of the library not the app.


I find this application/library distinction hard to reconsile with the many long-lived, stable applications that exist. AutoCAD, for example, is older than I am.


Any large scale program will have both "application" parts and "library" parts.

Lets take Autocad as a long-running example. Its first release was 1982, and had an update this year, in 2020. What's a good example of "application" vs "library" code here?

Lets take the user interface, which is almost always "application" style, with short lifespans. Back in 1982, you probably had to write a custom serial-communicator to interface with the mouse (or trackball). Eventually, Windows is released and a standard mouse-interface becomes common practice.

-------

A few decades later, the 3-button mouse with scrollwheel (with the wheel being the 3rd button) becomes popular. Adding new features to the application, and the overall design of the UI would have to change for modern sensibilities.

Then Ribbon happens. Hate it or love it, Windows programs are now Ribbon-based. Gotta overhaul the UI AGAIN to match the changing times.

This progression from 2-button mouse -> Windows driven mouse -> 3-button mouse with scrollwheel -> Ribbon -> touch-enabled (??) is "Application code", requirements that change with the times. Every few years, the code interfacing with the user was largely rewritten, to match the (then modern) expectations of its userbase.

-------

Of course, there's the "library chunk", which probably solves geometric constraints or something like that. That part may have never changed throughout the life of AutoCAD.

-------

Application code CHANGES. That's the important bit. You cannot expect applications to look the same over decades. I'd be very surprised if AutoCAD still had their DOS GUI laying around (https://www.scan2cad.com/wp-content/uploads/2016/06/autocad_...). That sort of code is thrown away when it is no longer fashionable.


AutoCAD is an interesting example because it is consumed by professionals much in the same way a library is used by programmers. I think you'd want to be a little slower moving in that space than you would for your average application.


But wouldn't AutoCAD be split into library and application parts internally? For instance, ShapeManager is the internal geometric modeling kernel used by AutoCAD.


Hasn't Autocad been around since before TTD became mainstream?


I think this distinction is harmful. Your application code should include areas that function as "library code." We do this at work with an explicit UI library that we package separately, but there's no reason it needs to be packaged separately, just create directories that represent some library-esque element of your application and heavily test it and write docs for it, like you would a library.


The distinction is helpful from an organizational perspective.

From a manager's point of view, they need to know where the money is going. If ProjectA is the "support" for LibraryA (and the managers don't know about it), it will look like ProjectA is costing more and behind-schedule.

But if you explicitly split into ProjectA + LibraryA, the managers can see where the money is going, and better allocate funds matching the business's priorities.

Financially: the goal of LibraryA is to outlive ProjectA. There's no financial incentive for ProjectA to think long-term. (And if ProjectA is thinking long-term, then STOP THINKING LONGTERM!! A huge amount of code is short-term glue logic with bad specifications). Aligning your technical goals with your soft-managerial goals is key to survival.


I'm not sure what you're finding harmful when it sounds to me like you're agreeing. Your "library code" (or library-esq elements) is the code that you find needs to be tested more heavily, because it is library code? I don't thing whether it's packaged separately was key to that, but it can help.


Anyone can be a library developer! Unfortunately no one pays library developers, so nearly every programmer is paid to be an application developer.


The economics of application vs library programmers is a managerial decision.

Any application project that starts writing library-style code will slow down, and possibly gain the scrutiny of the managers / executives. In "Large Scale C++", Lakos notes the importance of manager-level buy in. You must have a library-team, with a separate stream of money, if you expect to be sustainable.

Application-style programmers trying to make a library on their own will lose the economic game.

--------

"Who pays for the library??". Well, if your organization has 10 teams, all using a specific library (or that could benefit from a hypothetical library), its actually a very difficult question. This is where hierarchy and management must step in to solve the issue. No sane manager wants to spend their own money helping other teams with no recognition!

Trying to write a library without solving the long-term economic and managerial issues is suicide. You'll only slow down your own application and make your own project look weaker compared to your peers (at least in the short term. In the long term, the library code will blossom and make your team more productive. But surviving the short term is a major concern in reality)


> Unfortunately no one pays library developers

Shameless plug: We're hiring at talkjs.com. Our entire product is essentially a library.

Also, back on topic, our tests never go stale because we promise full backward compat to our customers, no matter what. Solves a big part of the downsides of TDD or other approaches that yield lots of tests.


I was going to write something similar, but i think you have said it better than I would have.


Sadly, in this day and age of development and the need to constantly ship things, TDD has been dead for a long time. In my 12 year career, I have heard lots of people talk about test-driven development, but I've never seen it in a workplace (at least none I've worked at).

On my own personal projects I have dabbled with TDD and I've seen the benefits it can provide, but it does make even simple programming tasks take a lot longer. Sadly, companies these days (especially during the pandemic) can no longer afford the luxury of development taking longer, even if it does mean the end result will most likely be cleaner and have less bugs. The company I work for see shipping potentially buggy code and fixing it as bugs are reported as an acceptable development practice.

With the advent of automated builds and deployment processes, it is way too easy to quickly ship code and roll back bad releases or push out emergency patches. Things don't have to be perfect the first or second time around. The optics to non-technical executives seeing code go out and features released are a lot better than seeing things take longer to develop.


That should itself be rather telling. TDD is extremely well known, in my experience. I bet if I stuck a microphone under the nose of random passersby at any development convention [1], 90%+ will be able to tell me what 'TDD' is short for, and even if they don't, they'll have heard of the concept at least. Hell, I bet most would tell me they 'aspire to do it'.

And yet, more or less nobody does.

So either it is next to impossible to begin doing it (seems like a bizarre conclusion), or, perhaps more likely, nobody wants to do it, and the few dev teams that do manage to do this have not managed to turn that into a competitive advantage. Which makes the value of TDD rather questionable based on simple evidence.

To explain this observed behaviour that TDD is clearly not a competitive advantage[2], I can name a million pet theories. But without going into any of those, the sheer fact that it's __this__ rare in practice says a lot, no?

[1] I mostly go to java related ones, maybe it's less well known amongst other communities.

[2] What other explanation is there? Clearly not 'ah, but, TDD is brand new and you have to give it some time for teams to get familiar with it, and for the concept to percolate through, maybe wait for tooling support to catch up' - TDD's quite an old concept!


One analogy that has come to mind for me is TDD is like eating a whole food diet with a ton of vegetables and no processed foods.

When you're doing it, you feel great, or at least you convince yourself that you feel great because you know you're supposed to. But ultimately it's hard to keep it up. It gets tedious, it requires a lot of willpower, and the benefits are a little too abstract or far removed.

At the end of the day, most people will be just as healthy with a more moderate approach. Just eat some vegetables, not too much sugar, and keep a reasonable limit on total calories. In the programming analogy - just write some tests for important functionality, and be careful that the tests you write are actually working, but you don't need to be fanatical about writing the tests before the functionality or having 100% coverage.

And then as with diet, it's also possible that if you're aiming for the moderate approach you end up slipping too much to the other extreme and writing almost no tests. For some people a structured diet is necessary to keep them on track, and I suspect for some people TDD can be a more effective tool than it is for others if their default coding style tends toward more sloppy/unstructured.


Exactly. I think most experienced developers know about TDD, maybe they have even tried it on a personal project. But, selling it to a company that have commitments to investors, paying customers and an executive team who might not all have technical backgrounds makes TDD a hard sell. How do you explain to people who don't get it that things will take longer?

One of the biggest problems with TDD is that it kind of relies on having clearly defined specifications. I don't know about you or other people here, but I've worked in a lot of places (many even called themselves Agile) where the work was not properly scoped at all. If you start doing TDD and the scope isn't clear, the goal posts keep on moving and things just perpetually take longer.

I think it's all about cost in the end. It's cheaper for companies to ship buggy code and then iteratively patch the bugs. Unless they're massive showstopper bugs (which normal tests should be catching anyway), it probably still comparatively works out cheaper to fix bugs as you find them.


> One of the biggest problems with TDD is that it kind of relies on having clearly defined specifications

That's very true. I think it's an easier sell when you have a relatively stable set of specifications (accounting software core logic) where the rate of change is low but the cost of regressions is high. But I think you can tackle the same problem space with good unit tests instead of enforcing a test-first mindset.

In situations where work is vague, spending more upfront time thinking about architecture is much more useful (design docs?) imo. Having a set of units for a crappy, entangled system is pretty costly.


> I don't know about you or other people here, but I've worked in a lot of places (many even called themselves Agile) where the work was not properly scoped at all.

I can see this argument for large-scoped scenario based integration tests, definitely. But TDD unit testing operates on a much smaller scale, that of a single patch, or a Jira ticket.


"But, selling it to a company that have commitments to investors, paying customers and an executive team who might not all have technical backgrounds makes TDD a hard sell"

It's an incredibly easy sale. The whole basis of TDD is that it's an approach that makes your development efforts faster, with higher quality. A million graphics of the amount of time that development spends fixing errors are what sold TDD to the masses.

The theory of TDD is exactly what sells to the suits and the money counters.

The theory doesn't mesh with reality, though, and it's that engagement with the enemy (reality) where TDD falls down.

As an aside, in my own career I've seldom been able to incorporate TDD because each project has been novel enough that trying to define tests up front was just not possible. Yes, if I was implementing the re-invent the wheel "sum two numbers" type example, it's trivial. But most of the time it's a vague API for a vague need on an uncertain technical foundation, and until the clay has taken form we really weren't sure what we were dealing with.


> To explain this observed behaviour that TDD is clearly not a competitive advantage[2], I can name a million pet theories. But without going into any of those, the sheer fact that it's __this__ rare in practice says a lot, no?

Short answer: TDD is hard

Long answer: It takes a very experienced engineer to start and layout a piece of software so that it's easy to write and maintain tests. IMO, this is one of the advantages of server-side rendered web applications; it's much easier to write a test that verifies HTML than a test that verifies the state of a Windows / Mac application; or even the state of the DOM in a rich in-browser application.

But, continuing on that train of thought: It's much easier, as a developer, to write a test that tests an API, (either programmatic API or web service) as a unit than a giant monolith.


The only circumstances in which I can see writing tests against HTML directly is if you're doing a move or a migration.

I did exactly this for a hosting company back when Windows Server 2003 was getting EOL'd. They setup on 2012, migrated the sites over, and then my script would pull from both versions and see how much variation there was. Even then it was mostly just a diff.

Otherwise that's the equivalent of testing implementation details.


Write a test that:

1: Stick an entity in the database 2: Try to get the page for the entity without logging in. Verify the error page 3: Log in as someone who doesn't have the entity. Try to get the page for the entity. Verify the error page 4: Log in as someone who can get the entity. Verify some basics of the page contents

Write a test that:

1: Posts the contents of the form without logging in. Verify the error. ... 4: Log in as someone who can post the form. Post the contents of the form. Verify the contents are in the database.

Tests like above meet the higher-level post's definition of a "unit test" and are rather easy to write. You don't even have to be in the same language as application. They're simple enough that developers can write them as part of their workflow, and simple enough that developers can run them to ensure no regressions.

Trying to do something like that with a thick UI application is much, much harder. Possible, but not easy.


Here's my concern with that approach.

It implies to me that the absence of an error is a success. In other words, if you don't find the HTML you're looking for that represents an error, then it's considered a success.

The problem is that at some point it's possible for the page to be redesigned (or the entire site) such that the HTML you're looking for that represents an error is no longer there for a legitimate error.

IOW, the tests are not future proof. But this is the general problem with looking at HTML, it's fragile.

That doesn't mean I don't see the need, I understand the goal, I just don't agree with this approach.

I think if it were me I'd do 1 of 2 things.

1. rely on HTTP codes (403 is a reasonable response for what you've described), or

2. rely on HTTP headers. And I would argue it's better to set them on success and consider it an error anytime the header is missing.

Obviously redesigns don't happen that often so it's not a super large issue, probably not worth rewriting anything but I do think if I were about to implement that functionality I would try and use HTTP codes or HTTP headers.


> Verify some basics of the page contents

Implies that you're looking for the contents that are supposed to be returned.

When there's an error, the contents you're looking for won't be there, therefore the test will fail.

Furthermore, HTTP testing libraries typically require you to specify required HTTP status codes.


- "you can, for example, examine HTML in your tests"

- "that sounds super fragile, there's only a few circumstances in which I would that"

- "Here's the circumstances in which I'm doing that"

- "Here are the problems with what you're doing, and here are alternatives that don't involve examining the HTML"

- "I'm not examining the HTML!?!? Whatever gave you that impression!"

---

Not worth my time.


Well I do TDD around 85% of the time and I'm doing pretty well financially as a dev. Sometimes it isn't possible to do TDD, like when I'm iterating on a data science algorithm or exploring data and relationships before settling on a final approach. But you're right, most devs I know don't do it, though I've noticed the ones that do generally make more.


> I’ve noticed the ones that do generally make more.

The market may be rewarding the TDD skill, it makes sense. But consider that causality makes sense the other way around too. Anyone with the patience, eye for detail, and organizational skills demanded by TDD could be just an above-average programmer.

Personally I like a “don’t forget the tests” approach for larger units of business logic, but I’m against exhaustive micro-test suites that just turn into bespoke, buggy, and useless-at-compile-time type systems.


oh please, this is the programmer version of "if you're reading this blog you're already in the top X% of <insert industry here>".


The top 50% are capable of actually doing TDD? I don’t see why that’s impossible, especially if they are the ones implementing it, not just pushing tickets along within it.

And I don’t really follow TDD practices myself, so this would say nothing about me.


Careful with this reasoning. It could be that those who insist on TDD are simply the loudest or most confident or whatever, and therefore are more often in a position to advance themselves. "Their stuff works better" isn't necessarily the link between "TDD" and "make more".


Interesting. You're defending and advocating for TDD, and you use it 85% of the time. I've used TDD as well, and I notice it tends to work very well for certain types of workflows. Otherwise, I write a little code, test a little code, back and forth. Almost everyone does this (you write a method, you tend to write a little test to see if it does what it's supposed to). To get high test coverage, you just save those little tests that you naturally write anyway. It's a good way to get high testing coverage without imposing an odd-feeling methodology into what should be a creative workflow.

When the big "TDD is dead" debate hit the scene, I felt that DHH and the detractors won that debate big time - even though I use TDD and consider it to be a valuable tool.

Why would I think this - after all, if I like TDD and often employ it, why would I agree that the people who claim it is "dead" have won?

Probably because the debate had become relatively extreme:

https://blog.cleancoder.com/uncle-bob/2014/05/02/Professiona...

"If I am right… If TDD is as significant to software as hand-washing was to medicine and is instrumental in pulling us back from the brink of that looming catastrophe, then Kent Beck will be hailed a hero, and TDD will carry the full weight of professionalism. After that, those who refuse to practice TDD will be excused from the ranks of professional programmers. It would not surprise me if, one day, TDD had the force of law behind it."

Whoah! And keep in mind, people were acting on this. I read a blog (sorry no cite) from a highly regarded developer who worked as a contractor to restructure software dev teams, and he came flat out and said that if people don't adopt TDD, they won't be employed for much longer wherever he works. I got the feeling that even questioning it would be considered grounds for "no hire".

That mentality is what I think is "dead" - and this is evidenced in the fact that you and I (who do quite a bit of TDD) acknowledge that it is not something we always do and that it isn't always possible or desirable. I got in wicked arguments about this at work, and I'm almost certain I was dropped from consideration for jobs during interviews where I argued the now more moderate position which, ironically, makes me one of TDD's defenders in 2020!.


Absolutely, but I still prefer the interfaces that come out of doing proper TDD. I minimize the use of mocks and try to pull in as much of the infrastructure as possible. I'm pragmatic though. I'll use fakeredis in tests instead of actual redis so the tests can run concurrently without quirks.

It's hard to communicate as a developer at times. We're all learning different things at different times and taking things to an extreme can cause bad outcomes for the rest of the team. One of the reasons I try to be accurate during online discussions. "I do TDD 85% of the time" is a better thing to add to the conversation than "I do TDD" since it gives people the idea of nuance in software development methods.


There's a distinction between TDD and writing tests.

I personally find TDD paradoxical in that it's often used as a way to gather your thoughts, yet the more exploratory the work the more TDD hurts.

Tests are very very useful when wrapped around the core logic and data structures of your application. But that doesn't mean they need to come first.


"But that doesn't mean they need to come first."

Curious - did you hold this opinion around 5-10 years ago, and did you voice it to any TDD advocates at the time? My experience was that a statement like this (which I consider immensely reasonable) could kick off really severe arguments, and could get you no-hired in interviews.

There may have been a bit of a Motte-and-Bailey argument going on, where TDD advocates would retreat to the value of tests when defending TDD, and then go back to insisting on TDD once the coast was clear.


There have been several phone screens in my life in which as soon as we started speaking about TDD and/or agile, they completely lost interest.

But as far as I'm concerned that's a good thing. I write tests where it makes sense, and I can think of 1 project where I eventually regretted the decision not to write tests (before or after), but even then it didn't really hurt it just would have been a lot less scary with a good suite of tests around some specific functionality (related to pulling emails, doing the wrong thing there could mean permanently losing emails without processing).


>So either it is next to impossible to begin doing it (seems like a bizarre conclusion), or, perhaps more likely, nobody wants to do it, and the few dev teams that do manage to do this have not managed to turn that into a competitive advantage. Which makes the value of TDD rather questionable based on simple evidence.

Or the competitive advantages it offers do not align with the competitive advantages our society currently selects for. These days it seems that short term profits are the primary factor selected for and that users tend to go with the first to market until it has a significantly worse experience.

>But without going into any of those, the sheer fact that it's __this__ rare in practice says a lot, no?

It's a bit like how good security is a rare practice, as good enough security is much cheaper and the cost of security incidents is generally not as large as they should be (for example, data breaches have most of their costs pushed onto the victims of the data being leaked and not the collecting company that was breached).


> So either it is next to impossible to begin doing it (seems like a bizarre conclusion), or, perhaps more likely, nobody wants to do it

Or it is an incredibly good procedure for a tiny minority of the projects, but sucks for the vast majority of them.


Test are like Haskell, it seems.


IMO, TDD really depends on which kind of code you're writing, which language, which tooling...

I just finished a quite large collection of parsers and did it entirely using TDD, got 100% coverage and it was much easier than when I did it without it in the past. I kinda HAD to get 100% coverage because how else would I test my code? I only wrote the part that read from files at the end of the project.

Also, when I was writing code that communicated with banks or telcos in arcane COBOL-era protocols, I didn't really have a way to "test" my code other than in production, so I relied on TDD for my day-to-day coding. It worked fine.

For GUI stuff, or web development? I used TDD in the past and didn't gain anything for it.

This should be obvious for everyone, but TDD only works when it works... it's not a silver bullet.


I think your post gets to the core point. My view is that 90% of the code is fine without tests (assuming a crud app and some way of catching type errors). It's just plumbing to get data from a to b. There's around 10% that it's important to test, the core logic.

I wrote a pdf parsing library and it's a joy to test and the tests give you a lot of certainty the code works because the unit is the library itself, the input is the file and the output is well defined.

But I really am beginning to dislike testing most code in a web app. If you have a pure calculation or some business logic, test away. For everything else some higher level tests give far more assurance, when you're moving fast those might even be primarily manual. A good rule of thumb I think is that a test using mocks is a bit of an anti pattern, they make you feel like you're testing code when generally you're testing your test.

The typical test pyramid is upside-down.


What people usually don't realize is automatic test is a "contract", and it works best for something that have specification.

Libraries usually have specification. Web gui? Not so much.


> But I really am beginning to dislike testing most code in a web app. If you have a pure calculation or some business logic, test away. For everything else some higher level tests give far more assurance, when you're moving fast those might even be primarily manual.

Yes, I completely agree!

It took me a while to get used to write code like this, but having the business logic completely unbraided from the interface/database code is probably the biggest productivity boon I ever had, because my tests run blazingly fast. People don't think it matters, but instant feedback makes a lot of difference.

This is also how things like DDD, Hexagonal Architecture, Functional-Core-Imperative-Shell are structured, so we're not alone. It doesn't have to be as complex as some of those: it just has to be easily testable...

-

> A good rule of thumb I think is that a test using mocks is a bit of an anti pattern, they make you feel like you're testing code when generally you're testing your test.

Oh, I couldn't agree more.

My personal pet peeve is having to write unit tests for thin-controllers or Spring-style services. I take an hour to carefully mock 10 deep dependencies, and in the end all I'm testing is if the language is able to call methods in other classes.

Instead, if I just write the silliest integration test, I'll probably get better coverage and it will be able to uncover more bugs.


I treat it like a tool. I use TDD when I encounter a problem that is complex enough that I cannot consider or remember all of the possible inputs/outputs during development. Getting all of those dumped into a test is the best way to reduce cognitive load.


> The company I work for see shipping potentially buggy code and fixing it as bugs are reported as an acceptable development practice.

That's what I've heard called "trading external quality" for time to market. That is exactly how you're supposed to do it, at least according to the dev coach that I've been listening to (@GeePawHill)

Why you write tests and refactor is to keep your code's internal quality up. Those are the things that customers can't see, the things that make it harder for you to change your code when you come back to do more work later.

The Correlation Principal states that ISQ and developer productivity are directly correlated and that you cannot trade internal quality for development time to market without this trade-off quickly coming back to bite you, in the form of lost productivity.

A feature that is only partially implemented, or an edge case that isn't tested properly and causes something to fail at runtime, are both examples of external quality. You can indeed come back and fix that later, and you can make this trade cautiously to get your work to production faster.

But if you write a long method for example that "just works" and ship it without further introspection, refactoring, or complete test coverage, you may never recover from that. It might require devs to spend an extra hour or more each time that method should need to change in the future. In some future state it has become so long and complicated that nobody can fully understand it, the time to stop digging is whenever you see yourself inside of this hole.

And according to Sandi Metz, those big long methods are almost always guaranteed to be the ones that will need to change again, they are long and complicated because those are the business objects that you care about. They should be well factored as early as possible to facilitate future changes, (unless you have a crystal ball and can say for sure that won't ever need to happen!)


That seems to be a very simplified model of how actual bugs look like in the wild.

> You can indeed come back and fix that later, and you can make this trade cautiously to get your work to production faster.

I think that is only true in certain companies of a certain size. If you're a small startup trying to launch an MVP, yes, time to market will be prioritised over everything.

If you're in a regulated sector, or your company is any larger than e.g. 30 people, "going back and fixing later" gets harder and costly than shipping correct software in the first place; either because you'll damage your reputation, you'll have a deadline to implement the fix (unnecessarily adding pressure on your own team), or other teams start relying on the wrong behaviour and now your fix is not trivial anymore because it doesn't affect only a local scope.

I want to highlight this because it's a trend to think any company and any project can (and should!) be managed as if anything being done is a prototype, w/ a perception that this is "agile", and picking the worst set of trade-offs for the task at hand.


I've just been learning about Wardley Mapping and he says you're right, too. The point wasn't only that you CAN trade external quality for time to ship, but that anyone who tells you that you can trade Internal Quality for the same is probably wrong and should be disbelieved, as that trade isn't likely to work for long, if at all.


Wardley Mapping seems like an interesting idea I never heard about - thanks for sharing.


I took the $25 class that you can find easily, I thought it was very worthwhile! Helped me to understand exactly what you said, but with some formal structure. Some teams are mad scientists, some are pioneers, some are settlers, and some are town planners.

Some tasks are suited better for mad scientists, ... some for town planners. Some regulated environments are only suited to the type of work that town planners can do (6-sigma people, that's how it's also explained in the class.)

But the settlers won't have much to do if pioneers haven't done some heavy lifting first, and so on (thanks a lot, mad scientist,) and the town planners can't do the type of work that you need from them either, unless there have been many layers of groundwork laid before them.

Glad you enjoyed checking it out!


Factoring a long and complex method doesn't magically make the process it's doing less lengthy (unless there is a lot of repetition) or less complex. In fact it's likely to make it longer and more spread out so harder to understand from scratch. When I encounter such code it's far easier to change if you keep it together and document it well so future people can find out what it's doing and see the whole in context.

For future changes, factoring also requires a crystal ball since you have no idea how requirements will change or if your factoring will actually be useful for them. Better to keep things simple and contained until you actually need to make them more complex.


I agree! When the method is too hard to unit test (for an experienced unit tester) it's frequently a sign that it's not single responsibility anymore, or that there may be other problems in the design worth sussing out. But there is no substitute for experience and a sharp eye, when you know what you're doing and have confidence about what is most likely to come next, it's almost like having that crystal ball. Rules are also made to be broken.


This summarizes my experience to a tee. Early on I've found that little compromises on internal quality come back and bite you almost right away. Compromising on that only makes sense if you're 'building one to throw away', in which case rapidity of learning is probably the priority. The problem is that the decision makers almost never want to actually throw it away when the time comes.

I've been doing TDD a long time (18+ years), and I'm definitely pragmatic in the sense that I don't test everything, definitely don't aim for 100% coverage, and don't use mocks often, etc. I aim to extract most of the value from the tests without paying an absurdly high tax on them. Still learning in that regard, but I have had some real successes in testing.

I find this is key in our broader team's ability to keep the internal quality high. It's obviously easy to mess up a code base, but (high quality!) tests have been an important part of that, in addition to refactoring, code review, etc.


I recently tried TDD. It was fun for a while, but it took too much time writing tests for mundane things. I am not convinced about 100% coverage etc.

For me the biggest takeaway was the way I was organizing my code. I work on a 15 year old web application with lots of legacy stuff, so it isn't easy to make changes fast. New code that I write depends on old code, so I am constrained a bit. Even still, I found marked improvement in my code structure after trying TDD. I do not know whether I'll stick with TDD, but I definitely will remember the code organizing lessons it taught me


It's usually counterproductive to strive for 100% test coverage. You wind up with nonsense tests for default getters/setters, and occasionally you face things that simply CAN'T be 100% tested due to how a 3rd-party library works.

The most aggressive shop I've ever seen set their bar at 95% coverage.


In my last job we had 100% coverage, but if something was trivial we'd annotate so the code coverage tool would ignore it. It meant every decision not to cover something was documented, and visible to the code reviewer


I don’t think it’s “this day and age”. I think that the reality is that TDD has never been the norm (even the article were lamenting over is 6 years old), for reasons you gave but also because it’s just not that good of a pattern. Is it useful? Probably. Is it useful relative to the trade off in shipping time? Less of an obvious answer.


To me, the value in TDD is the same as testing in general. It's not so much for initial feature development but rather later in a project or application maintenance you see the value. I view it as amortizing the complexity over the life of the project.

Sure, it might take a bit longer to ship a simple feature, but bugs also take away from feature development, and in a complex enough application small changes might result large bugs. I guess that's true about software testing in general, but I think most developers would say there's value in having tests.

Often, the tradeoff in shipping quickly is technical debt that impacts your ability to ship quickly in the long run. I think TDD can help manage that, especially if you're planning to write tests for a feature anyway.

It does depend on what you're working on, of course, but I think most companies over-estimate their need to rapidly ship features. In the grand scheme of things adding a week to a 6 week project isn't that big a deal and might even save time in the long run.


> In the grand scheme of things adding a week to a 6 week project isn't that big a deal

mmm maybe once. But that's a compounding delay. If every project is delayed by ~15% for TDD then over the course of years you can fall well behind your competitors.

> complex enough application small changes might result large bugs

Totally agree - but also on really tightly TDD-tests code bases small changes can result in huge test refactorings...many times to the point of just not doing something because the time to update the tests is prohibitively expensive.

There's a balance to all of this, and (empirically) TDD seems to be on the extreme end of the balance.


TDD is one of the many things this industry does to have the facade of rigor without actually developing the methodology sufficiently to move beyond a facade. It's driven largely by inertia and arguments from popular figures in the industry.


All of this is spot on, especially the second point. As someone who lives with ADHD, TDD is hell incarnate for my productivity. 'Lets write a test for what will be an Observable and account for for failing, passing, bad data, empty data sets...oh wait, what was the original task?' It's so unbelievably monotonous that it's REALLY easy to lose track of the original task and worse, sometimes I'll be doing something and think up an additional or better way to accomplish something, it creates real-time tech debt faster. My SO has OCD and to see the ways we think in action being so different makes it super clear to me that things like TDD probably are heaven for some people.


I have ADHD also. I find TDD to be very helpful, as it encourages me to break the problem into the smallest possible next step.

That said, I've found the most boost from the full XP menu -- pair programming, TDD etc. I can get a lot done with halved daily ritalin intake vs working alone.


The key to TDD is to write tests that evaluate on the order of 10ms. A whole test suite should take less than 1s.

Only this way does TDD not interfere with development velocity.


What a dream that would be. But unfortunately not super realistic on a project of any real size.


It's pretty easy to get those numbers if your business code isn't intertwined with your UI/database/MVC code, or if your core code can be tested in a pure manner (like the "core" of an image editing software, or parsers, or serializers/protocols, etc).

Of course, if all you have is a basic CRUD app, then there's not much testable "core" code that you can separate from your framework, so TDD and unit tests are probably not the best idea. IMO end-to-end and integration tests are the way to go anyway.


There are ways to structure your code so that this can be done. Of course, it would require a complete refactor to achieve full test coverage on an existing project, but if you build like this from the beginning, it's feasible.


Is there any literature on this? A test suite that runs in 1s is...kinda crazy sounding on any non-trivial codebase.


Gary Bernhardt has a pretty good talk: https://www.youtube.com/watch?v=RAxiiRPHS9k


Pretty sure the parent comment was talking about the time to write the tests, not the time to execute it.


The issue with slow execution is that you don't run the test suite often enough.

I like running my tests almost as often as I compile my code. If they run fast enough, it creates the most addictive feedback loop for TDD.

Besides, we should be measuring the time to write tests against the time it takes to manually test. If you don't execute tests every time you make a change, you're not really sure if your code works. And manual testing just means we have no idea whether or not our code works.


Test driven development is most useful when there are multiple contributors. The added work time doesn't calculate when I am sole architect/implementor.

If I was to be an architect only for a system though, I'd likely appreciate TDD to ensure implementors do what I think I spec'd


TDD for a complex domain generally means (at least in an OO design) you cannot do exploratory coding over what objects you have and what decisions you will make.

After all, you have to write the tests for the classes, before the classes exist.

When you do so, then start to fill in the code, and then realise you need to refactor, you have a heap of tests you need to refactor. It always appeared to me highly inefficient and predicated on an assumption that the author can express his class heirarchies well ahead of the coding.

In my experience, the class design is highly iterative, even the names of methods, what methods you have, etc - to have to write the tests before you've written the code just creates a huge amount of impediments to the flow of expressing a solution.

This is not a criticism of unit testing - but of TDD.


In the version of TDD I've been exposed to, you only write the minimum number of tests to cover your next feature. It's still iterative.

I do more TDD on exploratory work than something fully formed because I can work in a narrow scope without having to grasp the whole project.


If you only know what the feature is, you're prepared to write functional and/or integration tests, but not unit tests. Unit tests closely wrap the details of the implementation.


An important requirement of TDD is that you decouple the functionality from the implementation, for exactly this reason. You should be able to completely rewrite your implementation from scratch, and only minimally update your tests. And the ability to do that is exactly the benefit that TDD can bring.

I find red-green-refactor MOST helpful during exploratory coding. There's a principle that Uncle Bob calls "Don't go for the gold": "stay away from the center of the algorithm for as long as possible. Deal with the degenerate, trivial, and simple administrative tasks first." I build up to the final, complex final functionality one small step at a time. At no point do I need to architect the whole kit and caboodle at once, then re-architect, then re-architect. It's all just aggregate improvements and refactoring what's already there.


Your quotes from Uncle Bob to my ears sound like trite nonsense. Moreover, name-checking him and then giving quotations of what he said seems positively religious.

My basic argument with agile is this - can it be measured that these things are good? Or is it just a pile of piffle built on anecdotal evidence - that people get paid thousands of dollars a day as consultants to espouse?

The industry has a long legacy of this - e.g. designing your classes in UML will solve your problems, or the Rational Unified Process, or prior to that, in the early 90s, Rational Rose diagramming experience was the "must have" on your CV.

One of the more reasonable analysis tools was Use Cases in my view, but most of it is just a bunch of noise, busy-making tasks which Andersen Consulting were more than happy to provide 100 people doing these tasks and bilk some client 100m for the pleasure.

There's no substitute for high quality, intelligent people self-organising. People that want to proceduralise developer behaviour are basically people that want to steal our freedom to be creative and effective, in the manner which befits ourselves as individuals. And make themselves ghastly rich doing it.

I don't see Uncle Bob as all that different, having hung out in the Agile evangelist scene in London for a while - they are all on the lam.

In my opinion.


My experience as well. There is usually not enough information available to write truly useful tests at the point TDD wants you to.


TDD is meant to be done on public interfaces. So TDD is meant to allow you design interfaces from the pov of a consumer of that public interfaces.


This is high falutin’ theory.

Many user interfaces don’t have easily testable abstract interfaces, and user features are first tackled by expressing changes to a user interface.

You really have to jump through hoops to make the reality of coding meet the theoretical model.


We're not talking about user interfaces. It's what the user interface calls.


Frankly good type systems displaced TDD for me. They do a better job getting at the goals of TDD than TDD does. Types in good systems (OCaml, Rust, Typescript, Scala, Haskell, others) are 100% laser focused on depicting good public interfaces and laying out their behavior.

Sometimes you need a test or two to really nail down this behavior. That'll give you a TDD-like flair. But it's not the same because you've already invested so much productive thinking time into the interface driven purely by the types. At that point, TDD is good hygiene, but not a transformative practice.


For js->ts, this is right on the money.

When I started unit tests way back when, half the tests were just checking what happens if you give it strange arguments.TS now does that job for me.

Another thing that has cut down on the number of tests is switching to pure functions for most things. And trying to isolate side effects as much as possible, so you really don't need to write to many tests around it. If the function only has one or two lines, typically types are enough to catch potential issues.


It's one of the things that really frustrates me with the ASP.net team. They're obsessed with unit testing, have made really stupid decisions that means tons of stuff now gets injected instead of just passed, all in the name of unit testing.

But C# is a typed language and generally doesn't need reams of unit tests, so all it does is make the code unnecessarily complicated.

My big bugbear was when they idiotically made the config injected, of all the things that should be super simple to use, and definitely not injected, it's config. It only changes between environments.


> Frankly good type systems displaced TDD for me. They do a better job getting at the goals of TDD than TDD does.

I think it's a false dichotomy. Where you have types, use types. Where you don't, use tests. Either way, get a tool to guide you to the answer and then check the invariant during future changes.

In fact TDD made me more aware of and appreciative of types. Writing types-by-hand is very tedious and wasteful.


Writing types by hand is not any less wasteful than writing documentation.


There is a great talk by Gary Bernhardt exploring this exact view. It shows how types and and tests actually do not solve the same problems and are not interchangeable.

https://www.destroyallsoftware.com/talks/ideology


I didn't RTFA from Gary Bernhardt, but this has been my experience. I love types, but it's not like good types replace good tests. They just eliminate some states from the search space.


Over the years I have some notes about TDD:

* Writing test first is simply unnatural. Way more developers need exploratory session to even discover what are they supposed to do.

* That's because our spec is always vague and no amount of Agile process can help it.

* If you have a bunch of idempotent functions, they are the easiest to write tests on. So I would start there.

* Some languages are easier to do unit tests on, for example: Go. It gives everything you need: test runner, benchmarker, data race checker, unit test struct is part of std library, etc.

* This DSL: `describe(){ it "should work" { result.should eq true } }`, is nuts and counter productive. People are already not writing tests and you are adding another friction?

* To write or to not write mock? This is a controversial one, but in my opinion, it is simply practical to test directly on the database you are using. Your CI/CD should just prep the database you need instead of mocking. Once again, people are already not writing tests, don't add more frictions.


> * Writing test first is simply unnatural. Way more developers need exploratory session to even discover what are they supposed to do.

Fully agree with this. It has been my experience.

To quote Mat Ryer from Go Programming Blueprints

> The first time we write a piece of code, all we are really doing is learning about the problem and how it might be tackled as well as getting some of our thinking out of our heads and onto paper (or into a text editor). The second time we write it, we are applying our new knowledge to actually solve the problem.

https://neillyons.io/the-art-of-writing-is-rewriting/


> Your CI/CD should just prep the database you need instead of mocking.

This is a hard one. A lot of tests require setup steps even if you're using a real database, so it's not even that different than a mock. And if you can't give each test its own database to work with, then you have to run your tests serially. I'm working on a codebase that spins up 4 separate database engines in docker, and each test has quite a few setup steps. The result is brittle tests that take hours to run, so they're not very useful as a release gate. I've been bitten by mocks before too though, so I don't have good answers.


We did that with transactions. Every test gets their own view of the database, and rolls its changes back after they're done. They take much longer than unit tests but it's worth it in the end as writing unit tests are easy, there is no need to create mocks, and you also test if your DB queries work too this way.


> Way more developers need exploratory session to even discover what are they supposed to do.

to flip this: write tests to help discover what the thing is supposed to do


This idea doesn't make sense to me. Mind elaborating?


Common advice when designing an API is to experiment with using the planned API before you implement it.

TDD is one way of doing this exploration, where the exploration is codified into actual code using the API and including assertions about the behavior you intend to implement.


sure! write a few tests pretending your thing is already implemented, to capture what you want it to do. at this point it's a step beyond writing no test and just typing `YourThing.Do()` in a text editor. does it make sense? is it awkward? should it even be `YourThing` or `SomeOtherThing`? what the "unit" is of what you're testing might change, or its API might. you're basically just trying to get a sketch of what it's like for the user.

now, at the end of this, you'll have a clearer idea of the external API boundary, probably a clearer vision of how it should work, _and_ code you can test against. you've potentially just saved yourself the labour of writing the thing, realizing it needs to be redesigned, and rewriting it.


I guess the hesitation I have is a do all that, and find out the real implementation should have just been something like adding a new field to an enum and adding/adjusting some if statements.

It seems like a giant waste to build an api, when the there is one I could just extend. But to confirm I can just extend that api, I'd need to first implement the change to see it works.


right, i'm talking about implementing something new. if you're trying to refactor or alter an existing codebase, it can be even easier: you add another test to an already existing suite.

i don't think i've ever sat down, written tests for a completely new implementation, only to find that i need to add a field somewhere. before i sit down to write tests, i do some preliminary thinking. i'm not saying, write tests without ever trying to think first. but do use tests to flesh out your change from outside of the black box.


> * This DSL: `describe(){ it "should work" { result.should eq true } }`, is nuts and counter productive. People are already not writing tests and you are adding another friction?

Completely agree.

I have no idea why test libraries are so intent on making the test names readable as english sentences, it feels very awkward compared to just naming test methods.


While I'm not a fan of DSLs as seen in Ruby's RSpec, I do enjoy being able to write regular sentences as test names; provided you don't have to stick with the pattern "it is expected to ...".


Controversial Opinion

TDD for me has always been something to show off at an interview; performing a beautifully choreographed dance of "Red, Green, Refactor" when showing off your skills.

TDD is only useful for me when I know the structure of my code i.e. I am fixing a bug or adding a feature to an existing application.

However when you are "in the dark" or early days of developing a new service then TDD can definitely slow you down if you don't know what your architecture is going to look like (maybe thats my fault from not enough whiteboarding?)


Your last sentence is beautiful. You're on the journey.

If TDD is slowing a developer down because they don't know what the architecture is, it's a strong indicator that the plan isn't complete.

By the time TDD is being used, it should be as obvious what to do as unpacking a U-Haul truck full of boxes parked in front of an empty house with the door propped open.


> By the time TDD is being used, it should be as obvious what to do as unpacking a U-Haul truck full of boxes parked in front of an empty house with the door propped open.

<sarcasm>By the time TDD is being used, you should have already have known way ahead of time the house you were going to move to in the future and simply just had your amazon orders shipped to that address in the first place.

If you have to move your boxes at all, your design was clearly incomplete originally and you should not have started buying belongings until you knew the final house you were going to live in first.</sarcasm>

Real life is too messy for TDD 99% of the time I find. Unexpected things happen (e.g. you move house) that means you can't know everything ahead of time.


I hear you. But I feel that saying real life is too messy for TDD 99 out of 100 times, is an appeal to fatalism.

I believe, in good faith, that more often than one percent of the time, the existence of a good plan is what enables TDD to be the successful demonstration it claims to be.

The trust that we grow as practitioners of good software architecture is what enables us to have these talking points. I believe that we should feel enabled to discuss how to succeed, and less so how to fail.


Yeah sure, you have your clean architecture laid out without having written a single line of code. Then you implement it and it all works out fine, on time and on budget. That's a nice fairy-tale.


Yep, the GP's statement is completely dissonant from reality.

Inverting the development process may be useful for some well understood domains, but certainly not a majority, let alone the amount TDD zealots proclaim.


I think you have an opportunity to reframe TDD in your last sentence.

TDD, like Agile and Scrum, is not a one size fits all solution. These are tools with process that have to be adapted to the people and the organization. That’s why the creators of these practices are declaring them “dead”, because people are blindly believing they are written in stone rules and that misses the intention. We made these rules to help ourselves and our organizations... not to overly constrain ourselves... add more discipline but leave room for flexibility that our jobs demand of us.

Have you ever had to write some calling code, to call some other code, to make sure it worked? Don’t call it a formal test, I mean, a chunk of code that calls your other code, and you verify that the expected output is what you thought it would be?

Coding is often a process of discovery. Architecture can be learned but applying it takes discovery and iteration. “Why do I need a certain architecture?”. Well, if you don’t know, code long enough, and do enough projects, and you might start discovering architecture all on your own. Then you might see how what you discover aligns with what others have discovered. Writing code can help you discover the right architecture, as long as you keep trying and keep iterating.

Many of us had to learn and discover everything on the job, because not everything was known - much of it was being invented and discovered while we coded. Is that as efficient as learning known patterns and applying them? Maybe not, but also I believe the only way you can truly internalize the knowledge of a pattern or architecture is by implementing it, even when you don’t fully understand it. Therefore, you need to “test” it out to explore and discover it, to internalize it, to really learn and understand it.

TDD in that respect can simply be seen as a helpful crutch to help you, as you explore, discover, iterate and learn. A “test”, whether it be a unit or integration test, is that chunk of code that calls some other code, so you can see if the other code works or not, especially in a mocked up context of how you think it should work later. TDD is just accepting at the beginning, “I only have a rough idea of what this should do, I’ll define a starting point to call into my code, even code I haven’t written yet, so I can explore and discover both that code’s external interface and also internal implementation, and see results sooner”.

Using TDD is like sketching the blueprint of how you think the architecture should work, or how you want it to work, and then trying to color in the code in the middle, to see if it really does work.

TDD isn’t an absolute. It’s just a helpful practice made by developers to try to help developers... and you will eventually do something like it, even if you don’t know what it is called... because that’s how we all code in the end.

Think, code, discover, test, integrate... and iterate.

TDD just suggests, start in a test, and then start thinking, and then continue as usual.. and let the testing framework help you assist you as you explore and learn.


If so, what replaced it?

What methods do teams use instead to ensure that:

1. software works as advertised

2. software can be refactored

If there's "no time" to write tests before code, it seems likely that there will also be "no time" to write them after.

If there are no tests, or test coverage is spotty, refactoring is going to be well-nigh impossible. If refactoring isn't done, the implementation is likely to be brittle. If implementation is brittle and no tests exist, few developers will want to touch the code for fear of it breaking. If no developers touch the code to keep it well-factored, the code will rot.

Now maybe this is the plan all along: code is written to decay and eventually disappear. But it doesn't sound like a recipe for long-term success.

So whenever these discussions about the use of TDD come up, I'm very curious about the specific ways teams address the two points I raised above.


Believe it or not but good typed languages with a lot of compile time checks are probably one of the main reasons why TDD is not that big. Sure, these won't catch logic errors but most developers don't write that much heavy logic code anyway. And the senior developers are capable of writing mostly straight forward code that it’s easy to debug.


Type-checking, while helpful, can't always catch logical programming errors that result in user-facing bugs. We would have observed a marked increase in software quality with the advent of typed languages, but we haven't. Quality depends on more than just a typed language, whether that be unit tests, integration tests, etc.


If you have a logic error what's to prevent you from making the same error when writing the test?


I strongly disagree having years of experience in both Objective-C and Swift codebases / products


Disagree on general software quality? I've been an iOS user for about a decade now. Seen a couple of 'bad' (extremely buggy) apps here and there, but the quality seems relatively constant over time (possible sample bias).


Great product teams will build great products independently of Objective-C or Swift but you will need much less resources to deliver on the same quality using swift and modern frontend architectures.

I would also doubt the majority of apps in the Appstore today are using swift at least in the sense that I mean it. It’s not enough to use swift you actually need to know how to model your problem well using its type system and that takes experience. A lot of devs simply focus in solving narrow problems without giving much thought to how much more strict they can be with the language.


Manual testing and design patterns that isolate the effects of a change from the rest of the system.

I am going to manually test things anyways, so in practice, unit testing usually takes longer.

I am more likely to add robusts tests to parts of the code that are harder to test manually as it does save time in that instance.

Ironically, I find that well tested code gives me less incentive to refactor it. Refactoring is good for allowing future changes without breaking things. With good testing, I rely on tests to provide those guardrails instead of a clean code structure.


I guess it depends upon what kind of code you end up writing most. I tend to find lots of code I write has more complexity in the business logic sections than the UI itself - there are lots of corner cases with no impact on the UI.

I find automated testing (not necessarily TDD) adds a lot of value over manual testing in these scenarios. And the automated tests tend to pay off either at development time or after even a single change.


> 1. software works as advertised

Understanding (mutual) and communication seem to be the big & hard problems to tackle. Software reflects these issues clearly, even down to the detail. You can sometimes look at a particular chunk of code and understand how it got there from refining assumptions (etc.) or the lack thereof.

In terms of technical solutions I think one could just look at modern, more powerful concepts, like writing proofs and advanced typed systems in the static world, generative testing, controlling state via FP, gradual typing in the dynamic world etc. A whole host of bugs can be found w/o writing over-specific handcrafted tests.


> what replaced it?

In my experience, TDD has been "replaced" with laborious, time-consuming, low-coverage manual testing - that is, what it was originally meant to replace.


From another comment (thread moves fast):

The next big thing is ATDD - Acceptance Test Driven Development. This is implemented using frameworks like Cucumber[1], in which test scenarios are:

1. Described at the feature level

2. Described together with, and signed off by, the business stakeholders

3. Described in a structured format, which can be implemented in code ('given ... when ... then' format)

The advantages this gives are huge. It is essentially the business requirements described as test scenarios, which can be executed in an automated fashion, (co)authored and owned by the business.

[1] https://cucumber.io/


At my place, automated integration and end-to-end tests replaced TDD. There is an expanding number of unit tests, which usually aren't developed with TDD.


Over the years I've worked with many developers who don't write ANY unit tests, relying only on integration tests and this has caused severe bugs that could have easily been caught by unit tests. This has cost the companies they're working for a fair bit of money.

I've called this developers out and they often seem to be anti-unit test or against unit tests as they think it slows them down, when in reality the cost of cleaning up afterwards costs more.

When you start writing code with unit tests in mind, you generally follow best practices, and start to realise if a "unit" is too big and needs to be split up into smaller units. That and mocking I've found that the anti-unit test developers I've worked with commonly aren't keen on mocking stuff out either (but again, that's all anecdotally).

Generally I find Fowler's guidance on the test pyramid is always worth considering.

https://martinfowler.com/articles/practical-test-pyramid.htm...


As one of those developers, the problem I face is learning how and why to write tests. Most of the tutorials I've read on the matter test things that are too trivial and test too often. It also doesn't help that the people I've worked with don't have a clue how to do it either or those that do it think that everybody should just get it and can't be bothered to explain it.


I recommend reading The Art of Unit Testing -> https://www.manning.com/books/the-art-of-unit-testing . At least this is the book I learned TDD from more than 10 years ago and the knowledge I gained from the book has proven to be timeless.


I see this problem a lot. TDD is a hard skill to learn. It took me 2 years of continuous practice to really get it, and by year 3 I was only just starting to get decent at it. People who try and do TDD for less than 6 months haven't even left the parking lot yet.

I try and short-cut that steep learning curve by mentoring other developers, but there isn't enough people who have that experience around. I've met lots of programmers, and less than 10% of them have ever even tried TDD, let alone done it enough to gain insight and mentor others.


I completely agree, I feel like I'm in the same boat. I do feel like I'm slowly making progress though - primarily by making small contributions to open source libraries that have tests, which requires updating and/or writing new tests - and then trying to replicate tests like that in my own small libraries.


It could also be due to the way the company incentivizes developers. Say a new system or tool needs to be shipped this quarter. They could incentivize that being delivered by offering bonuses to the team if they ship it on time. But if the bug is a minor issue, or will only happen 6 months after ship date (performance issue, leap year issue, etc), developers may elect to ship it as "broken" to meet their deadline. Then another "maintenance team" will be responsible for fixing the bug 6-12 months later when it surfaces in production.

Plus, if no one ever gets fired for shipping buggy code, why bother working so hard on bug-free code? It's a tradeoff.


> relying only on integration tests

You frame this as integration vs unit test, but from the sounds of:

> this has caused severe bugs

and

> aren't keen on mocking stuff out either

My question would be: were the integration tests any good either?


I think like 80% of people in the comments are trying to find refuge in the idea that "TDD is dead" to justify them not writing tests.

Not going to argue, I just hope you realize this will come back and bite you. This profession requires discipline, like any other.


It's okay. I'll be working somewhere else in two years and someone else can figure out how to fix the bug ridden code I wrote /s.


And that person, if they have any sense, will write tests.

Source: I am that person (in general, obviously not in specific).


I've always had trouble with TDD. I'm a mostly self-taught dev, and I had my first jobs in high-velocity startups that themselves were made up of engineers who'd prefer a quick-and-dirty approach.

Later in my career, I've worked for more traditional software businesses, but they didn't encourage testing either. It was always an afterthought, and time was better spent on building flashy and fancy new features to pacify customers.

When I founded my first bootstrapped businesses, it was the same thing: things had to be built quickly, almost always as easily scrapped experiments. A SaaS that is well-tested but missed it's market opportunity window was something I didn't want to risk.

Even with the SaaS FeedbackPanda that finally worked out (and which I sold with my co-founder last year), testing was more or less non-existant. We truly tested in production, and trusted the infrastructure and framework choices we made to bear the most of the burden. It worked out for us, and even our acquirer didn't expect much in terms of TDD in the properties they were looking for.

I have the feeling that the concept of rapid prototyping has developed into a cultural phenomenon that expanded into software engineering—and people have adopted to it.


s/expanded into software engineering/replaced software engineering/


In my experience:

libraries and frameworks - TDD is required and highly effective

for web apps, business CRUD apps in loosely typed languages like Javascript/Python - TDD can marginally help, especially with checking types, validations, etc.

for web apps, business CRUD apps in strongly typed languages like Java - TDD / Unit Testing does not add much value. skip it.

I get more confidence when I run tests for the deployed web page (or API) with actual test data in real time.


> loosely typed languages like Javascript/Python

Isn't Python strongly dynamically typed?


yes, Python is strongly typed which is much better than Javascript's weak types.. but, still the dynamic types mean that potential bugs are not caught by compiler but the runtime (that means live Production environment!!).

Static vs Dynamic type - based on WHEN the types are checked.. i.e., during compilation phase or runtime phase

Strong vs Weak type - based on HOW strict or loose the type check is. i.e., weak types make many implicit assumptions and are very liberal in allowing different types to be referenced interchangeably or added together.


> apps in loosely typed languages like Javascript/Python

> Python is strongly typed

Another of Schrodingers animals.


It seems pretty obvious that TDD was a good idea for some folks, but it became a meme long before most of us were ever exposed to it. By that point you couldn't just write tests and do dev work, you had to join the church of TDD if you wanted to play. Then some lazy ruby scripter like me gets told to write tests not code, and we end up with another layer of glorified middle management to make sure "everything was written with a test"

Eventually those of us who have work to do throw away the junk, get things done, or we move on. Its a pretty natural progression for most bureaucratic organizations.


I am surprised this article exists, but then I noticed it is from 2014. TDD is not a magic bullet, but the moment I have ventured into codifying a critical part of business logic, I have written test first. Not always unit tests, many times an integration test. But the confidence I get when adding/modifying things around that business logic is just mind-blowing.


Wouldn't you get that same confidence if you wrote verification tests instead of tests first?


Actually writing tests before means I'm asking myself what I want to achieve. While verification test is more like saying that whatever the output is, that's the right thing.


The amount of people admitting not doing TDD surprises me. I wouldn’t dare to write any production code without a comprehensive suite of tests.

Most projects I work on for have about 50% production code and 50% tests code, especially if I have a say in it. I simply won’t take any responsibility for my code if I cannot test it.

Funniest thing is that the projects that are well tested are also the projects that just work, rarely triggers the error notifier and rarely has bug reports. The errors that are present are usually due to 3rd party API outages, database outages or abuse.


TDD != writing tests, but it means that you write tests before you code. Many people write tests for their production code, but they mostly add them after at least some of the code has been written.

The idea is that you'll understand the problem space better by writing the test first. Which is IMO a valid assumption. But, at least for me, the problem space (interfaces etc.) is often a bit too fuzzy initially, such that it's easier to just write the first draft, and THEN test that.


This! So many of the comments on this post are about writing tests, rather than writing tests first. I've tried TDD a couple of times and always hated it.

Maybe it's one of those things where you have to get over the hump. But to me it's much more enjoyable to write what I think is the perfect function and then try to break it with my tests.

I also don't understand how anyone could think that a strong typesystem is a replacement for tests. It definitely helps one write a lot fewer tests for the same level of quality, but you still need tests.


I know what TDD means. I write tests before I write my code.

Usually I write out a set of tests (this should this X, Y and Z but not H, I, J) and then I implement whatever thing I need to add.

Same with bug fixes or changes. Bug fix: reproduce in tests, then fix bug. Change: change or create new tests, then update code.


> The amount of people admitting not doing TDD surprises me. I wouldn’t dare to write any production code without a comprehensive suite of tests.

This suggests that you don't understand that people can achieve the stated goal of "a comprehensive suite of tests" without TDD.

Just to be clear, people can be passionate about testing without being passionate about TDD.


Or they can do testing, not do TDD, and be passionate about neither.


You’re right: I don’t understand how people can have proper tests without TDD. I’ve yet to see a project with meaningful tests without TDD


To me, your comment implied otherwise, sorry for assuming wrongly.


> I wouldn’t dare to write any production code without a comprehensive suite of tests.

If I'm starting from scratch, I always put together at-least decent code coverage. However, most of the time I'm not starting from scratch - I'm hired to work on a framework that's been around for a while and, invariably, is not only written without unit tests but written in a way that repels unit tests (every class depends on every other class, static initializers connect to live databases, etc.)


I think TDD is a specific reference to the approach where you always write the test first, i.e. 100% coverage

Could be wrong, but that's how I've always assumed it is intended.


Yes that’s how I’m doing my daily job. Not sure if my apps have 100% coverage but I’m pretty sure it will be pretty close.

Edit: I ran a coverage report on the app I’m working on and it’s 84% (4129 of 4936 LOC).

Untested parts are related to deleted features, dead code or some weird error conditions I didn’t care to test.

There are also some admin only related pages that are untested, and some stuff have their tests disabled because they directly communicate with production APIs because these services don’t have test apis


My problem was learning how and why to do it. Most of the tutorials I've seen on the matter either test too trivially or too often.

But it also says a lot about the places that I've worked where shipping code and carving out domain knowledge for job security was seen as more important that actually doing a good job.


I write tests in units of work. For example, If I need to add a contact form to a page, I’ll create a test that checks whether some form exists and matches some structure (names of fields) and I test that submitting the form performs some desired action. Usually I would replace the real mailer with a fake, test-specific mailer which has extra methods for keeping or inspecting sent mails. (There are libraries for this as well; it’s just an example).

Testing the “real” mailer might never be done since it would involve checking if the mail was actually sent. In these cases I want to isolate that boundary so it is invoked as late as possible so I can use fake mailers for my tests.

If the feature is mostly UI (like in this case) I usually test through the UI (using JSDOM and an HTTP client, or a virtual browser test runner like superagent).

If the feature is more complex like an API or something, I like to write a barebones integration test (does this HTTP endpoint actually invoke some injected module) and then I thoroughly test the underlying module which is HTTP agnostic or represents bare minimum of HTTP state without dependencies to a real HTTP server or client. That way I can test the module without performing HTTP requests or mocking them.

I usually start by writing lines like:

- It returns no posts when there are no posts - It returns no deleted posts - After creating a post, posts index returns my post - Editing my posts updates my post - editing someone else’s post fails - deleting someone else’s post fails

Once I’ve written the outline of my module like this, I convert them to empty tests and implement them one by one, first implementing a test, then changing the code to make the test succeed. Repeat until done. Usually I come up with more happy or unhappy cases while implementing - I instantly write a new blank test for later in these cases, adding it to the list of tests to write.

I like to always have separate implementations for implementing features: one module for the business logic (which is thoroughly tested) and one for connecting it with libraries or frameworks or whatever (with minimum tests, but at least one). This is a guideline and not a requirement: sometimes a dependency to a framework or library or external service is just part of the feature so no point in keeping it separate.


You must not have inherited many ugly codebases.

I would love to write tests for the codebase I currently work in but it just doesn't make sense. Everything works for the most part (not really but the business people are happy) so they is no way we can convince the hire ups that writing tests is beneficial.


Actually migrating legacy (ugly) codebases is where writing tests is helpful.

The tests are the invariants -- they help you preserve functionality while you shift the code underneath. You can rewrite ugly code without worrying about botching anything downstream.


Agreed, but... often these ugly codebases are also very difficult to write tests for, because the code isn't well-factored, there are large functions which combine lots of unrelated behaviors, etc. etc.

I inherited a codebase of this type two years ago. It was only 10k sloc of Python but some of the worst code I've ever worked with—of course it had no tests either. Retroactively writing tests for the existing code made no sense—half the behavior wasn't worth preserving.

What I wound up doing was slowly refactor as I worked on things, writing a new component (with tests) and replacing bits of the legacy code. I also added linting and typechecking and now the code is not gorgeous, but it's serviceable and a lot less buggy.

In retrospect it might have been better to just rewrite it from scratch, architected for testing, but it's difficult to be certain. I think it was probably better for the team to see the codebase improved piecemeal bit by bit, even if it took longer.


I hear you. Testing is but one of many tools needed to fix ugly codebases. It's no panacea for sure.


Not doing TDD isn't not testing. TDD is writing failing tests first and then writing the code to make the tests green (red, green, refactor).


With the speed at which management expect deliveries, TDD has been replaced with Tech Debt Development


ooo I like this.


The design damage from TDD is very real, I'm unsure if the complexity added is of any real value. But once you turn your mind to it, it becomes second nature. I wonder if people's woes about it are because it's hard to change?

Regardless, TDD doesn't work for my org, it's too expensive. Our business logic is very unstructured, requirements are built up over years of projects being layered on top of each other. Decisions about business logic are made on a whim and need to implemented quickly to support the rest of the organization. Given how quickly and randomly the work changes, I'm not tempted to implement anything further than automated smoke tests. Besides, I have a QA team that is triple the size of my development team, it's their responsibility to test the end to end solution.


> Our business logic is very unstructured, requirements are built up over years of projects being layered on top of each other.

That sounds to me like a codebase I'd be terrified to make changes in without extensive test coverage. (Whether the code was written with "TDD" or not is a slightly different question). But I guess it doesn't work out that way for you?

Ah:

> Besides, I have a QA team that is triple the size of my development team

Sure, I guess that is an alternate approach to tests. I have never worked with a formal QA team that extensive, but I'd guess they have to have various scripts and explicitly written out acceptance criteria and such? That's basically a form of 'tests', just in human language and human testers, not code. (I also wonder if the QA team is actually using some forms of automation that look a lot like tests, just they write/maintain them instead of "developers"?)

With a team three times the size of the dev team, it's definitely not a cheap alternative, but I guess I could believe it's cheaper/more effective than trying to have automated test coverage for your code (or a combo of much smaller QA team with some test coverage), like I could believe there is some context where that's true, it seems unlikely to me it will be widely true. But whatever works.

The number of software development projects that lack either sophisticated QA operations like that or good test coverage is probably bigger than those doing either though.


It is a fairly traditional way of doing it. TDD usually means you kinda know what is going on up front. If you work in a org where the sales guy can make up a feature and you need it done yesterday that can be a tough sell that you need time to write tests 'thats what we got QA for'. The refactor at this point is a daunting task. It would even be a decently expensive one. What comes out on the other end is effectively from the end users point of view, the same.

I personally like working with a decent QA team. They challenge you to do better. They take a special glee in breaking your code in ways you did not think of. You can also use their test plans to write automated tests. So that way they can go think of more devious ways to break your code. Also sometimes developers can go off the deep end and over do things. It is nice to have a semi neutral third party saying what is important to test or not. Also one thing to keep in mind is many of these integration testing frameworks are basically tedious coding exercises. Especially if you are external API heavy.

> it's definitely not a cheap alternative You may have hit on why many orgs like the idea of TDD. It pushes the idea that I have someone who can write code well they can write the tests too. Skipping over the fact that takes time and energy away from other things.


> They take a special glee in breaking your code in ways you did not think of.

God, this is so true! It's made me a better developer though; "You know this will break, make it better so Carly doesn't yell at you".


That's a good point - we're still doing TDD, just with "human code" instead of computer code. You're correct that they write out explicit test cases for each of the acceptance criteria.

It's not cheaper than developers doing the testing, but it's more wholesome. They will test exactly how a user operates, with no regards for a software's boundaries. If part of the requirement is fulfilled by another team's software, they will test that other team's work. This works great for my team since we are highly integrated with the rest of the enterprise.


The other amazing thing about proper QA testing that automated testing doesn't do so well is that they have the ability to go off script. Working in games the number of times I've built some feature or designed a level and thought I had tested it thoroughly only to have QA break it to pieces is very high! Your assumptions versus the assumptions of QA people and players are very different.

A single, technically proficient QA expert on a team can be an incredible asset. Back that with a larger team to regression, smoke and otherwise test things and you'll not only find a load of bugs but also get early feedback on design.

Automated testing definitely has a place. Libraries are a great example. There isn't really an end-user interface to test, it's not the conglomerate of much code and you basically need a test harness to even run your code. The downside is that you're typically back to only testing your own assumptions.


That's the only appeal of writing tests for non-critical functionality for me. If I know I have to write a test, I am likely to make a very clean method with a clear input / output. So it almost automatically follows SRP and is a pure function when I know it has to be tested.


It's a shame Kent used the words "test" and "development". Test Driven Design would have been better, but people would still misinterpret what is under "test". Yes, there's a side effect of asserting behavior in Kent's vision of TDD but it's a happy accident.

What's under test is the design. Way before TDD was a thing, when I worked at IBM, we used to call this "inverted design": write the calling code first to see what the API might look like and then make it work. In the late 80s it would have been considered a massive waste to assert behavior though; we'd just implement it.

Automated functional tests (from the outside in) are where the bulk of does-it-do-what-it-says-on-the-tin testing should happen.


> Way before TDD was a thing, when I worked at IBM, we used to call this "inverted design": write the calling code first to see what the API might look like and then make it work.

I really like the idea of this, and I very occasionally have the foresight and wherewithal to do this kind of "top-down" programming.

Maybe not surprisingly, this is how I sometimes end up with the much-criticized "Interface with only one implementer" design smell. I write the interfaces that I would like, right next to where I'm writing the calling code. The interface(s) evolve as the calling code is unfolding. Then, later, I make an impl for the Interface.

At that point I could just delete the interface and only use the concrete implementation, but... I don't. shrug.


I don't think so. If I didn't know who Ron Jeffries was, I would have sworn the following series was pure satire about TDD:

    https://ronjeffries.com/xprog/articles/oksudoku/
    https://ronjeffries.com/xprog/articles/sudoku2/
    https://ronjeffries.com/xprog/articles/sudokumusings/
    https://ronjeffries.com/xprog/articles/sudoku4/
    https://ronjeffries.com/xprog/articles/sudoku5/


Mods: (2014) would be a useful tag to have on the submission, especially given the topic and provocative title.


No.

If you have a large, critical system you need tests. Whether you write the tests first or not is largely immaterial, the tests and functionality get merged and deployed together.

Once your system is of sufficient complexity, you need those tests to prevent regressions unless you’re working on something very isolated.


TDD is mostly about writing tests first. I don't think this is stating that tests shouldn't be written, just that the TDD approach of writing tests then basing your code around fulfilling those tests is dying.


I've found that the strictness imposed by TDD (write test first) has been valuable in learning how to write good tests and understanding the concept of "testability".

If you've never tried TDD for a real project, I still highly recommend it for those reasons alone. It may not be your cup of tea and it may end up being extremely unproductive for you, but hey at least it'll give you some experience-backed opinions for this never-ending debate :)


The main benefit of TDD is that it forces you to code with testing in mind. This produces a decoupled architecture of easily testable system components...

However, once you learn the skill of making things testable, you can find that you don't really need to write tests first anymore... which makes TDD less useful

This is not necessarily rational, but it can explain why it's popularity has diminished, even among people who know it's benefits.


> However, once you learn the skill of making things testable, you can find that you don't really need to write tests first anymore... which makes TDD less useful

Exactly! Plus, I find the idea that TDD produces decoupled architecture to not be true. In fact, it's usually the exact opposite unless someone has a "testability" mindset, in which case they don't need to TDD in the first place.


Developers are too polarized about tests. I wish more projects had just enough tests.

Tests can be a nice entry path for new developers trying to understand how to use the code. Tests be useful when solving tricky bugs. Also, it can be useful to make sure parsers and financial stuff work as expected under different scenarios. On the other hand, too much testing means you are wasting precious time not adding value.


Right, but the discussion is whether or not creating a test is the essential starting point for all development. It's important to not conflate testing with TDD. I don't think testing is polarizing. TDD, though, certainly has its advocates and detractors.

Plenty of others have noted the consultancy influence on the matter, so I'll skip that and just say that I've encountered this phenomenon several times. When the consultant or new VPE or DirE or whatever comes in and says it's TDD time, and when there is dissent, you see this "So you're against testing?" play. That's not useful.


Developers are too polarized about tests.

Developers are too polarized about everything.


Haven't watched yet but this should be a gem: "Test-induced design damage"

In my own company we saw TDD create tons unnecessary indirection by introducing dependency injection all over the place. The only reason for that dependency injection was so the component could be sufficiently isolated for unit testing.

Although it could be argued that the components _should_ be highly compositional anyway. ¯\_(ツ)_/¯


TDD has nothing to do with Testing and everything to do with Development. It is a development methodology (test driven), not a testing methodology.

It is a means to focus development and structure code into isolated "units". And it facilitates wholesale, brutal refactoring and deletion of code because it provides confidence that; tests pass (external interface remains consistent), tests fail (you need to fix your refactor or update external interface), your tests are poor / incomplete.

By writing tests firstish, you make it real hard/noticeable/laborious to write YANGI, sprawling, or over engineered code. If your tests are hard to write, it means your code is interconnected, doing too much, not layered properly, etc. Too many people don't realize this and struggle with tests / spend way too much time writing equally complex tests. When they should realize all this pain is the same pain that will occur when you try to fix/expand/maintain your code. Fix your code! not the test.


Quite liked this view point on testing: https://eng.rekki.com/unit-testing-at-rekki/t.txt


In Microsoft, at least in Office division, TDD got some big pushes and serious traction within many of the teams working on Services, funnily enough back around the time of this article.

But... maybe relatively uniquely? We had a firm division between Dev and Test Engineers at the time. The person writing the tests was not he person developing the code.

The best success story I saw for this was a well defined feature, covered by tests written while the Test Engineer had time before a vacation, and then developed against by the Dev who didn't get time until the vacation.

Shortly afterwards Microsoft did away with the split between Test and Dev, and laid off many of the Test Engineers (while keeping QA in the case of Windows).

I haven't seen many Engineers these days, former Test or not, be enthusiastic to do TDD. It might be motivational due to the constant pressure to make progress and a shift to smaller check-ins, so developing tests for TDD is maybe not observable progress.


I consider TDD as a tool rather than a methodology. Its utility is in exploration, when the problem at hand does not have a precedent or an already well-fitting solution.

Writing tests first serves more like a guidance in fleshing out the approach. At the conclusion, seeing the 'emerged' approach, I always get an itchy feeling that it could be done smoother, now that I know it's doing what I need.

Sure, no one is going to rewrite it, so it goes out as done. The net result is perhaps more domain knowledge, and the tests flesh out the expectations of the behavior.

That's why writing meaningful tests (even by the name) make TDD worthwhile.

But all in all TDD a kinda prototyping tool first. Ideally, someone has to analyze the bulk of the tests to better formulate the product and its resulting 'spec' such that this prototype could be further maintained, hopefully morphing into a better one at some future time.


For me personally it's the story of the TDD luminary trying and giving up on implementing a Sudoku solver with TDD, and the equivalent elegance of Peter Norvig's solver that led me to conclude test driven may ne good, but cargo culting it is particularly bad...


I never "bought" TDD. Just like "microservices" it adds a ton of complexity and cost for something you probably don't need.

The TDD cult doesn't consider that its the last thing you should try, test everything. If you need high reliability you should move to a safer language first. Dynamic typing to static. Unsafe memory to safe, like Rust or Go. Turn on a bunch of linters too. Move to a language that doesn't allow nulls.

Once you've got a bunch of linting and a static language that doesn't allow memory corruption, do you really need to test everything? Probably not. The linux kernel is a good example as usual. Virtually no tests and its the backbone of the internet


Relevant post I wrote back in 2012 --

XDDs: stay healthily skeptical and don’t drink the kool-aid

https://amontalenti.com/2012/02/12/xdds


Tests are often a substitute for compile-time guarantees.


True that. Without decent amount of testing with good coverage I don't feel any confident of the code (in Python) I write. Without decent amount of test any migration from Python2 to 3 is a cumbersome exercise.


I personally I make sure I have some form of automated testing (normally unit tests and e2e tests). As I like to know my code works but I can't do the writing tests first. It just doesn't work with how I approach problems.

The approach I have to many projects is:

1) Sketch out a rough idea of how I will build it. 2) Try to get the fundamentals working, 3) Start building and fixing any obvious defects in the design. Add tests as I go along to catch obvious defects. 4) Iterate from there to completion.

For most of the projects I am working on this seems to be fine.


Working in a codebase without tests is like taking a job at the top of a 4-story building that lacks an elevator.

For me, the biggest benefit of TDD is to act as increase my individual velocity. When I forget what my task is in the middle of executing it, running a test enables me to return to the task within seconds. So personally, I refuse to work professionally in a codebase if I cannot write (even hacky shell-script-based) tests.

Also, tests enable me to better understand how someone else's code is supposed to be used.


surely "development with tests" is different from TDD, a very specific subset of the former.


Correct. But a codebase without tests is one where it is unreasonably hard to practice TDD.


TDD is one of these buzzword methodologies that it good to know it exists but is inapplicable by itself.

In fact, TDD-like techniques have produced some of the worst code I've seen.

The trap with test driven development is that you write code to pass the tests, not code that solves a problem. It is easy to write code you don't understand. Ex: "is it +1 or -1? +1 passes, -1 fails, it must be +1". In the end you don't know why you wrote +1, and maybe it makes no sense but since it passed your tests...


I use TDD with integration tests. At first, my test case verifies a whole feature at a high level, then later when I have the main test case passing, I add more detailed integration test cases. Once I'm satisfied that the feature works under all the tricky possible edge cases, I merge the feature. I just run my test case to verify that the feature is working. It's way easier than manually testing using a browser or HTTP client to make requests; also it's nice that once I finished testing, I get to keep the test case and it serves to avoid regressions on that feature later.

Sometimes I do TDD with unit tests, but only if the specific component is complex enough to warrant a unit test. I don't write unit tests for a simple component where the correctness of the component can easily be inferred from the passing of integration test cases which rely on that component.

Trying to cover everything with unit tests from the beginning is a terrible idea and will lead to sub-par architecture. You have to start with tests that cover functionality at a high level first and work your way to the details later; top-down, not bottom-up.

The beauty of the integration test is that it forces you to modularize high level logic to make it testable. It forces you to think about the input and output boundaries of the code being tested under integration. Integration testing doesn't mean you need to have a database running during the tests; often you can mock out the database client with a dummy adapter (or an in-memory adapter which works in the same way as the real database for the purpose of testing).


I like a lot the distinction that I first saw mentioned at Google

Is your test actually testing functionality or is merely change detector?

For example, unit test has copy-pasted sql query and breaks if you reformatted it etc? If it's change detector, it should be deleted, because it's just copy/different representation of the same information.

This is very clear definition that helps argue and think about tests a lot, especially about mythical numbers like "90% unit test coverage".


These discussions always surface something: The problem with the terminology.

If you look at the arguments (including my own posts in this thread), people are basically talking about different kind of tests, with different costs and value propositions, and dumping them all under the same umbrella. The term "unit test" has been badly overloaded to mean a whole lot of stuff.

Without first categorizing type of tests, and defining what we mean by unit test, we can't even have the discussion about if it's dead, useful, whatever.

Even after defining that, there's a couple of sub groups that strictly define the value of testing based on specific criterias and ignoring others (eg: "the goal of tests is to know if you're breaking something when you refactor").

Tests have a LOT of benefits, and different type of tests get a different subset of these benefits, at varying cost. Without those definitions and associated tradeoffs, the discussion is basically a waste of time. You can see it in this thread: people are going "TDD is XYZ and the purpose is ABC". And you have about 6 variations of XYZ and ABC. Everyone is sure their version is the right one. Myself included.


Agree. Also it seems that some people think TDD is about "do you write tests at all" while others think "do you write tests before the implementation".


martin fowler discusses this in the linked video that no one in this whole thread watched


I just consider TDD as Test During Development. 'During' can be before or after the actual code is written. When I say development is done I mean that the code is written and has been tested.

I have seen mocking and code coverage work really well for a large codebase so I am in favour of those things too. I am not dogmatic about it though. I choose when to apply those things.


The "work outside in" strategy is ideal. Do this as much as possible.

Alas, TDD requires precognition.

Or such prior familiarity with the problem domain as to make the effort redundant.

Or performantive. Which is ok if you're managing upwards. Trying to impress those bozos who read the forward of some Agile Methodology books so now believe they are experts.


I'm going to go on a bit of a tangent.

I don't follow any particular testing religion, and I definitely find myself struggling to figure out what to test, how to test it, and have inflicted design damage in the name of testability...

BUT. I just want to give a shout out to the Rust language for allowing us to test private functions. Sometimes the "tricky" part of a "unit" is not directly in the public API. Maybe you have a fancy regex, string formatter, or sorting algorithm that's part of a larger API. The fact that you can test that part directly without the ceremony of constructing the rest of the unit and being forced to parse out subtle differences in the public API's output is REALLY refreshing when it's needed.

That's definitely not TDD. But it's related to unit testing, so I just felt the need to give a virtual high-five.


This is considered an anti-pattern for a lot of reasons. Bakes private implementation into the specification, making it harder to refactor. Can't reuse tests across different implementations. Brakes abstractions.

I wonder what the reasoning was for the Rust devs.


I'm not sure I follow. What do you mean by "bakes private implementation into the specification"?

In Rust, these tests live inside of the module where the functionality is defined. Nothing "leaks". You write your private function, then right next to it, you write some tests for it. Nobody ever has to know about the private code (unless your tests fail).

> Can't reuse tests across different implementations.

Huh? It's a private function. You aren't going to have multiple implementations...


When you test private methods they essentially become part of your API because you cannot remove that method without changing the tests. You increase the API surface because any observable behavior becomes part of the API, no matter what the docs say.

A test against an interface would be more reusable (and more clearly defined) than tests against a concrete class's private implementation. If you do make a new implementation, you have to pick apart what can be used and what can't.

Maybe YANGI but the idea is that its a code smell that you can't easily test core functionality from the public API. Why is the API so subtle? Is this class doing too much, should you break it up? These are the questions I ask when a junior dev makes a method public just to test it.

But I don't know Rust so perhaps some of these concerns are not relevant to Rust.


Well, the way unit tests work in Rust is that you write a test module inside your module. So, in a case like I'm describing, the unit test is usually (always?) right under the private function in question.

So, technically, yes, that's "observable behavior" in the sense that a test will pass or fail. It's not really the same IMO as running e.g., jUnit, where all of your tests are in the same area away from the code they're testing.

The reason I say that is because if you refactor and change the function in question, the test for it is RIGHT next to it. It's just going to be part of the refactor. It's almost like saying that private methods shouldn't depend on each other because taking one away will break the other. I believe your IDE will even show you squiggles when your test gets messed up while you're editing.

Your suggestion to use an interface is exactly the kind of "test-induced design damage" that's referred to in the OP. Which is what made me think to leave my comment. If you have a one-off pure function that is not used anywhere but inside one module, why in the world would we want to create an interface, then make an impl, then inject that interface into our module? Rust code is also not as class-driven as some other languages, so often times a module is just a collection of free functions. You'd have to inject this interface into every function, when the thing is never going to be different.

> Maybe YANGI but the idea is that its a code smell that you can't easily test core functionality from the public API. Why is the API so subtle? Is this class doing too much, should you break it up? These are the questions I ask when a junior dev makes a method public just to test it.

I don't disagree with that. It very well could/should be considered a smell. But sometimes a piece of smelly code checks out. And, while it should be used sparingly, testing private functions can sometimes save us from making our "actual" code more complicated than it needs to be, just for the sake of testability. Sometimes we can get good testability without adding extra ceremony.

EDIT: Also, one of the things that I sometimes struggle with when writing tests is: which public function's tests are responsible for testing the private functionality?

In other words, let's say I have a private function that formats a date a specific way. Two public functions depend on that functionality, plus do other things.

How do I verify that my date formatter is correctly implemented? I can do the whole iterface+impl+injection dance so that I can test my formatter (which was supposed to just be private implementation detail, but is now public so I can test it), or I can add some asserts to my tests for one of the public functions that depends on it. But which one? Do I test it in both places? Do I pick a favorite? Do I leave a note in one test explaining why it seems more involved than the sister function?


It’s more of a natural consequence of how the privacy rules work than it is an ideological commitment.


Was thinking about this the other day. Working with various teams, especially made up of developers early in their careers, I've seen recently a mashup of TDD disguised as CI/CD pipeline tooling reliance.

I appreciate wanting coverage to see if something gets broken, however, when things break anyway and the root-cause shows there are too many moving parts(updating test cases, dependency trees not getting bumped, overly-complicated build stage, etc.) it makes me question where the sweet-spot is. Working code and broken pipeline? May spend 2 days figuring out why only to punt the issue as irrelevant.

I can't help but feel tooling and frameworks are contributing as crutches and guardrails that developers lean on, contributing to fragmentation and efficiency waste in the industry.


The most obvious case I have seen of this is the addiction to over mock everything and to not write any integration tests.

We got 90% coverage! Sure, all the mocks use 'any' and are not really relevant to real world returns, we don't test any part of the front end and the tests have never actually caught a failure of the many we have every quarter, but 90%!


To be dead, it would have needed to be alive.

Sure, on HN, you will find sometimes some people that reports using it on real life projects.

But in the last 10 years, I did missions for about a 50 or so companies, and none of their team members, NONE, used TDD.

Few even had tests at all.

I've seen companies tried many things: agile, remote work, one team by micro-services, etc.

I've seen many videos of people saying they applied on TDD in their companies. I've even seen a few IRL of them in conferences or meet up.

But I've never worked with people using TDD in real life.

Not saying it doesn't happen, or that TDD is bad, just saying that I don't think it ever became a popular approach. Like Haskell or Nix, it's famous only in our bubble.


"Is X Dead?" -> No. Usually this mean the tech has finally left hype zone.


Three points about TDD.

1). For some people TDD helps them gather their thoughts and design. Gives them a place to start their coding. For some people, like me, i rather just start writing the code.

2). I think it is important to have failing tests. TDD is one way to arrive to this. One thing I like to do is that after writing my code, i then write the test, then comment out the code to create a failing test. I want to make sure that the code is passing because of the new code.

3). For fixing bugs, TDD is a must... IMHO. You want to make sure you are fixing the right thing, so create a failing test first.


The next big thing is ATDD - Acceptance Test Driven Development.

This is implemented using frameworks like Cucumber[1], in which test scenarios are:

1. Described at the feature level

2. Described together with, and signed off by, the business stakeholders

3. Described in a structured format, which can be implemented in code ('given ... when ... then' format)

The advantages this gives are huge. It is essentially the business requirements described as test scenarios, which can be executed in an automated fashion, (co)authored and owned by the business.

[1] https://cucumber.io/


I think one of the main benefits of writing tests before implementation rather than after, is that it forces you to think about making a testable design. But if you are already in the habit of making testable designs, then IMO TDD isn't worth it, because front-loaded tests can slow down your iteration speed, and become a drag if you are trying out a couple different designs before settling on one. Once you know how to write testable code, you're just as well off writing the tests afterward - works fine for me at least.


Just skimmed the comments and missed the rise of typed languages, especially TS with its excellent tool-chain. Sometimes I just code with vim and coc.vim (which brings VSCode's LSP to vim) for hours without ever running tsc. The live type-checking and other checks in the editor are so good. Same with Rust. Guess C# had this for decades in Visual Studio (same with Java) but yeah, somehow I and many other people (from the dynamic langs like Ruby) missed this and needed TDD even more.


TDD/unit tests are great in some cases:

- when testing algorithmic logic

- for rapid feedback

- for setting up good context (e.g. a pool that is full)

- helps get well-tested parts for when using integration tests

- to make sure your design is decoupled

More here: https://henrikwarne.com/2014/09/04/a-response-to-why-most-un...


TDD is most useful to me for algorithmic logic. I tend to pull a lot of that stuff out of my code, because it's not related to my domain.

TDD is a lifesaver for random one off algorithmic problems. You can either write it in 5 minutes, and spend the next 2 weeks fixing random bugs in your `includeRange` implementation, or you can spend 20 minutes to TDD the function and be done forever.


One of the things I got out of learning Haskell was learning about QuickCheck.

Now, my preferred testing method is to write pure functions attach generators for the given input types, assert invariant properties about them, and then let QuickCheck fuzz my functions where it will try to find a minimal example that breaks the given invariants.

The remaining stateful portions of code are tested through integration tests.


Some of the literature I've read on Property Based Testing 'speaks to me' but not really in an epiphany sort of way, so I'm still trying to decide if I want to dive into that or not.

So far it's the only thing I've seen that really feels like it could carve out a big chunk of territory from TDD. But that would mean less TDD for me, not removing it.


I remember arranging a talk for my college’s AITP Club in 2012 which was sold to me as a trendy “How to XCode for iOS Blah Blah Blah.” After setting up the projector and giving a brief introduction the guy went headfirst into an evangelical tirade about TDD. No XCode, no iOS, no iPhone app. From that day forward I knew TDD was bullshit.


Was TDD ever really alive?


In my experience it never was a thing. 12 years writing code, never seen people do TDD even once. You write tests sure, but you don't TDD. All it was is an opportunity for shysters to squeeze money from training unsuspecting companies.

There's another gimmick out there these days, but I won't name and shame, people will get mad.


who is actually watching and discussing the linked videos in this thread rather than braindumping all their anecdotes?


It was never a thing to start with.

My fun questions to TDD advocates was always related how to do desktop applications the TDD way.


There's a book on TDD with an extended case study about developing a Swing application: http://www.growing-object-oriented-software.com/


If people want to learn how to do TDD well, check out the Codecraft playlist for your favorite language by Jason Gorman: https://www.youtube.com/c/Codemanship/playlists


I believe TDD is still a valid tool to develop well-tested software, of course, don't take it to the extreme, where you mock everything to in the end test nothing or create tests that have very little value, I find TDD a lot more when is applied as guidelines instead of rules.


I used TDD for unit tests, less so when I have to start mocking things, not at all for E2E tests.


Yesterday I made a glue program that imports my model code and http code and makes requests to my running server.

I’ve loaded it with assertions and anytime my server code is wrong—it’s just so easy to find out.

The biggest issue for me is trying to get XCode to launch my server first and my test program second every time I make a change.

Besides that—having the test program is a joy. Took me an hour to setup.

All this is to say—testing is very much alive and well in me.

The biggest mistake made is that these IDE’s or programs usually make the testing command something you need to run besides the program. They really should run together. Once click. Test everything.

PS — my favorite testing framework is MPWTest because it brings my code to life as a living document.

With HTTP or server code it hiccups of course—hence my need to roll my own.

https://blog.metaobject.com/2020/05/mpwtest-reducing-test-fr...


I definitely see a lot less of "TDD as religion", where all code is written in TDD, without exception, and no code exists or is added to a project without a failing test to demonstrate its need.

Most developers I work with (as well as myself) will use TDD as a technique that can be useful in certain circumstances.

When it's easy to use and reason about code as an isolated unit without any dependencies outside of straightforward libraries, TDD is usually useful and yields good results.

Things get uglier when there's more dependencies involved, which tends to lead to excessive mocking that results in your tests just testing a very specific implementation (expect method A to call method B on object X), or to alter your code in a way that's purely in service of the tests (the "test induced damage" that DHH talks about).

To take a Rails example - I think there's little to no value in doing TDD style testing for controllers. The tests will almost by definition test a specific implementation (since well-written controllers will often just connect various other models and service objects), and any efforts to get around this will just introduce unnecessary abstractions that make your codebase much more difficult to comprehend.


> I definitely see a lot less of "TDD as religion", where all code is written in TDD

In my experience it helps to drive a new technique or paradigm to it's extreme for a while (for example by employing it religiously in a pet project). After that you can look back and see where a sensible boundary lies for applying said technique. And often it allows to grasp the true essense whereas otherwise (trying it just a little bit in your current job project) doesn't allow it to bear it's fruits and it will be quickly forgotten or turn into a anti pattern.


IMO fast software iterations, coupled with excellent observability tools - and the ability to better issues close to realtime - killed TDD.

TDD became popular at a time where waterfall and slow shipping cycles were common. In this setting, writing well-tested code upfront had a large upside. This process also helped clarify the spec on the go.

However, as teams and companies move to deploying multiple times per day, adopting things like monitoring, alerting, canarying, the value of having code that satisfies a spec upfront is lower, than code that works as expected in the prod environment.

I see almost all tech companies use unit and integration tests extensively, often with high coverage - but sometimes retrofitting them, after validating that the code does what is expected, in a complex environment.


Visibility (logging, monitoring, etc) trumps testing IMO in terms of coding productivity, especially when you don’t understand the domain fully yet. Testing utility increases when code matures and becomes more stable.


TDD only works for extremely well-specified requirements.

E.g. you are writing a server for a very well-understood protocol, or writing some code to perform some very well-understood algorithm.

These are easy to test because there are very clear & totally unambiguous answers that the software has to answer. You can very easily write test-cases then because you already know precisely what the software should do.

In reality for a lot of "enterprise" software I have been involved with, I have found that there is very rarely a specification that is that well thought out enough so that all of the answers (...or even questions that need to be asked) are known before any code starts to get written. The specs are usually super-high level (e.g. "the user should be able to update their preferences") that most of the details are left as an exercise for the developer implementing it.

I'd wager that if you are in the situation where you have such a clear and concise specification before any code is written, then you're not actually agile at all and instead you are in some lethargic ossified waterfall where it has taken 18 months for The Committee to sign-off the specs for exactly what the widget will do when the user specifies that their preference is for home delivery but they have not entered a postal address yet but they do have a grandfathered-in address from the pre-aquisition database that can be used when the terms-of-service-acceptance bit has been flipped, but not if they are in EU and have not flipped the GDPR-acceptance bit yet etc etc etc.

And if you are in this scenario, then some TDD ain't going to help your velocity. Real life is messy.

Don't get me wrong, I am 110% in favor of unit tests. I just don't think that for the vast majority of projects that there is enough detail in the specs to write the tests first then stop when all the tests pass.


I've found what really has helped me with TDD is a continuous test runner. Different platforms have different takes on them, but in my case it's NCrunch.

The fact that it's running the test as I'm typing is incredibility powerful. It helps me keep to a rhythm, as I don't have to stop and get the test runner to run the unit tests.

When I'm refactoring and a test goes red when I'm not expecting it has saved me time, it also helps you question everything, why was this test needed, do we still need it?

That and (with NCrunch at least) the coverage dots. If I'm diving into some code without an test coverage or code with gappy coverage it lets me know I need to be cautious.


TDD is a lot like microservice adoption.

If you start your project planning on a microservice architecture it is relatively simple to design it properly. If your are migrating your monolithic application to microservices you are probably going to have a hard time.

TDD is similar. If you start the project planning on writing your application TDD style it will be simpler. Converting your large business critical, not written to be tested, application over to TDD is not impossible but it makes business sense to "ship it" and hope for the best. Hiring a QA engineer would probably be less expensive than your developers rewriting your application to be "testable".


It’s not dead, but it requires the developers to know what they’re building before it’s being built.

“Walking on water and developing software from a specification are easy if both are frozen”


Depends on your org and it's goals.

For orgs that are engineering oriented with strong business plan, strong requirements gathering, case studying, and minimal feature selection, it's a boon.

For orgs that aren't engineering oriented and use development as a tool to find holes in the business plan, testing anything is usually a waste of time. You're trying to see what the market responds to, rather than actually build something of quality.

It comes down to each org's goal and usually the best answer will be somewhere between the middle of the two extremes.


It also depends on the individual engineer.

Some engineers (myself included) need to be able to write tests to make hour-by-hour progress towards the minimum needed to learn about the market.


Any developer worth their wage tests.

After writing any code I run the program to test the change I have made. I make sure the code is exercised either by logging, debugger, or clear UI change. If it's a browser app I use multiple browsers, if it's a rest service I make the rest call.

But I have known some developers that write a unit test, but never test the actual change! And without fail serious bugs appear. Like the application fails to start, or crashes when the new feature is invoked for the first time.


"Every great cause begins as a movement, becomes a business, and eventually degenerates into a racket." Eric Hoffer


To do TDD correctly, things have to be defined in great detail. 'Given X, I should get Y'.


That’s actually a benefit of TDD imo. One of the most difficult things about software development is specifying how it should work. Once you have that the rest is pretty easy.

So if you go to write a test case and don’t know what it should do that that’s an early sign you need more time on the spec.


This is six years old. What is the value in rehashing the same arguments again in 2020?


Perhaps as a reminder of the fundamentalism and bullying that can happen in our field. Some methodologies almost seem to take on the quality of a moral panic. It's fascinating to me to see how they are eventually punctured and deflated.


Should have (2014) in title.


I worked on the proxy and dns software almost every ISP in the world uses today to transfer http and tls traffic. If a bug was put into code it could take months where subtle parts of the internet would be failing. This wasn't an option, so we had strict testing. I think for every line of code we had 2 to 3 lines of equivalent test code. From this experience I've found:

1) I found a natural way to write pseudo TDD I prefer. When I'm mocking up some code in a project, I write interface code to run what I'm writing. I run the code from time to time to make sure it's working. If I'm not doing that, how do I know my code it working? You get a sort of dopamine high when running the code and seeing everything come together.

What I do is I take that interface code I wrote while creating the code and instead of deleting it, I create a test function (or multiple) and I copy paste it there. If I'm in a hurry I'll come back and write the assert_eq or equivalent later.

Writing a test is that quick. You're already doing 95% of it. Just copy paste it into a function. I know it's not always that simple, but often times it is. When approaching testing this way there is no reason not to write tests.

2) Systems tests catch far more than integration and unit tests, so if you're serious about testing, these should be considered. Systems tests go by many different names, so by this I mean a testing echosystem where the system is spun up and a mock client and mock servers are ran. This simulates a real world client connecting to the software and tests the real world results they would get.

The reason system tests catch so much more than unit tests is in a large system not everyone knows every functionality and how it should work. It's easy to create a regression where a new feature is added but it changes the behavior of this old previously unknown feature. This often results in going back to the drawing board as it's a business issue more than just a software issue. These kind of bugs can get pretty nasty in the enterprise space if unchecked so it's a good idea to catch them.

Furthermore, systems tests catch nearly 100% of what unit tests and integration tests catch as well. This is because if there is a bug in a unit caught, it will propagate out to the client affecting behavior. If it is caught in a unit test and not a systems test, then there might be a hole in systems tests where it's not testing every scenario, or you might have dead/unused code in the code base or something similar.

Systems tests I've found are better at catching race conditions as well.

Systems tests act as a great source of documentation, because they document how every interface in the program is intended to act. You get a quick high level view and can learn the ecosystem quickly from it, where unit tests are down in the weeds.

And finally, you don't need to write anywhere as many systems tests as you do integration and unit tests. You can test a lot more with a lot less work.


(2014)


What's the tl;dr of this article? It's framed as a question so I assume the answer is "no" but that's not particularly useful information on its own.


Sorry but not a fan of DHH anymore... In my opinion he is too focused on self marketing and always tries to show how different he thinks. Dude is not Steve Jobs.


(2014)


Let me have my take on TDD. (I work with legacy corporate systems mostly so that's how I am biased)

TLDR: it never had a chance of working

The basic problem of TDD is that it has a huge cost to it and therefore you have to do it right to not only get any benefits but to just get even.

In a perfect TDD utopia, all developers would start with a set of clear, complete requirements and then implemented each requirements with a clean and concise test and then ensured their code passes the test. Then later, with not much additional effort, these tests would be used to verify that whatever shenanigans you are doing to your project still cause functionality X to do Y when Z happens.

Unfortunately, this fails to capture whole diversity of the development reality.

- It is rare you get clear and concise requirements. Most corporate code is just something that a developer decided should happen in a situation, not a result of meticulous discussion and planning. As such, preserving these requirements in stone has much less value than one might think because now instead of preserving objective truths about how the application should work you are spending time to preserve what an intern Joe decided this should be doing with probably little regard for entire system.

- There isn't a way to tell your tests are complete and good quality. Whereas an application must implement functionality and must be doing (something) right because otherwise your users would notice missing or faulty functionality (users are ultimate verification of your app), tests can be as incomplete as you want and as pointless as you like and will still "pass". Thus, completeness and quality of tests is completely on a whim of developer. Guess what, if they can't get the app right will they be able to at least get tests right? And if they can get the app right the lack of tests is the least of their problems. By having feedback on how teh application functions from their users the developers are "forced" to deliver something at some level of quality. By having no feedback on tests quality developers are not forced to deliver anything (other than to satisfy tests) and if they are not forced to do something no user or manager will ever look at they are most likely do only what is absolutely necessary (to pass automated gates).

- Most developers want to be done with a task and there is this bunch of stuff they have to do that is just annoying because it has no immediate bearing on how the application functions. That bunch consists of refactoring, documentation, code reviews and guess what, tests. Do you know why documentation is typically poor if at all? Do you know why code reviews mostly focus on little stuff and rarely attack big problems? Because overwhelming majority of developers don't put their heart into stuff that does not bring them closer to achieving the goal (except reading HN, taht is, and other forms of procrastination).

- Management. While they will usually "require" tests, I have never in my life seen that a development was extended to get more time to "get tests right". Usually tests is something you have to do as quickly as possible to meet the requirements rather than something that is seen as core to the project functioning.

- Developers. Most developers are already overwhelmed with stuff they need to learn to be able to functioning in increasingly complex technological jungle. Microservices? Devops? Full stack? Given choice to spend time learning skill of doing tests right and learning above not many developers will choose tests. Making quality tests is a skill that like any other skill require you to apply yourself.

- Refactoring functionality in a legacy system (which is theoretically exactly the reason you might want tests) requires that you know exactly what you are doing. This means understanding what changes are "safe". When I refactor a piece of code I try to form a hypothesis that the change is not going to break functionality and then prove it by following code leads (how the function is used, etc.) until I am satisfied. Doing hundreds or thousands of refactorings one by one requires from me that I am 100% certain I know I am not breaking anything. If I were to use tests to help me do refactorings I would also have to have close to 100% trust in that these are correct. Unfortunately, there are no tests to test that tests are correct or complete. Therefore, I cannot use tests to replace my process of following code leads and this makes tests useless for my needs.


This. It's not that tests have no benefit -- they do, especially for complex refactoring (though even for that, as you say, tests are rarely exhaustive enough to engender full trust).

But outside of the narrow confines of production programming, it's rare that the requirements can be specified completely up-front.

In the initial stages of development, coding is very much a discovery process. Code is often evolved rapidly and drastically, such that any test written is a throwaway. And some of these tests/mocks aren't easy to generate. In data intensive applications, it's especially a chore to have to generate new mock database objects each time the code changes, and with completely different data characteristics to boot (in order to test different facts of the code).

The psychological tedium of doing this will put off most developers. The potential drag on productivity extends to more than just the time required to write test + code: the whole pace will feel sluggish and the dev is liable to feel demotivated.

I see tests being more useful as a retrospective addition, once the code has stabilized and the use-patterns established.


That is good point. In my initial stages I like to explore the problem and tests just get in the way of being able to refactor everything very quickly.

Maybe I want to put some dirty implementation first and then once I think I understand the problem better quickly fix it?

Maybe I want to start refactoring it without knowing where it exactly will lead me?

The problem is tests do not help with these initial stages, they only help if you know exactly what you are doing.

And once I am done with the initial stage I want to move to another problem. I don't like sticking with the same module to now do the documentation, tests, etc. Once something works and I am satisfied I move on.


yes


Betteridge's law of headlines applies.


When I hear "Kent Beck" I reach for my gun.

I don't have a problem with "Kent Beck" as an individual but more than anyone else in software "Kent Beck" is a brand like "Anthony Robbins". I care what you think but not what you think "Kent Beck" thinks.

For that matter the same thing is going on with Martin Fowler, who is using his software craftsmanship cred to legitimize a shop that ships work to the third world and turns programming from a white collar to a blue collar profession. There is no topic you could pick better if you wanted to have a clickbait discussion war over software.


> Martin Fowler, who is using his software craftsmanship cred to legitimize a shop that ships work to the third world and turns programming from a white collar to a blue collar profession.

Can you expand on that, or point out resources describing this phenomenon? Not as proof, but to explain to me what you're getting at. I hold the ThoughWork-ers I know in high regard, so I feel personally challenged in a bunch of ways by your point and worry I've missed something.


I laughed at this, and I agree that almost any sort of appeal to authority should be derided as the cheap trick it probably is. That said, Kent Beck says a lot of smart stuff in his books, and I don't think you should stop listening just because his name comes up.


It is the level of indirection about the problem.

Kent Beck said something good about problem A.

Now somebody else is talking about "Kent Beck said something good about problem A" and there is the risk that "Kent Beck said so" outweighs the "something good about problem A".

Baudrillard writes about this phenomenon about the "precession of simulacra", which gets you roughly to the place Girard warns about -- Baudrillard reminds that there was something else in the past, Girard wouldn't care.

Steven Hawking said some absurd things about black holes in the 1970s that essentially postulated no quantum gravity (e.g. the propagator is not unitary and operators being unitary is almost the only thing you need to do quantum mechanics) I think the "cult of personality" held back work in QG for at least two decades, it wasn't until some brave people tried to calculate things with radically different methods and realized they were getting the same results and not by accident that the "information loss" concept is absurd, as is the classical picture of a black hole interior.


> turns programming from a white collar to a blue collar profession

why would that be a bad thing?


We detached this subthread from https://news.ycombinator.com/item?id=24281805.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: