* give you less confidence in the correctness of the system (per time spent writing them)
* when they fail, give you more information about where the failure is
The more integration-y/e2e-y a test is, the more it strays from this: slower to run, more confidence that the system is correct, less info about where the failures are.
I think people have learned to undervalue the properties of integration tests and overvalue the properties of unit tests. Is it nice to know exactly what's broken based on the test failure? Sure. Is it _as_ nice as having confidence that the whole system is working? Probably not, in a lot of cases.
(I don't think there's general rules about which kinds of tests are easier to write. Sometimes setting up a real version of a component for test is harder, other times setting up a mock version for the test is harder.)
The other point about unit tests is that they allow you to test far more permutations. At a micro level, you can test more of the possible types of inputs to your function. When you move to integration tests, the permutations multiply to the point that it would require billions of tests to test the same range of inputs.
This is why both are so necessary. Integration testing tests whether the whole thing is coherent when put together. But it’s terrible for testing edge cases and error scenarios. That’s where unit testing comes in. It gives the developer a chance to codify every edge case they’ve considered to automate the process of catching regressions.
Don't forget a very important thing: unit test are hindering any refactoring. People will resist it because it means they have to rewrite their tests. And if you don't have enough e2e tests, you have nothing to check your refactoring efforts have not broken anything.
Very true. A lot of unit tests, especially the ones with mocks tend to test a lot of internal behavior. So when you refactor your system, you often have to rewrite unit tests although the system actually performs correctly.
Until you realize that your api is wrong. So long as you got your api right up front unit tests make it easy to refactor. As soon as the api needs nontrivial changes you are sunk.
If your api is wrong, and you have unit tests for that wrong api, and you have time to rewrite it the right way, you can write a set of methods with the correct api which is a shim for the wrong api (you don't care about perf here), write your refactored version, and switch your now-tested shim to the correct api (and then probably keep it as a reference for the new way of doing things).
If you find yourself doing this frequently though unit tests might not be ideal for your situation.
> Don't forget a very important thing: unit test are hindering any refactoring. People will resist it because it means they have to rewrite their tests.
Well, unit tests verify a contract. If a developer wants to change code but is incapable of respecting a contract (i.e., preserving old invariants or adding new ones that make sense) then he should not be messing with that code at all, shouldn't he?
In that sense alone, it's quite clear that unit tests pay off and add a lot of.value, namely in avoiding needless regressions from Rambo developers who want to change code without caring if it even works.
Test the heck out of the external, documented contracts. Then the test suite helps you refactor: reshuffle the internals while maintaining the contracts.
As other said, if/when it comes to this it is because the code in question has change in a significant way. So the point was to test a previous version.
If unit tests are hindering your refactoring then you aren't writing SOLID code. The only time I find myself, e.g. splitting a class in two, and then having to deal with its shitty tests, are when I'm dealing with legacy classes that did five things at once. Oh, and their tests were just "did it crash?" tests, i.e. call a method but only check the happy path.
If you have reached the point where you feel tests hinder refactoring then you have two paths: a/ learn to write smaller, SOLID classes, or b/ meh unit tests are stoopid.
I think it's a failure to ever believe that unit tests provide any correctness proof. The only thing you achieve with unit tests is to prove that you are still bug for bug compatible when changing your code.
I think experience is needed to know when to unit test. Some code might not need any tests at all while other code might need quite a lot of tests.
My personal experience, for the code bases I work on, shows that for those code bases mocking is usually not beneficial, we find more problems with integration testing and fuzzing. That holds for those code bases, it might very well be different for other code bases, and here's where experience comes in again...
> The only thing you achieve with unit tests is to prove that you are still bug for bug compatible when changing your code.
People love to ignore how fucking awful code is when you aren't forced to write unit tests.
Unit tests before you write code help you code, they provide a tight feedback loop, they force or at least encourage modularity to be testable.
Adding tests to my code after writing it forces me to challenge assumptions and frequently requires refactoring to improve the code along with testability. It also documents my intent better than comments can and often gives consumers valuable example usage code.
Yes it's a trade off and sometimes overkill but that's why 100% coverage is a stupid goal. You should always evaluate RoI of the tests you're writing.
Integration and end to end testing is much the same, they're simply different grain sizes, we need all of them to be effective.
The biggest mistake that happens with testing is assuming it's all about correctness, doing so is why we have 100% coverage people around, they're equating 100% coverage with 100% correct which is just a wrong conclusion.
I used to do TDD, I found that it didn't work that well for me.
I still think that the code base you work on dictates how you should be testing. It also dictates which testing strategy I use.
I test for different things when writing C than I do when I write Python for example. My testing strategy is different if I write a networked C application compared to if I write a Ruby on Rails app.
Also the tools I have available to me when writing the code dictates how I will test, which is tied into which language I write in.
And I really disagree that code has to be awful because you are not forced to write unit tests. I work with a team of 4 other experienced and responsible programmers. We actually don't have to have rules that force us to do anything. We are responsible enough and experienced enough to know what to do and when to do it.
40 - 50 % of our code base doesn't have a single unit test, because it doesn't have to have any. Are we infallible, no. Does mistakes happen? Yes. Do we sometimes go back and add unit tests to code that we thought didn't need any. Yes, it happens.
Would unit tests have saved us sometimes? Yes. If it would have, then we are responsible and go back and add that test.
I think the point was more that many people can’t write unit tests because their code is so shoddy that it’s basically impossible to do so.
It sounds like you and I share a similar methodology. I don’t write unit tests for all my code. In fact if I had to guess, I’d say I generally test less than 25%.
But I do write all my code such that it’s capable of having unit tests written for it.
In many codebases that’s just not possible. The developers have spent all their time writing and none of their time thinking. It’s one big yarn ball that’s impossible to write unit tests for. For these people, forcing them to write tests up front would have resulted in a much cleaner code base.
I do enjoy TDD, in that it makes it easier for me to ensure that I implement everything I need.
I completely agree that it's not always necessary, but I definitely find it a useful tool.
That being said, TDD with ML applications is a bunch trickier, as you'll normally have functions/classes which take an unfathomably looooooonnnnngggg time to run.
40-50% is perfectly acceptable for your team by the sounds of it. In other teams with varying degrees of experience and skill things are a bit more wild.
The fact that you can add tests means your code is at least somewhat scaffholdable and that is the important part for testability.
For the record I do TDD maybe 20% of the time, I mostly just use it in tough problems where I don't know what to do next.
If you have a web API that returns a 200 OK with a JSON object with a missing entry when an item is not found, instead of a 404 error, then you have a bug in your API.
But you cannot allow it to change because your users are relying on the existing behavior.
And this is the case for, what, maybe 0.1% of unit tests?
And you're using that... to justify that the "only" thing unit tests achieve is maintain bugs? Although in any case, this is no longer a bug, it's part of your spec now.
But regardless, what about the other 99%+ of unit tests that are enforcing correct behavior...?
Your argument is like saying that food is bad because people occasionally get food poisoning.
I think that a well covered codebase should include mostly unit tests, plenty of integration tests, and few E2E tests, i.e. model the test pyramid[1].
Mocking in unit tests is generally acceptable to avoid also testing other systems. E.g. if the function does OS or network calls it's a good idea to mock that and keep the test isolated. But you should also have integration tests for the feature which do make external calls without mocking (or mocking at a higher level), as well as E2E tests that test the system as a whole. This way you can generally quickly pinpoint where something breaks, which may or may not break higher level tests.
I’ve seen some people make the distiction between “unit” tests and “programmer” tests. Using this terminology, “programmer” tests are what are used in TDD; they help with development and cover the desired functionality, not implementation. When code is refactored internally, no such tests should break. “Unit” tests, on the other hand, are called such because they test an isolated unit of code, and are frequently heavlily mocking all its dependencies, both internal and external. These tests, when created, exist for code quality purposes, and are almost always tightly coupled to the implementation, and must be rewritten when the implementation changes.
Having noodled on these comments for a few, I share your view, but I'm curious if the type of testing varies based on seniority/carefulness. I find people earlier in their careers (or people with less care) write code where the failures at boundary conditions are obvious to me, but not to them. I can see unit tests (or quick check) paying big dividends with that group while being lower ROI for more experienced / more careful teams.
1 and 3 is the dogma, but mostly I find it false. My integration tests are fast because they deal with small data sets. A few dozen function calls adds a few ms, who cares.
I find it easy to find where an integration tests broke because I back out my last change.
> I think people have learned to undervalue the properties of integration tests and overvalue the properties of unit tests. Is it nice to know exactly what's broken based on the test failure? Sure. Is it _as_ nice as having confidence that the whole system is working? Probably not, in a lot of cases.
This comment alone demonstrates a collosal misunderstanding regarding unit tests.
Unit tests are not used to verify if the system is working. They were never used for that. At all. Unit tests only specify invariants which are a necessary but insufficient to verify that the system works as expected. These invariants are used to verify that specific realizations of your input and state will lead your implementation to produce specific results, no matter which code you change. If you expect function foo to return true if you pass it a positive integer, you want to know if a code change suddenly leads it to return false or throw an exception. That's what unit tests are for.
If unit tests worked anything remotely similar to the way you misrepresented them, we would not need integration or end-to-end tests at all, would we?
This blog post has tell us that we shouldn't use mocks but should instead "send data" to the code we are testing. I think this is supposed to mean using dependency injection instead. But the case isn't really made. Instead, with a waving of hands and wild assertions such as that I'm lying to myself, I'm left wondering what I just read.
The previous post was on cargo cult programming. It warns that CCG is bad but can't seem to define it. There is "how to spot" advice but no evidence to show that the advice works (except for a minor appeal to authority).
Yes, English is probably a second language, but the writing has more fundamental problems than that. The author needs to consider what a thesis is and how to support it. And after that, who the audience is. Etc.
Unfortunately, integration tests are too slow, so the practice doesn't scale if one is trying to TDD.
Insinuating integration testing into every user story will lead to friction. Test run times will balloon, cycle times will get extended and resentment will grow for the test suite and the team's testing regime.
If your test suites cannot complete quickly (seconds), then they cost more than they're worth. I've learned this about outside-in TDD at-scale. Our code quality is glorious. But our test run times are untenable.
I'm experimenting with sociable tests to curb my appetite for integration tests or at the very least keep on writing them but make it safe/productive to run the vast majority of them on the CI/CD server only.
Unfortunately, integration tests are too slow, so the practice doesn't scale if one is trying to TDD.
If integration tests get more useful outcomes than unit tests in some situation but TDD only works well with unit tests, maybe that means TDD isn’t the best process to use in that situation. Isn’t the essence of agile development to recognise what is working well and what is not, so you can make changes where they are needed?
If your test suites cannot complete quickly (seconds), then they cost more than they're worth.
I disagree with this. The goal of testing in software development is to improve results. Any testing methodology should be evaluated accordingly. Fast feedback is good, other things being equal. However, if going slower means getting more valuable feedback — identifying more defects, providing more actionable information about how to correct them, checking for a broader range of potential problems — then maybe going slower is the right choice.
Integration tests, for my team at least, really are the workhorses for determining if we have shipping software or have more work to do. So much of our deployed code relies on configuration and packaging to function properly that units just don't test!
There have been so many errors in our coding over the years that only came out in our integration tests. I agree with you that, given my team's current maturity level, going slower and having multi-minute test runs is okay given our incredibly low defect rate in PROD.
Those long running test suites have created too much friction for new feature development. So, while the way we are working is comfy and life is easy, I feel we can do much better.
I believe it is possible to have rapid test cycles with our outside-in style TDD by keeping the added safety of integration tests but run the majority of them on the CI/CD server only, and change the way we unit test to rely less on mocks which should get us the 2-5 second inner loop cycle times we wish to hit.
I find I can do ITDD effectively with integration test times of < 30 seconds. I've even done it with tests of up to 5 minutes.
What makes it work is pairing it with a REPL. That way I can have an "outer" loop that triggers the REPL and then an "inner" loop where I can experiment in the area where I'm writing the code and get feedback quickly.
I might run the outer loop just a few times:
* At the beginning of the ticket
* When I messed up the state with the REPL and when I want a fresh slate.
* When I've pasted code I think will hypothetically make the test pass from the REPL and I want final verification.
* One or two more times after that coz I missed something stupid.
Often the waiting times are a good time to get up and go for a walk/make a coffee/check my messages.
My team is oriented towards solitary. For unit tests, mocks stand-in for collaborators with the system under test. For integration tests, databases and queues are realistically simulated with locally running instances and downstream services we call are wiremocked with local configs pointing service clients to those mocks. It's an incredibly effective way to produce reliable, fully tested code. It's also incredibly costly execution time-wise.
James Shore has a nice take on sociable tests; I interpret what he's produced is like a mishmash of unit and integration tests. I have trouble making my brain work that way, but I am intrigued given his impressively low test run times. There is something special there, I feel it just needs a sexy paint job for people to take notice. https://www.youtube.com/watch?v=jwbKSiqG0DI
Generally you shouldn’t really be using integration tests for TDD, but it’s totally possible to write your tests in a way where the dependency supplied is talking to a real system in one scenario and in another scenario is talking to a system with mocked responses - or any sort of level of depth in between.
However, I wouldn’t start out writing a system like this (aka TDD). From my experience, the best tested software looks like this as the end result and the design of the software itself has been forged in the same furnace.
Not sure if that completely makes sense, since some of these concepts are from functional programming and I never know what is totally foreign or totally obvious.
Definitely. Most of our developers do do that. We typically only run a subset corresponding to just the service containing the feature we're focused on.
On our larger services, unit tests take 20-30 seconds to run (WAY too slow) and integration tests take 2-4 minutes to run. In both suites we have hundreds of tests. There's no excuse for how slow the unit tests are. I chalk it up to the heavy reflection in the mocking frameworks we rely on and inefficient use of those mocks. Got to either optimize or decrease the use of mocks in our tests.
>I'm pretty sure he means "just use integration tests".
Really he means e2e testing. You can drive yourself mad writing integrations that break constantly or provide false positives. Better to unit test what is truly unit testable, and then rely on e2e to ensure your integrations are fine. Ultimately all that matters is the system running as expected at the user level.
I think that depends on the scope of what you're testing. If you have a database, a set of backend services and 1 or more clients (websites, mobile apps, etc), then I think it makes a lot of sense to test the backend services in isolation from the frontends (which would mean not e2e test), but backend by an actual database (so integration tests, not unit tests)
Why stop there? Why not delete your test methods too and just test in production?
Mocks are just test code, same as your test functions. And they’re necessary fur unit tests. If the thing you’re testing talks to another component without a mock, it’s now an integration test instead of a unit test.
Unit tests test the API surface of a component. They’re useful for ensuring a component adhere to its documented API contract. It’s also useful for testing how it reacts to edge cases. Integration tests are useful for ensuring components work together, and testing that data flows through the system properly. But integration tests make it harder to test various edge cases.
For example, if my component talks to yours and issues several requests to your component with callbacks, it may be that 99% of the time your component calls the callbacks in the same order. So I can’t really validate in an integration test that my component works properly if the callbacks are involved in a different order. By mocking your component in a unit test, I can test any order I want.
Unit tests also tend to be more deterministic. By mocking, I can eliminate sources of non-determinism from outside my component. As a trivial example, my component might need random numbers, so I might mock the random number generator to return a specific chosen sequence that I know will exercise different code paths.
Really the lesson that should be taught here instead of “don’t use mocks” is “unit tests aren’t sufficient, write integration tests too”.
> Really the lesson that should be taught here instead of “don’t use mocks” is “unit tests aren’t sufficient, write integration tests too”.
I completely agree with your comment.
I also think that people underestimate the kinds of scenarios unit tests can test; scenarios that are not possible or not practical to test by other means:
Unit tests are excellent for testing edge cases and scenarios that do not generally happen in normal production. It makes it easy to test things like 'how does this function behave if X returns some corrupted data?'. I feel like a lot of products don't test the failure cases at all, thinking 'that will never happen'. Mocking external services is a very straightforward way to test these kind of scenarios.
Just last week I was looking at a typical heisenbug that did not happen in 99%+ of production. The bug was masked by a combination of caching and the program flow. I was able to write a very short and specific unit test that was able to deterministically mimic the (bad) state the program was in and reproduce said bug.
There was however no way to properly test this using integration testing because it was relying on specific program state to manifest itself. Even if we had an integration test with the exact same data that caused the bug to appear, it would not be a deterministic test and future code changes might make it disappear even though the underlying bug is still there (or re-appeared).
> Why not delete your test methods too and just test in production?
This, but unironically.
Depending on your use case, any local tests, whether mocked or using a swarm of local containers attempting to represent production may be a far stretch from production reality. Put everything behind feature flags and test your contracts, then release to production regularly and test against live data and live services.
You don't test in production. By definition, prod is what you care about and don't want to break.
This is what preprod is for: an environment that stores, receives and processes the same data as prod. It replicates prod as closely as possible and errors or unexpected differences are investigated.
Basically, preprod and any not-prod environments cannot ultimately perfectly reproduce prod, and in a sufficiently complex system, the failure states cannot be predicted in a replicated environment. Instead, using strategies like rolling deployments, you can test on the actual environment you want to validate.
If set up properly, it doesn't really risk breaking prod.
Honestly I'd rather have all the units covered in isolation, and leave it up to my manual testing to catch any unexpected errors or bugs, then let our QA team do the same with their own Selenium based black-box testing.
If the contract between two units is flawed, you only have to worry about updating the behavior of one or both units. The unit tests make sure they work as designed.
It seems to abstract the problem for me to the level of only detecting missed integration issues. I don't need to worry about how those units do what they do, just that they do what they're designed, because the unit tests say so.
There's nothing I hate more than making a small change, and having to spend hours tracking down why 20+ tests that are written as integration tests rather than unit tests (with mocking).
I'd rather maintain the mocks when making changes, than have to spend all that time trying to figure out what complicated and huge, far reaching, thing is broken.
The author seems to believe people either mock everything or don't mock anything. Obviously using mocks for all your tests is a very bad idea, but that's not how things are done generally.
Unit tests allow you to validate a unit's behavior very quickly. If your unit test takes more than 1 second to run it is probably a bad unit test (some would argue 1/100 second max so your whole unit test suite can complete in a few seconds). In unit tests you use mocks not only to keep the test hermetic, but also to keep the execution time as low as possible.
Then you should have integration & e2e tests where you want to mock as little as possible, because you want a behavior as close as production as possible. For those you care less about how long they take. That's because you usually don't run those tests at the same stage as unit tests (development vs release qualification).
The author does not make the distinction between different types of testing, the resulting article is of pretty poor quality imho.
I've certainly seen people who mock almost everything to test units at the smallest scale possible because they think that's what they're supposed to.
E.g., I once saw someone test a factory method like:
def make_thing(a, b, c):
return thing(a, b, c)
with a unit test where they mocked `thing`, and ensured that calling `make_thing(a, b, c)` ended up calling `thing(a, b, c)`.
They write just a shit ton of tests like this for every single method and function, and end up writing ~0 tests that actually check for any meaningful correctness.
harkens back to the early obsession with "100% code coverage" and java robots were coding tests on bean getters/accessors.
100% code coverage was a bad breadth-first metric when unit tests should be depth based on many variant inputs. Also, "100% code coverage" ignores the principle that80% of execution is in 20% of the code/loops, so that stuff should get more attention than worrying about every single line being unit tested.
Well, unless you were in some fantastical organization of unicorn programmers that had an infinite testing budget and schedule...
A good exercise is to get 100% coverage for anything that uses ByteArrayInput/OutputStreams. The language enforces handling IOException for a bunch of methods that could throw one for a generic stream but never for a ByteArrayStream.
You should see the opposite of this. Where every module of code is unit testable with zero mocks and just a small subset of untestable IO functions packed in a neat corner.
I've seen a lot of tests where people just mock everything by default without thinking. Smart programmers at a good company. It's an issue that does deserve more recognition. Abuse of mocks is bad for tests.
I know which company you are talking about :). I agree that abuse of mocks is bad for tests 100%. But when I clicked the link I was hoping to read an article giving a nuanced description of mocks, with some analysis on when to use and when to avoid mocks. Instead the article is just an opinion piece that just says "Stop using mocks" as if that was actually an option.
>The author seems to believe people either mock everything or don't mock anything.
The author is saying that people frequently mock things that it would be more economic to just run because you've got the real thing right there. Building a model for it is an expensive waste that probably won't even match reality anyway and will demand constant maintenance to sync up with reality.
If you're overtly concerned with the speed of your test suite or how fast individual tests run then you're probably the kind of person he's talking about. Overmocking tends to creep in with a speed fetish.
When I am developing a feature, I want to know very fast whether or not my code's logic is correct. It is not rare during the development cycle to run the same test dozens of times because I made a silly mistake (or a few), and obviously if the test takes 30 minutes to complete it completely wastes my day of work.
Having a set of very fast running tests is absolutely necessary in my opinion.
Once I have validated that the piece of code I wrote is doing what I intended, then I want to run other tests that do not use mocks/fakes, e2e tests that can possibly take a whole day to complete and will allow me to see if the whole system still works fine with my new feature plugged in. But this comes AFTER fast unit tests, and definitely cannot REPLACE those.
This sounds exactly right to me. You write mocks for the things that could take too much time to run frequently with the real code. (And I'm assuming you'd also write it for things that you don't want to make actual changes somewhere, such as a third-party API that you don't control.)
But if it could be run locally, quickly, you wouldn't bother mocking it.
If that's all correct, I think you and I would do the same things. All the people screaming "no mocks!" and "mock everything!" are scary, IMO.
Mocks mean your code is too tightly coupled. You should be able to unit test your code by creating only fake data.
Things like dependency injection increase coupling to the point where you have to mock. Avoid dependency injection and other complexity within complexity features.
I agree with the premise of this post. Instead of mocking, design your components in such a way that they are easy to spin up in tests. Or use containerized versions of services.
I started disliking the idea of mocks a few years ago. I was writing a system based on the Play framework. Play framework used to (still does?) come with a dedicated integration testing environment. The problem with it was that the setup of the testing world slightly differ from the real world. I was bitten a few timer by the real world setup process while the tests were executing flawlessly. In essence, there was no simple way to test the real world construction before deployment to production.
Since then, the only integration tests I accept as real tests are the ones that test the production code path. Database? In containers. Kafka? In containers. etcd? In containers. There are exceptions, though. Proprietary APIs like SQS, or Kinesis, or Firestore are the difficult ones. I usually replace them with hand written memory bound implementations with an abstracted API adhering to some general top level constraints (put / get / consume / publish). This does not prevent errors rooted in the wrong understanding of the design principles of the dependencies, for example, consistency guarantees or delivery ordering, but those are usually possible to cover with additional tests further down the line.
I personally use mocks or stubs only in unit tests. Everything else should be live test or recorded network responses but always run through live code. Unless the service really depends on it, I do not reccommend setting up local swarm of services. This breaks down if you shift to distributed micro services and has little benefit for a single service app (ex crud + database app, no reason to test the database in the app tests).
The problem is not spinning up services, but setting them up to fail in the way your test case needs them to fail. Testing the happy path isn't that useful, just trying out the system should show that they work. Edge cases and error conditions are what needs testing.
There is no problem with that. A program execution is a result of a given input. Error condition is therefore a result of a set of conditions leading to a fault. There faults can be recreated.
The only two things I was not able to test reliably so far are: network latency / failure and disk latency. That is possible with firecracker but I haven’t had a need to go so elaborate yet.
> But dude, I don’t want to use a real database (or AWS endpoint or rocket launcher) in my tests. Debatable, but fair enough.
Is this debatable? Hermetic tests give you a lot of things for free and I don't really see a reason why you would default to making all of your tests hermetic.
The real thing that this article is touching on is that your tests should test all of the code you write but not code others have written. This sounds wild at first but: do you need to test that Postgres knows how to parse a query? Probably not. I think the postgres team knows how to test and release their database and I don't really need to spend time doing that. Then, the next layer of abstraction: if you use an ORM or some middle layer that abstracts the database do you need to test that the ORM knows how to talk to postgres? In a unit test, likely not. The ORM people have provided you an API and you should use their API to fake/stub/mock/whatever that system so you can focus on testing your business logic. After you have that system built you should then build integration/E2E tests that actually talk to hermetic copies of the real systems. An easy way to do this is to build a troubleshooting cli tool that you can run against your backend services/dbs/etc that can be used in CI against a copy of your backend or in prod to debug configuration issues.
You don't need to test that Postgres knows how to parse a query. But you do need to test that the query you wrote means what you think it means to Postgres. The only way to know that for sure is to hand the query to Postgres.
Maybe you don't need to test that the ORM is correct. But you do need to test that you are using the ORM in the way that the ORM's designer expected, which is often non-trivial. And so on.
The mocks are the symptom. The problem is that your code doesn't restrict side effects in any way. And so you end up with integration tests for everything and setting up a single test requires recreating the universe from scratch and slightly tweaking it on every run.
But that's what our program does, it talks to databases and file systems and HTTP servers!
Sure, and the effect of doing those things is moving data around. What does your program do with the results of these side-effects? Does it parse it? Transform it in any way? Decide whether to run effect A next with the result or effect B? This is the code that, if extracted, can be tested in isolation of databases and HTTP servers. You want as much of your code to be in this place as possible. It's a thousand times easier to test and the tests are thousands of times more reliable because they don't perform any effects.
Mocks have their places but if you can't test your code without mocking out the universe then the problem is that your code is interleaving too many effects with the core logic of the program. The cure is to refactor effects to the edges of your program and run effects in one place in your code. Make the rest just plain, pure code as much as possible.
>Sure, and the effect of doing those things is moving data around. What does your program do with the results of these side-effects? Does it parse it? Transform it in any way? Decide whether to run effect A next with the result or effect B? This is the code that, if extracted, can be tested in isolation of databases and HTTP servers.
It's very frequent that this "decision making/calculation" code:
* Doesn't do very much.
* Isn't even in the top 5 source of bugs for your app.
* Integration tests caught those bugs just fine anyway.
* The process of extraction balloons the size of the code base (all those yummy extra layers of indirection), sometimes introducing fun bugs.
It's certainly the right thing to do if you have a self contained chunk of complex decision making/calculation code (e.g. a parser or pricing engine).
However, if you do this as a matter of principle (and far too many do) then this advice isn't just wrong, it's dangerously wrong.
Be careful. Principled application of this advice can lead to separation of concerns. Programs that are broken down into independently verifiable modules can lead to an abundance of spare time and pursuit of new features in the absence of errors. Ensure your job security by making sure that it is difficult to test your code.
In all seriousness though it's smart to take a layered approach to testing. It is plain good engineering to be able to separate your program's concerns into independent, verifiable modules and to isolate side-effects from pure code. Definitely write a few integration tests to verify that the modules work together and that global properties like configuration have the desired behaviours. And by jolly live dangerously and test in production!
But don't do it dangerously. Make sure you have the right team and infrastructure to test in production safely without taking down production or frustrating customers. But test in production for reals. Integration tests are fine but nothing compares to prod. Not mocks, synthetic inputs, simulated environments, and not good intentions.
But what you’re describing creates tight coupling between your integration points and your internal logic that doesn’t look like tight coupling. You’re not feeding you code real data, you’re only feeding it what you think that external service will provide. And to validate that your tests are correct you need integration tests again and to enforce that your integration point code always produces consistent valid output.
So yeah, the advice to avoid side effects is good but still exercise your integration points with real data and services.
I'm confused as to where any mention of integration tests are? Integration tests can't use Mocks, as then they wouldn't be testing the integration. Unit tests can use Mocks, but as the article and GP and Mark Seemann[0] points out, that is always worse than writing your logic in a pure way, if you can.
Because you don’t gain as much as you think when you do this.
Before your external external service touches your integration point and then runs some logic. You mock the external service to unit test the internal logic. Then you run integration tests to exercise the integration point.
In the new world you have an external service talking to an integration point which then passes the data to your internal logic. You don’t need the mocks anymore because you can just call your logic function with your data that would have come from the mock. Great! You run integration tests again to test the integration point. But now you have another linkage at the call site. Your internal logic function now has a dependency on the output of your integration point and that’s an invariant that has to be tested to catch someone modifying one but not the other.
With enough discipline you might be able to make the type system do these kinds of checks for you but IRL very
few people do enough to say catch an external string field changing format slightly.
I think I was confused because I naturally do what you say few people do. My external calls all parse into an object that you can then pass into the unit. It's the integration tests job to ensure that the parse works. Then its the unit tests job to ensure that you run enough that you can be reasonably sure that you cover all cases.
> You mock the external service to unit test the internal logic
You only need the output produced by that external service or the the input expected by that service. For example, I can pass in a data structure or object representing the HTTP request to the method that I'm unit testing, but I don't need to mock a client to generate those requests to test that logic.
This isn't a terribly practical article. I don't disagree with mocks being an "alternate reality". The author is entitled to their opinion on whether this is a good or bad thing. This said...what is the alternative? Integration testing all the way down?
The implication here is to work with stubs over mocks (i.e. I need to work with S3; I would then abstract that to provide a StubObjectStore to replace the S3ObjectStore used by other pieces of my code during tests). Great; I know they work now. But at some point, I need confidence my S3ObjectStore handles everything correctly. Do I give everyone on my team a bucket? Perhaps their own test AWS account? Test it, but only in the pipeline on its way to an intermediate stage? I can't control how AWS writes their SDKs (spoiler alert: they don't stub well), but I need some confidence that I can handle their well-understood behavior that scales. Likewise, I often can't control the libraries and integration points with other systems, and mocking offers a "cheap" (if imperfect) way to emulate behavior.
For AWS specifically, I prefer to have an AWS account specifically dedicated to each service + stage. For example, if I have an image service that handles s3 uploads (say Lambda, S3, Cloudfront and API Gateway), then I'd deploy a "test" environment to a dedicated AWS account and run tests against that. Since it's fully serverless, it only costs a few pennies to test (or free).
That means that every person who runs the tests needs credentials for that AWS account. That obviously won’t work for an open source project. Even for a company project, how do you distribute those secrets? It adds friction for developers getting their local dev environment setup.
Not only that, but you now need network access to run tests. A network blip or a third party service outage now makes your tests fail.
There is also the possibility that an aborted test run might leave state in s3 that you are now paying for. Someone hits Ctrl-c during a test run and now you have a huge AWS bill.
> That obviously won’t work for an open source project
On the contrary - I was an employee at Serverless Inc, working on the Serverless Framework for the last two years, we used this pattern extensively (and very successfully) in our open source repos.
We used part of our enterprise SaaS product to provision temporary credentials via STS and an assumable role, and it works great. You could do the same thing with something like HC Vault.
For Lambda, S3, DynamoDB, the perpetual free tier means we've never paid to run our own tests. API Gateway isn't free (after 1 year), but it's still pennies per month. We've had several cases where tests stuck around a long time, but a billing alert and occasionally some CloudFormation stack cleanup takes care of that.
We still have offline unit tests which test business logic, but everything else runs against the cloud - even our development environments ship code straight to lambda.
Speaking from personal experience, our team wasted far more money tinkering with local dev environments and trying to replicate the cloud than we ever did simply using it to develop.
The blog post in the parent comment lays out our experience and my thoughts, but because of the pretty generous free tier, I don't think we've ever paid a penny for a build/dev/test AWS account.
That still doesn't address my concern. I cannot test my S3 implementation without either mocks or a very specific emulator of the protocol. AWS happens to be popular enough that some libraries exist to do the latter, but I assert this is the exception rather than the rule for external integrations. You shouldn't be checking in code without at least some unit testing along for the ride, mocks or otherwise. It is indeed no substitute for integration testing, but it can certainly help catch bugs sooner rather than later.
I agree you need some integration tests, but, in my experience, if you define an interface the defines what you need from third parties, you can make 90% of the code you care about unit testable.
For me, this isn’t just theory: I’ve worked at a place that trained its employees to write code this way, and the benefit was obvious.
If your external integration has no local alternative, you are getting locked-in to its provider, so you should either not use it, or have an abstraction layer and implement an alternative backend.
It's simple enough take to extract the interface of the S3 calls you make into a definition that you can write a test stub that unit tests can pipe fake data into.
Mocks make me sad. They are so often misused by people with the best of intentions. I have seen so many tests that literally only assert the mock expectations and say nothing about the code under test. I have watched upgrades and refactors take 5x longer because someone steps on a landmine of overmocking. It is so common to see people mocking dumb data objects, containers, and pure functions - creating Frankenstein widgets that break all contracts the original had except for the 1 code path the original developer used 6 months ago. I've listened to DRY/SRP fanatics defend mocks until they realized that their tests were littered with unreusable, low fidelity copies of their production system in the form of mocks.
Mocks are a power tool and you can use them to do things that they are not the right tool for the job.
Contracts cover more than what type systems on most common languages can express.
For instance popping a stack with no elements throws an exception. Or similarly constructing an object with certain propertied set cause methods on the object to behave differently.
Mocked objects at best encode what the behavior of a certain subset of interactions should be - at a certain point in time.
Another fanatic blog post about how something is always correct and another thing is always wrong.
Blog posts like this will lure you into thinking that there is a single right approach. Don't fall for it.
Do what makes sense. If it doesn't work try something else the next time. Becoming good is about growing your ability to make the right calls, not blindly following a methology.
One of the most obvious problems with mocking is that the team that develops some code usually also develops the mocks that are used for testing it. So precisely the same misunderstandings will be present in the code as in the mocks. In other words you are not really testing anything.
From my experience most errors are at boundaries between code from different teams. Mocking does not help here.
My favorite form of tests are what I usually call subsystem tests. Try to test as much code as is feasible with each single test.
Usually there are parts of your system that can be expensive to setup or use. For example, creating and filling a real database can be slow. In this case you could use an in memory database. Creating it and filling it with some representative data can be very fast. This database could be used by multiple teams, and is vastly superior to mocking.
A similar approach can be used with other expensive parts like remote procedure calls [0], or input from browsers.
This approach works when you design your system so that it can easily switch between using the real (expensive) resources and the ones that are only used for testing. But that is not very difficult.
I've generally had a little bit more success with mocking when I'm hiding that dependency behind my own interface. So for example in Java, instead of trying to mock the AWS provided class, I write my own class (like a facade or repository pattern) which has a very simple interface of a success case and maybe a couple of relevant failure cases. It's calling the AWS library within it. But my mocks are at the level of my facade class which I find easier. The drawback is I'm not sure if there's a good general strategy to test the implementation of that facade. Most of the times the implementation is simple enough that I can do some simple integration tests for the most relevant cases, but there's always a risk that I am missing out some weird edge cases and I don't know how to properly deal with that.
I think the nice thing about this is that the facades usually don’t change: you might add methods, but, once the code is written, it only changes if you’re doing something major like switching databases. Code that is written and then never changed tends to be less buggy than code that changes, so this sort of pattern tends to reduce bugs at integration points.
> QA says: “it doesn’t work”. Dev says: “it must be working, all the tests pass”.
I've never experienced this, and I'm kind of doubtful this is a true reaction. I guess maybe it could happen in a team that is totally dysfunctional, where there's zero trust between QA and developers .. but in that case mocks are not the problem.
The realistic reaction is "huh, guess we are missing some test cases".
Yeah, when I’ve run into this sort of bug, I always try to figure out how to make the unit tests reproduce the bug as a failing test case before attempting to fix the bug.
> QA says: “it doesn’t work”. Dev says: “it must be working, all the tests pass”.
Yup, the very next question here should be: Show me the failing/in correct behavior.
This means:
* There's a case the automated tests don't cover.
* There's a misunderstood requirement on the QA or Dev side.
* There's a broken machine or configuration in a deployment.
I did some contracting work for a company with a 1.5 hour test suite that ran on every deploy, the section with mocked tests only took a couple of minutes — the rest were end-to-end (no mocks).
The worst parts of those non-mocked tests:
* They would interfere with each other, and could not be run in parallel.
* They were subject to real-world variability and were not entirely deterministic.
Management wouldn’t budget fixing or replacing the tests. Bugs still regularly found their way into production; arguments to pare down the tests were vigorously rejected.
My takeaways: If possible, use a language with a strong type system to avoid writing as many tests as possible. Move as much application logic as you can into pure code (so it can be tested in isolation). Observe the test pyramid.
In effect a sufficiently strong static type system greatly reduces the kinds of (mostly undesirable) programs you can compile and in doing so eliminates entire classes of bugs. If you can’t produce those bugs, then you don’t have to test for them.
In the dynamic languages I’ve used (even the ones that don’t allow coercion) you can’t ensure the arguments passed to a function at runtime will be the correct type, so you’re usually stuck doing introspection or a try catch/except block to stop an undesirable argument type from messing up your application, and you should probably have tests to verify that.
Working in languages with a type system the disallows nulls (nils, nones, etc.) you never ever have to test for null pointer exceptions, that should account for at least a few tests.
Moving to dependent types, you can even start to verify the domain logic of your application.
I think your argument may hold true for languages like Java, but it’s not applicable for the strong static functional stuff: Haskell, PureScript, Idris, etc.
> All of us have used mocks in our testing code, to substitute the real things. I know I’m definitely guilty of this. Why though? Why do we do it? Well, It’s usually because initialising those collaborators is not trivial (i.e. can’t be done with a one-liner).
I practically never mock something because I’m concerned that creating the real instance is more than a single line of code or non-trivial.
I don’t want my code making network requests if the actual instance is some kind of network/service client. As a matter of fact, I don’t want network requests in my unit tests. At that point, they’re not unit tests, and they could fail because the request failed or the system on the other end had issues - unrelated to what I’m actually trying to unit test.
I think the author fundamentally has a different idea of unit tests. I write unit tests and mock some dependency because I want to test some code in isolation where the “mock” is some interface with some possible inputs, outputs, or it fails perhaps by throwing an exception. And I can set up this behavior and validate my code gives me the expected outputs or makes some function call on a dependency.
The blog author seems to be conflating unit tests with integration tests, and I’m even wondering what’s the point - if I take them as objectively onto something, perhaps we should throwaway our tests and just test in production.
Sorry, this is bad advice. If you test against 3rd-party systems, eventually your system will grow to the point where you are spending a good deal of time dealing with your 3rd-party's random breakage (downtime or flakiness rather than API changes).
I learned this lesson twice the hard way: once with social media networks like Twitter and Facebook, and the other with Google Drive.
Docker has made most mocking nonsensical considering how easy it is now to use the real thing... But I would disagree with the premise of mocking as being non-starter. Often times you want to unit test and don't really care if you're using the real "thing" but want to hit code that's got zero to do with that dependency. Good example, we use Okta to authenticate... We want to run unit tests that test how a component in our UI works within our application, we mock Okta to get around our authentication for testing that very thing. When we want to test authentication with Okta, that's what we do.
I feel this can be summarised as "integration tests > unit tests". And yes.
However, for tests you want to run locally and quickly, the question should be, can you get the speed advantage of unit tests with the largest possible amount of integration? Can you only mock things that are "outside the system" (such as DB, networked services, and file system)?
IMO the biggest risk of integration testing is non-cleanup. I feel that is one of the major positive use cases for mocking. A mock DB will not retain garbage from previous tests, or previous runs of the same test.
They can do a good job, but you need to be very careful. "Oh, it's faster if you start the container once and then run all the tests in it." Yeah but now your tests are all peeing in each other's pool.
The upside of mocks is that the values returned are literally hard coded.
Mocks and fakes are used to test your code under isolation - like an experiment, only one variable should be under test at one time.
In addition, most stateful code that can fail, does so intermittently, is susceptible to load delay at scale that can effect your customers and your wallet, may not be at the released version you expect, requires clean-up so that your storage isn't clogged with useless test data, which may or my not be mocked, effects downstream analytic services, etc.
Isolate your unit tests with fakes and in-memory test-local storage, run proprety-based tests with generated data for both inputs and returns that confor to the constraints of your domain, including possible expected error conditions, and unit test in parallel. If you do these things, you can and will have many of the benefits of integration tests, and won't waste time writing carefully crafted test data/scenarios.
Integration test when you get a bug in production, and run the integration tests in an isolated environment identical to the one you deploy to prior to deployment to a place where your users will be interacting with your code. Cleanup can then simply be destruction of the test environment. There was a talk on blue/green deployment the way AWS SRE does it a few years back that was great. While I have the notes, I can't find the talk, but it's a much more complex process than running tests against your up system.
I used to use mocks an awful lot more than I do nowadays. I learned to do that style of testing from the book Growing Object Oriented Software, Guided by Tests (the "GOOS" book, which is still well worth a read, even if you don't subscribe to that style of test driven development). I'm still of the view that mocks are extremely useful if a) you're working in a highly object oriented style (where systems are composed of objects that communicate with their collaborators by method call) and b) you're unit testing relatively small units of code at a time. Mocks are very useful if it's important to you that object X must call object Y, do a computation and then call object Z with the result.
There are two key insights that helped me to eliminate the need for so many mocks: a) modern languages that support functional programming allow you to separate the concerns of computation and interaction with collaborators, and it's an awful lot easier to test drive a pure function that it is to test drive a graph of collaborating objects; b) modern hardware is sufficiently fast that it's much more feasible to spin up a whole service (or many services) in order to run tests on them, and you don't need to write fine grained unit tests merely in order to make the tests run quickly enough.
> And guess what… you have to keep the mocks in sync with the real things
It is a valid problem but it's also possible to write mocks that are small and simple enough that this becomes trivial.
In my experience when testing a module necessitates a complex mesh of mocks and an intricate knowledge of inbound dependencies than this means that the problem are not the tests nor mocks but rather a problem tight coupling. The tests are simply telling us that in an unexpected way.
My rule of thumb is that if I can't mock it with a "every thing at default except what I want to test" than there's something wrong with the code, not the mocks.
And most integration tests can be eliminated by refactoring the code with proper contracts so that If A depends on B which depends on C then testing that A, B and C completes the chain of trust from A to C and doesn't necessitate an integration test from A to C.
> What you almost always want, when mocking, is really just different input data for your module
>What you almost always want, when mocking, is really just different input data for your module, data normally provided by a long stream of collaborators. Just build your program in such a way that you can send it this data, regardless of the runtime (unit test framework, test env or production env). Yes, it is possible and highly desirable and don’t be in denial right now.
Mocks are data in a nice, encapsulated and understandable form. I'd rather have a simple mock than "data provided by a long stream of collaborators". I can't recall how many hours I've lost by trying to reverse engineer huge pile of data and configuration files because the "tech lead" wanted a "life insurance" to make sure we broke nothing.
Strongly agree, especially when it comes to things like AWS services. Their APIs and services evolve so quickly that things like local mocking (or emulation for local development) is an anti-pattern.
Where possible, I prefer to utilize short-term, pay-per-use infrastructure for development and testing.
If the system doesn't work but all the tests pass, then you're missing one of two things: code coverage, or articulated requirements. Either code that you're not testing is crucially responsible for the system outcome, in which case it should be tested; or the demands of a unit of code with full coverage and passing tests don't include something that a consuming unit needs it to do.
Neither of these conditions is an argument against unit testing or mocking. Both indicate a fairly basic gap in how you're testing that should be fixed directly. If integration tests would catch this gap, you're still missing the basic thing you need while incurring the cost and risks of integration tests.
I agree with the author. Some devs take unit testing too literally. I have come across tests which mock every units that a component being test depends on. It's a terrible practice. First mocks are hard to maintain and second over time this friction generate disparity between real code and mocks. I have seen tests that pass even when the actual implementations have diverged.
The rule of thumb is, if you can test without mocks don't mock.
Most of the tests we write are integration tests not unit tests. Even at function level, a function being tested calls multiple other functions to get the things done.
In production, I had about 1 in 1,000,000 failures. The most likely culprit was broken TCP connections, in turn most likely due to the enormous network load I was generating, which was only tolerable during rare windows.
I tried mocking Linux TCP socket behavior, but failed to find a formal descriptions of the possible failure behaviors, and had to guess at a lot of things; I ultimately didn't finish before the project was abandoned.
I prefer to mock everything, with one exception: Refactoring old code that wasn't built with unit testing in mind.
For this, I'll go with the "social" unit tests up to the point where the code calls an external service or crosses a network boundary. I'll always mock these things because they slow down your tests massively. A 500ms external service or db call might not sound like much but multiply that over 5000 tests.... I'll still try to mock whatever I can that can be mocked with minimal effort.
"The classical TDD style is to use real objects if possible and a double if it's awkward to use the real thing."
I've been working in a classical TDD style for the past 8 years, after at least that many years of Mockist TDD. A code base built in classical TDD style is much easier to maintain and change, but it does require more test setup which can easily be pulled into re-usable test data scenarios etc. We'll use fakes for services that are external(S3, DynamoDB, third party APIs, etc), we'll use real DB code pointing to h2 instead of say postgres other than that there's no test doubles. I would not go back to using mocks by choice.
I have seen some very very bad tests written with mocks. The tests were written in such a way that you couldn't change the underlying implementation without completely rewriting them because they were testing underlying implementation behavior down to ensuring that certain methods were called on certain libraries used by the underlying implementation code. Please don't do this. It is a nightmare to improve the implementation.
Recently, I thought about a process I call "Test Coverage Driven Testing". It is similar to TDD, but more adapted to when we write tests after the code (you know you do too, at least occasionally, don't lie).
It goes roughly like this:
- write one integration test for the "happy path".
- using some test coverage report, identify untested cases.
- write unit test for those.
I find I helps me find a good balance between time invested writing tests and benefits reaped from those tests.
> - using some test coverage report, identify untested cases.
From what I understand, that is not reliable; a line of code can be “covered” – i.e. executed – but still not be tested under all circumstances. If you have pre-existing code you need to write tests for, what you need is probably a tool for mutation testing.
This is right. This is not a reliable approach since merely calling a function does not mean I tested all of its edge cases. And if my code depends on a 3rd party lib, I might not even have access to this lib's source.
OTOH, aiming for 100% reliability and coverage is waay too expensive for most business app. This is not like embedded software for a plane, where lives are at risk. I usually aim for a 80-90% coverage of my own code, plus a regression test for each bug actually reported.
And by the way, if you really want zero error (planes, trains, cars etc), TDD is not enough anyway.
Developers end up hearing my admonition that their unit tests should not be uptime indicators for some third party API. Those external things should definitely be mocked.
Whenever/wherever there is piece of code that can tested as part of a unit test, write a simple test or two for it.
However, each time I run into 200 lines of setup code that will make the test pass trivially, I think an opportunity to simplify something was missed.
I'm not at all an expert in testing; but I instinctively take the attitude that every line of test code is code that itself has to be tested, even though it's never going into production. I don't entirely trust that instinct, but I'm suspicious of test rigs and mocks. But I don't know how to test a "module" without them. Que faire.
As the blog post says, sometimes people mock because they're incapable of running the real system. That's just another reason why it's important to be able to run your system... http://catern.com/run.html
Regardless of what this article says, mocks are really useful when your project is partially written and you want to test the pieces you have in hand. Mocking then is absolutely the right approach and might entail mocking many things (which will eventually be replaced by working code.)
Nothing should ever be mocked. Period. If you don't agree you likely don't understand many things. Think about why mocks exist in the first place. Mocks exist for things that can't be tested. What you are doing is creating something completely new in place of that thing that can't be tested and testing that new thing instead. Utterly pointless. If it can't be tested, you can't test it period. It's like saying drugs can't be tested on humans, so you create a plastic dummy in the shape of a human to test the drug on instead. Come on man.
There are books, there are experts, there are people with years of experience who think they know what they're doing but the minute I see a mock in a code base which is probably 99% of what's out there I already know that the people who designed the system don't know what they're doing.
Nothing. Ever. Needs. To. Be. Mocked. Many of you are thinking you know better. You don't. Mocking is bad. Allow me to explain.
There are two parts of your code. Code that can be unit tested and code that can't be unit tested.
Code that can't be unit tested is simple. Any code that has to touch IO can't be unit tested. Period. Any code that doesn't touch IO can be unit tested. It's that simple.
Why do people mock? Because people write systems that are too tightly integrated with IO.
Imagine this function:
function addTwo(x: socket) -> number{
return socket.get_one_numer() + 2;
}
Now you have a function that adds 2 to a number. But in order for it to be unit tested you have to Mock the socket. The socket is a parameter polluted with IO. If you have that parameter touch any part of your code then all of that code cannot be unit tested anymore and you have to mock that socket if you want to regain unit testing functionality.
What's the simple way to fix this? Easy Keep ALL IO segregated from the rest of your code. Keep IO functions and methods super small. Do not inject IO polluted objects into other parts of your code. It's trivial:
function addTwo(x: number) -> number {
return x + 2;
}
function getNumber(void) -> number {
return socket.get_one_number();
}
There. No mocks. addTwo is a function that can be unit tested and getNumber is an IO function that can NEVER be unit tested. That's it. No need to mock it.
There are two types of IO functions. Input and Output. Inputs have void parameters. Outputs have void return values. These are the functions that can't be unit tested. If you keep these functions super small and tiny, guess what? Most of your code can be tested with unit tests and you're golden.
Instead what you'll see throughout your career is typically this garbage:
class RandomObject(int param1, Socket paramSocket, IOService paramIOService, LogService logService){};
or some other overly complicated, over engineered structure that necessitates Dependency injection or some other garbage pattern that forces people to mock things to test.
Think about it. Every single method you put in that class cannot be unit tested. By using this stupid pattern you pollute that entire class file with IO and nothing can be unit tested unless you mock the socket and/or the IO service.
The problem is, this pattern even though it's so obviously detrimental is used practically everywhere because the complexity of the pattern makes it seem modular and "advanced" when really it's just bad.
Additionally I neglected to mention that it's not only IO. But overly complicated logic sometimes is mocked as well. To that I say it's the same problem as IO. If you find yourself mocking overly complicated logic to test some portion of your code it means that portion of your code is too tightly integrated with the rest of the universe. You need to loosen the coupling.
Mocking has a place, which is not unit testing. If you find yourself mocking a dependency in a unit test you are not unit testing any more.
Those points of contact with 3rd parties should be clearly defined and encapsulated at the perimeter of your system. Mock at that level. Not when testing business logic.
what does unit testing have to do with whether or not you instrument the test with fake responses? those points of contact that you're mocking out at the perimeter, that data will sometimes need to reach a particular function through a collaborator...which you may want to mock?
sometimes the dependency is not a third party, but it may be code that requires a ton of setup (as mentioned in article) that's not worth the cost. it may make sense to just mock at that point to actually test the rest of the business logic in a function. I don't think i'd say "well, that's no longer a unit test!". You can argue that it's a more brittle test, sure.
update: also, i'll be honest that comments like this really rub me the wrong way. This type of dogmatism around what is or isn't unit testing (which is a pretty ill-defined phrase in industry) is something that needs to stop. I think it hurts new practioners in the field who are mislead into black and white thinking.
I'm sorry, I did not intend to offend anyone obviously. Needless to say, this is just my opinion condensed in a sentence (therefore lacking a lot of context, which I should have provided).
I was not aiming to define what a unit test is, more like when it stops being a unit test (which I thought it would be an easier agreement to reach than a definition of what is, but I guess I underestimated the task).
My point was that if you have to mock, for example, a DB call inside your business logic, well you are writing an integration test at that point, whether or not you mock the DB out. If you design your code so that you only have those dependencies at the edge of the system then you get, in my opinion, a much cleaner design and much more testable code.
Too much mocking (and/or more like mocking in the wrong places) is a code smell in my opinion.
by claiming when something stops being a unit test, you have to define what a unit test is. Now, I do think there is a good, reasonable definition of a unit test and it's looser than yours. A single function that reaches out for an API call via another function and does N other steps of compute that has the API call mocked is still a unit test.
can it be a smell? maybe, that depends. Does it break your idea of single responsibility? maybe. Is it an integration test? not ... really because the test is primarily designed to verify the behavior of the rules of the function and not its interaction with a third party system.
so while you can argue that mocking is a smell in that context, that doesn't change the fact that it's still a unit test!
all that said, one can still make a case that the fundamental unit of work for a given context is not really a function, and so testing functions are actually integration tests! so I'll also grant that these definitions can be very context sensitive..
this convo has made me realize that our terms for testing do not pair well with actual testing practices, which i find easier to conceptualize in terms of "coarseness" relative to fundamental units of work in a system as opposed to this binary notion of unit versus integration.
More and more I'm seeing "unit testing" becoming the generic term for what we used to just call plain old testing. A lot of companies had never had anything remotely close to unit testing but called all their automated tests unit tests, at least in part out of ignorance of the difference. My current company has gone a step further and calls their extremely manual and poorly defined QA test plans unit testing.
As for my 2 cents, I find single assert principle helpful as it helps narrow down the unit of behavior your actually testing. I don't care if a test covers a single function or half the code base, as long as it's clear why it is specifically testing for and what has gone wrong if it fails.
Hi detaro. Good question. If all my module does is make an api call, and I mock the API, what am I testing? I would rather leave that out of the unit tests because the added value is, in my opinion, close to none if you are relying on mocked data.
Now, if the module does more than just call the API then I would argue that it's breaking the single responsibility principle and would prefer to split it into a module that does only the call and another that does the rest.
Ok, then go one level in: If a component uses the "only makes an API call module", how do I unit test it? I can't let it use the module to make the API call (because the API might not be available in testing/that's an integration test/...), and I can't mock it, because using a mock would make it a not-unit test? I guess this gets at the line between mocks and stubs, but I never found that all that convincing.
Yes, you just don't unit test it, because if you mock out the only dependency it has you are left with nothing. So you are not unit testing anything anyway. You know what I mean?
That code can (and should) be tested, through an integration test.
The term "unit test" can mean a lot of things (which is very unfortunate).
Does it mean:
- A test of some of the requirements that can be done very fast. (That's my preferred form when involves a large part of the code)
- A test of a small piece of code that makes a lot of assumptions about other code by mocking it. (Not a good idea, but it might work in your organization)
P.S. I did not read your post, while I was writing mine.
Hi, I don't really understand who do you refer to when you say that "anything further can be safely ignored". By me? What exactly can be ignored?
Sorry, I would like to answer your comment since it seems that it upset you, and nothing was further from my intention. But I honestly don't understand what you mean.
Hey bluepizza. As I wrote in response to another comment, I think that too much mocking (and/or mocking in the wrong places) is a code smell.
I was not aiming to define what a unit test is, more like when it stops being a unit test, and I think that if you have to mock, for example, a DB call inside your business logic, well you are writing an integration test at that point, whether or not you mock the DB out. If you design your code so that you only have those dependencies at the edge of the system then you get, in my opinion, a much cleaner design and much more testable code.
Exactly. It may sound ridiculous but it happens all the time. You dig into the "unit tests" of the app and you find mocks for a db call, or an API call, or the system's date, or environment variables that some other part of the system will break if they are not defined (or all of the above). That's what I mean, these are all code smells that the code is highly coupled.
* faster to run
* give you less confidence in the correctness of the system (per time spent writing them)
* when they fail, give you more information about where the failure is
The more integration-y/e2e-y a test is, the more it strays from this: slower to run, more confidence that the system is correct, less info about where the failures are.
I think people have learned to undervalue the properties of integration tests and overvalue the properties of unit tests. Is it nice to know exactly what's broken based on the test failure? Sure. Is it _as_ nice as having confidence that the whole system is working? Probably not, in a lot of cases.
(I don't think there's general rules about which kinds of tests are easier to write. Sometimes setting up a real version of a component for test is harder, other times setting up a mock version for the test is harder.)