The biggest gain of Cucumber I have seen is executable specifications. That is, every development task is described somewhere, whether it's pages and pages of a Google Doc or a couple lines in a Github issue. Code then gets implemented, and is almost certainly iterated over time. The question is, does the specification ever get updated to match the code's behavior?
With Cucumber, you have a relatively clear specification (written in Gherkin's Given-When-Then syntax), and it always remains up-to-date because the developer needs to update it as corner cases or new scenarios arise. (Of course, this assumes you are running continuous integration with something like CircleCI and tracking code coverage with something like Coveralls.io.)
I don't think there's anything magic about Cucumber, vs. Turnip <https://github.com/jnicklas/turnip>, or Spinach <https://github.com/codegram/spinach>, or even Steak <https://github.com/cavalle/steak>. But I do think the Gherkin syntax encourages use of clear descriptions that make it easier for a new developer (or yourself, 6 months later) to understand what a feature is supposed to do, and in particular, to avoid accidentally breaking it as you make other changes.
It was more helpful to write "Given I have an non-existent file" than to write "(not (file-exists-p org-doing-file))". One can be read by others who don't understand Emacs-Lisp and they'll understand the concept, while the latter can only be read by Emacs-Lisp developers. With the Gherkin syntax I'm free to move my tests over to another text editor and implement code that's specific to the text editor. For example, I could use my tests to implement the package for VIM or Atom (something I might do on a weekend).
The best part is that you have the spec under version control and locked down instead of spread across.
At work no one is really updating the functional spec of our project and it means we have information that's falling through the cracks and is in separate emails or IMs. Executable version-controlled specs have to be maintained but at least they're in one spot.
rspec + capybara is much more enjoyable and maintainable. Maintaining a cucumber test suit is basically a nightmare. I hate writing tests for it, I hate fixing tests in it when they break, and i hate the random cucumber exceptions that cause random failures. and should you come to a point where you want to optimize the test suit performance, good luck!
cucumber is good if you're not the one writing or maintaining the tests. You just have to have a DSL that the test maintainers can use, and they handle writing and maintaining. But thats it.
If the devs are the ones who have to write the tests, as is far more common; then interfacing with capybara directly is much much better than bringing cucumber into the mix. I used to be into cucumber, but once i switched over to a leaner approach i found testing was much easier and more enjoyable.
I agree with this. I've recently tried using cucumber for a couple side projects and have decided it's complete overkill for my needs. I see the major benefit would be communication amongst your teams (especially to translate the detailed technical stuff), but if it's a small team who are all devs I'm not finding it very useful.
Couldn't agree more. We built up a huge cucumber test suite and are now struggling to transfer it all over to RSpec with our unit specs. We were convinced we were doing it all wrong but after speaking with some of the developers at thoughbot it seems we're not alone. Transferring projects from cucumber to RSpec+capybara has become something of a regular occurrence and they have RARELY seen anyone use cucumber successfully.
> rspec + capybara is much more enjoyable and maintainable
It looks like it's limited to testing web sites and is not really useful for me. Gherkins can be adapted for all kind of tests (even if I agree that it's far from perfect).
>Maintaining a cucumber test suit is basically a nightmare. I hate writing tests for it, I hate fixing tests in it when they break, and i hate the random cucumber exceptions that cause random failures. and should you come to a point where you want to optimize the test suit performance, good luck!
You're right, in that if you are writing Gherkin specs as a replacement for something you can implement using Capybara, then I don't think you'll get the most out of Cucumber – it becomes an unnecessary abstraction.
If you're using it to communicate features to other team members, it can be super-useful, since it's much easier to comprehend that the equivalent developer-friendly code.
> If you're using it to communicate features to other team members, it can be super-useful, since it's much easier to comprehend that the equivalent developer-friendly code.
In ruby/rspec it's actually still understandable and direct. in other languages, maybe not so much.
I have not run into a cucumber implementation I liked. I agree it can lead to a human readable spec, but the effort it takes to get there, and stay there, is a nontrivial cost in terms of maintenance/flow/optimization. I don't think it solves technical problems, but organizational.
It's all about tradeoffs, with everything we do. Outside of someone else writing and maintaining the spec, I don't think the downfalls at all warrant any benefit of clarity.
cucumber is not "free" to use, it's a commitment with lots of fine print. I suppose thats really what i'm trying to advocate, in the end. At least thats been my experience across 5 or so projects with it. YMMV.
Cucumber is still quite relevant. Your alternative assumes Ruby, Cucumber does not.
I personally enjoy writing Gherkin as I get a large set of feature files that serves as documentation. I don't think it's that great as a DSL or to have someone besides the step definer writing the tests. Maybe that's why you found it hard to maintain. Oftentimes I define single-use steps because my other scenarios simply don't need them. It's definitely a poor solution if you are looking at it from a programming language or code reuse perspective. Instead I use it as a way, in English, to explain features. Granted I use it almost exclusively for integration/systems tests and not unit tests or anything.
This is frightening; is the state of the art in software development set by a process more reminiscent of music fans arguing about esoteric bands than any kind of formal rational engineering?
Pretty much yes. You will rarely see anyone using real evidence and studies to prove their point, you're more likely to see someone say "hey I heard so and so is doing this maybe just maybe it's a good idea" which is why you have half-hearted adoption of Agile for example and why you see people arguing against automating error-prone manual processes.
I don't think this is like that. There is no right or wrong, but pragmatically these tools have their upsides and downsides, and cucumber in my experience has been more trouble than its worth; I would hope to give others pause before deciding to use cucumber, because it does come with some fine print.
I've been in the situation where I had a coworker who insisted on using cucumber and BDD. I would not wish that situation upon my worst enemy. It's ridiculous to think that a non developer could write a test, and it's especially ridiculous to try to parse tests as regular expressions. Good luck grepping to figure out where the implementation of the test is. It's pointless meta-work that does not increase the quality of code.
This is digital snake oil. Matt Wynne, you're making the world a worse place. Can you find something else to do?
Ignoring the tooling, I don't feel it's fair to say that Wynne is "making the world a worse place". This quote in particular:
> Really the magic of BDD is in playing the game of, if we have to explain to the computer how to test the behavior that we want, we have to have figured out the behavior we want. By collaboratively doing that, by sitting down together and doing that we have to thrash out between those three groups what is it that we want so the tester and the developer and the business person are all on the same page about what is it actually going to mean for this story to be done. By the time they get to writing the code it’s a much more straight forward process. A lot of those potential bugs have been ironed out.
is incredibly important and often overlooked. It doesn't necessarily mean that a "non-developer" is writing an automated test, but that the engineering team, the QA team, and the business have discussed and reached an agreement over the acceptance tests. This implies that the test has been thought about, written, and understood by all parties including the business person which often is skipped in favor of "increasing turnaround time". Prototyping is fine if everyone agrees to it, but in a world where the expectation to ship software every sprint exists, you can't skip mutual understanding. That is a conversation and no tool is going to automate it away.
> That is a conversation and no tool is going to automate it away.
That's my point. It's the developer's job to know what the product owner wants and write the appropriate tests for it. Cucumber brings nothing to the table.
Note from the field: the purpose of BDD/ATDD and Cucumber (aside from validating work and making sure the system is still up) is to work with the business to get agreement on common business terms. That means there's a fair bit of factoring involved as the system grows. Things like "When you say 'content', do you mean 'html content', or also user manuals?'.
It's like English is the programming language, and you're constantly looking to refactor terms so that the English used is more structured.
What I'm beginning to see is cucumber being used for system or even unit tests. You'll get Cucumber full of magic numbers, table names, API calls, and it looks very scripty.
Testing is great, but this is using the wrong tool for the job. We end up confusing terms and thinking we understand BDD/ATDD when in reality -- not so much. (Insert long discussion here about teams working on middle layers, how APIs and microservices fit into the picture, and so on)
I think a good policy to try to follow is that nobody with a title including the word "software" or "engineer" should ever write a feature file without someone without such a title in the room or at the computer with them. If a test needs to be written and it isn't possible to get buy-in or time to follow this rule, that test should be written in a programming language at the developer's discretion. Not following this policy results in programmer frustration at writing code in a weird english / code hybrid, without any of the benefits of it catalyzing communication and elucidating business language and processes.
Everyone "knows" all this, but it's so tempting to just say "it will be easier to just let the developers do it this time", which eventually becomes every time.
> "Individuals and interactions over processes and tools"
Right. That means processes and tools should serve individuals by supporting their interactions, rather than individuals and their interactions being reformed to fit preconceived processes and tools. That doesn't mean that processes and tools are irrelevant, or that sharing experiences and ideas about how particular processes and tools can serve individuals in their interactions is contrary to Agile.
(In terms of promotion of processes and tools, it really has nothing directly to do with whether or even how those things should be promoted: it does have a lot to do with how -- and who on -- teams should evaluate processes and tools that are being promoted, though.)
Thats perhaps why he says in the interview it's not important to write all the tests out in detail right there and then, but the discussion between the developer, business person and tester is the important part. (paraphrased)
Except that when most teams adopt the SCRUM flavour of Agile, they're dropping much of the processes and tools that help (pair programming, test-driven development, etc.) So yeah, it's important to value individuals and interactions but not at the expense of light-weight processes and tools that help deliver a quality product.
Having not yet read this interview (did you?), I'm willing to give the benefit of the doubt that he identifies some problems with Agile teams that COULD be solved with BDD and Cucumber (for teams that want and choose this approach)
One of the nice things about BDD tests in Cucumber/Gherkin is that they're written down in plain text (English), a language that managers, testers, and developers all presumably speak. The Gherkin language is actually very useful for facilitating communication/interactions.
I've found it very useful, where there's ambiguity about how a particular feature works, to write a Gherkin spec that describes it. It's easy to understand, and can be agreed upon or even edited by my colleagues who aren't developers, but remain part of a CI process.
YMMV, but this remains one of the lazier dismissals of BDD.
I won't be quite so pessimistic about the possibilities as other people here, but I share their experiences. It seems possible in theory but nearly impossible in practice to walk the tightrope of creating a language convenient enough for non-technical people to read and write without getting frustrated by it, while retaining its programmability and maintainability without driving the developers crazy.
It's just a special case of the general problem with tools that aim to be easy for non-programmers to use while doing the same things as programming languages. It's always a leaky abstraction.
Most of the time I've spent with Cucumber in actual work projects has been wasted on trying to write steps to wrap idiosyncratic UI patterns which aren't handled out of the box - of course its been years since then and the UI on that project was particularly horrendous.
That being said, I do find a lot of value in being able to translate text from a story - "As a user I should see <X>" directly into a test.
You can easily get very close to this with Capybara: 'expect(page).to have(:css, ".article table th td", text: " hello world ")'
Save functionality but you won't have to support that silly regex abstraction layer. If this is still not human readable enough, I would still recommend writing your own DSL on top of Capybara rather than matching to regexes like Cucumber does.
The biggest problem with Cucumber is that most people trying it out don't understand what it is.
Cucumber is not a tool for testing software. It is a tool for testing people's understanding of how software (yet to be written) should behave.
Most bugs and delays caused by rework arise from misunderstandings, and this is the problem Cucumber aims to solve.
Cucumber is a tool that facilitates collaboration and software design (especially domain-driven design).
Here is how it works: You pop a story off your backlog and run a 20 min. meeting (Discovery Workshop) with business folks (BAs, POs, domain experts) and IT folks (developers, UX, testers if you have them).
You have a conversation about the story and come up with some concrete examples to describe the various acceptance criteria for your stories. Not in Cucumber's Gherkin language - just in plain conversational language.
For example: "The one where I upload a picture that is too big". Or: "The one where there are five taxis in range". These conversations act as catalysts to uncover subtle details where business and IT might have a different understanding.
Two things can happen at the end of this short meeting. You ask people to do a thumbs-up or thumbs-down vote on whether they understand everything that needs to be done, and whether the story is small enough. If enough people give a thumbs down, you send the story back for further analysis, maybe breaking it up into something smaller. If it's mostly thumbs-up, you're good to go.
After the 20 min. meeting you have 2-5 concrete examples that a developer (and perhaps a tester) can flesh out in more detail using Gherkin (Given-When-Then) to make it even more concrete. For example:
Scenario: Close taxis with higher rating win
Given taxi A with rating 0.8 is 1400m from the customer
And taxi B with 0.9 is 1500m from the customer
When the customer requests a taxi
Then taxi B should be assigned
The dev shows the example to the business person, who confirms that this is right (or wrong).
Now, the developer follows the regular TDD workflow, using the Scenario to guide the development of the core domain logic. The Cucumber scenario doesn't go through a UI using Selenium WebDriver or similar. The domain logic is implemented in such a way that external services, message queues and databases are stubbed out.
Lower level unit tests are still written, and there are far more of those than Cucumber Scenarios.
Cucumber is there to make sure you write the right code.
Unit testing tools are there to make sure you write the code right.
Using UI testing tools together with Cucumber? Please don't - or at least do it very sparingly. UI tests are expensive to maintain (the UI is more volatile than your core domain). They are slow (2-3 orders of magnitude slower than test talking directly to the domain logic). And finally - when they fail they don't tell you where the bug is.
The purpose of Cucumber is to bridge the communication gap between business and IT by providing a small set of essential scenarios to illustrate core behaviour of unwritten software. These scenarios do become regression tests, but their real value is to prevent defects by uncovering bad assumptions up-front. You end up with executable, living documentation accessible to everyone on the team. -Documentation of how the software should behave - and how it actually behaves.
Cucumber is a testing tool, depending how you use it it may facilitate BDD. It may not be your intention as the author for it to be a testing tool but it is, lets look at it some more.
Gherkin (Given-When-The)
Scenario: Close taxis with higher rating win
Given taxi A with rating 0.8 is 1400m from the customer
And taxi B with 0.9 is 1500m from the customer
When the customer requests a taxi
Then taxi B should be assigned
Lets look at the Gherkin syntax. It follows a set format with forced English language. It gives a context (Given), some input data (When) and an expectation (Then). This syntax is an example based specification and lives in a plain text file generally with the file extension *.feature. We'll call the above example a feature based on the file extension, we'll also call this Gherkin format an external DSL (domain specific language).
Now on it's own this feature file is useless, it's a plain text file. Why do I need Cucumber for this it's a plain text file? Why can't I store it in a shared wiki where people can collaborative edit it and track changes? If it's a plain text file to share knowledge why do I need to use forced English with the Gherkin syntax instead of a more natural form for the intended audience? Why does it need to be text if it's demonstrating shared knowledge, maybe a comic strip may be more appropriate for the domain? Why would I need Cucumber for a team to sit down to create this collaborative knowledge?
We need to follow this strict syntax because it's an external DSL which is used by an interpreter. This interpreter parses a feature file and asserts a given input is equal to the expected output specified in the feature file Gherkin DSL. Lets think about this some more, we give some context, we provide some data, we run a computation and we assert the output, this sounds very much like an automated test. If I was to write a definition of an automated test this would be it.
This interpreter is fairly fragile. We have a miss match between the plain text feature files in our Gherkin DSL format and our test (not sure what else to call it?) execution tightly coupled with fragile regex. The implementation also promotes heavy mutation in the test (sorry, again it's not a test?) implementation which ultimately leads to fragile assertions (aka tests).
@Given("^I am on the front page$") will mutate sending the browser to a page. It doesn't give an indication if it's successful or not. The function is marked with throws InterruptedException so I can only guess it blows up if it fails, or maybe not?
Then we have in TemperatureStepdefs @When("^I enter (.+) (celcius|fahrenheit)$") which finds an element by id and sends some key events. Again how does the subsequent step know this is successful, how does this step know the previous step was successful? We don't, we assume, our test may work or we may get silent errors which cascade down.
Then you see people use things like public String currentPage = "" and update it as you progress through the workflow and assert on it. Mutation, race conditions, silent failures if you've used Cucumber for any amount of time you've been deep in these trenches.
I digress
So, if Cucumber is not about the test part why do we need the interpreter which runs a computation with given input and tests it matches the expected output. Without this part Cucumber is a set of flat text files. What do we get? flat files? Why are these better than a shared wiki, google doc, spreadsheet? They achieve the same, canonical source of knowledge.
But Cucumber promotes a conversation. No, used in an agile productive organization features will be written collaboratively and Cucumber may be a facilitator to reach this goal. Cucumber does not enforce this or ensures this happens, it's promoted but in no way is this a requirement to use Cucumber.
If you are already an efficient team delivering software you'll already be having this communication part. Cucumber isn't a tool for test so what does it give us if we are already talking and delivering? We have other better test tools (cucumber is not a test tool right?) and more efficient ways of collaboratively writing, sharing and tracking knowledge.
Cucumber is a facilitator to help organisations to start having conversations and collaboratively share knowledge. I accept this, people who are looking at ways to improve things see this as a valid usecase, it's a trojan horse to get a more agile workflow in through the backdoor. My issue is when Cucumber is used in this way it's used when you are knee deep in mud, it's a technical solution to mostly a non technical issue. Your organisation is dysfunctional, likely not delivering and a command and control structure. Your team looking at Cucumber want change but the issues lie far deeper and Cucumber will not save you. Open up a Google Doc, Wiki Page, sit down with your team and stakeholders and first talk.
So really, what is Cucumber?
As a test tool it sucks. There far better automated test tools
As a shared knowledge base, it's easier to collaborate in something accessible to everyone where change is easier and can be audited. A wiki, a Google Doc, a spreadsheet, it really doesn't matter they all achieve the same goal.
As a facilitator, you have a non technical problem deeply rooted in how your organisation works. Sit down, have the talks, create the shared knowledgebase, solve the core issues then and only then look at technical facilitators.
TL;DR: Matt Wayne (the interviewee) talks about a basic problem that agile was supposed to solve: creating software that actually satisfies business requirements.
For him, Cucumber and BDD solves this problem.
But for all other bugs and code defects that happen during software development, mutation analysis works great at increasing the quality of unit tests: https://en.m.wikipedia.org/wiki/Mutation_testing
I can get behind the message that Scrum has been problematic because it focuses too much on selling PMs on practices to the exclusion of developer practices.
XP was much more balanced in this regard. I think Scrum loses a lot in both being so free-form to start and in being so often something that PMs or management bring to the table. XP in my experience always seemed to be more of a developer lead movement and ultimately a development process without developer buy in becomes a command and control process that kills creativity and stifles feedback.
I find the idea about describing the requirements to the computer leading to greater human understanding of the requirements interesting. Where I work, the product people decide random shit on a whim so this would be deemed "not Agile", but insightful nonetheless.
I have actually used Cucumber before, back when it was a new thing. It was mostly a boondoggle because this was a .NET shop, so our QA folks would figure out how they wanted their Cucumber tests to look (may have been Gherkin, not clear on the distinction), then the .NET devs would have to go off and write Cucumber parsers for that, then the tests could be run. You can imagine this cycle taking roughly forever. ("parser" is probably the wrong word, but, the thing that takes a bunch of English words, turns them into code, then evaluates it)
With Cucumber, you have a relatively clear specification (written in Gherkin's Given-When-Then syntax), and it always remains up-to-date because the developer needs to update it as corner cases or new scenarios arise. (Of course, this assumes you are running continuous integration with something like CircleCI and tracking code coverage with something like Coveralls.io.)
I don't think there's anything magic about Cucumber, vs. Turnip <https://github.com/jnicklas/turnip>, or Spinach <https://github.com/codegram/spinach>, or even Steak <https://github.com/cavalle/steak>. But I do think the Gherkin syntax encourages use of clear descriptions that make it easier for a new developer (or yourself, 6 months later) to understand what a feature is supposed to do, and in particular, to avoid accidentally breaking it as you make other changes.