Hacker News new | past | comments | ask | show | jobs | submit login
Must and Must Not: On writing documentation (acm.org)
182 points by amortize on Dec 9, 2019 | hide | past | favorite | 63 comments



The primary rule of writing documentation is: Revise it, many times, on separate days. You need to keep coming back to it, reading it with a fresh mind that hasn't thought of what you wrote for at least half a day, preferably longer. What seemed perfectly clear yesterday will come across as ambiguous today. You keep revising until the changes get smaller and start to feel cosmetic.

Your job is to write concisely, using simple, unambiguous language that's quick to read and easy to digest. It's a skill that must be learned, practiced, and honed.

It's also a good idea to explicitly spell out the meanings of "must not" etc, because people (especially non-native speakers) sometimes get confused. Example: https://github.com/kstenerud/concise-encoding/blob/master/cb...

It doesn't take up much room, but it saves a whole lot of trouble down the line.


Large requirement docs need to be refactored too.

We got up to 600 pages for a spec one time. And we could only do that because our intern figured out a trick to keep the text editor from panicking on that much text (edit chapters, concatenate them only to produce the final doc).

We got a nasty architectural surprise one day because the same paragraph appeared, ver batim, more than 40 times (I think it may have been as many as 60) in different sections of the document.

But one of them was worded slightly different from the others in the middle of the paragraph, negating some assumptions we had made about the information architecture of the system. I was pretty upset by this. What a hostile way to write documentation! It felt like sabotage.

To stop this happening again I wanted us to move clauses like this high up in the document and instead of repeating them, using one sentence to refer to the common constraint elsewhere. Then the difference would have stuck out like a sore thumb, and might not have been accepted. Unfortunately they weren't biting. As a consequence, document review got a lot slower after that incident.

One of my bosses at a previous job was a shockingly open and pragmatic person. On several occasions he stated that something we were planning to do was a trick you only get to use once, and did we really want to spend that trick on this particular issue or save it for something more important. This documentation trick felt cheap and poorly spent.


> Revise it, many times, on separate days. You need to keep coming back to it, reading it with a fresh mind that hasn't thought of what you wrote for at least half a day, preferably longer.

Agree completely. Can you recommend any resources that emphasize this point and provide similar helpful documentation guidelines?


Another idea is to try on different personas: Put yourself into different roles, e.g. the person who just looks at the examples and skips wall of texts, the person who wants to read up on every nook and crany, the person who has no technical background etc.

The cool thing about text is, that it can be many things to many people and this is even more true with hyperlinks. Try to cover all bases at least a little bit and people will ignore the stuff they dislike and go after the stuff they like.


I don't know of any resources offhand, but the principles of technical writing are very similar to code writing. The biggest difference is that computers only need to be told once, whereas humans need a few (controlled) repetitions and some examples in order to digest something.

- Don't Repeat Yourself: Describe it in one place, and then refer to it elsewhere in the document.

- Componentize: Build up smaller concepts and then use those smaller concepts to build up bigger concepts, etc. Then you can continue to use simple language with the bigger concepts that are easily broken down because they in turn are described in full.

- Each part does one thing and does it well. Don't mix many different concepts in one description. Describe one thing well, and reference it elsewhere.

- Keep things short. The longer a section is, the harder it is to absorb.

- Naming is important. Use memorable, descriptive names for your concepts.

- Have examples that demonstrate your concepts. Don't try to pack many concepts into a single example; that just makes it harder to figure out.

- Have a glossary if there are more than 10 new concepts.

- Test your documentation on people who are unfamiliar with the project. Ask them where they're getting confused. Confusion is the human manifestation of a bug in your documentation.

- Always keep your goal in mind. What should the end result be with your reader? Anything taking longer than 10 minutes is a chore, and only the most dedicated would continue that far.

- U/X is important. UI layouts, source code, and documentation all have U/X components to them because a human will be interacting with them / consuming them in some form. Make it easy for them to navigate to what they need.


FWIW, a couple of books that have helped—and continue to help—me write better:

On Writing Well[1], by the inimitable William Zinsser

Clear and Simple as the Truth[2], by Thomas and Turner

I've written some more about the above two books here[3], in a different thread.

[1] https://www.goodreads.com/author/show/7881675.William_Zinsse...

[2] https://press.princeton.edu/titles/9445.html

[3] https://news.ycombinator.com/item?id=20062455


Also helpful is the Checklist for Plain English[1]. The gist is "less is more;" fewer words are easier to understand than many words, short sections are easier to understand than long sections, and familiar words are easier to understand than something you looked up in a thesaurus.

Of course, these guides are only helpful if you already understand what you're trying to convey; that's the real secret of writing well. Figuring out the correct sequence to walk a reader from "I don't know anything about this topic" to "I understand this topic" is more art than science, and to do so well takes years of practice.

[1]: https://plainlanguage.gov/resources/checklists/checklist/


There are some utilities out there that will give your writing a education grade level (such as the Hemingway Editor: http://www.hemingwayapp.com/). In the case of technical documentation, you want to aim for the lowest grade level you can. Your technical words and descriptions of systems will necessarily push the grade level higher. Every non-technical phrase would ideally be written as simply as possible.


I've taken some technical texts I've written in the past and run them through the xkcd simplifier - it identifies words that are not in the 1000 most common English words (? something like that).

https://xkcd.com/simplewriter/

I keep the words that truly add value to the project, and replace all the others with more simple ways of saying whatever I am trying to say. Usually in the process I lose a few lines of text (occasionally I can a few bullet points instead). It's a pretty good experience to go through (for example, this post fails. A lot.)


It would be nice to have something like this with common words used in programming contexts added to it. Maybe as a VSCode plugin or something.


Couldn’t agree more with that checklist. I’ve used Hemingwayapp.com to help simplify my prose in the past, which seems to enforce similar rules. I’d imagine grammarly is similar but don’t have personal experience with it


The computer scientist David Parnas had a great technique for helping his students who were having trouble writing their theses. His idea was this: every chapter and paragraph in your thesis should answer a question. Start your outline with questions. For example:

Chapter 1: What is this thesis about?

p1: What is the problem tackled in this thesis?

p2: Has there been previous work done on the problem?

p3: Can the previous work be improved?

Chapter 2: How did I approach the problem?

and so on.

Then the writer started answering the questions. The final step (usually) is to change the questions into headings or remove them.


Simon Peyton-Jones (of Haskell fame) also has a similar talk+slides that I find valuable to periodically re-visit. https://www.microsoft.com/en-us/research/academic-program/wr...


There was an incident related to this that got a couple of German colleagues in big trouble. They were told that a specific change "must not" go into a release, it was OK for it to wait for the next patch release. They heard "must not" as the negative of "must", as it is in German. This is wrong: "you must" and "you have to" are the same, but "you must not" and "you don't have to" are not the same. They thought they were being told "you don't have to" but thought they had the freedom to get the fix to a customer sooner. It seems this caused breakage for a different customer and they were accused of being insubordinate until it was sorted out.


I think the more illustrative comparison in English here would be "you must not" vs "it is not a must to".


The difference between "must not" vs "need not"; in some languages the equivalent word functions like english "need".


> They were told that a specific change "must not" go into a release, it was OK for it to wait for the next patch release.

To be fair, the two clauses of this sentence are somewhat confusing: If it is "OK to wait" it somehow implies that "you don't have to", while "must not" is indeed imperative.


I think that's the GP paraphrasing, the real issue seems to me to be the fact that "du musst nicht" literally translates to "you don't have to".


Seems like the associativity is different: in English "must not foo" groups as "must (not foo)" while German's is like "(must not) foo" (taking 'not' as postfix there)?


Yes. Although their English was excellent, they thought that "you must not" meant "du musst nicht".


What's the level above excellent that includes the grammar of negatives?


Their English could have been excellent. People can communicate effectively and yet have small gaps in their knowledge that remain unnoticed for years.


Yes, for example, there appears to be a gap in the knowledge of many native English speakers, on this issue. "You must not" is not the negative of "You must" in English. It's an oddball exception, one that native speakers don't notice because they've been hearing "You must not" since about age two. So people accusing my former colleagues of not understanding the grammar of negatives really don't get it: they understood perfectly, they just weren't aware of a very important exception.


> It turns out that it's not necessary to have fancy formatting in order to communicate clearly; in fact, fancy formatting often distracts from the message you are trying to get across.

Man I wish the JavaScript community would grasp this better. I’m looking at you, GitBooks!


At least they still render with JS disabled. Gotta love when a text document won't render without script bundles.


"the problem is that I'm not a writer, I'm a software engineer, "

This is a pretty common attitude, but I would argue that you are not a competent engineer unless you can properly document what you have done, plan to do, expect of other systems, etc.

It may not be a favorite part of the job, but it absolutely is a critical and fundamental part of the job.


‘If you can’t explain it simply, you don’t understand it.’

Also, writing documentations helps me understanding and reflect.


KV is missing the point. The problem is not really with generating the documentation, so much as it is with the software engineer lacking understanding of what they are writing about, what the security processes should be. That is why the engineer is at a loss and looking for some kind of a template, to rely on others who have already done the work. With regard to tackling security processes in a startup, one could refer to NIST standards. Maybe not the most suitable per se, but a starting point.


I’ve yet to see a solid execution of how to convey INTENT.

It’s the why vs what issue scaled back a layer. With software and systems you want the user to do solve many problems with your solution, but you never let them know your exact intent and that causes shoehorning.


It's good to reference RFC 2119 and using the language there.

But other than that, it seems like a terrible response to the question asked, missing the point of the original question by at least 80%. Or phrased another way, it sounds like somebody's trying to push for RFC 2119 and sacrifices giving good answers to questions.


This is helpful to get all doc authors to use the same vocabulary. One ambiguous wording that I keep seeing is "X may not do Y". Does the author mean that X possibly doesn't do Y or do they mean it's strictly forbidden that X do Y (-> "x MUST NOT do y")?


Clearly what we need is for documentation to embrace the subtlety that can only be expressed by a double or even triple modal auxiliary: "when the server receives a request, it'll might could reply with either a standard response or an error code... unless it entirely fails, in which case it might not would do anything; the client may oughta detect this with a timeout, but may can instead wait for the operator to intervene and explicitly cancel the operation".


As outlined in Dr. Dan Streetmentioner’s Software Developer’s Handbook of 1001 Tense Formations


In the author means that X possibly doesn't do Y, and possibly does do Y, then I'd say just drop the "not". "X may do Y".

I'd use these:

X shall Y: X always does Y.

X shall not Y: X never does Y.

X should Y: it is expected that X will do Y, but it might not often enough that this case needs to be handled (as opposed to an "X shall Y" not doing Y which would be considered to total failure of the system).

X should not Y: it is expected that X will not do Y, but if it does you should handle it.

X may Y: both X does Y and X does not do Y are OK.


To me, that reads as 'X is not allowed to do Y' without any ambiguity until you pointed it out and I thought hard about it.

Wondering whether that is common response or not.


it's an ambiguous parse for sure, but the meaning is usually clear in context. it's possible to create a really ambiguous example though, e.g. "when pinged, the server may not respond". is this stating a requirement that the server never responds to pings, or is it a warning that the server doesn't always respond?


The problem is statement vs command.

"X may not do Y." as a statement implies that X doing Y is unusual, but possible.

"X may not do Y." as a command implies that X doing Y is a mistake.


A technical document must not include the term "may not". "Must not" or "might not".


While I think there is great benefit in using concise and clear language, I highly disagree if anyone takes away the idea their documentation should be a themed after an RFC (I do enjoy reading RFCs though!)

This [0] was imo an excellent discussion on documentation that gets to the issue much more than word and formatting choices.

The short is there isn’t “documentation” as one thing, but four.

1. Tutorials where you start to finish explain it to a newbie,

2. How to guides where you explain specific tasks,

3. Discussions where you lose all formal pretext and explain your INTENT (huge) and reasoning for decisions, like you would actually explain to a person, not a robot, not some cute marketing talk

4. Technical reference that goes over the actual mechanics and the weeds that people need only once they’re working with your solution.

It was on HN at least once. [0] https://www.divio.com/blog/documentation/


Am I the only one who was initially confused by "Must Not"? As a non-native speaker, my logic was probably something along the lines of "Must Not" => "Not a must" => "You don't have to, but you can" - which of course is not what it means...


The meaning is unambiguous to native speakers. In "you must not run" the word "not" always modifies "run", never "must".


It's not that unambiguous. Compare "You must not do X" and "You need not do X". The sentences have the same form, and they are both common and idiomatic, but in the second sentence, the "not" modifies "need".


"Need not" isn't exactly common phraseology/is older phraseology, though; and a common way to phrase a similar sentence would be "You do not need to do X", and it's more clear that "not" modifies "need" in that context. Regardless, for a native speaker it should be pretty clear what both mean.


"Need not do ...": Unnecessary.

"Must not do ...": Disallowed / prohibited.


These used to be called Dos and Don'ts.

Do A, B, C. Don't do X, Y, Z.

Am I the only person who thinks it's ironic that an article about documentation appears to suffer from a fairly fundamental ambiguity?


No.

"Need not" (or "May...") indicate actions or behaviours which are allowed or optional. This is not the same as "do X", which is a "MUST" condition.

"Must not" (or "Don't...") indicate actions or behaviours which are expressly disallowed.

RFC 2119 referenced in the article should make all this clear. If you're still confused after reading that ... you SHOULD NOT be writing documentation. And possibly not be reading it ;-)


The meaning is as if "not" modified "run", but syntactically "not" is attached to "must", I think. You can probably convince yourself of that by considering a sentence such as: they can yet must not run. So you are absolutely right to be confused by such sentences. There is something weird going on.


Perhaps an analogy to C's multiple declarations:

    int *i, j;
 
i is a pointer to an int, j is an int. The * binds to the variable identifier, not to the type. Famously troublesome for beginners.


"Mustn't" means exactly the same thing as "must not" (but makes it clear that the negative should be scoped to the verb, by using an inflected negative form of the verb itself), so yes, there's something funny going on with "must".


See wiml's sibling comment for a correct reason why someone might be confused by such sentences. "Not" negates, so it is definitely not modifying "must". Clearly the word being negated is "run".


Believe it or not, as a native speaker of English who has studied linguistics, I do know that "not" negates, but thanks anyway.


For native speakers, "must not" is explicitly forbidding something. English is full of inconsistencies. Queue all kinds of poems and comedy sketches. Some things you just have to know

One of my current favorites is Ismo's use of "ass" as seen on Conan: https://youtu.be/HLyFlWahuSE


That's why we have an RFC; to avoid confusion.

2. MUST NOT This phrase, or the phrase "SHALL NOT", mean that the definition is an absolute prohibition of the specification.


A lot of commentary here on if "MUST NOT" is too ambiguous, or not, or a problem for non-native speakers...

This makes the authors point about citation. In technical documentation any terms use like this must (MUST) be defined, either in line or by citation.

This is also part of the reason they are capitalized, to key you to the fact they are being used in a structurally important way. If you are reading them, you need to familiarize yourself with the context.


My system engineering experience avoided "must" as insufficiently specific.

We used "shall" and avoided "not". Because testing: how do you prove a "not"?

Source: 35 years of SE


I've tended to use SHALL and MAY for the same reasons, but it is important they are defined anyway, somewhere everyone agrees.

Agree "not" is tougher, and best avoided when you can.


Don't write sentences. Write bullet points. Arrange those bullet points into groups. That becomes your table of contents. Continue to add more bullet points over time, and rearrange them as necessary. Eventually, you'll get the picture of what the documentation should look like. Then you can convert the bullet points into prose.

The reason you do it this way is that it's way easier to write bullets that it is to write sentences. It allows you to focus on substance instead of form. It's analogous to typing into a plain text editor and not worrying about formatting. Formatting, and form, do matter, a lot. But you can add them later.


And once you have write the bullet points, stop and ask if the people really adds anything.

Prose is could for creating emotion in the reader, but not for conveying technical information. Know your goals.


Sorry but this seems to hit one of the age-old arguments of documentation: can you truly rely on everyone using your code downstream to read your docs?

This article makes a strong assumption "yes". Because it must, it relies on non-executable and non-verifiable documentation to implement a security gate. But for a very small performance overhead, you can likely do those checks in code, and never worry if your documentation becomes lost, outdated, or simply ignored, and more importantly, now that the requirements are embedded in executable code rather than dumb documentation, you can automated testing of the requirement.


Yeah, I prefer unit tests over docs. Rather than writing a document saying "Make sure X does Y" Just write a spec checking that X does Y and if it ever stops doing that there will be a descriptive error in the CI.

Docs and comments go out of date and people don't know they exist/where to find them. Unit tests are always checked and always maintained.


Can you provide proof that you can automate all tests of all code?

Can you provide proof that you can turn all human-understandable language into computer-understandable language?

We haven't even started the discussion yet of how much code you spend on writing tests for other people's code, versus how much code you spend writing your own code.



On a related note, does anyone remember a recent (~30d) article describing a company which disabled their Slack (or alternative) message retention in order to encourage more permanent documentation? I'm sure I saw it on HN but cannot for the life of me find it again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: