Hacker News new | past | comments | ask | show | jobs | submit login
TLA+ is hard to learn (2018) (surfingcomplexity.blog)
94 points by azhenley on Aug 21, 2021 | hide | past | favorite | 44 comments



TLA+ can be a powerful tool in certain situations - but it's got some... unique ways of doing things.

For a start, you almost always have two languages in one file - a mixture of 'TLA+' and 'PlusCal' with the latter written in comments and used to generate more TLA+. And although they're separate languages, they're close enough any blog post or search looking for TLA+ has a good chance of actually being about PlusCal and vice-versa.

And every document can be presented two ways - you can write a TLA+ document in ASCII, or take it through a LaTeX-style conversion to make a PDF. So for example, what appears as ⋃ in some documentation might be typed as \cup

That makes a lot of this stuff almost ungoogleable. Go ahead, try and find what ⊑ does or what \bigcirc does. And if you do manage to find a document explaining how to do what you want to do, 50% of the time it tells you ∘ is the operator you want - but not how to type that in.

And it's very easy to make specs that will take forever to validate - and often tough to tell what change made it so. For example, if you simulate four independent processes that count to three, it'll evaluate all the ways those processes could be interleaved in a fraction of a second. But if you increase that three to ten? The model checker will run and run and run. Do you need to leave it to run over night? Wait the age of the universe? As far as I can tell, the only way to know is to wait, and the only way to figure out the cause of the slowness is to undo the changes to your file until it starts checking in a reasonable time.

So while it's a powerful tool, and capable of doing some things that really can't be done any other way, convincing other people of the tool's merits can be tough.


All of TLA+ (except the proof language), how all the symbols are typed, and all of the standard modules can be found on this cheatsheet: https://lamport.azurewebsites.net/tla/summary-standalone.pdf

While some people prefer writing PlusCal, I don't think it's the majority. Most people I know just write TLA+. PlusCal is more suitable for "code-level" problems, like specifying low-level operations on concurrent data structures.

As to your point about TLC's sometimes surprising behaviour, I agree, but "mysteriousness" is common in software tools. These are things that are learned through experience. One way to estimate how things are going, even with no experience, is to look at the graphs of newly discovered states that the Toolbox draws. If they're trending down, you're in the process of converging; if they're still going up -- you're not.


> And it's very easy to make specs that will take forever to validate - and often tough to tell what change made it so. For example, if you simulate four independent processes that count to three, it'll evaluate all the ways those processes could be interleaved in a fraction of a second. But if you increase that three to ten? The model checker will run and run and run. Do you need to leave it to run over night? Wait the age of the universe? As far as I can tell, the only way to know is to wait, and the only way to figure out the cause of the slowness is to undo the changes to your file until it starts checking in a reasonable time.

Given N deterministic sequential processes with M steps each, the total number of possible behaviors is `(NxM)!/M!^N`. So for 4 processes that count to 10 each, that's `40!/10!^4`, or 4.7 * 10^21 behaviors. If they count to 3 each, that's just 370,000 behaviors.


The funny symbol is called sqsubseteq and you can find that out by drawing it here: https://detexify.kirelabs.org/classify.html

But I agree that there needs to be a way to type latex Symbols quicker. The Lean Community did this quite well: hovering over a symbol in the docs gives you the correct ascii name.


"Spectacle" is a Haskell DSL for writing TLA+ specifications: https://awakesecurity.com/blog/spectacle-a-language-for-writ...


It's inspired by TLA+, but it uses a different semantics, I think.


For me the big issue is the weird syntax. I know this may sound superficial but it is really off-putting.

My guess is that if syntax is improved, a lot of people will give it a shot.


TLA+'s syntax is pretty close to standard mathematical notation, and is very similar to other languages for writing mathematics, so it is actually quite mundane. Anything that's more programming-like would have made it weird compared to most other mathematics languages.

But it is certainly true that languages for writing mathematics often have syntax that's different from languages for writing computer programs, but the question is, would writing mathematics in programming syntax make learning it significantly easier for programmers?

There are certainly many who make the argument that it would -- including some experienced TLA+ users -- but I think there are two points against it:

1. While it might make understanding easier on the very first day, mathematical formulas simply do not mean the same thing, or "behave" in the same way, as programs do -- neither operators nor functions behave the same way as subroutines do, not even the same way "functions" in pure-functional programming languages like Haskell do [1] -- which, on the third week and on actually makes matters more confusing, and,

2. Just as when you want to find some answer about some programming topic, it helps that the answer is in some syntax you know, almost all materials that are relevant when learning and using TLA+ -- say, you want to look up De Morgan's laws on Wikipedia -- are written in the same mathematical notation.

At the end of the day, what you write in TLA+ is mathematical formulas, and the standard mathematical notation is the most familiar and popular syntax for that task.

[1]: https://old.reddit.com/r/tlaplus/comments/enencr/yet_another...


I see the point you are making.

The issue with mathematical notation is more practical one. Most programmers do not deal with maths (and notation) on a daily basis. TLA+ is a tool. If the tool is encouraged to be used in the field by programmers to model their systems, then it should "adjust" to their needs.

This means:

1. Better Debug-ability

2. Easier syntax

3. Better error messages

(I do not know how to do these and do not want to sound it as trivial)

This is where I want to make a subtle point: It does not have to be on par with mainstream language. I am not saying it should be as easy as Python/Go/Java. Solving for some of the low-hanging fruits would have disproportionate improvement in usability.

(A lot of ppl find parsing C++ templates & Rust generics jarring, let alone mathematical notations)

Another example of tool: https://www.wolframalpha.com/

It has both natural language input and mathematical notation as well (you see these options right below the search bar)

Click on any of the examples: https://www.wolframalpha.com/examples/ They are self explanatory for most I assume.

Even a simple "ForAll" instead of the symbol "∀" goes a long way (in my book) https://reference.wolfram.com/language/ref/ForAll.html


As a Software Person who is self-studying Formal Methods and TLA+ i can categorically state that you are wrong!

The Mathematical Notation is THE universal language of Logic, Set Theory, Functions etc. which are the mathematical underpinnings of Computer Science and Programming. This is what all Engineers need to be familiar with. Programming languages are incidental in this case and are merely used for syntactic expressions of Mathematical Concepts. This is as it should be and the reason Leslie Lamport (the inventor of TLA+) settled on this specific notation. Please see his interviews/videos on Youtube for more details.

In my own case i started by grasping/studying the basics from the following books;

* Software Engineering Mathematics by Woodcock and Loomes.

* Understanding Formal Methods by Monin.

* Introductory Logic and Sets for Computer Scientists by Nimal Nissanke.

* Mathematical Notation: A Guide for Engineers and Scientists by Scheinerman.

This allowed me to start studying Specifying Systems by Leslie Lamport which is the main book for TLA+.

PS: User "pron" to whom you replied to has a nice 4-part series on TLA+ which i have linked to in another post in this thread. He really knows this subject :-)


I agree with better debugability (there's now a TLA+ debugger) and better error messages, but I think that programming syntax would only make things easier on the first few days, and actually make things harder later: for one, you'd still need to learn mathematical notation because all the relevant material anywhere on logic uses it, and because the meaning of the syntax would still be very different from programming.

> I am not saying it should be as easy as Python/Go/Java.

I think it's already significantly easier than all of them already for someone who has no knowledge in programming and/or modelling with mathematics. But programmers need to know that they're not learning another programming language, but a completely new skill.

> They are self explanatory for most I assume.

Yes, but only because I already know what the symbols mean. But if I wanted to learn about any of those subjects, the materials would not be using Wolfram syntax but mathematical notation, so I'd have to learn it, anyway.

Why did Wolfram choose that syntax, while TLA+ (or Coq, or Agda, or Lean) chose a more mathematical one? Because they're used for different things. Wolfram is for quick calculations you feed into the computer. TLA+, OTOH, is supposed to take maybe hours to think about each line, and then, what you'll be doing with it most of the time is not feeding it to the computer but reading it. After a while, mathematical syntax is less strenuous to read than Wolfram syntax, especially when you might have hundreds of lines of maths. The think/read/write ratio of TLA+ is very different from any programming language -- and even Wolfram -- so it doesn't make sense to optimise for the same things.

> Solving for some of the low-hanging fruits would have disproportionate improvement in usability.

Yes, but I don't think syntax is one of those things. TLA+'s syntax is not only rather standard among similar languages and close to the syntax used in study materials, but actually helps.

One of the syntax-related things that I think might help is for the editor to replace the input ASCII with real TLA+ syntax in Unicode as you type.


Having studied TLA+ a little bit, I'm curious - is there something flawed in the parent comment's argument?


> is there something flawed in the parent comment's argument?

As someone who really dislikes mathematical notation (pre-TLA+ proficiency), I think it's flawed.

You need to learn to think in math. Literally use math to solve CS problems.

This is fundamentally a different kind of thinking than writing code.

Once you learn to Think In Math™, the TLA+ notation is fine, even good—it maps 1:1 to your thoughts. Programming language notation would only get in the way.


Well put! I was in the same boat i.e. wanting everything to be expressible in C/C++ type of syntax before i realized that it was quite the wrong way of looking at things and in particular; Mathematics. Mathematical Notation is fundamental while Programming Language Notation is incidental when looking at Mathematical subjects. It was merely an unwillingness to put forth effort to learn a new "shorthand conceptual language" that was holding me back. Once i learnt to read the notation and familiarized myself with it, things became much "easier and obvious".


I would say that TLA+'s syntax is quite standard, or even conservative, similar to that of most maths language, and so quite the opposite from weird.

The question of whether mathematics would become significantly more approachable if written not how it normally has been for the past hundred years but rather in programming language syntax, even though the symbols would not have the same meaning they do in programming, is not one I have the answer to. But even if TLA+ itself used programming syntax, because it would still be just maths and logic, once you'd want to learn something about a particular topic, the material would be in standard notation, so you'd still have to learn it.

And even if that weren't the case, and even if the mathematical notation were a serious obstacle, I think that obstacle is only significant in the first couple of days; after that, there are bigger challenges -- like getting used to describing things with maths -- that make the early syntax problem pretty negligible in comparison. There are other ways to specify discrete systems, but TLA+ chooses to do it with (pretty simple) mathematics. You have to know what you're getting into, namely learning how to use maths to specify. It isn't very hard compared to, say, learning how to program, but it is a challenge, albeit a fun and interesting one, that you have to want to tackle. But the payoff is quite big, in my opinion, even beyond being able to write specifications in TLA+ (which, in itself, has a big payoff in software quality compared to the effort required).

Those who've just started climbing a hill might complain about the first ten steps, because those are the ones they've taken. But the best way to get more people to the top isn't to flatten the first ten steps, but to show them why climbing the hill is worthwhile.


To me, a big turn off was the Eclipse based interface. I would have been happy with minimal support (syntax highlighting, automatic formatting) for common editors and some command line tools. Maybe these things exist now, but last time I tried using the tool, this wasn't available out of the box.

And yes, the syntax is weird. I understand it tries to mimic set theory notations (which I'm fine with), but transposing this in ascii didn't feel right.


I use the Visual Studio Code TLA+ plugin, it's significantly nicer to work with.


I think I've seen on HN stuff built on TLA+ but with a typescript-like syntax. A lot of people argue though that by making it more programming-language-like, vs math-like, that you're trading better tools for the job in exchange for what's familiar.

I don't use TLA+ yet, I'd like to learn, but I've followed a lot of discussions on HN regarding this.


Did you try pluscal? My understanding is TLA+ is intended to be low level ( closer to theorem prover like z3) instead of a high level spec language.


The tool is also pretty abysmal at telling you why it can’t parse the thing you wrote. For me that was the reason #1 it’s hard to learn.

The other reason I don’t use it is that it’s very hard to inspect what you’re running. You can’t write something like “a trace of X state transition must be possible within N cycles with the right inputs”. It makes it really hard to convince yourself (and your peers) you haven’t assumed all of the state away.


TLA+ as a formal language for specifying systems is quite good. But the quasi-official toolchain for working with it [1] including TLC and the sanitizer are not well maintained. They were pretty resistant to outside contributions when I tried. I have hope that a better toolchain for working with the TLA+ language can come along. The underpinning language is well designed. Lamport is no slouch.

[1] https://github.com/tlaplus/tlaplus


TLA+ really is quite nice. I write most of my TLA+ specifications longhand and only bother with the toolbox when I think that refinement might be useful. Even then, it's mostly for SANY rather than TLC.

After noodling with with the spec for half an hour or so, I’ll usually have enough insight/confidence to start coding/debugging.


> You can’t write something like “a trace of X state transition must be possible within N cycles with the right inputs”.

You can, although possibility properties aren't for beginners: https://youtu.be/TP3SY0EUV2A?t=818

But there often are easier ways to do sanity checks to "convince yourself you haven’t assumed all of the state away," usually either by asserting something believed to be false or by intentionally introducing an error in the spec, and letting TLC find a counterexample.


I agree it’s workable, but it’s a bit of a hack. Commercial formal verification software for RTL has this built-in, and once you have 10+ people updating a spec, you need an automated way of checking for accidental constraints.


I don't think possibility properties are directly expressible in any linear-time logic, just branching-time logics, and those have their own issues, even though I think they're still sometimes used.

A mistake that could be automatically checked is specifying a system that is equivalent to FALSE (and so implies anything). I hope TLC adds a feature that allows you to check if your system implies FALSE. In the meantime, checking the invariant ¬Init has the same effect.


The blog post is too vague and not really helpful.

This detailed 4-part series by user "pron" is far more comprehensive to understand and start with TLA+: https://pron.github.io/posts/tlaplus_part1


Z3-solver is a popular backend for specifying properties about programs. Dafny, Viper, Boogie are all systems built on top of it.

The issue is that much real world software is not written in these languages (or Haskell)

Typescript and python are a lot more interesting.

https://adsharma.github.io/pysmt/

The proposal is to have a programming language and a specification language both with a pythonic syntax living in one file.

Coupled with a transpiler, it could make a capable system vs the C++/TLA+ based alternative.


Are there good alternatives for solving the same / similar kinds of problems?

I am a big fan of property based testing that can fuzz through very similar types of problems but it's not quite the same.

Or is tla+ just the thing to use and it's worth ploughing through?


It's worth plowing through.

I hated the syntax at first, and frankly, it took two weeks and about 500 pages of reading before it clicked (as to how to solve practical problems with it).

It's been a pleasure to use ever since. What is does you can't get from any other tool or methodology (with the same amount of effort).

> Are there good alternatives for solving the same / similar kinds of problems?

Not that I've found. It's really worth it to just learn TLA+.


Yup, I agree strongly. I watched the Lamport videos when on paternity leave, and really enjoyed it. The main benefit for me was that writing a spec in TLA+ allows me to "get my mind right" before writing programs. I was in love with the notation from the beginning (but I was a math major in school so I'm biased).


https://en.wikipedia.org/wiki/Specification_language

There are quite a few alternatives. Alloy is pretty approachable and aims for the same or a similar space. I liked Event-B (based on B-Method), its tooling was nice. Z Notation (pronounced Zed) was my first examination of formal specifications, it lacked a lot of tooling for automation but I found it helped to clarify my reasoning about systems quite a bit. I wish I'd used it more, it's now a fuzzy memory for me.

That said, I like TLA+ specifically for what it's aimed at, modeling concurrent and distributed systems. It's very effective for this if you can take the time to understand it.


Alloy is great for exploring possible shapes of data given specific operations.

Most people who use TLA+ also seem to fire up Alloy once and awhile…


Coyote (concurrency exploration tool for .NET programs) can be used to do something "similar". My team often writes tests which set up focused concurrency between different APIs, the tests use Coyote to explore different ways that concurrency can unfold and write a strong set of assertions and invariants that must be true at the end. It's not TLA+ but it's still quite effective, very teachable as developers are already familiar with writing C# tests and helps catch safety and liveness bugs in our actual code base (as opposed to a model of it). It's not the same, by design, and does a decent job at finding concurrency and distributed system bugs.


TLA = Three-Letter Acronym?


What's the top result if you Google "TLA+"?


Or click the article.


Did.


Temporal Logic of Actions


Tla also introduces a new type of bug: differences between your tla model and your program. And I found that in order to get tla to reason about large programs, you had to express the model in ways that look very different from the program.

This issue alone is enough to dissuade me from using tla.


But the whole purpose is to check your model before you write code, no? I get that differences are hard but it's better than nothing


I never recommend TLA for reasoning about programs, but about models. It's great for describing a system you're still designing.


Elaborate please; Without reasoning about programs how do you come up with the predicates which constitute a Model? Are you making a distinction between declarative logic used in specifications vs. say predicate transformers used in proving program correctness?


That's like saying that tests introduce a new kind of bug -- behaviours that aren't explored at all -- and that that serious issue (and it is, indeed, extremely serious) should dissuade you from writing and running tests.

All software quality tools are imperfect, and even in the very few special cases where software can be realistically verified end-to-end, the behaviour of the actual system -- which also depends on hardware that can never be fully assured -- is not guaranteed.

The only question that matters when evaluating a software quality tool is: does it improve the software's quality for a cost that is lower than achieving the same improvement by other means? In other words, does it save us money and/or pain? For TLA+, the answer in many cases is absolutely yes. I.e. there is no better/cheaper/easier way to gain the same benefit.


The alternative is to not check the model at all. Doesn’t really seem better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: