Hacker News new | past | comments | ask | show | jobs | submit login
Cargo-semver-checks: Scan your Rust crate for semver violations (github.com/obi1kenobi)
147 points by todsacerdoti on July 9, 2023 | hide | past | favorite | 114 comments



Author here! Excited to see this on the front page, ping me with any questions!

I recently gave a talk about the tool, addressing:

- why semver in Rust is important and harder than it looks

- how the tool works

- why it took until 2022-2023 to solve this problem

There's no recording but here are the slides: https://docs.google.com/presentation/d/1gVYM9-YLrBmRYSuHqW8r...

A lot of semver is just queries over APIs. For example: "find public functions that existed in the last version but don't exist in the new one." Any such function we find is a breaking change.

Cargo-semver-checks uses an engine designed to query any data sources called Trustfall, and uses rustdoc JSON (which has an unstable format that changes ~once a month).

Here's a query playground where you can try running queries over rustdoc:

https://play.predr.ag/rustdoc

More info about all this on my blog:

https://predr.ag/blog


You did much more wonderful job than just enforcing the semantic versioning---you gave a general database query tool that may drastically change how rustdoc can be used! (Sorry for my general displeasure with semver below, that's nothing to do with this kind of tools which does the actual job semver is supposed to do.)


No worries!

Trustfall is even more general than semver actually, one can use it to query any data source -- even HN itself:

https://play.predr.ag/hackernews

Here's a link to that project if you'd like to check it out:

https://github.com/obi1kenobi/trustfall

I gave a 10min conference talk on it last year titled "How to query (almost) everything"

https://www.hytradboi.com/2022/how-to-query-almost-everythin...


Elm does this. While developing Elm libraries it caught violations for me a few times even though I was trying hard to stick to semver.

I also found that the average version of an Elm package was much higher than I expected given the ecosystem’s relative infancy when I was working with it (~5 years ago), when comparing to the JavaScript ecosystem where popular packages theoretically follow semver.

I suspect that most packages regularly violate semver in small ways, and I’d recommend giving automated checks for violations a go if only to have the experience and be more aware of the issue.


Tool author here. Your hunch is absolutely right. An analysis I and a few contributors did for Rust showed that over 1 in 6 of the top 1000 most downloaded Rust crates has broken semver at least once in a way our tool can detect & where we can produce a program that suffers a compile error due to the semver break.

This is obviously a lower bound, and yet it's quite a high number already.

I plan to write more about this on my blog in the coming weeks:

https://predr.ag/blog

In the meantime I put some more info (including a link to slides from a recent talk I gave on this) in another comment:

https://news.ycombinator.com/item?id=36653954


Nice one, great analysis.


Thank you!


I'm waiting for the day when we will finally admit to ourselves that the Semantic Versioning experiment has failed.

The purpose of Semver is to allow software and people to determine, from the version number alone, whether and what has changed. But even though many projects claim adherence to the Semver standard nowadays, the ecosystem as a whole isn't, and never has been, anywhere close to the point where a reasonable engineer would actually rely on version numbers to decide whether to upgrade.

Imagine you see that one of your library dependencies, which professes adherence to Semver, has released version 1.2.3, up from version 1.2.2 which your project currently uses. Would you feel comfortable doing the update, without checking whether the project still compiles and passes its test suite afterwards?

If you answered "no", then you are basically admitting that Semver is worthless in practice. If you are going to test everything anyway, version numbers might as well be a simple incrementing integer, or a date, or a random string. Upgrading will still consist of trying the new version, and, on failure, rolling back – which is exactly what you are probably already doing anyway.

Semver doesn't work because you can't actually trust that all your dependencies have verified their claimed Semver compliance to the extent that would be necessary for your project to rely on it. You have to do that verification yourself, which makes the Semver version number at best an occasionally useful hint. And that's not even talking about all the other problems, such as different projects having different interpretations of what constitutes a "breaking" change, and almost none of the older, established libraries even claiming to follow Semver in the first place.


> Would you feel comfortable doing the update, without checking whether the project still compiles and passes its test suite afterwards?

I don't trust saving a file that hasn't changed without checking if things compile/pass tests.

> If you answered "no", then you are basically admitting that Semver is worthless in practice

I hugely disagree. Version numbers give me a good idea about what level of changes to expect. Upgrading from 1 to 2 implies a larger review is needed, I'd not just check compile/tests. Upgrading from 1.4.656 to 1.4.657 and I don't expect to have to check the docs for major changes to how things work.

Dates, pure increments etc don't have this. Two releases a day apart can be wildly different - a patch release fixing a logging format and a fundamental API change.

It's a method of communication . Not being perfect doesn't make it useless.


> Upgrading from 1 to 2 implies a larger review is needed, I'd not just check compile/tests. Upgrading from 1.4.656 to 1.4.657 and I don't expect to have to check the docs for major changes to how things work.

What would you do for upgrading from 1.4 to 1.5?

> Dates, pure increments etc don't have this.

Right, but most versioning schemes are not pure increments and nor pure dates exactly for that reason. You don't plan to release twice a day when your version scheme is year-based; the second number is frequently used for when it's inevitable, making it a hybrid scheme. Still not compatible with the semantic versioning.

Semantic versioning also effectively shadowed a previously widespread decimal-based versioning (e.g. 1.00 < 1.09 < 1.5 < 1.99 < 2.0). There are pros and cons for both formats, but decimal-based versioning makes authors to pick a suitable increment for communicating changes, and forbids them to perpetually delay the major version (which can be a pro or a con, I personally think it's a pro).

If semantic versioning simply wanted to codify and mechanize the best practice, it should have instead adopted Debian's [1] which tried a lot to unify existing conventions.

[1] https://www.debian.org/doc/debian-policy/ch-controlfields.ht...


> What would you do for upgrading from 1.4 to 1.5?

Something in between? More checking and a glance over the changelog at least to check for added things I should use. Depends how important the component is and risk to the business.

> Right, but most versioning schemes are not pure increments and nor pure dates exactly for that reason. You don't plan to release twice a day when your version scheme is year-based; the second number is frequently used for when it's inevitable, making it a hybrid scheme. Still not compatible with the semantic versioning.

Sure, those were the suggested options I was countering. Dates of any form don't tell you much about what's practically changed though.

> and forbids them to perpetually delay the major version (which can be a pro or a con, I personally think it's a pro).

"Version 2 because it's been a while even though this is a logging change" sounds awful to me, as does "welp we've used our 99 releases so now it's a new big number" or "oh shit we used 1.1 so now we only have 10 releases possible". You don't need to make major revisions.

> There are pros and cons for both formats, but decimal-based versioning makes authors to pick a suitable increment for communicating changes,

Isn't that largely what semver does? There are three major buckets.

Still. My point is that it's not worthless, not that it is objectively better in what is a very subjective area (communication).


> "Version 2 because it's been a while even though this is a logging change" sounds awful to me, as does "welp we've used our 99 releases so now it's a new big number" or "oh shit we used 1.1 so now we only have 10 releases possible".

That's not how people used to do decimal-based versioning. They instead tend to be more conscious about the relative amount of changes and how to quantify them. Some changes are worth 0.01. Others are worth 0.1 or 0.05. If many changes are expected before the actual major version, either increments are lowered (say, 0.001 instead of 0.01) or many changes are batched into a single release of the reasonable size.

> There are three major buckets.

That's too many, I believe---see my other comment. There is a good reason to have at least two components, and authors and/or users may want much finer increments (4 or 5 would be the practical maximum though), but three components as defined by the semver don't seem to be justifiable. I'm not against other existing schemes using three numbers.


> Some changes are worth 0.01. Others are worth 0.1 or 0.05.

You can still do that with semver if you want to try and encode more information and hope that people read your docs to understand what specifically you mean by the size of the changes.

> If many changes are expected before the actual major version, either increments are lowered (say, 0.001 instead of 0.01)

Planning an entire major version and how many releases you have coming up sounds rough. Why not give yourself more room? Make it two numbers not a decimal. You could then add .0 to everything.

> or many changes are batched into a single release of the reasonable size.

I'm not a fan of delaying releasing fixes just because changing a number is hard. But you can still do that! Just increment minor or major and set patch to 0. This also buys you the ability to patch an older minor release!

I kind of get the argument against minor and patch, but think they are usefully different. I can't imagine 5 different parts.

Having a reasonable standard means not having to know the preferences of 100 maintainers.

Is it perfect for everyone? No. Is it fine? Sure. My originally point is that not being perfect doesn't mean it's useless.


> That's not how people used to do decimal-based versioning.

“Completely arbitrary, utterly inconsistent (across vendors, across products from the same vendor, and even within the same product over time) and often marketing driven” is how people used to do decimal versioning, and any claim to the contrary is romanticizing a very messy past.

The grandparent's description is a perfectly reasonable example of what would actually happen.


You can't expect anything from version numbers used for marketing purposes anyway, in fact it was also frequent for such products to have internal version numbers that are much more consistent (e.g. Windows). You can do the exact same thing with the semantic versioning---you don't have to increment by exactly one. That practice is irrelevant here.


"What would you do for upgrading from 1.4 to 1.5?"

I'd read the release notes, and spend a little bit of time thinking about if they impact my project.

For a 1.4.1 to 1.4.2 bump, if my (comprehensive) tests pass, I'll land the upgrade.


Yes, the minor version clearly doesn't have much information. Then why should we keep the three-part version number after all?


It does have information about what's changed. It shouldn't be a patch release and it's not supposed to be a breaking change.

Outside of new things it informs you about the work expected with downgrades. Going down in minor versions and you should check for breaking changes or removed functionality.


> It shouldn't be a patch release and it's not supposed to be a breaking change.

Such change is not distinguishable from a patch release because, according to the semantic versioning, both should be trivial to upgrade. In the other words, if the semantic versioning did work as intended, upgrading from 1.2.3 to 1.2.99 and to 1.42.0 should be same to users. If it's not the case, there should have been some breaking change. And clearly this is not how users feel about the minor version increment.

> Outside of new things it informs you about the work expected with downgrades. Going down in minor versions and you should check for breaking changes or removed functionality.

If you need downgrades the distinction among major, minor and patch versions does not matter because either:

1. The library author has broken the premise of semantic versioning, or

2. You have entirely separate requirements that are not part of the public API (whatever it is) and thus the library author didn't honor them.


> Such change is not distinguishable from a patch release because, according to the semantic versioning, both should be trivial to upgrade

They're absolutely distinguishable, they convey a different scale of change. Just because you should be able to blindly upgrade doesn't stop that. If nothing else, it indicates there may be deprecations. You don't want to wait until things break before dealing with those.

> If you need downgrades the distinction among major, minor and patch versions does not matter because either

You've never had to downgrade a package for other reasons like matching a version used elsewhere? Compared versions of a tool with another developer whole debugging? Downgraded due to an introduced bug? None of those break semver.

Again this is about communicating.


> They're absolutely distinguishable, they convey a different scale of change.

Minor vs. patch releases are absolutely distinguishable because authors did say so, but changes are not---unless you actually try to use new features.

> If nothing else, it indicates there may be deprecations. You don't want to wait until things break before dealing with those.

Deprecations are not breaking changes, so it shouldn't matter anyway. The semver rule of requiring the minor version for deprecations is pretty silly. I do see the intention---give enough time before the actual removal---but the rule is too strong and smells of retrofitting.

Semver's unclear distinction between minor vs. patch versions is also visible from the coexistence of "caret" requirements and "tilde" requirements in many semver-enabled systems. The former can upgrade to any minor versions, while the latter won't if the required version is specified in the full three-part form. Which one should I use? Depends on packages? Is that any different from having a different versioning scheme per package?

> You've never had to downgrade a package for other reasons like matching a version used elsewhere? Compared versions of a tool with another developer whole debugging? Downgraded due to an introduced bug? None of those break semver.

I do a lot of downgrading, but I don't trust semver and at least test them again (and look at actual changes if I can). Bugs do not break semver, but at that point the very reason to trust semver also disappears.

Also, consider the case where x.y.0 has introduced a new feature I need and also a new shiny bug. Obviously I can't no longer downgrade and semver offers no other solution. The common way to deal with this is to fork and vendor local changes (I did encounter this situation as recent as 2 months ago). I should note that semver in principle should have prevented this for other x.y.z versions (z > 0), but bugs introduced by x.y.z are much rarer than x.y.0 because x.y.z are supposed to be bugfix-only releases anyway.


Major = Changes an existing API.

Minor = Adds new API calls.

Patch = Changes the internal behavior, but not the API.

SemVer's distinctions are essentially useless for applications (and particularly so for GUI applications), but do make some sense for libraries.

SemVer is one-way. Downgrading usually still needs the same care as a major version change.

SemVer is mostly intended to limit what the version number conveys. It doesn't add extra info (like, is this a bugfix?) but instead takes that away and forces it into the release notes.

Dynamically-typed languages make this harder, because internal behavior and the resulting type signature are less separate than with a statically typed language where an accidental type signature change (e.g. returning the wrong type) will break at build time.


> Minor vs. patch releases are absolutely distinguishable because authors did say so, but changes are not---unless you actually try to use new features.

I'm not understanding, sorry. There are changes that fit in minor that don't in patch - they're not equivalent. I think we have crossed wires somewhere on this point.

> Semver's unclear distinction between minor vs. patch versions is also visible from the coexistence of "caret" requirements and "tilde" requirements in many semver-enabled systems. The former can upgrade to any minor versions, while the latter won't if the required version is specified in the full three-part form. Which one should I use? Depends on packages? Is that any different from having a different versioning scheme per package?

Tilde is "this patch and above don't go to a new minor version". Caret is the same but allows minor upgrades.

Which you use depends on your project and testing, and yeah you can base it on what the maintainer is like. Some projects break all the time and you don't want even patch releases. I'm not really sure why they added those tbh I'd prefer just ranges. They're just shorthand for ranges anyway.

Again, it's about communication. Some people being bad at communicating or maintaining software doesn't make communication pointless.

It brings the problem down to one versioning scheme, which you can rely on to different degrees. The overrides are then significantly more contained than having scores of various different things with formats that require different parsing despite looking identical. You instead have one thing and if you're unlucky you have a misbehaving package in your dependencies.

> Deprecations are not breaking changes, so it shouldn't matter anyway.

Yes, which is why it isn't a major change. This is all about the maintainer communicating with you that this is more than a patch

> Also, consider the case where x.y.0 has introduced a new feature I need and also a new shiny bug. Obviously I can't no longer downgrade and semver offers no other solution.

Semver doesn't fix bugs no, but I don't understand the problem you're describing. That would be followed by a patch release with the same minor versions number right?

How does semver make any claims to prevent bugs being introduced?

If it's a local change you make would you have a distinct package? If you want to have the same package name just make a patch number yourself and mark it pre-release. That's literally what it is right - a patch that's not yet released.


Just because one person doesn't find a piece of information useful doesn't mean that everyone else doesn't. Semver is a useful foundation to build versioning semantics upon, and nothing more. I find all three components to be useful for communicating the scope of changes in any given version, and tools like the OP that enforce these assumptions make semver even more useful. If you think that patch versions are useless, then you're free to omit them in your own libraries.


I never said patch versions are useless. I pointed out that minor and patch versions in the semantic versioning are not very different from each other, in particular compared with major versions. I do use three-part versions, as people had done much before the semver, but people also have used two-part versions and decimal-based versions, not to mention multiple flavors of four-part versions as well. And tools like cargo-semver-checks are actually useful even in the absence of any versioning, because they directly tell which version is compatible with others. They make semver (or any other versioning schemes) more useful, but semver does not make them useful.

If I seem to attack a weird strawman, you are not very wrong. Semver is useless but doesn't hurt much either. Semver can somehow encode one-part and two-part versions as well by appending excess zeros. I'm much more concerned that embracing semver (or really any other alternative versioning scheme) makes us overlook the actual problem---versions may detect but can't fix breaking changes. A vast majority of breaking changes is mechanically fixable, and we all know how it is beneficial. We should embrace a method to reduce perceived breaking changes, not a method to point out possible breaking changes.


> A vast majority of breaking changes is mechanically fixable, and we all know how it is beneficial. We should embrace a method to reduce perceived breaking changes, not a method to point out possible breaking changes.

These seem like entirely distinct goals that should be pursued separately. Yes, from the perspective of a library consumer, any breaking change is a problem. But from the perspective of a library author, there's a crucial distinction between breaking changes that I intended to make and breaking changes that I didn't intend to make. The tool in the OP is for use by library authors, with the goal of eliminating unintentional breaking changes, and thereby reducing the number of breaking changes that my users are forced to endure. And what you're asking for is a tool to be used by library consumers, in order to mitigate intentional breaking changes. Both these tools can exist, and both benefit (at least marginally) from semver. Semver isn't a panacea, it's just the simplest possible foundation to communicate ideas about versioning. It's entirely welcome to invent concepts beyond semver to perform even more advanced communication.


1 to 2 doesn't imply anything strictly from the semver spec, so it gives you no indication in theory In practice major versions signal major changes, but only because many people view these versions as humanver


> Version numbers give me a good idea about what level of changes to expect

Google Chrome went from version 1.0 to 114.0 in less than half the time in which FreeCAD went from 0.0 to 0.20. Blender once had a huge update that only changed the minor version number...and the entire user interface, among other things. The Linux kernel had major version bumps simply because Linus Torvalds felt like it. No, version numbers give absolutely no indication about what actually happened.

> Upgrading from 1.4.656 to 1.4.657 and I don't expect to have to check the docs for major changes to how things work

I had node project setups crash and burn because a library had the version X.0.7 instead of X.0.3 and some other library couldn't handle this supposed non-breaking change. Similarly last time I tried ML stuff with python I had to downgrade from version 3.11 to 3.10 to make it work. I can't remember any case of breakage where the difference was actually visible in the major version number.

> Not being perfect doesn't make it useless.

The only two aspects that prevent it from being useless is that version numbers make it easier to google specific issues and to figure out if the version you have installed is the newest one. But communicating what extent of changes to expect is completely unreliable.


> No, version numbers give absolutely no indication about what actually happened.

Surely you understand that not all projects use semver. When projects that do not use semver have version numbers that communicate nothing, that's not a failure of semver. Indeed, you have accurately derived the reason why semver exists: because it would be useful for version numbers to have known semantics. The above comment is, seemingly by accident, an argument in favor of semver.


> it would be useful for version numbers to have known semantics

Fair point, but then wouldn't it have made more sense to also define a unique syntax for semver? If I can't tell whether a project follows that standard by looking at the version number I still can't rely on it without looking it up.

That doesn't mean it's useless, but it's an unnecessary source of confusion. With Rust crates there's also the problem that developers cannot use a different syntax to indicate they don't follow semver - cargo will refuse to compile the project. While it would be nice if all packages followed a standard this will realistically never be the case and enforcing the notation won't change that.


> Surely you understand that not all projects use semver

SemVer only makes sense for projects whose principal function is presenting what is consumed as a single, unified API, because the first two components of SemVer are actually a version number for the presented API, and the last component communicates the sequential version number without API changes.


> Blender once had a huge update that only changed the minor version number

Blender was not semver until 3.0 so that is perfectly valid. You can't complain about semver projects and inconsistent versioning and then pull out non-semver projects.


As other people have said, I normally run unit tests after changing comments. It's part of my check-in routine.

That said, I maintain close to a dozen Rust utilities, some of which only get updated every two or three years. And when I update them, I try to update all their dependencies. We're talking 250-400 transitive dependencies in most cases, and typically two years of evolution of all of them. In other words, this is a worst-case scenario for semver. I'm often looking at 600 "library-years" of supposedly compatible changes.

4 times out of 5, the "semver-compatible" updates work flawlessly. The 5th time, I'll occasionally run into a minor problem that takes me 20 minutes to clean up. Once every several years, I'll run into a genuine headache.

To me, this is stunningly successful. Rust semver isn't perfect, but it's a genuine joy from a maintenance perspective. And I expect things to get even better, thanks to tools like the one in the OP.


> I normally run unit tests after changing comments

Thirded. Tests get run before I commit, period. The grandparent's apparent argument that semver has failed because you have to run tests when upgrading patch versions is utterly alien to me. Has any semver proponent ever claimed such a thing?


> I normally run unit tests after changing comments. It's part of my check-in routine.

I run the tests after running even a part of a file through the formatter. And no, I don't work in Python.


You say the experiment has failed, but speak purely of theory. The actual experiment, the one running in practice, is an astounding success; the things you describe happening don't happen, or at least don't happen in Rust.

In practice, a dependency I should not trust to have done diligence in semver management is only a couple steps past a dependency I should not trust to run on my machine. Ignore any dependency with less than 50k downloads and 'what if other people are dumb' concerns like this one just melt away.


The tool author found out that at least 1 in 6 Rust creates broke semver that the tool detected. That's a big number for an "astounding success"


Both can be true at the same time, and I believe that to be the case.

The issue in Rust isn't that semver isn't doing the right thing for us. It's that every so often (maybe as rarely as 2-3 times per year for the entire Rust community!) there's a spectacularly painful and regrettable accidental semver break that causes tons of effort to be spent by both maintainers and many downstream users.

For every such incident, there are probably thousands of little semver violations here and there that we don't notice and where nobody is affected by them.

cargo-semver-checks is about letting maintainers have actionable information at the right time so they can engage in well-informed decision-making. Without this tool, the alternative is: every time you publish a new version, roll a d100; on a roll of 2 or 3, you broke semver but nobody noticed; on a roll of 1, you wake up to a GitHub issue with 1000 other issues linked to it that says "you broke semver, please fix ASAP." This is the part that sucks! (The numbers are approximately accurate based on our study of the top 1000 most-downloaded crates.) The remaining 97 times, everything is great -- you publish a new version, everyone happily uses it, astounding success all around.


What's the ratio among Rust crates with over 50k downloads? I manage a tool with literally seven hundred dependencies and not one of them has ever broken semver.


Tool and study coauthor here. The "more than 1 in 6" ratio is purely in the top 1000 most-downloaded crates on crates.io, which are the only ones we tested.

I'm glad to hear of your positive experience with semver. Unfortunately, the issue with semver violations is that their impact has extremely high variance, easily ranging between "nobody noticed" and "ecosystem-wide pain":

- Most of the semver violations we found appear to have never been reported or noticed. After talking to a representative subset of the maintainers of those crates, they were generally surprised to find out a semver violation had happened.

- At the same time, a quick search for "violated semver" in Rust repos on GitHub will surface hundreds of issues, many of them heavily linked-to by issues in other projects. Every such issue represents dozens or hundreds of cumulative engineer-hours of triage and remediation in both the project where the semver accident happened and in affected downstream projects. All this work is regrettable, in the sense that both maintainers and downstream consumers would have preferred if it could have been avoided.

Most semver violations aren't intentional -- they are accidents that happen to all of us regardless of skill level, carefulness, experience, etc. Most of the time, those accidents go unnoticed or have minimal impact. Every so often, they trigger nearly ecosystem-wide consequences for a few days or weeks. (Some examples on my blog: https://predr.ag/blog/toward-fearless-cargo-update/ and https://predr.ag/tags/semver ) And sometimes, breaking semver on purpose can be the correct thing to do!

cargo-semver-checks aims to be a tool such that maintainers never have to say "if I had known this API change had happened, I wouldn't have published this version." Its biggest impact should be at minimizing or altogether preventing those ecosystem-wide semver issues that happen from time to time. Our day-to-day experience with semver should remain unchanged otherwise.


How are you so certain if it could be a break in a code path you don't use? Could you check it with this tool?


cargo-semver-checks is not dependent on whether code paths are used, because it doesn't analyze from the "use" side. It analyzes from the "what API is exposed" side.

You can think of cargo-semver-checks as an automated way to answer a series of queries like "are there any public functions that existed in the previous version and don't exist in the new version?" Any results produced by any of those queries are examples of breaking changes.

This is why our "more than 1 in 6" study found so many previously-unreported semver violations — in our study we were able to find issues in all sorts of dark corners of library APIs, then generate programs specifically designed to rely on the affected behavior in a way that works fine in one API version and causes a compilation error in the subsequent (non-major) version.


> cargo-semver-checks is not dependent on whether code paths are used

that's exactly my point, without this great tool he can't know whether semver is broken from his practical use since his use is not exhaustive


I don't think semver is meant to condition users into blindly trusting updates. I always thought it was meant to condition developers into being mindful about what the changes they make will mean for their users. You can maybe never be sure that some change won't lead to something breaking, but it's often quite clear when a change almost certainly will lead to something breaking.


> Would you feel comfortable doing the update, without checking whether the project still compiles and passes its test suite afterwards?

> If you answered "no", then you are basically admitting that Semver is worthless in practice.

Because everything that is not perfect is worthless indeed… Some people should really learn about nuance.


No, the purpose of semvar is to be able to distinguish compatible updates from incompatible updates.

As a concept it's relevant any time you have a protocol or a communications channel between agents that aren't rigidly kept in sync. Think network protocols, on disk formats...

The alternative to semvar in the filesystem world has historically been feature bits, but feature bits suck because there was an order in which you rolled out features and feature bits do not preserve that ordering.

Meaning users can (and will, because users do crazy things) end up in a state where they have all the modern feature bits enabled but not the one from 5 years ago and you never tested that configuration.

Learn semvar. Use semvar. Semvar good.


I think you're looking for too much out of a versioning scheme.

If you're seeing 1.2.3 -> 2.2.3, updates, i..e the dev is saying THIS 100% BREAKS SOMETHING IN MY OPINION, and you just do your normal set of analysis and walk away happy that's a bit scary.

If you do anything other than your normal analysis, you've just gone ahead and used semver.


> the Semantic Versioning experiment has failed

I'm going to take a wild guess that you're going off of experience from something like npm library versions. That ecosystem is a mess, and I don't blame you for mistrusting it.

On the other hand, the Go language has maintained the Go Compatibility Promise remarkably well. https://go.dev/doc/go1compat

In other words, whether or no semver is useful depends on how well its followed. The versioning compatibility would be a problem regardless of the version number system followed. Case in point: Windows 95 v Windows 10.


> If you are going to test everything anyway, version numbers might as well be a simple incrementing integer, or a date, or a random string.

You are perhaps right when you say that semver had failed, but what you list above as alternatives are less expressive. Semver admits a tree of versions (weak ordering) but your alternatives are strictly ordered.


Yeah, versions should just convey the "marketing" information (how much has changed, how important those changes are), and all the API compatibility should be left to automation like this tool


I've not had much pain with semver violations in the Rust ecosystem (compared to say people abusing the yank functionality of crates.io), but semver is easy enough to get wrong where I might have just been lucky.

This is such a cool project that I hope it goes further, since verifying dependency distribution for correctness on top of all of Rust's other guarantees makes the ecosystem that much easier to justify investing in.


> I've not had much pain with semver violations in the Rust ecosystem

The Rust ecosystem is pretty good about crate authors not knowingly violating semver. This tool is mostly to keep crate authors from unknowingly violating semver. It's possible that the reason that you haven't noticed is because even if a crate accidentally violates semver by making an unanticipated breaking change to a given API, either you're not using that particular API (who uses 100% of any given library?) or else that the breakage doesn't affect your specific usage (e.g. maybe some type no longer implements a trait, but you weren't using that trait to begin with).


Exactly right. Most accidental semver violations aren't noticed by anyone, let alone most people. But every so often, an accidental semver violation causes ecosystem-wide breakage and lots of pain for both maintainers and users.

The goal of this tool is to aid in preventing regrettable accidents like that, by informing maintainers about semver issues before they are published.


cargo-semver-checks author here, thanks for the kind words :)

I've written several blog posts about how semver in Rust is particularly tricky and has unexpected edge cases, if you'd like to see specific examples of things non-obviously going wrong:

https://predr.ag/tags/semver/

Also, I've been working with the cargo team and the plan is to merge cargo-semver-checks into cargo itself, so it runs automatically on `cargo publish`. This would be similar to how cargo checks for uncommitted git changes: it will alert if it finds anything but you can override it if you think that's warranted.


What a great linting concept! Wow, just imagine a world where semver violation scanning was used by PyPi in all of its hosted pip packages.


Author here. The same overall approach should work for Python and most other languages as well!

If you or anyone else is interested in building such a tool, ping me and I'd be happy to help you get started.


I've looked into this!

The issue with python is dynamic typing... It could be a tractable problem for fully type hinted code, but when your object type can be meta-programmed from external data, something I've seen SOAP client libraries do (point client class object at WSDL -> call virtual method name on client class -> get arbitrary tree of data objects which have no code representation with specific class types as appropriate)... it can be nigh impossible to fully resolve this.

I'd love to have tools to scan what I could, but its remarkably challenging due to the level of meta-programming that Python provides.

I admit I'm clearly not as well versed on the topic as you are at this based on looking at your work, so its entirely possible I've missed some obvious tricks that make it easier, but I'm not sure I've got the time any more to tackle a fun new project.


About a year ago, I worked on a similar tool, using the same rustdoc-based idea, [cargo-breaking](https://github.com/iomentum/cargo-breaking/tree/zdimension/r...), but since cargo-semver-checks was released around the same time and used a better general approach to change detection, we never bothered actually releasing it and ended up just archiving the repo


cargo-breaking is cool! I only learned of it well after I had published cargo-semver-checks, but your logo is awesome and the tool was definitely a good idea, obviously.

If you ever feel like doing more work in the space, I'd love to work together on cargo-semver-checks!


Is there something like this for Go?


Tool author here. I think the same overall approach I described here should work for Go as well:

https://news.ycombinator.com/item?id=36653954

One would just need a machine readable data source for Go package APIs, and would plug it into the Trustfall query engine. I'm not a Go expert but most languages I've worked with have something like that.

If you might be interested in building this for Go, ping me and I'd be happy to help you hit the ground running!


<3



Thanks will take a look!


I'm not familiar with rust+semver problems.

Can someone please explain what the issue is?


Unlike the way some languages' package systems work, Rust crates don't cleanly separate the public interface from the private implementation.

In practice a crate's public interface is effectively scattered throughout a number of modules. There's no separate textual interface definition that you could diff to see what you changed without also seeing changes to the implementation. Clients importing your library compile against the whole source.

So in practice if you want to check whether you changed your public interface unintentionally, your best bet is to compare the old and new output of Rust's automated documentation system, which by default is describing only the public interface.

Cargo-semver-checks attempts to automate that.


Nice summary! I'd also add that part of the public interface you're on the hook for is never explicitly written down in the source code, and is instead derived automatically by the compiler based on a set of rules. There are excellent reasons why this is the case, but as you could imagine this doesn't make semver any easier.

For Rust folks, I'm talking about auto traits like Send and Sync.

More info and examples here:

https://predr.ag/blog/toward-fearless-cargo-update/


Tool author here! I wrote a blog post that explains the motivation for building cargo-semver-checks here:

https://predr.ag/blog/toward-fearless-cargo-update/

TL;DR: Semver is particularly tricky in Rust and can be broken in many far-from-obvious ways. When a library breaks semver it can cause compile errors in every downstream bit of code that uses it. This sucks for everyone on all sides.


oh interesting, thank you!

So semver is something that Rust always uses and enforces as opposed to stuff I've seen where it's a manual number added by a human. Cool, I can definitely see how that would cause issues, and your examples+blog were helpful. Much appreciated!

Could there be a workflow in Rust compilation or packaging that uses your tool and says "oops, looks like the semver needs to be updated, can I do that for you?"


cargo-semver-checks is in the process of being merged into cargo itself, which would make the default publishing process into essentially exactly what you describe!

Running `cargo publish` would check for semver issues and alert if it finds anything. Users would be able to override it, of course, but most often they'd probably accept the version change or revert the semver break if it wasn't intended.


rust lang shifts the digits right by 1 so minor (second spot) == major in their eyes meaning always breaking and nobody hit 1.0+ because they are scared of not changing thing or misunderstand semver.


By definition, if a library is updated (even at the same semver version), something in its behaviour must have changed. By Hyrum's law, that's a breaking change at least for someone. It makes no sense to try and enforce versioning by only looking at the types of an API. In fact that's the least useful way, since the compiler will already trivially catch such mismatches.


Hyrum's "law" may claim that somebody out there will run into trouble if I change the internals of my library, but it certainly doesn't say that I have any obligation to care about it.

If you're working for a big company with a monorepo where everyone is trying to live at head, maybe you have to care.

But if I'm giving away my software for free to anyone who cares to use it, I'm entirely comfortable in saying that if you relied on undocumented internals you get to keep both pieces when it breaks.


There might just be new functionality added with no change to existing behavior. And sure, someone might rely on undocumented behavior[1], but that's a bug in their program – nobody can reasonably complain that you've improved performance. Also, the compiler won't necessarily catch this – you might, say, change a Foo parameter to impl Into<Foo> and have it work fine for all your tests, but it might break type inference for some of your users.

[1] which may hint at deficiencies in your library


> There might just be new functionality added with no change to existing behavior.

Even this can be a breaking change for you. Suppose you're writing a bootloader that must fit in 512 bytes and you import a library. If the size of the library plus your code exceeds 512 B, then your program doesn't fit the target and will not compile. Hyrum's law is real.


If your program has to fit in 512 bytes you're probably not pulling in a heap of external libraries?


It doesn't matter. Pick a number. Maybe you have a 650 MiB cap because you're releasing software on CD. Maybe you have a 25 GB cap because you're releasing on BD. Or 100 MB for some app store. Yet, semver only considers API compatibility, not other factors such as binary size. Any minor change could push the size beyond an acceptable limit. The point still stands.


But at a higher, more reasonable limit a tiny change is much less likely to push you over the line.

Also remember it's not the size of the dependency that matters it's the size of the portions of the dependency used less any cross-crate optimizations. So you're always going to have to be careful and test regularly.


Libraries have clear API contacts, especially in Rust where the interfaces are strict and side effects are rarer. The ecosystem is also relatively young, so it hasn't developed true Hyrum's ossification. In practice vast majority of library updates goes smoothly.

> since the compiler will already trivially catch such mismatches

That's too late — that breaks someone else's build. The goal is to check types before publishing an update.


If looking at types is the least useful way to enforce versioning, then what are _any_ better ways?


Do not strictly enforce versioning (still publish guidelines and make tools though). Allow third parties to publish adaptors between different versions instead. The correctness of adaptor can be verified to varying extent in many ways.

People seem to give too much weight to versioning, but it's just a means to communication. Like any other communication mechanism, it's subject to error which should be somehow detected and/or corrected. And versioning only concerns with detection---it doesn't say what you should do when a breaking change is dropped! Focusing to the error correction is more sustainable and user-friendly.


Consumer Driven Contracts is one way https://martinfowler.com/articles/consumerDrivenContracts.ht... (implemented in pact for example).


But the point is not to catch every possible observable change, as you say, any of those could break someone's build or change runtime behaviour in a non-desirable way.

The point is to catch changes that are part of the library's public API. That's something the author can be reasonably expected to evolve in a controlled fashion.

If you care about the internals, the onus in on you.


A build system based on people never breaking semantic versioning is a bad idea.

Not only will people break it, it's inconvenient when doing active development, tweaking things here and there.

Just autodetect whatever you need to in the first place.


Whether something is breaking or not can be subjective.

You can change the behavior of an API in breaking ways without changing the interface.

There are so many edge cases.


Sure, Hyrum's law applies. For example one of the game or 3D libraries blew up people's stuff a while back because Rust's structure layout behaviour changed. The library cared about type layout, but it neither asked its users to ensure layout (which you can do with a Rust attribute) nor checked the layout (which you could also do with a macro if you put some work in) - it just blindly assumed it knew how things are laid out, even though Rust specifically tells you that's not the default behaviour - and so it broke.

This tool is a linter, with I believe no false positives. So if it says you've broken semver constraints, you can decide "Oh, whoops, I knew there's a reason I didn't change that, I'll put it back" or "Time for a new version" or, I guess, YOLO although in that case I wonder why you bother running the tool.


I'm not a Rust developer, but my understanding is that if you break semver, which is very easy to do by mistake, you're in the same situation as in C++ with ODR issues, which are undefined behaviour.

For a language that touts itself as a safer C++, that's definitely bad.


Semantic versioning of Rust crates is just a convention and doesn't affect the correctness. Compiler (rustc) in particular has no notion of crates at all and will error whenever users try to use whatever API incorrectly.


(Rustc does know what crates are. You pass a crate root to rustc. You pass —extern to point at other crates. Rustc has no notion of packages.)


It affects what gets rebuilt.


That's not even true for C/C++. A patch version can still update header files and C/C++ build systems can't detect whether the update is semantic or not. In Rust the serialized crate metadata can be used to detect if the recompilation is needed, and it is completely orthogonal to the semantic versioning.


C++ ODR is not just "Undefined Behaviour" it's an example of IFNDR, a get-out in which the language standard just washes its hands of difficult problems by saying in that case what you've written is silently not a C++ program, so its behaviour is arbitrary but you don't get any compiler diagnostics.

Rust doesn't have this get-out. To some extent that's because it's not a formal standard, but mostly it's because IFNDR is diametrically opposed to correctness. Many C++ programs (perhaps even the vast majority) technically have no meaning whatsoever due to IFNDR which is bad -- it's not that the C++ compiler judges them to be correct when they aren't, but that it has no opinion either way, ever and programmers mistake "it compiled" for success.

Thus, in Rust you don't get "ODR errors" here, your program won't compile.

Here's a nice easy Rust semver break example: I make a library about US vacation booking, it has a public enumerated type FederalHoliday but initially that enum doesn't mention Juneteenth - whoops. Somebody points that out to me, and I correct the omission, but I forgot it's in a public enum. So that's a semver break.

Why? If somebody writes code using FederalHoliday from my library, they're entitled to exhaustively match against the enumerated type. So they can make code to handle MLK Day and Thanksgiving and so on, and of course they had no reason to handle Juneteenth because it wasn't in the list - then one day they take my update. Their program no longer builds, it complains that their FederalHoliday code isn't handling Juneteenth, I made a semver breaking change.

Now, in this type of case the correct fix (which you need to do before release, as introducing it later is a semver break) is to mark the enumeration as #[non_exhaustive] which tells Rust I know what's in this enumeration, so I can use it as an exhaustive match in my own code, but outside this crate, you can't match it exhaustively, you need to write a default case to handle new additions.


> it's an example of IFNDR, a get-out in which the language standard just washes its hands of difficult problems by saying in that case what you've written is silently not a C++ program, so its behaviour is arbitrary but you don't get any compiler diagnostics

How does this differ from undefined behavior?


UB is runtime behaviour, which is why "UB? In my lexer?" (the proposal paper for C++ 26 to remove Undefined Behaviour from the lexer, which clearly isn't happening at runtime) is so hilarious - C++ is so badly defined even parsing a source file has undefined behaviour, LOL.

Because it is runtime behaviour UB is constrained, if the only UB is in your "Delete files" code, and that code only executes when the user hits the "Delete" button, we can say that this UB won't occur when the user doesn't hit the button.

But IFNDR isn't like that, since the entire program has no meaning, nothing about it is required at all. Maybe the compiler output is 40GB of zeroes. Maybe when you run the program it immediately segfaults. Maybe it works exactly as you expected... except it's a bit... off and you can't exactly say why.


So IFNDR is a property of programs, whereas UB is a property of executions of programs. Got it, thanks.

Given that IFNDR is a static, rather than a dynamic property, why can't compilers detect it and error? Are all IFNDR issues similar to the ODR one you mentioned in that they only arise at link time and fixing them would require changing how symbols are mangled?


> Are all IFNDR issues similar to the ODR one you mentioned in that they only arise at link time and fixing them would require changing how symbols are mangled?

No, they're worse. Are you comfortable with basic CS theory? If not this is likely to whizz over your head, sorry.

A typical IFNDR issue depends upon the semantic properties of the alleged C++ program. Rice's Theorem tells us that all non-trivial semantic properties are Undecidable. No algorithm can exist to determine whether a program has these properties except if trivially all the programs in your language have the property (e.g. imagine a toy language with no jumping or iterating features, it must halt) or none do (e.g. imagine a resolutely single threaded language, it doesn't have any data races).

Because it's Undecidable, having the compiler decide if your alleged C++ program has the desired properties and if not report an error is intractable. So the C++ standard just says if it doesn't have that property it's actually not a C++ program, too bad, the compiler shouldn't worry about this case at all, just press on.

Now, the good news about Rice's Theorem is that we do have another option. When we're not sure if the semantic property we wanted is present, we reject the program, with a diagnostic informing the programmer that we couldn't prove it had this desirable property.

I'd argue this approach is much better, but it does mean you can write programs in Rust (which has this behaviour) where even though you are totally correct that the program would actually be OK the compiler rejects your program because it can't see why you're right.


That only works if you're dealing with source, where it's possible to check for coherence.

The problem happens when you link inconsistent binaries.


In which case static or dynamic linker will simply fail because Rust symbols have hashes derived from ABI attached by default (unless `#[no_mangle]` is used and you're on your own) and those hashes won't match unless the identical signature was used.


Mangling isn't magic.

You have a type T, and a struct U which contains a T.

A function takes a U by reference, and therefore will be mangled with U.

However say that function was built with a different version of T than the caller. This doesn't affect mangling. Things silently work even though they break.

This is a basic pitfall of working with any system that allows linking of binaries.

The setup only works so long as you guarantee that all entities with the same name are indeed the same thing. This is the One Definition Rule.


Those hashes are exactly designed in the way that any such changes will alter hashes and thus stop linkage. Virtually all C++ name mangling schemes don't even consider such cases: `void foo(T&);` will probably result in the same mangled name even though T's layout or any type referenced by T has been changed. In Rust they do affect mangling.


Except it’s not really subjective.

Does it make the code which previously compiled not compile?

Yes - Breaking

No - then, does it make the code which previously worked as expected do something unexpected?

Yes - Breaking

No - Not breaking

I agree that the second part is tricky as you need to cover all usecases that used to be considered valid, but it’s not impossible, especially with a well defined API


API breakage and ABI breakage are two different things.


Rust has no stable ABI (simplification for the sake of discussing crates) and instead sources of all dependencies are fetched and compiled along with your program hence we’re talking about crate’s API and not a library’s ABI

And yes, I’d say that if a program did not error at runtime but after upgrading your library it started to - that’s still a breaking change (even worse than one that you can detect at build time)


The build system needs to know what source files it needs to rebuild and for which ones it can just reuse previously-generated build artifacts, typically by detecting what changed in the source.

What changed in the source outside of your project is driven by whether the semver changed. Semver not changing means that the source didn't change and you can link existing binaries as-is.

Then there is the whole business of dynamic linking.


How is this different from equivalent claims about Continuous Integration or about Version Control? Automation allows humans to more easily produce good quality software.


What I'm saying is that you could just integrate some tooling which would detect when changes are or are not made in a compatible way, instead of relying on the human making a correct annotation.


cargo-semver-checks is the tooling you describe. It doesn't currently catch every possible semver violation, so it won't suggest version numbers for you yet, but that's the eventual goal!


It isn't. You apparently didn't even read the README.


Are you telling the author of the tool that he didn't even read his own README?


xD


Doesn't change the fact that the README is sufficient to see the tool doesn't do what I stated.

It's a workflow problem, things are done the wrong way around.


Can you explain better what you want then? Because I understood your description as a tool that detects when a code change doesn't match semver versioning without a human having to annotate the changes as breaking or anything. And then shows what to correct.

Which is exactly what cargo-semver-checks does.

Like, the first line of the README after the name is "Lint your crate API changes for semver violations." - it quite specifically works with the actual crate API, not commit names, comments or any other human annotation.

It also will show you what broke semver and how, so you can fix it easily. And as the author stated, the goal is to suggest the correct version in the future.


I already wrote the correct workflow in plain English.

Verifying that a semver change is correct is not the same as automatically generating the semver.

It's literally P vs NP.


You're right, the tool doesn't automatically generate the correct semver. If that's what you meant, it seems to have been lost in translation. Your first comment was "Just autodetect whatever you need". It says nothing about automatically generating the correct semver value.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: