It's nice that they're being explicit about breaking changes, but it's still a pain that they're doing them in the first place. I understand the argument for moving quickly, but projects that use LLVM suffer as a result.
Basically, any code that uses LLVM bitrots very quickly because the API is (potentially) incompatible on each release. The way that large projects (like Rust [0]) generally deal with this is by sticking close to the bleeding edge of development, and have an established process for integrating changes as they come. But smaller projects just don't have the resources to do this.
KLEE (a symbolic execution engine) appears to be 5 versions behind the newest release [1]. Terra (a language focusing on metaprogramming for high performance) [2] is only one version behind, but the maintainer has since graduated and I'm worried the community is too small to keep it from bitrotting.
Nominally the C API is supposed to solve this by being more stable, but it achieves this by exposing a smaller surface area, which means you can't do a lot of what you might need to do with this API.
I'd like a stable, full-featured API for LLVM. It sounds like they're going this way with the bitcode format, and I think they could start to move in this direction with the API as well. When the project was young, the argument was that avoiding stability allowed the developers the flexibility to explore design decisions and make different tradeoffs over time necessary for the long-term health of the code base. I understand that. But this instability makes it essentially impossible to maintain client projects without ongoing effort, and means that small projects almost always bitrot.
I think we know enough now to develop a pretty decent stable API for LLVM. And I'd be a lot more confident in trusting smaller projects that use LLVM if such an API existed.
Thank you for the only comment that isn't shallow semver debate.
And for being spot on.
LLVM is an incredible piece of software, but it's the lowest part in the stack.
Sure, clang, rust or swift have enough manpower to always follow, but smaller languages / compilers can't possibly hope to deal with breaking changes all the time.
The constant breaking changes also make it very hard to support multiple LLVM versions simultaneously.
Which makes development and deployment harder, because you can usually only install one LLVM version via distro packages.
So if you want to compile something that depends on an older / newer LLVM version, you have to compile LLVM, which takes a loooong time and is not something you can bring many people to do.
I, for one, cannot keep track of which version we're currently at. It used to be that a major version was introduced with fanfare and broke things. The holy 1.0 or versions such as Python 3 and or Gnome 3. I will always remember if I'm running 2.x or 3.x.
Some time after the major version hits 3, the madness seems to begin. Am I running Firefox 73 or 85 or was that last years version? I honestly don't know anymore. Are the stable versions even, odd, Fibonacci or prime?
Why not be brave about it and use date-based numberings such as 16.12!
I like the idea of Semantic Versioning, but it does cause problems where if I'm going to do several breaking releases one after another I have to keep bumping the major version, and I hate that. I've actually put off doing breaking changes simply because I didn't want to bump the major version number, but I don't like that either.
Thinking on it now, I wish instead of major.minor.patch the format was major.breaking.minor.patch, where the first component is for "significant" releases, and the second component is what you use when you're simply introducing breaking changes (but doesn't qualify as a "significant" release).
> I've actually put off doing breaking changes simply because I didn't want to bump the major version number
That's kind of the point though, right. As an API consumer/library user breaking changes are a pain in the ass. A library that rapidly issues breaking changes is one that is just plain difficult to use. Part of the goal with semantic versioning is to push authors to slow down on that and respect the commitments of their users.
Also, another thought here is that if you're hitting that many breaking changes that quickly, you possibly went 1.0 too early.
> where the first component is for "significant" releases, and the second component is what you use when you're simply introducing breaking changes
From a user's point of view, a breaking change is a significant release. It means I can't just hit update and get bug fixes (hopefully) for no effort. I have to investigate changes and test everything to make sure my software will still work.
There's a difference between a major upgrade that breaks nearly everything (like Angular to Angular2) and a smaller but necessary change in a couple of the many functions of an API. So I think there is a point for a major.breaking.minor.patch versioning, depending on the amount of work necessary when bumping major vs breaking.
Major would be for rewrites: (almost) a new product, but with the same name for brand recognition.
This seems like a nice idea on the face of it, but it's a lot less simple (who decides what's major and what isn't) and plus, when you introduce a "small" breaking change you have no idea what impact that's going to have downstream. It could be a really big deal for a client. The fact that it's a breaking change by definition means it's going to be a problem for someone and you really have no way to know how deeply entrenched that small piece of API surface is.
What bothers me about the current trends in release management is that there is way too much emphasis on iterating quickly, even for libraries where that's completely inappropriate.
It's much better to just put all your breaking changes into an unstable branch and only merge to stable (and change version numbers) infrequently, when you're sure the dust has settled. Anyone that really needs the latest updates can pull the unstable branch at their own risk, but everyone else doesn't have their build broken every other day. This is a really serious problem in the JavaScript ecosystem right now (even disregarding the left-pad debacle).
You can only do that once. Once you hit v1, if you introduce new APIs later, there's no "grace period" to make small breaking changes to those APIs.
In my case, the library that I avoided making breaking changes to because I didn't want to bump the major version was already on v2, and I didn't want to make it v3 after just a few weeks at v2.
This can be solved by adding some sort of annotation or documentation to new api's.
In RxJava [1], parts of the API are marked with @Experimental or @Beta. @Experimental provides no guarantees, while @Beta only guarantees no breakage in the same minor version.
> I like the idea of Semantic Versioning, but it does cause problems where if I'm going to do several breaking releases one after another I have to keep bumping the major version, and I hate that. I've actually put off doing breaking changes simply because I didn't want to bump the major version number, but I don't like that either.
When you're doing backend/API things (where breaking changes matter), I sure hope you think a million times before making breaking changes.
Can you imagine if someone had to go through millions of lines of his code to make sure nothing broke, then, a week later, you broke his code again?
I'm not the parent, but I sort of like the idea at first glance. I mean, it's a fine line: if I have "v1.0.0" and I break one API in one module, I'm compelled to release "v2.0.0" even though v2.0.0 is otherwise entirely compatible with v1 -- it's not as big of an upgrade as the version numbering system would imply.
But if I go and drastically change things in a way likely to impact many users, that gets a brand new version number too. So v1.0.0 -> v2.0.0 only really communicates "something might break".
The scheme proposed by the parent would be able to communicate "expect many things to break because I refactored the heck out of stuff to fix some long-standing design deficiencies" -- though admittedly when to bump that first version number is likely to be a subjective topic. :)
Perhaps this isn't as valuable as it seems at a first glance, but if anybody's tried something like this I for one would be interested to hear about it.
A piece of software I'm familiar with does this Arch.Major.Minor.Patch.
The Arch number represents a "generation" of the software that represents a significant overhaul where the architecture changes and input files that were generated for previous versions are not likely to work.
Major represents breaking API compatibility, so users who write plugins for the software will need to recompile and possibly change their code (but maybe not depending on what changed).
Minor and Patch are what you'd expect from semver.
The point of making backwards compatibility the first number is because the subtle changes need attention drawn to them. There's little risk of people accidentally using things in a broken way when projects are renamed. It's about clarity of technical communication. There are lots of ways to communicate big direction changes, including changing the look and feel of something, adding code words (WD Caviar Green), putting out a big publicity push (v32 is a whole new game!), and so on.
If you're using a proper type-safe compiled language then most "subtle" breaking changes can't possibly be missed because your code won't compile anymore (assuming you used that API to begin with). You don't need a major version number to call attention to the fact that one parameter of one method changed from a boolean flag to a set of options, anyone who's using that method will find that out pretty quickly.
The main reason semver is done the way it is is so you can do things like have package managers automatically pick the latest compatible version, since incompatibilities are denoted by the major version number. That's why I'd prefer major.breaking.minor.patch, because you can still have the package manager automatically detect compatible versions, but you don't end up in the crazy land of releasing a library at v27.
> If you're using a proper type-safe compiled language then most "subtle" breaking changes can't possibly be missed because your code won't compile anymore (assuming you used that API to begin with). You don't need a major version number to call attention to the fact that one parameter of one method changed from a boolean flag to a set of options, anyone who's using that method will find that out pretty quickly.
1. What if it's a dll, .so? You upgrade and find out that your program is broken.
2. Sometimes the API stays the same but the code behind the API changes a result (for example, secure_hash goes from MD5 to bcrypt)?
3. What about non-type safe languages (like HTML or JS, so things like Firefox or Chrome)?
The point is that you should avoid breaking other people's code if you can. What happened if that removal of one function in that one module costs me a full years of work?
Sometimes you can't help yourself. PHP had register_globals. Some people were able to use it safely (initialize all variables before use), but PHP rightfully realized the security implications and disabled it. However, it broke code, and a lot of it.
These are things you should think about and heavily before breaking code. It may be one line for you, but for all the millions of people who use your library it could be thousands of man-years of work.
> What if it's a dll, .so? You upgrade and find out that your program is broken.
If it's not backwards-compatible then it needs to bump the appropriate version number (in my proposal, that would be the second dotted component). So I'm not sure what you're trying to say here.
> * Sometimes the API stays the same but the code behind the API changes a result (for example, secure_hash goes from MD5 to bcrypt)?*
If it's a non-backwards-compatible behavioral change then maybe you need to design your API better such that this kind of change is expressed in the API. After all, if you expect anybody to ever upgrade to your new bcrypt version, you need to provide some path for people to still work with their older MD5 hashes anyway.
> What about non-type safe languages (like HTML or JS, so things like Firefox or Chrome)?
Not something I particularly care about. Though it doesn't really matter anyway; even if you think the breaking change number is too "subtle", anyone who's manually upgrading to a new version instead of letting their package manager do it should already be prepared to deal with breaking changes, because if there aren't breaking changes then their package manager should have been happy to upgrade without any manual intervention.
> The point is that you should avoid breaking other people's code if you can. What happened if that removal of one function in that one module costs me a full years of work?
I have no idea what point you're trying to make here. My suggestion was just about changing the format of the version number, and has no bearing whatsoever on the actual breaking changes you do or don't introduce. I'm certainly not advocating for removing functionality.
I think, and correct me if I'm wrong, but I think he's saying that semver's system that forces major version bumps for breaking changes is good because it discourages the maintainer from making breaking changes more frequently than he is comfortable releasing a new major version. That is, the effect of getting a maintainer to batch up breaking changes in exchange for version consistency is positive, and thus semver should not be changed as you propose, because it would decrease the cost of releasing a breaking change.
There's still a cost associated to releasing a breaking change, which is users have to manually upgrade, their package manager won't silently upgrade for them. So if I release a breaking change, I know any bugfixes included in it or later builds will take a while before they end up in the hands of users, because most people tend to put off dealing with breaking changes until they have time to actually investigate the changes.
But with my proposal, users can see that the breaking change is a "minor" one, and therefore they don't need to be prepared to learn about a bunch of big changes in order to upgrade.
That is even worse, because now you have to look at two numbers to see if something's breaking or not. If you change the API, you bump the major. If you don't want to bump the major, then either figure out a way to do it with the original API, such as a different number of arguments, or put that module in a package that can be installed seperately. Given your original example, you may have package Foo 1.0 which includes subpackage Bar, but you have subpackage Bar 2.0 which people can install separately if they need to. Bumping the major tells people straight out that things have changed. Two majors means that people have to keep track of two numbers for that -- can you tell at a glance if 54.32.593.3 is compatible with 54.33.594.4, for example.
With my proposed version scheme, you won't ever get 54.32.593.3. That's kind of the whole point. So instead you'd be comparing 1.2.1.1 and 1.3.0.2, which is a lot easier to read.
The point of Semantic Versioning is to tell you something.
So let's say you have Compiler 5.3.2
It means that the important thing is compiler #5. Upgrading from 4 to 5 is a _Big Deal_. You may have to rewrite all your code.
Within 5, you have a version 3. 3 has features A,B,C which 2 doesn't have. Most additions go there. So it should be safe to upgrade.
Within that, you have bugfix #2. That _should_ always be upgraded, unless you rely on undocumented features.
So it's easy for me to tell if I should upgrade.
So upgrading from Apache 1 to Apache 2 may brake config scripts and .htaccess files. Don't upgrade on production build.
Upgrading Apache 1.1 to 1.2, See README, Should be fine, do a small test on your testing machine.
Upgrading Apache 1.1.2 to 1.1.3. Probably a security fix. Do so. Immediately.
---------
The OP's numbering system doesn't tell me anything. should I upgrade 5.4.3.2 to 5.5.0.0? Will it be safe? Probably not. You may have to schedule a full testing load just to be sure.
What about from 5.4.3.2 to 6.4.0.0? Same thing. You have to do a full testing.
And if you _really_ break old code, do everyone a favor and rename your project (So, no, please don't call Go C++ V.13 or something)
You seem to be very confused. Upgrading from 5.4.3.2 to 5.5.0.0 with my scheme is no different than upgrading from 54.3.2 to 55.0.0 with traditional semantic versioning. I'm not suggesting any change to the actual model of semver, I'm literally just saying that I want to tack a new component on to the front for human consumption purposes.
In a library, breaking releases should be far fewer than "regular" feature releases. My point is that if you break code more than a few times in the history of your library, you'll get a revolution. For example, see Python 2->3, which was a relatively "small" fix (which just happened to affect pretty much half of existing string processing code), and PHP, where they seem to introduce and then turn around and remove those features every couple years (mysql, no, mysqli, no, PDO? Are we there yet?)
There's a big difference between massive sweeping breaking changes, and small breaking changes. What I care about is the ability to do smaller breaking changes, and the new "major" version number that I tacked onto the front is to signify the large sweeping changes instead of the smaller breaking changes.
As user, I see no difference because result is same: code is broken. You are trying to introduce full scale for the binary thing. If breaking change is small, then delay it until next major release.
As a consumer, that first number becomes irrelevant if it's not the "is this going to break my shit?" number. Why have it? It's not communicating something useful to me. Cut it.
But it does communicate something useful! It tells you "this isn't a huge change, it's a minor change that just happens to break backwards-compatibility in some fashion". Most products reserve major version number changes for when there's particularly large or important changes to the product. Committing to semantic versioning means losing that. The upside is your package manager knows when it's safe to silently upgrade. The downside is the actual humans looking at your product have no idea which versions they need to actually do some work to support, versus which ones just have minor breaking changes that may not even affect them.
I expect to audit dependencies I use when they break API compatibility in any way. That's a feature, not a bug. Having a "well, the maintainer think this is a bigger break" number does nothing. It's still a break. It's still a major version change.
In Semver the first number is not "is this going to break my shit?", it's "is it potentially breaking my shit?" in this proposed alternative the first number is "this is definitely going to break your shit", the second number is "this may break your shit, check the changelog to see if it affects you"
Having a "may" in there is the same thing as a "will" from the perspective of downstream. It still needs to be audited and checked. There's no value in splitting this out.
That's the same thing as projectName.major.minor.patch in some respects. You could just rebrand the project if it's a completely different direction. There's not a technical reason to keep the name, just a marketing/political/organizational reason.
> I like the idea of Semantic Versioning, but it does cause problems where if I'm going to do several breaking releases one after another I have to keep bumping the major version, and I hate that.
From your point of view, the problem is something inherent to semantic versioning. From my point of view, the problem is inherent to your behaviour – if you really want to do “several breaking releases one after another”, why hide it from users?
I'm not trying to hide it, but the problem is there's no way to communicate "this is a breaking change, but it's not a particularly large or important change", i.e. users don't have to be prepared to learn a bunch of stuff when upgrading.
I thought change logs served this purpose quite nicely. Every time there is a breaking change, the user needs to check the change log anyway. If there are only some small breaking changes, they are only pleasantly surprised.
Not exactly what you're asking, but here's a wrench I've seen in semver at work. Let's say I do a overhaul to a library; either I rewrite the logic or write a new interface. I also write "shim code" that's "supposed" to be backwards compatible. It's a huge change, and the company isn't really big enough to have formal q&a or significant testing. Personally, I tend to write a lot of shim-code, but try and be aggressive about deprecating it--it's more work, but is more politically viable than getting everyone to update their code /and/ troubleshoot/trust a big change. The consensus we had was a major version bump when that's introduced and another when the shim code gets deprecated--but nobody really wants that in practice. (I never did come up with a good answer)
Also, we decided that "stealing" major version numbers for speculative code was a bad idea. We'd have a lot of false starts that get abandoned and it got really confusing when the next major version got released (you're either re-using that version number or skipping a bunch).
Personally, I like semver, but think versioning mechanisms can differ (going from major OSes to small libraries I write at work) based on the needs. One thing I've noticed when working on artwork for clients, they get scared and confused by large revision numbers, so we tend to keep them fairly low by keeping internal version control numbers separate from review versions.
If I'm writing a library, and I want to make a relatively minor breaking change to one function (e.g. changing the type of a parameter), that's not a significant change to the whole library, but it is a breaking change and semver demands that I bump the major version number.
My opinion, as someone whose code got broken by a library micro version update just yesterday, is that the responsible thing to do is add a new function with the new args instead of changing the old one. Change the implementation of the old one so it calls through to the new one with suitable parameter values & and mark the old version as deprecated, but please don't remove it until the next major release!
That's what I do in most cases. But sometimes there's just no option.
The particular case that I was referring to when I said I decided to put off changes because I didn't want to bump the version number was actually a stylistic change, renaming a method from `parse(with:)` to `parse(using:)`, in order to better match the Swift 3 naming conventions. Normally I would have just marked the old method as deprecated, except a compiler bug means that if I do so, any code using trailing-closure syntax fails to compile (https://bugs.swift.org/browse/SR-3227). So it's literally impossible for me to rename this method in this manner without deleting the old method entirely (which is a breaking change). But I just bumped the major version number recently when Swift 3 was released, and I didn't want to bump it again shortly afterwards just because I didn't consider this method name during the Swift 2->3 migration.
Fair enough, I understand the desire not to bump the major version in this case. It does still kind of suck for users of the library though.
The case I was talking about was the OpenVR library, which changed the names for some of its enum values in the upgrade from 1.0.4 to 1.0.5. The change was documented in the release notes and it was straightforward to fix our code, but now we have to document that we require at least v1.0.5 and everyone building our code has to update their copy of OpenVR and so on. There are knock-on effects, is what I'm trying to say.
Library users expect breaking changes to be rare and have a damn good reason. What you're describing is deprecation without any warning period ("this stuff will go away in the next major release").
Cant you just deprecate the Old function, and remove it upon change ? If you mean 'keep bad behavior intact' (ie maintain bugs, that why you should have a spécification. 'the implémentation is the spécification' is very bad imo
> it does cause problems where if I'm going to do several breaking releases one after another I have to keep bumping the major version, and I hate that
Well how about stopping to think about your API instead of firing off several breaking changes in short succession?
The problem here is lack of commitment to a stable API, not the versioning scheme.
I get why you feel like that, I really do, but it all seems a bit hand wavey to me. 99% of the time with semver you can unambiguously tell whether it should be a major, minor, or patch bump (of course there are edge cases though).
Keeping up with the LLVM internal API is really a pain with the six month cycle. Is there a plan to stabilize it in the near future? I can see some open source project start to really drag, julia is still on llvm 3.7, so does ghc 8.0.1. I don't think it's healthy.
I should also note that the internal APIs are not the problem for Julia. We usually pick those up immediately. Rather, doing validation for a new release on all platforms, reducing, filing, fixing any regressions, etc is what takes the time (in addition to only a few of us knowing LLVM well enough to do so).
Some folks in the commercial world pull all the commits from upstream all the time in order to not fall too far behind. It's a pay-me-now-or-pay-me-later kinda thing. Forces you to take the interface breakage seriously lest it cause other commits to queue up while you let it linger.
Fully agree. Staring at the same leading number in the versions of a given piece of software for my whole life is a very acceptable worst case price to pay for opening up the version space to those really big "spiritual successor" kind of changes.
Just imagine the headache if Angular had bumped the most significant number for every change pre-Angular2. All the we difference between 1.x and 2.x would be perfectly obscured.
I totally agree. When you release bi-annually anyway, dont really care about how much breakage a Major release introduces, totally ignore minor versions and with patch versions being more or less insignificant, why not just go with $year.$month
I'm glad that semver is being widely adopted outside of the npm/rails world. It's incredibly useful to understand how much work will go into upgrading a package.
I think the attachment around "saving" major releases are really just attachments to marketing messages of yesteryear. It still feels good to announce "Library Version 3!", but from a technical perspective, semver is far more consumable and sensible. I'd rather be confident Lib v27.3.0 -> v27.4.3 won't break my build than having v2.9.0 -> v2.10.11 break my build, but look nicer.
As the article stated, LLVM 5.0 should really be called LLVM 5, and isn't it easier to remember and write than 17.09. And IF you do care about the patch level, then it's
LLVM 5.1 vs 17.09.1, again I'll take 5.1 any day.
They still haven't got it right ; you can only decide how the version number should change by comparing the software to previously released versions. Deciding ahead of time how it will change on a periodic basis is a mistake.
So stupid. The old scheme is easy to read if you just remove the dot in your mind: 3.8 -> 38, 3.9 -> 39. Now They make it complex by 38.0 -> 38, 39.0 -> 39. Removing a dot and a zero. Not helpful and So stupid.
Basically, any code that uses LLVM bitrots very quickly because the API is (potentially) incompatible on each release. The way that large projects (like Rust [0]) generally deal with this is by sticking close to the bleeding edge of development, and have an established process for integrating changes as they come. But smaller projects just don't have the resources to do this.
KLEE (a symbolic execution engine) appears to be 5 versions behind the newest release [1]. Terra (a language focusing on metaprogramming for high performance) [2] is only one version behind, but the maintainer has since graduated and I'm worried the community is too small to keep it from bitrotting.
Nominally the C API is supposed to solve this by being more stable, but it achieves this by exposing a smaller surface area, which means you can't do a lot of what you might need to do with this API.
I'd like a stable, full-featured API for LLVM. It sounds like they're going this way with the bitcode format, and I think they could start to move in this direction with the API as well. When the project was young, the argument was that avoiding stability allowed the developers the flexibility to explore design decisions and make different tradeoffs over time necessary for the long-term health of the code base. I understand that. But this instability makes it essentially impossible to maintain client projects without ongoing effort, and means that small projects almost always bitrot.
I think we know enough now to develop a pretty decent stable API for LLVM. And I'd be a lot more confident in trusting smaller projects that use LLVM if such an API existed.
[0]: https://www.rust-lang.org/ [1]: https://klee.github.io/getting-started/ [2]: http://terralang.org/