Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: BitKeeper – Enterprise-ready version control, now open-source (bitkeeper.org)
384 points by wscott on May 10, 2016 | hide | past | favorite | 306 comments



The grand irony is that Larry was one of the earliest advocates of open sourcing the operating system at Sun[1] -- and believed that by the time Sun finally collectively figured it out and made it happen (in 2005), it was a decade or more too late.[2] So on the one hand, you can view the story of BitKeeper with respect to open source as almost Greek in its tragic scope: every reason that Larry outlined for "sourceware"[3] for Sun applied just as much to BK as it did to SunOS -- with even the same technologist (Torvalds) leading the open source alternative! And you can say to BK and Larry now that it's "too late", just as Larry told Sun in 2005, but I also think this represents a forced dichotomy of "winners" and "losers." To the contrary, I would like to believe that the ongoing innovation in the illumos communities (SmartOS, OmniOS, etc.) proves that it's never too late to open source software -- that open source communities (like cities) can be small yet vibrant, serving a critical role to their constituencies. In an alternate universe, might we be running BK on SunOS instead of git on Linux? Sure -- but being able to run an open source BK on an open source illumos is also pretty great; the future of two innovative systems has been assured, even if it took a little longer than everyone might like.

So congratulations to Larry and crew -- and damn, were you ever right in 1993! ;)

[1] Seriously, read this: http://www.landley.net/history/mirror/unix/srcos.html

[2] The citation here is, in that greatest of all academic euphemisms, "Personal communication."

[3] "Sourceware" because [1] predates the term "open source"


Yeah this irony is not lost on me. But in both cases, the companies acted in self interest. Neither had the guts to walk away from their existing revenue stream. It's hard to say what would have happened.

It's been an interesting ride and if nothing else, BK was the inspiration for Git and Hg, that's a contribution to the field. And maybe, just maybe, people will look at the SCCS weave and realize that Tichy pulled the wool over our eyes. SCCS is profoundly better.


Thank you for contributing to the development and evangalizing of DVCS, directly (BK) and indirectly (the ideas and inspiration for git, hg).


It's probably fair to say the DVCS accelerated the growth of the entire software industry.

Was BitKeeper the first version control system to "think distributed" ?


Sun's TeamWare [1] was probably the first real distributed version control system. It worked on top of SCCS. Larry McBoy, BitKeeper's creator, was involved in its development. I believe BitKeeper also uses parts of SCCS internally.

[1] https://en.wikipedia.org/wiki/Sun_WorkShop_TeamWare


We did a clean room reimplementation of SCCS and added a pile of extensions.


So basically when Tridge "reverse engineered" BK, he basically reimplemented SCCS?

https://lwn.net/Articles/132938/


nope, he did a clone and pull. He did rsync or tar but had no awareness of the file format.


NSE begat NSE-lite begat TeamWare begat BK begat git. Or so says Cantrill.


That is my understanding, yes -- but the NSE and (McVoy-authored) NSElite chapters of the saga pre-date me at Sun. Before my "Fork Yeah!" talk[1][2], from which this is drawn, I confirmed this the best I could, but it was all based only on recollections of the engineers who were there (including Larry). I haven't found anything written down about (for example) NSElite, though I would love to get Larry on the record to formalize that important history...

[1] https://www.usenix.org/legacy/events/lisa11/tech/slides/cant...

[2] https://www.youtube.com/watch?v=-zRN7XLCRhc


The NSE was Suns attempt at a grand SCM system and it was miserably slow (single threaded fuse like COW file system implemented in user space). I did performance work back then, sort of a jack of all trades (filesystem, vm system, networking, you name it) so Sun asked me to look at it. I did and recoiled in horror, it wasn't well thought out for performance.

My buddies in the kernel group were actually starting to quit because they were forced to use the NSE and it made them dramatically less productive. Nerds hate being slowed down.

Once the whole SCM thing crossed my radar screen I was hooked. Someone had a design for how you could have two SCCS files with a common ancestry and they could be put back together. I wrote something called smoosh that basically zippered them together.

Nobody cared. So I looked harder at the NSE and realized it was SCCS under the covers. I built a pile of perl that gave birth to the clone/pull/push model (though I bundled all of that into one command called resync). It wasn't truly distributed in that the "protocol" was NFS, I just didn't do that part, but the model was the git model you are used to now minus changesets.

I made all that work with the NSE, you could bridge in and out and one by one the kernel guys gave up on NSE and moved to nselite. This was during the Solaris 5.0 bringup.

I still have the readme here: http://mcvoy.com/lm/nselite/README and here are some stats from the 2000th resync inside of Sun: http://mcvoy.com/lm/nselite/2000.txt

I was forced to stop developing nselite by the VP of the tools group because by this time Sun knew that nselite won and NSE lost so they ramped up a 8 person team to rewrite my perl in C++ (Evan later wrote a paper basically saying that was an awful idea). They took smoosh.c and never modified it, just stripped my history off (yeah, some bad blood).

Their stuff wasn't ready so I kept working but that made them look bad, one guy with some perl scripts outpacing 8 people with a supposedly better language. So their VP came over and said "Larry, this went all the way up to Scooter, if you do one more release you're fired" and set back SCM development almost a decade, that was ~1991 and I didn't start BitKeeper until 1998. There is no doubt in my mind that if they had left me alone they would have the first DVCS.

Fun times, I went off and did clusters in the hardware part of the company.


Wow, jackpot -- thank you! That 2000th resync is a who's who of Sun's old guard; many great technologies have been invented and many terrific companies built by the folks on that list! I would love to see the nselite man pages that the README refers to (i.e., resync(1) and resolve(1)); do you happen to still have those?

Also, even if privately... you need to name that VP. ;)


There used to be a paper on smoosh here: http://www.bitmover.com/lm/papers/smoosh.ps but it's gone now. Do you mind putting it back? I'd like to read it again.



You weren't the only one to do SCCS over NFS. The real-time computer division of Harris did it too. That version control system was already considered strange and old by 2004 when I encountered it.


1 man using perl can outpace 8 on C++. Who would've thought? /sarcasm. But seriously, I think this is one of the classic instances of what is now quite common knowledge about dynamic scripting languages: They let you get things done MUCH faster. I think the tools group learned the wrong lesson from this, but OTOH, who would want to start developing all of their new software in perl? And given that python hadn't caught on yet, there really wasn't much else out there in the field.


> there really wasn't much else out there in the field.

What about shell scripts?


Shell scripts are even more miserable to write than perl, and are missing a lot of features you want for most applications


Or LISP.


Firstly, the mention of LISP would have probably sent most of Sun screaming and running for the hills at the time. Secondly, LISP has never been very good at OS integration, one of the most important things for many software projects.


>> Evan later wrote a paper basically saying that was an awful idea

Is this paper available online? Thanks.


https://www.usenix.org/legacy/publications/library/proceedin...

And for the record, Evan was somewhat justified in not saying I had anything to do with Teamware since I made his team look like idiots, ran circles around them. On the other hand, taking smoosh.c and removing my name from the history was dishonest and a douche move. Especially since not one person on that team was capable of rewriting it.

The fact remains that Teamware is just a productized version of NSElite which was written entirely by me.

If I sound grumpy, I am. Politics shouldn't mess with history but they always do.


Good to know I (probably) got it right.


BK and Monotone begat Git


There was a paper published on a DVCS using UUCP (!) in 1980: "A distributed version control system for wide area networks " B O Donovan, http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=tru...


The date of publication on Xplore is September 1990 though?


> Thank you for contributing to the development and evangalizing of DVCS, directly (BK) and indirectly (the ideas and inspiration for git, hg).

I concur.


> the companies acted in self interest

Sun, could, at least, make a profit building workstations and servers and licensing chips.

It's actually very sad they don't build those SPARC desktops anymore.


I really doubt it. There's not enough demand to make them competitive in price-performance. The server chips have stayed badass but still niche market. There's even open-source, SPARC HW with reference boards for sell from Gaisler. That didn't take off.

I liked the desktops but there's no money in them. Market always rejects it. So does FOSS despite it being the only open ISA with mainstream, high-performance implementations.


> There's not enough demand to make them competitive in price-performance.

There does not need to be demand: Steve Jobs (in)famously said that where there was no market, "create one".

I for one would absolutely love to be able to buy an illumos-powered A4-sized tablet which ran a SPARC V9 instruction set, plugged into a docking station, and worked with a wireless keyboard and mouse to be used as a workstation when I'm not walking around with my UNIX server in hand. Very much akin to Apple Computer's iPad Pro (or whatever they call it, I don't remember, nor is that really relevant).

But the most important point was, and still is, and always will be: it has to cost as much as the competition, or less. Sun Microsystems would just not accept that, no matter how much I tried to explain and reason with people there: "talk to the pricing committee". What does that even mean?!? Was the pricing committee composed of mute, deaf and blind people who were not capable of seeing that PC-buckets were eating Sun's lunch, or what?


"There does not need to be demand: Steve Jobs (in)famously said that where there was no market, "create one"."

What people forget is that Steve Jobs was a repeated failure at doing that, got fired, did soul-searching, succeeded with NEXT, got acquired, and then started doing what you describe. Even he failed more than he succeeded at that stuff. A startup trying to one-off create a market just for a non-competitive chip is going to face the dreaded 90+% failure rate.

"But the most important point was, and still is, and always will be: it has to cost as much as the competition, or less."

That's why the high-security stuff never makes it. It takes at least 30% premium on average per component. I totally believe your words fell on deaf ears at Sun. I'd have bought SunBlades myself if I could afford them. I could afford nice PC's. So, I bought nice PC's. Amazing that echo chamber was so loud in there that they couldn't make that connection.

"I for one would absolutely love to be able to buy an illumos-powered A4-sized tablet which ran a SPARC V9 instruction set"

That's actually feasible given the one I promote is 4-core, 1+GHz embedded chip that should be low power on decent process node.

http://www.gaisler.com/index.php/products/processors/leon4?t...

The main issue is the ecosystem and components like browsers with JIT's that must be ported to SPARC. One company managed to port Android to MIPS but that was a lot of work. Such things could probably be done for SPARC as well. The trick is implementing the ASIC, implementing the product, porting critical software, and then charging enough to recover that but not more than competition whose work is already done for them. Tricky, tricky.

Raptor's Talos Workstation, if people buy it, will provide one model that this might happen. Could get ASIC's on 45-65nm really quick, use SMP given per-chip cost is $10-30, port Solaris/Linux w/ containers, put in a shitload of RAM, and sell it for $3,000-6,000 for VM-based use and development. It would still take thousands of units to recover cost. Might need government sales.


The problem is the number of people who are into Illumos and want a portable Unix server is insignificant. All products came with fixed overheads (e.g. cost of tooling to start production), which have to be divided over the likely customer base. Small customer base == each customer pays a bigger share of the fixed overheads.

Basically, you want something to suit you and a small number of other people, but you won't pay for the cost of having something that "tailor made". You will only pay for the high-volume, lower-cost, more general product. So... that's all you get.


The problems were Sun's failure to recognize that cheap IBM PC clones would disrupt them like they disrupted mainframes and Sun not trying hard enough to overcome Wintel's network effect. Sun needed to die shrink old designs to get something that they could fabricate at low cost and compete on price. Such a thing would have canabalized Sun workstation sales, which might be why they never did it.


Could be. It's gone now, though.


You again! (:-) You know that the VHDL code for UltraSPARC T1 and T2 has been open sourced? If I had enough knowledge about synthesizing code inside of an FPGA, I would be building my own SPARC-based servers like there is no tomorrow!

As long as the code for those processors remains free, and a license to implement a SPARC ISA compliant processor only costs $50, the SPARC will never really, truly be gone, especially not for those people capable of synthesizing their own FPGA's, or even building their own hardware.

Some people did exactly that, a while back. Too bad they didn't turn their designs into ready-to-buy servers.


" You know that the VHDL code for UltraSPARC T1 and T2 has been open sourced?"

That was exciting. It could do well even on a 200MHz FPGA given threading performance. Then, use eASIC's Nextreme's to convert it to structured ASIC for better speed, power-usage, and security from dynamic attacks on FPGA. That it's Oracle and they're sue-happy concerns me. I'd read that "GPL" license very carefully just in case they tweaked it. If it's safe, then drop one of those badboys (yes, the T2) on the best node we can afford with key I/O. Can use microcontrollers on the board for the rest as they're dirt cheap. Same model as Raptor's as I explained in another comment.

Alternatively, use Gaisler as Leon3 is GPL and designed for customization. Simple, too. Leon4 is probably inexpensive compared to ARM, etc.

"SPARC ISA compliant processor only costs $50, the SPARC will never really, truly be gone, especially not for those people capable of synthesizing their own FPGA's,"

Yep.

"Too bad they didn't turn their designs into ready-to-buy servers."

Not quite a server but available and illustrates your point:

http://www.gaisler.com/index.php/products/systems/gr-rasta?t...

Btw, I found this accidentally while looking for a production version of OpenSPARC:

http://palms.ee.princeton.edu/node/381


FPGA is not magic. SPARC implemented on FPGA will never be competitive with consumer-level x86's


No, they are magic: arbitrary hardware designs run without the cost of chip fabrication. Two non-profit FPGA's, one for performance at 28nm & one for embedded at 28SLPnm, would totally address the custom hardware and subversion problem given we could just keep checking on that one. The PPC cores and soon Intel Xeons already show what a good CPU plus FPGA accleration w/ local memory can do for applications.

Yeah, buddy, they're like magic hardware. Even if they aren't ASIC-competitive for the best ASIC's. Still magic with a market share and diverse applications that shows it. :)


I thought that is clear? Apparently not...

FPGA's are cheap and good enough for prototyping; once one has a working VHDL / Verilog code, it's tapeout time.


Security wise, a FPGA is superior to consumer level x86 processors. How do you backdoor a FPGA?


In more ways than you'd know. They're already pre-backdoored like almost all other chips for debugging purposes in what's called Design for Test or scan chains or scan probes or something. Much hardware hacking involves getting to those suckers to see what chip is doing.

Now, for remote attacks, you can embed RF circuitry in them that listens to any of that. You can embed circuits that receive incoming command, then dump its SRAM contents. You might modify the I/O circuitry to recognize a trapdoor command that runs incoming data as privileged instructions. You can put a microcontroller in there connected to PCI to do the same for host PC attacks. I know, that would be first option but I was having too much fun with RTL and transistor level. :)


> There's not enough demand to make them competitive in price-performance

High-end chips only need to compete with other high-end chips. And low-end SPARC will not take off now that x86 has taken over.


You make that sound easy. The POWER and SPARC T/M chips are amazing. Yet, Intel Xeon still dominates to point that they can charge less and invest more. That's with one hell of a head start from Sun and IBM. The other's... Alpha, MIPS, and PA-RISC... folded in server markets with Itanium soon to follow.

You can't just compete directly in that market: you have to convince them yours is worth buying for less performance at higher price and watts. Itanium tried with reliability & security advantages. Failed. Fortunately, Dover is about to try with RISC-V combined with SAFE architecture (crash-safe.org) for embedded stuff. We'll see what happens there.


They do, sort of. Intel were using SPARC cores in at least one of their "Management Engine" devices in chipsets recently anyway. ;)



Yeah, that. I think he got a PhD for RCS and what he should have gotten is shown the door. RCS sucks, SCCS is far, far better.

Just as an example, RCS could have been faster than SCCS if they had stored at the top of the file the offset to where the tip revision starts. You read the first block, then seek past all the stuff you don't need, start reading where the tip is.

But RCS doesn't do that, it reads all the data until it gets to the tip. Which means it is reading as much data as SCCS but only has sorta good performance for the tip. SCCS is more compat and is as fast or faster for any rev.

And BK blows them both away, we lz4 compress everything which means we do less I/O.

RCS sucked but had good marketing. We're here to say that SCCS was a better design.


> in both cases, the companies acted in self interest. Neither had the guts to walk away from their existing revenue stream.

Why does bitmover have the guts now?


he explained the move in another comment https://news.ycombinator.com/item?id=11668492


I though SCCS had the same problems as RCS. What did it do differently?


RCS is patch based, the most recent version is kept in clear text and the previous version is stored as a reverse patch and so on back to the first version. So getting the most recent version could be fast (it isn't) but the farther back you go in history the more time it takes. And branches are even worse, you have to patch backwards to the branch point and then forwards to the tip of the branch.

SCCS is a "weave". The time to get the tip is the same as the time to get the first version or any version. The file format looks like

  ^AI 1
  this is the first line in the first version.
  ^AE 1
That's "insert in version 1" data "end of insert for version one".

Now lets say you added another line in version 2:

  ^AI 1
  this is the first line in the first version.
  ^AE 1
  ^AI 2
  this is the line that was added in the second version
  ^AE 2
So how do you get a particular version? You build up the set of versions that are in that version. In version 1, that's just "1", in version 2, that's "1, 2". So if you wanted to get version 1 you sweep through the file and print anything that's in your set. So you print the first line, get to the ^AI 2 and look to see if that's in your set, it isn't, so you skip until you get to the ^AE 2.

So any version is the same time. And that time is fast, the largest file in our source base is slib.c, 18K lines, checks out in 20 milliseconds.


I had... much too extensive experience both with SCCS weaves and with hacking them way back in the day; I even wrote something which sounds very like your smoosh, only I called it 'fuse'. However, I wrote 'fuse' as a side-effect of something else, 'fission', which split a shorter history out of an SCCS file by wholesale discarding of irrelevant, er, strands and of the history relating to them. I did this because the weave is utterly terrible as soon as you start recording anything which isn't plain text or which has many changes in each version, and we were recording multimegabyte binary files in it by uuencoding them first (yes, I know, the decision was made way above my pay grade by people who had no idea how terrible an idea it was).

Where RCS or indeed git would have handled this reasonably well (indeed the xdelta used for git packfiles would have eaten it for lunch with no trouble), in SCCS, or anything weave-based, it was an utter disaster. Every checkin doubled the number of weaves in the file, an exponential growth without end which soon led to multigigabyte files which xdelta could have represented as megabytes at most. Every one-byte addition or removal doubled up everything from that point on.

And here's where the terribleness of the 'every version takes the same time' decision becomes clear. In a version control system, you want the history of later versions (or of tips of branches) overwhelmingly often: anything that optimizes access time for things elsewhere in the history at the expense of this is the wrong decision.

When I left, years before someone more courageous than me transitioned the whole appalling mess to git, our largest file was 14GiB and took more than half an hour to check out.

The SCCS weave is terrible. (It's exactly as good a format as you'd expect for the time, since it is essentially an ed script with different characters. It was a sensible decision for back then, but we really should put the bloody thing out of its misery, and ours.)


Huh. Now I wonder how BK resolved this.


Yeah. I suspect the answer is 'store all binary data in BAM', which then uses some different encoding for the binary stuff -- but that then makes my gittish soul wonder why not just use that encoding for everything. (It works for git packfiles... though 'git gc' on large repos is a total memory and CPU hog, one presumes that whatever delta encoding BAM uses is not.)


We support the uuencode horror for compat (and for smaller binaries that don't change) but the answer for binaries is BAM, there is no data in the weave for BAM files.

I don't agree that the weave is horrible, it's fantastic for text. Try git blame on a file in a repo with a lot of history then try the same thing in BK. Orders and orders of magnitude faster.

And go understand smerge.c and the weave lightbulb will come on.


Yeah, that's the problem; it's optimizing for the wrong thing. It speeds up blame at the expense of absolutely every other operation you ever need to carry out; the only thing which avoids reading (or, for checkins, writing) the whole file is a simple log. Blame is a relatively rare operation: its needs should not dominate the representation.

The fact that the largest file you mention is frankly tiny shows why your performance was good: we had ~50,000 line text files (yeah, I know, damn copy-and-paste coders) with a thousand-odd revisions and a resulting SCCS filesize exceeding three million lines, and every one of those lines had to be read on every checkout: dozens to hundreds of megabytes, and of course the cache would hardly ever be hot where that much data was concerned, so it all had to come off the disk and/or across NFS, taking tens of seconds or more in many cases. RCS could have avoided reading all but 50,000 of them in the common case of checkouts of most recent changes. (git would have reduced read volume even more because although it is deltified the chains are of finite length, unlike the weave, and all the data is compressed.)


Give me a file that was slow and lets see how it is in BitKeeper. I bet you'll be impressed.

50K lines is not even 3x bigger than the file I mentioned. Which we check out in 20 milliseconds.

As for optimizing blame, you are missing the point, it's not blame, it's merge, it's copy by reference rather than copy by value.


I'd do that if I was still working there. I can probably still get hold of a horror case but it'll take negotiation :)

(And yes, optimizing merge matters too, indeed it was a huge part of git's raison d'etre -- but, again, one usually merges with the stuff at the tip of tree: merging against something you did five years ago is rare, even if it's at a branch tip, and even rarer otherwise. Having to rewrite all the unmodified ancient stuff in the weave merely because of a merge at the tip seems wrong.)

(Now I'm tempted to go and import the Linux kernel or all of the GCC SVN repo into SCCS just to see how big the largest weave is. I have clearly gone insane from the summer heat. Stop me before I ci again!)


Our busiest file is 400K checked out and about 1MB for the history file lz4 compressed. Uncompressed is 2.2M and the weave is 1.7M of that.

Doesn't seem bad to me. The weave is big for binaries, we imported 20 years of Solaris stuff once and the history was 1.1x the size of the checked out files.


Presumably if you then delete that first line in the third version, you get something like

  ^AI 1
  this is the first line in the first version.
  ^AE 1
  ^AD 3
  ^AI 2
  this is the line that was added in the second version
  ^AE 2

?


Close. By the way there is a bk _scat command (sccs cat, not poop) that dumps the ascii file format so you can try this and see.

The delete needs to be an envelope around the insert so you get

  ^AD 3
  ^AI 1
  this is the first line in the first version.
  ^AE 1
  ^AE 3
  ^AI 2
  this is the line that was added in the second version
  ^AE 2
That whole weave thing is really cool. The only person outside of BK land that got it was Braam Cohen in Codeville, I think he had a weave.


It's sort of suprising then that a delete doesn't just and end-version on the insert instead:

  ^AI 1..2
  this is the first line in the first version.
  ^AE 1..2
  ^AI 2
  this is the line that was added in the second version
  ^AE 2
This way the reconstruction process wouldn't need to track blocks-within-blocks.


Interesting. "^AI Spec" where Spec feeds into a predicate f(Spec, Version) to control printing a particular Version? Looks like you could drop the ^AE lines.


Sounds like equivalent representations no? Limit the scope of the I lines or wrap them in D lines.


Probably missing something. Both are working on one file at a time and have some form of changeset. One is adding from back to forward (kind of) another is from forward to back (rcs). Not sure where the reduction of work coming from.


SUN must like the scat names :) I used to use scat tool for debugging core files.


That actually is pretty neat


Aha, so that's where bzr got it from. :-)


bzr got more than that from BK, it got one of my favorite things, per-file checkin comments. I liken those to regression tests, when you start out you don't really value them but over time the value builds up. The fact that Git doesn't have them bugs me to no end. BZR was smart enough to copy that feature and that's why MySQL choose bzr when they left BK.

The thing bzr didn't care about, sadly, is performance. An engineer at Intel once said to me, firmly, "Performance is a feature".


Git's attitude, AFAIK, is that if you want per-file comments, make each file its own checkin. There are pros and cons to this.

Performance as a feature, OTOH, is one of Linus's three tenets of VCS. To quote him, "If you aren't distributed, you're not worth using. If you're not fast, you're not worth using. And if you can't guarantee that the bits I get out are the exact same bits I put in, you're not worth using."


Big fan of `git commit -vp` here. Enables me to separate the commits according to concerns.


I suppose that in Git if you wanted to group a bunch of these commits together you could do so with a merge commit.


If I remeber the history correctly, per-file commit messages were actually a feature that was quickly hacked in to get MySQL on board. It did not have that before those MySQL talks and I don't think it was very popular after.

Performance indeed killed bzr. Git was good enough and much faster, so people just got used to its weirdness.


> Git was good enough and much faster, so people just got used to its weirdness.

And boy is git weird! In Mercurial, I can mess with the file all day long after scheduling it for a commit, but one can forget that in git: marking a file for addition actually snapshots a file at addition time, and I have read that that is actually considered a feature. It's like I already committed the file, except that I didn't. This is the #1 reason why I haven't migrated from Mercurial to git yet, and now with Bitkeeper free and open source, chances are good I never will have to move to git. W00t!!!

I just do not get it... what exactly does snapshotting a file before a commit buy me?


It's probably the same idea as the one behind committing once in Mercurial and then using commit --amend repeatedly as you refine the changes. Git's method sounds like it avoids a pitfall in that method by holding your last changeset in a special area rather than dumping it into a commit so that you can't accidentally push it.

I often amend my latest commit as a way to build a set of changes without losing my latest functional change.


I always do a hg diff before I commit. If in spite of that I still screw up, I do a hg rollback, and if I already pushed, I either roll back on all the nodes, or I open a bug, and simply commit a bug fix with a reference to the bug in the bug tracking system. I've been using Mercurial since 2007 and I've yet to use --amend.


> In Mercurial, I can mess with the file all day long after scheduling it for a commit

OTOH, I find that behavior weird as I regularly add files to the index as I work. If a test breaks and I fix it, I can review the changes via git diff (compares the index to the working copy) and then the changes in total via git diff HEAD (compares the HEAD commit to the working copy).


Did you know you can do 'git add -N'? That will actually just schedule the file to be added, but won't snapshot it.


Cool. I've used bzr but never knew about per-file comments.


10 or even 20% performance is not a feature. But when tools or features get a few times faster or more then their usage model changes - which means they become different features.


In short, RCS maintains a clean copy of the head revision, and a set of reverse patches to be applied to recreate older revisions. SCCS maintains a sequence of blocks of lines that were added or deleted at the same time, and any revision can be extracted in the same amount of time by scanning the blocks and retaining those that are pertinent.

Really old school revision control systems, like CDC's MODIFY and Cray's clone UPDATE, were kind of like SCCS. Each line (actually card image!) was tagged with the ids of the mods that created and (if no longer active) deleted it.


| CDC's MODIFY and Cray's clone UPDATE, were kind of like SCCS

Do you have references? I've heard of these but haven't come across details after much creative searching since they are common words.



Thank you! A peek into the (as far as I know) into root node of source control history.


I've heard that too. It comes from card readers somehow.


I've read that "sourceware" article before, in the distant past when it was still a roughly accurate picture of the market (maybe 1995 or 1996). It's weird to read it again now, in a world that is so remarkably changed. Linux, the scrappy little upstart with a million or so users at the time of the paper, is now the most popular OS (or at least kernel) on the planet, powering billions of phones and servers. NT was viewed as the unfortunate but inevitable future of server operating systems.

I remember looking at IT jobs back then, and seeing a business world covered in Windows NT machines; I even got my MCSE (alongside some UNIX certifications that I was more excited about), because of it. Looking at jobs now, the difference is remarkable, to say the least. Nearly every core technology a system administrator needs to know is Open Source and almost certainly running on Linux.

And, the funny thing is that the general prescription (make a great Open Source UNIX) is exactly what it took to save UNIX. It just didn't involve any of the big UNIX vendors in a significant way (the ones spending a gazillion dollars on UNIX development at the time). Linux got better faster than Sun got smarter, and ate everybody's lunch, including Microsoft. Innovator's Dilemma strikes again.

Apple is an interesting blip on the UNIX history radar, too...though, they're likely to lose to the same market forces in the end, as phones become commodities. I'm a bit concerned that it's going to be Android, however, that wins the mobile world since Android is nowhere near the ideal OS from an Open Source and ethical perspective; but, I guess they got the bits right that Larry was suggesting needed to be right.

Anyway, it was a weird flashback to read that article. Things change, and on a scale that seems slow, until you look back on it, and see it's "only" been a couple of decades. In the grand scheme of things, and compared to the motion of technology prior to the 1900, that's a blink of an eye.


Eh. Linux does have a LOT of problems. Well, so does everything, but it's not like it's "great." More like "good."

But yeah, it was weird that everybody thought NT was going to be the future. And now, MS has opensourced a good deal of infrastructure, is working with Node, has announced an integrated POSIX environment for Windows. And since it's in corporate, it might even be able to fix the fork(2) performance problems.


Great is a relative term. But, can you name an OS, especially a UNIX, that is better in the general case? By "general case", I mean, good for just about anything, even if there's something better for some niche or role. Also, take into account the world we live in: More computing happens on servers and phones than on desktops and laptops; and judge the OS based on how it's doing in those roles.

I sincerely consider Linux a great UNIX. Probably the best UNIX that's existed, thus far. There are warts, sure. Technically, Solaris had (and still has, in IllumOS and SmartOS) a small handful of superior features and capabilities (at this point one can list them on one hand, and one could also list some "almost-there" similar techs on Linux). But, I assume you've used Solaris (or some other commercial UNIX) enough to have an opinion...can you honestly say you enjoyed working on it more than Linux? The userspace on Solaris was always drastically worse than Linux, unless you installed a ton of GNU utilities, a better desktop, etc. But, Linux brought us a UNIX we could realistically use anywhere, and at a price anyone could afford. That's a miracle for a kid that grew up lusting after an Amiga 3000UX (because it was the closest thing to an SGI Indy I could imagine being able to afford).


Fair 'nuff. And no, I haven't used commercial UNIXes all that much, but I have experienced plenty of Linux's warts. I do agree with a lot of those points, but containers on Linux just aren't there, systemd is a mess that's going to get pushed in no matter what we say, and there are plenty of issues to be had, although the ladder is true of any UNIX. If you want to know what the rest of the issues are, just start googling. And while you're at it, listen to some of Bryan Cantrill's talks. They are biased (obviously), but they're entertaining, and they do point out some things that I think are real problems (posix conformance (MADV_DONTNEED), and epoll semantics, mainly).

Oh, and don't flame me for speaking in ignorance. I've been a Linux user for half a decade at least now, and I CAN say I see problems with it. I can also say, as a person who is programmer, that some of the things that Cantrill pointed out are actually evil. Note, however, that I don't claim Solaris, or any other OS is better. Every UNIX is utterly fucked in some respect. I just know Linux's flaws the best.

By the way, I've been trying to get Amiga emulation working for a while. It basically works at this point, but the *UAEs are a misery on UNIX systems. Without any kind of loader, you have to spend 10 minutes editing the config every time you want to play a game. But if you're in any way interested in the history you lived through, check out youtube.com/watch?v=Tv6aJRGpz_A


Those issues seem so trivial with the benefit of hindsight and a memory of what it was like to deploy an application to multiple UNIX variants. Having one standard (we can call that standard "it ain't quite POSIX, but it runs great on Linux") is so superior to the mine field that was all of the UNIXen in 199x, that I don't even register it as a problem. Shoot, until you've had to use autotools or custom build a makefile for a half dozen different C compilers, kernels, libc, and so on, you don't know from POSIX "standards" pain.

But now my beard is showing and I'm ranting. My point is this: it took something from completely outside of the commercial UNIX ecosystem, so far out in left field that none of the UNIX bosses (or Microsoft) saw it as a threat until it was far too late...and it took something that was good, really good in at least some regards, that it would have passionate fans even very early in. Linux did that. And, compared to everything else (pretty much everything else that's ever existed, IMHO), it's great.

And, I'm on board the retro computing bandwagon. I have a real live Commodore 64 and an Atari 130xe. I'd like to one day find an Amiga 1200 in good shape, but because I live in an RV and travel fulltime, I don't have a lot of room to spare. But I do like to tinker and reminisce.


> until you've had to use autotools or custom build a makefile for a half dozen different C compilers, kernels, libc, and so on, you don't know from POSIX "standards" pain.

Truer words have ne'er been spoken. My first big boy job involved building and maintaining a large open source stack on top of AIX. These days I occasionally experience hiccups related to OpenBSD not being Linux. Problems aren't even in the same league. That said, the thrill of getting stuff to work on AIX was certainly greater (and purchased with more human suffering).


Ah, the agony that I can only imagine. The many Linux distros are bad enough...


You know, I think you're right. You have a good point. Thanks.

And I'd love to have some real retro computers, but I've got no money, and most of the really interesting ones are from the UK. Ah well...


> Great is a relative term. But, can you name an OS, especially a UNIX, that is better in the general case? By "general case", I mean, good for just about anything, even if there's something better for some niche or role. Also, take into account the world we live in: More computing happens on servers and phones than on desktops and laptops; and judge the OS based on how it's doing in those roles.

Okay then, SmartOS. Why is an exercise left for the reader, because it would just take too much and too long to list and explain all the things it does better, faster and cheaper than Linux in server space; that's material rife for an entire book.

> can you honestly say you enjoyed working on it more than Linux?

Enjoyed it?!? Love it, I love working with Solaris 10 and SmartOS! It's such a joy not having a broken OS which actually does what it is supposed to do (run fast, be efficient, protect my data, is designed to be correct). When I am forced to work with Linux (which I am, at work, 100% of the time, and I hate it), it feels like I am on an operating system from the past century: ext3 / ext4 (no XFS for us yet, and even that is ancient compared to ZFS!), memory overcommit, data corruption, no backward compatibility, navigating the minefield of GNU libc and userland misfeatures and "enhancements". It's horrible. I hate it.

> The userspace on Solaris was always drastically worse than Linux,

Are you kidding me? System V is great, it's grep -R and tar -z that I hate, because it only works on GNU! Horrid!!!

> But, Linux brought us a UNIX we could realistically use anywhere, and at a price anyone could afford.

You do realize that if you take an illumos derived OS like SmartOS and Linux, and run the same workload on the same cheap intel hardware, SmartOS is usually going to be faster, and if you are virtualizing, more efficient too, because it uses zones, right? Right?

It's like this: when I run SmartOS, it's like I'm gliding around in an ultramodern, powerful, economical mazda6 diesel (the 175 HP / 6 speed Euro sportwagon version); I slam the gas pedal and I'm doing 220 km/h without even feeling it and look, I'm in Salzburg already! When I'm on Linux, I'm in that idiotic Prius abomination again: not only do I not have any power, but I end up using more fuel too, even though it's a hybrid, and I'm on I-80 somewhere in Iowa. That's how I'd compare SmartOS to Linux.


> It's like this: when I run SmartOS, it's like I'm gliding around in an ultramodern, powerful, economical mazda6 diesel (the 175 HP / 6 speed Euro sportwagon version);

"Tout ce qui est excessif est insignifiant"


"Yeah Tenzin, I... still don't speak that."


That was an awesome rant. You must work with Brian.

Edit: Bryan Cantrill, spelled it wrong.


Nothing would make me happier professionally than to have the opportunity to work with Bryan (sadly, we've never met, although I did work in Silicon Valley for a while). For instance, those times when I wouldn't be writing C, I could finally have an orgy of AWK one-liners and somebody would appreciate it without me having to defend why I used AWK.


I'm struck by how much this sounds like a Linux fan ranting back in 1995, when Windows and "real" UNIX was king. The underdog rants were rampant back then (I'm sure I penned a few of them myself).

I think the assumption you're making is that people choose Linux out of ignorance (and, I think the ignorance goes both ways; folks using Solaris have been so accustomed to Zones, ZFS, and dtrace being the unique characteristic of Solaris for so long that they aren't aware of Linux' progress in all of those areas). But, there are folks who know Solaris (and its children) who still choose Linux based on its merits. We support zones in our products/projects (because Sun paid for the support, and Joyent supported us in making Solaris-related enhancements), and until a few years ago it was, hands-down, the best container technology going.

Linux has a reasonable container story now; the fact that you don't like how some people are using it (I think Docker is a mess, and I assume you agree) doesn't mean Linux doesn't have the technology for doing it well built in. LXC can be used extremely similarly to Zones, and there's a wide variety of tools out there to make it easy to manage (I work on a GUI that treats Zones and LXC very similarly, and you can do roughly the same things in the same ways).

"Are you kidding me? System V is great, it's grep -R and tar -z that I hate, because it only works on GNU! Horrid!!!"

Are you really complaining about being able to gzip and tar something in one command? Is that a thing that's actually happening in this conversation?

I'll just say I've never sat down at a production Sun system that didn't already have the GNU utilities installed by some prior administrator. It's been a while since I've sat down at a Sun system, but it was standard practice in the 90s to install GNU from the get go. Free compiler that worked on every OS and for building everything? Hell yes. Better grep? Sure, bring it. People went out of their way to install GNU because it was better than the system standard, and because it opened doors to a whole world of free, source-available, software.

"You do realize that if you take an illumos derived OS like SmartOS and Linux, and run the same workload on the same cheap intel hardware, SmartOS is usually going to be faster"

Citation needed. Some workloads will be faster on SmartOS. Others will be faster on Linux. Given that almost everything is developed and deployed on Linux first and most frequently, I wouldn't be surprised to see Linux perform better in the majority of cases; but, I doubt it's more than a few percent difference in any common case. The cost of having or training staff to handle two operating systems (because you're going to have to have some Linux boxes, no matter what) probably outweighs buying an extra server or two.

"and if you are virtualizing, more efficient too, because it uses zones, right? Right?"

Citation needed, again. Zones are great. I like Zones a lot. But, Linux has containers; LXC is not virtualization, it is a container, just like Zones. Zones has some smarts for interacting with ZFS filesystems and that's cool and all, but a lot of the same capabilities exist with LVS and LXC.

I feel like you're arguing against a straw man in a lot of cases here.

Why do you believe LXC (or other namespace-based containers on Linux) are inherently inefficient, compared to Zones, which uses a very similar technique to implement?

And, it's not Linux' fault the systems you manage are stuck on ext4. There are other filesystems for Linux; XFS+LVM is great. A little more complex to manage than ZFS, but not by a horrifying amount. So, you have to read two manpages instead of one. Not a big deal. And, there's valid reasons the volume management and filesystem features are kept independent in the kernel (I dunno if you remember the discussions about ZFS inclusion in Linux; separate VM and FS was a decision made many years ago, based on a lot of discussion). Almost any filesystem on Linux has LVM, so, filesystems on Linux get snapshots and tons of other features practically for free. That's pretty neat.

Anyway, I think SmartOS is cool. I tinker with it every now and then, and have even considered doing something serious with it. But, I just don't find it compellingly superior to Linux. Certainly not enough to give up all of the benefits Linux provides that SmartOS does not (better package management, vast ecosystem and community, better userland even now, vastly better hardware support even on servers, etc.).


I predate the Solaris stuff, I'm not a fan. I liked SunOS which was a bugfixed and enhanced BSD. When I wrote the sourceos paper I was talking about SunOS. (I lied, I overlapped with Solaris but I try and block that out)

All that said, Sun had an ethos of caring. In the early days of bitmover amy had some quote about the sun man pages versus the linux man pages, if someone can find that, it's awesome. We keep a sun machine in our cluster just so we can go read sane man pages about sed or ed or awk or whatever. Linux man pages suck.

Sun got shoved into having to care about System V and it sucked. I hated it and left, so did a bunch of other people. But Sun carried on and the ethos of caring carried on and Bryan and crew were a big part of that. My _guess_ is that Solaris and its follow ons are actually pleasant. I'll be pissed if I install it and it doesn't have all the GNU goodness. If that's the case then you are right, they don't get it.

What I expect to see is goodness plus careful curating. That's the Sun way.


I agree that GNU man pages suck. At least the atrocity that was "this man page is a stub, use info for the real docs" is gone now (I don't know if GNU stopped trying to force me to use info, or if distros fix it downstream). I have always hated info and the persistent nagging that GNU docs used to try to make people use it.

SmartOS is nice. I've always thought so and I have a lot of respect for the folks working on it. But, it isn't nice enough to overcome the negatives of being a tiny niche system. Linux has orders of magnitude more people working on it (and many of those people are also very smart). That's hard to beat.


On the subject of manual pages, a few days ago: https://news.ycombinator.com/item?id=11643347


So, why isn't any major hosting provider offering a multi-tenant Linux container hosting service directly on bare-metal Linux servers, whereas Joyent is providing Docker-compatible container hosting on SmartOS using LX-branded zones? Does anyone trust the security of Linux namespaces and cgroups in a multi-tenant environment? That's the one thing that SmartOS really seems to have going for it.


I's not just a matter of trust: namespaces and cgroups break like graham crackers. But you knew that already. ;)


> Citation needed, again. Zones are great. I like Zones a lot. But, Linux has containers; LXC is not virtualization, it is a container, just like Zones. Zones has some smarts for interacting with ZFS filesystems and that's cool and all, but a lot of the same capabilities exist with LVS and LXC.

How about simple logic instead? I know zones work, because they have been in use in the enterprises since 2006, and they are easy to work with and reason about; if I have the same body of software available on a system with the original lightweight virtualization as I do on Linux, and my goal is data integrity, self-healing, and operational stability, what is my incentive to running a conceptual knock-off copy of zones, LXC? To me, the choice is obvious: design the solution on top of the tried and tested, original substrate, rather than use a knock-off, especially since the acquisition cost of both is zero, and I already know from experience that investing in zones pays profit and dividends down the road, because I ran them before in production environments. I like profits, and the only thing I like better than engineering profits are engineering profits with dividends. That, and sleeping through my nights without being pulled into emergency conference calls about some idiotic priority 1 incident. Incident which could have easily been avoided altogether, if I had been running on SmartOS with ZFS and zones. Based on multiple true stories, and don't even get me started on the dismal redhat "support", where redhat support often ends up in a shootout with customers[1], rather than fixing customer's problems, or being honest and admitting they do not have a clue what is broken where, nor how to fix it.

> And, it's not Linux' fault the systems you manage are stuck on ext4. There are other filesystems for Linux; XFS+LVM is great.

Did you know that LVM is an incomplete knock-off of HP-UX's LVM, which in turn is a licensed fork of Veritas' VxVM? Again, why would I waste my precious time, and run up financial engineering costs running a knock-off, when I can just run SmartOS and have ZFS built in? The logic does not check out, and financial aspects even less so.

On top of that, did you know that not all versions of the Linux kernel provide LVM write barrier support? And did you know that not all versions of the Linux kernel provide XFS write barrier support (XFS at least will report that, while LVM will do nothing and lie that the I/O made it to stable storage, when it might still be in transit)? And did you know that to have both XFS and LVM support write barriers, one needs a particular kernel version, which is not supported in all versions of RHEL? And did you know that not all versions of LVM correctly support mirroring, and that for versions which do not require a separate logging device, the log is in memory, so if the kernel crashes, one experiences data corruption? And did you know that XFS, as awesome as it is, does not provide data integrity checksums?

And we haven't even touched upon systemd knock-off of SMF, nor have we touched upon lack of fault management architecture, nor have we touched upon how insane bonding of interfaces is in Linux, nor have we touched upon how easy it is to create virtual switches, routers and aggregations (trunks in CISCO parlance) using Crossbow in Solaris/illumos/SmartOS... when I wrote that there is enough material for a book, I was not trying to be funny.

[1] http://bugzilla.redhat.com/


The Linux "knock-off" of SMF is not systemd. It is SystemXVI. Roughly speaking.

* https://news.ycombinator.com/item?id=10212770

* https://github.com/ServiceManager/ServiceManager/blob/master...


> I'm struck by how much this sounds like a Linux fan ranting back in 1995, when Windows and "real" UNIX was king. The underdog rants were rampant back then (I'm sure I penned a few of them myself).

It sounds like a Linux fan ranting circa 1995 because that is precisely what it is: first came the rants. Then a small, underdog company named "redhat" started providing regular builds and support, while Linux was easily accessible, and subversively smuggled into enterprises. Almost 20 years later, Linux is now everywhere.

Where once there was Linux, there is now SmartOS; where once there was redhat, there is now Joyent. Where once one had to download and install Linux to run it, one now has but to plug in an USB stick, or boot SmartOS off of the network, without installing anything. Recognize the patterns?

One thing is different: while Linux has not matured yet, as evidenced, for example, by GNU libc, or by GNU binutils, or the startup subsystem preturbations, SmartOS is based on a 37 years old code base which has matured and reached operational stability about 15 years ago. The engineering required for running the code base in the biggest enterprises and government organizations has been conditioned by large and very large customers having problems running massive, mission critical infrastructure. That is why for instance there are extensive, comprehensive post-mortem analysis as well as debugging tools, and the mentality permeates the system design: for example, ctfconvert runs on every single binary and injects the source code and extra debugging information during the build; no performance penalty, but if you are running massive real-time trading, a database or a web cloud, when going gets tough, one appreciates having the tools and the telemetry. For Linux that level of system introspection is utter science fiction, 20 years later, in enterprise environments, in spite of attempts to the contrary. (Try Systemtap or DTrace on Linux; Try doing a post-mortem debug on the the kernel, or landing into a crashed kernel, inspecting system state, patching it on the fly, and continuing execution; go ahead. I'll wait.) All that engineering that went into Solaris and then illumos and now SmartOS has passed the worst trials by fire at biggest enterprises, and I should know, because I was there, at ground zero, and lived through it all.

All that hard, up-front engineering work that was put into it since the early '90's is now paying off, with a big fat dividend on top of the profits: it is trivial to pull down a pre-made image with imgadm(1M), feed a .JSON file to vmadm(1M), and have a fully working yet completely isolated UNIX server running at the speed of bare metal, in 25 seconds or less. Also, let us not forget almost ~14,000 software packages available, most of which are the exact same software available on Linux[1]. If writing shell code and the command line isn't your cup of tea, there is always Joyent's free, open source SmartDC web application for running the entire cloud from a GUI.

Therefore, my hope is that it will take less than 18 years that it took Linux for SmartOS to become king, especially since cloud is the new reality, and SmartOS has been designed from the ground up to power massive cloud infrastructure.

> I think the assumption you're making is that people choose Linux out of ignorance

That is not an assumption, but rather my very painful and frustrating experience for the last 20 years. Most of those would-be system administrators came from Windows and lack the mentoring and UNIX insights.

> (and, I think the ignorance goes both ways; folks using Solaris have been so accustomed to Zones, ZFS, and dtrace being the unique characteristic of Solaris for so long that they aren't aware of Linux' progress in all of those areas).

I actually did lots and lots of system engineering on Linux (RHEL and CentOS, to be precise) and I am acutely aware of the limitations when compared to what Solaris based operating systems like SmartOS can do: not even the latest and greatest CentOS nor RHEL can even guarantee me basic data integrity, let alone backwards compatibility. Were we in the '80's right now, I would be understanding, but if after 20 years a massive, massive army of would-be developers is incapable of getting the basic things like data integrity, scheduling, startup/shutdown or init subsystem working correctly, in the 21st century, I have zero understanding and zero mercy. After all, my time as a programmer and as an engineer is valuable, and there is also financial cost involved, that not being negligible either.

> Linux has a reasonable container story now; the fact that you don't like how some people are using it (I think Docker is a mess, and I assume you agree)

Yes, I agree. The way I see it, and I've deployed very large datacenters where the focus was operational stability and data correctness, Docker is a web 2.0 developer's attempt to solve those problems, and they are flapping. Dumping files into pre-made images did not compensate for lack of experience in lifecycle management, or lack of experience in process design. No technology can compensate for lack of a good process, and good process requires experience working in very large datacenters where operational stability and data integrity are primary goals. Working in the financial industry where tons of money are at stake by the second can be incredibly instructive and insightful when it comes to designing operationally correct, data-protecting, highly available and secure cloud based applications, but the other way around does not hold.

> Are you really complaining about being able to gzip and tar something in one command? Is that a thing that's actually happening in this conversation?

Let's talk system engineering:

gzip -dc archive.tar.gz | tar xf -

will work everywhere; I do not have to think whether I am on GNU/Linux, or HP-UX, or Solaris, or SmartOS, and if I have the above non-GNU invocation in my code, I can guarantee you, in writing, that it will work everywhere without modification. If on the other hand I use:

tar xzf archive.tar.gz

I cannot guarantee that it will work on every UNIX-like system, and I know from experience I would have to fix the code to use the first method. Therefore, only one of these methods is correct and portable, and the other one is a really bad idea. If I understand this, then why do I need GNU? I do not need it, nor do I want it. Except for a few very specific cases like GNU Make, GNU tools are actually a liability. This is on GNU/Linux, to wit:

  % gcc -g hello.c -o hello
  % gdb hello hello.c
  GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-45.el5)
  ...
  ...
  ... 
  Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /home/user/hello]
  "/home/user/hello.c" is not a core dump: File format not recognized

  (gdb)
Now, why did that happen? Because the debugger as delivered by the OS doesn't know what to do with it. Something like that is unimaginable on illumos, and by extension, SmartOS. illumos engineers would rather drop dead, than cause something like this to happen.

On top of that, on HP-UX and Solaris I have POSIX tools, so for example POSIX extended regular expressions are guaranteed to work, and the behavior of POSIX-compliant tools is well documented, well understood, and guaranteed. When one is engineering a system, especially a large distributed system which must provide data integrity and operational stability, such concerns become paramount, not to mention that the non-GNU approach is cheaper because no code must be fixed afterwards.

[1] http://www.perkin.org.uk/posts/building-packages-at-scale.ht...


So it seems that my innocent suggestion that Linux isn't perfect may have spawned a tiny... massive holy war. Great. You know what? SmartOS is fantastic. It's great. But Linux isn't terrible. They both have their flaws. Like SmartOS not having a large binary footprint, and not having the excellent package repositories. And the fact that every partisan of one hates /proc on the other. And the fact that KVM is from linux. And that the docker image is actually a pretty good idea. Or the fact that SMF and lauchd were some of the inspirations for systemd. Okay, now I'm just tossing fuel on the fire, by the bucketload.

Personally, I run linux on my desktop. Insane, I know, but I can't afford mac, and the OSX posix environment just keeps getting worse. Jeez. At this rate, Cygwin and the forthcoming POSIX environment from MS will be better. But anyways, I'm not switching my system to BSD or Illumos anytime soon, despite thinking that they are Pretty Cool (TM). Why? The binary footprint. Mostly Steam. Okay. Pretty much just Steam. Insane, I know, but I'm not running (just) a server.

So in summary, all software sucks, and some may suck more than others, but I'm not gonna care until Illumos and BSD get some love from NVIDIA.

And why do you care what a crazy person thinks? Oh, you don't. By all means continue the holy war. Grab some performance statistics, and hook in the BSDs. I'll be over heating up the popcorn...


> Like SmartOS not having a large binary footprint, and not having the excellent package repositories.

http://www.perkin.org.uk/posts/building-packages-at-scale.ht...

> And the fact that KVM is from linux.

Actually that's great that it's from Linux, because one major point of embarrassment for Linux is that KVM runs faster and better on SmartOS than it does on Linux, because Joyent engineers systematically used DTrace during the porting effort:

https://www.youtube.com/watch?v=cwAfJywzk8o

> Insane, I know, but I can't afford mac

Once you're able to afford one, you won't care about the desktop every again, because your desktop will JustWork(SM).

> but I'm not gonna care until Illumos and BSD get some love from NVIDIA.

NVIDIA provides regular driver updates for both Solaris and BSD. I bought a NVidia GX980TX, downloaded the latest SVR4 package for my Solaris 10 desktop, and one pkgrm && pkgadd later, I was running accelerated 3D graphics on the then latest-and-greatest accelerator NVIDIA had for sale.

> By all means continue the holy war. Grab some performance statistics, and hook in the BSDs.

They take our code, and we take theirs; they help us with our bugs, and we help them with theirs. BSD's are actually great. BSD's have smart, capable, and competent engineers who care about engineering correct systems and writing high quality code. We love our BSD brethren.


...Annnd the BSDs involved. Now all we need is some actual fans, and the Linux hackers should start to retaliate...

But good to know that the Solaris and BSD NVIDIA drivers work. If they work with lx-branding, I might actually consider running the thing.

>Once you're able to afford one, you won't care about the desktop every again, because your desktop will JustWork(SM).

Yeah, no. It seems like OSX is making increasingly radical changes that make it increasingly hard for applications expecting standard POSIX to run. By the time I get the cash, Nothing will work right.


"I'm a bit concerned that it's going to be Android, however, that wins the mobile world since Android is nowhere near the ideal OS from an Open Source and ethical perspective; but, I guess they got the bits right that Larry was suggesting needed to be right."

They've already won; Apple isn't coming back (did they "go thermonuclear" in the end or did that nonsense die with Mr Magical Thinking?).

Don't confuse Android with Google; you can grab the source and do what you want with it, like Cyanogenmod have, or like millions of hobbyists are doing themselves. It's all available with bog standard open source licenses - no need to worry about ethics.


Not really. AOSP died when Jean-Baptiste Queru quit in protest out of AOSP not really being open source. He works at Yahoo now.

http://www.engadget.com/2013/08/07/aosp-maintenance-head-lea...


Was that before or after android overtook iOS though? I'm not that bothered about 1 persons opinion on the licence used.


This is a bit funny because Mercurial is partly named after Larry.

https://groups.google.com/d/msg/mercurial_general/c3_SM3p7S1...


Just for the record this thread is where I learned that. And yup, it sorta fits. For better or worse.


The point about small communities is great. I've never seen it spelled out like that, but it captures what I was thinking perfectly. People get caught up on popularity as the only meaningful metric for success, but it really isn't.


> And you can say to BK and Larry now that it's "too late", just as Larry told Sun in 2005,

In case of software, it is never too late: as you once put it so eloquently, software does not suddenly stop working and does not have an expiration date.

If this software works and works well, then Paul Graham's revolutionary idea of when you choose technology, you have to ignore what other people are doing, and consider only what will work the best applies. (Common sense really, but apparently not to the rest of our industry.)

If this software will work the best, and do exactly what I want and need it to do, I have enough experience to know not to care that everyone else runs something like git just because that is trendy right now. (A lesson appreciated by those who run SmartOS because it is the best available technology for virtualization, cloud, and performance, instead of running Linux and Docker.)



Sun is also no more... and it's not at all clear they would've survived had Solaris been open sourced a decade earlier.

Yes, you're absolutely right that there are a ton of startups built on opensolaris (who have proprietary code they haven't and don't intend to ever give back to the community), and there is smartos/omnios/illumos as well. But none of those projects would have in any way contributed to the health of Sun Microsystems, nor provided the funding to get Solaris to where it is today. ZFS may have never seen the light of day if Solaris were open sourced in 1995.


> ZFS may have never seen the light of day if Solaris were open sourced in 1995.

It depends on how that would have affected Jeff Bonwick. If it kept him from deciding that Sun ought to develop a new filesystem, promising Matthew Ahrens a job writing one out of college and working together with Matt on it, ZFS would never have existed.


Some history. I was at Sun and Bob Hagmann was teaching at Stanford and got me to be a TA there. He retired and Stanford asked me if I would teach CS240B so I did. Jeff Bonwick was student and I recognized his ability and recruited him to Sun. He said "I have no experience programming in C" and I said "You are smart. I can teach you C, I can't teach you smart".

I also told him that he'd go way farther at Sun than I did and I was right, I think he made DE, I didn't. He played the game better. Smart guy. Him, Bryan, Bill Moore, those guys were the new Sun in my mind.


In that case, you are one of the guys on whose shoulders much of what I have done stands. Thank you.


I actually posted before I saw your comment. I guess I was right. So are you happy to finally have an open source bring-over-modify-merge VCS whose command set makes sense?


Bit keeper is a great example of what happens when you do not open source your code. I have cited it that way many times.


Except that we've been around for 18 years and made payroll without fail that entire time. Supported a team of 10-15 people every year. That's something, many many companies in the valley, including many that open sourced everything, have not done as well.

You may have done more by open sourcing whatever it is that you have done; if so congrats.


My remark was intended to cite how much farther bit keeper could have gone had it been open source from the start rather than belittle what bit keeper accomplished.

At work, many of my newer colleagues have backgrounds in closed source software development. We are developing software that has no exact analog to existing software, and we hope that it will have a big impact. If it becomes as important as we think it could be, then bit keeper vs git is a fantastic example of why our work should be open source from the start.


Except you're not Linus and you probably don't have a cult like following for any work you produce.

Linus brought DVCS to the masses, but to pretend there wasn't more at play than simply open sourcing a project and hoping it all works out is complete rubbish. People have families to feed. Closed source is not inherently evil.

It takes a unique situation to produce something like git that's product is beyond the sum of the project itself.


I wrote enough patches to ZFSOnLinux that I have the distinction of number 2 by commit count. It was a hobby for me at first and quite frankly, I never expected it to make a difference for more than a few hundred people. Now ZoL is on millions of systems through Ubuntu in part because of my work and there are far more places using it than I can count.

Open sourcing those patches rather than keeping them to myself made a difference that was greater than anything I imagined. Similarly, the impact of making ZFS open source far surpassed the expectations of the original team at Sun. I think that making any worthwhile piece of software open source will lead to adoption beyond the scope of what its authors envisioned. All it takes is people looking for something better than what previously existed.

As for closed source being inherently evil (your words, not mine), how do you fix bugs in closed source software that a vendor is not willing to fix? How do you catch things like a hard coded password that gives root privileges? How do you know that the software is really as good as they say? It is far easier with open source software than with closed source software. Closed source software is a bad idea.


I still think you need to look at it from the point of view of an employer. Like me. I'm weird, I really care about my people, our company is more like a cooperative than anything else.

I grew this to a place where I could pay salaries. Doing so was super super hard. I had a lot of scary nights where I thought I couldn't make payroll. Just building up to a place where the next payroll was OK was a big deal for us.

So open source it? When you finally got to the point where you can pay people without worrying all night?

I get that you see that open source is the answer, and it is for some stuff. For me, jumping on that years ago was asking too much.


I have no idea why people are down voting this. In an alternative universe, we would all be using bit keeper. The reason we are not is mainly because Larry McVoy ceded the market to git and mercurial because he was afraid of disrupting his existing business. git and mercurial would never have existed had he practiced at Bit Movers what he preached at Sun.


For people who don't know the history -- McVoy offered free bitkeeper licenses to various open source projects, and the Linux kernel switched to it.

After Andrew Tridgell (SAMBA, among other projects) reverse-engineered the bitkeeper protocol [1] in order to create his own client, the license was rescinded for everyone.

As a result, Linus wrote git.

[1] https://lwn.net/Articles/132938/


> As a result, Linus wrote git.

And mpm wrote hg, never forget:

http://lkml.iu.edu/hypermail/linux/kernel/0504.2/0670.html

http://lwn.net/Articles/151624/


It's astonishing to me that Git has won out given how much easier it's been for me to explain Hg to other people than to explain Git. To this day, in our SVN workflow at my company, nontechnical people who have merely seen a Hg diagram on a whiteboard by my desk immediately grasped the idea and the lingo, and ask me questions like "hey, can you branch the code to commit those changes and push them to the testing server? This thing's really cool and we don't mind playing with the alpha version, but we might scrap it all later."


Maybe I'm a too long time user of git, but I really fail to see why git as of the last 5 years is any harder to explain than hg. Personally I think the branching in hg is pretty much broken; alone the fact that it's pretty much impossible to get rid of branches is horrible.


Because the diagrams for hg are very simple, there is a really simple way to do branching that obviously works and commits a relatively forgivable sin: just `cp -a` the folder.

Now, I know that that's in essence an admission of defeat! I'm not pretending that it's anything less than that. However, this is also the easiest explanation of, and model for, branching that anyone has ever created. The explanation of branching which the nontechnical user immediately understood was, in fact, just having a couple of these repository-folders sitting side by side with different names, `current_version` and `new_feature`. It is a model of branching that is so innocent and pure and unsullied by the world that a nontechnical person got it with only a couple of questions.

Like I said, I'm actually employed at an SVN shop, where branches are other folders in the root folder and the workflow is less "push this to the testing server" and more "commit to the repository, SSH into the testing server, and then update the testing server." But that Hg model resonated with someone who doesn't know computers. To me, that was a moment of amazement.

I'm not even saying which one is better really; I like Git branches too! It's just that I'm astonished that the more confusing DVCS is winning. Most peoples' approach to Git is "I am just going to learn a couple of fundamentals and ask an expert to set up something useful for me." I would guess that most Git users don't branch much; they never learned that aspect to it. I'm really surprised that software developers aren't more the sort to really say "why am I doing this?" and to prefer systems which make it easier to answer those questions with pretty pictures.


> I'm really surprised that software developers aren't more the sort to really say "why am I doing this?" and to prefer systems which make it easier to answer those questions with pretty pictures.

The network effect should explain it.

That said, using pictures to answer questions is fairly sadistic when those asking them are blind. I know a blind developer and I never use pictures when talking to him in IRC.


How did Git's network effect get started in the first place?


It was made specifically for managing the linux kernel, which has huge amounts of contributors all doing their thing in different parts.


They do answer the questions with pictures. But the pictures look like http://xkcd.com/1597/ and the answers are perhaps not what you are expecting. (-:


I think that the hate that gets lumped on mercurial's branches, though understandable, is a bit unfair

Disclaimer, it's been years since I used hg as my primary DVCS. So some of my thoughts here might be out of date, or have a misrecollection.

> branching in hg is pretty much broken*

It really isn't. It's absolutely not suited for the task that many people want to use it for, but it's totally fitting with the intended use case and the "history is immutable" philosophy of mercurial.

Using mercurial branches for anything resembling feature branching is a bad idea. But mercurial branches are perfect for things like ongoing lines of development. So, for a project like PostgreSQL, you'd have a "master" (default) branch for the head of development and then once a release goes into maintenance mode you create a new branch for "postgres-9.4" and any fixes that need to be applied to that release will be made to the maintenance branch.

Following hg's "immutable history" policy the fact that the commit was performed on a maintenance branch is tracked forever. And it should be because the purpose of your source control is to track those kinds of things: "This is the branch we used for maintenance releases of version X.Y.Z, it is now closed since we no longer support that version"

The issues with mercurial's branches are:

- For a long time they were the only concept in hg that had a simple name and looked like "multiple lines of development". Even though hg supported multiple heads and multiple lightweight clones, neither of those had commands or features with a clear and simple name, so they people turned to "branches" expecting them to do what they wanted even when they were a bad fit.

- "branch" is very general name that is often used (quite rightly) to refer to a bunch of slightly different ways of working with multiple concurrent versions. In general use it might refer simply having 2 developers who both produce independent changes from the same parent. Or to intentionally having multiple short lived lines of development based around feature. Or splitting of development right before a release so that the "release branch" is stable. Etc. Yet the feature in hg that is called "branch" is useful for only some of those things. It would have been better to call it a "development line" or something like that.

- It took far too long for hg to get a builtin way to refer to named heads (bookmarks). The model assumed that each repository (clone) would only ever want to have 1 head on each branch (development line) and that producing multiple heads was a problem that ought to be resolved as soon as possible. There's a lot of history behind that approach (almost every CVS and SVN team I ever worked with did that), but DVCS tools made it easier to move away from that, yet official hg support lagged.

So even today, the "branch" concept in hg is only useful for a small number of cases, and the "bookmarks" concept is what most people want, but they're separate things with names that don't align with expectations.


This distinction between branches and bookmarks looks like one of the things missing the most from git. Grab a random branch from some place and try to guess whether the committer intends to rewrite it in the future or not: good luck.

For the rest: https://stevebennett.me/2012/02/24/10-things-i-hate-about-gi...


> So even today, the "branch" concept in hg is only useful for a small number of cases, and the "bookmarks" concept is what most people want, but they're separate things with names that don't align with expectations.

And it doesn't help that the primary hosted repository system is Bitbucket and Bitbucket didn't support pull requests from bookmarks last time I checked.


So what about all this contradicts "branching in hg is pretty much broken"?


Your original comment linked broken branches with the fact that they can't be deleted. That's only true if you intended to talk about Mercurial named branches which aren't broken, they just aren't what you want them to be.

Bookmarks (today, and for several years) work just fine. So "branching" in the general sense isn't broken even though the combined feature set is a bit haphazard.

That bitbucket doesn't work well with bookmarks is a sign of how little Atlassian cares about hg, rather than an hg issue.

If you're arguing that the hosting options for hg are limited and fall far below the git options, then I'm not going to disagree.


In other words, in Git, a branch is just a named pointer to a particular revision. In Hg, these are called 'bookmarks' and they work exactly how you're imagining; and there is an immutable sort of bookmark that is called a 'tag' (bookmarks can be repointed; tags cannot). By creating a new head (i.e. branching) and naming that new head with a bookmark, you do exactly what `git branch` does.

Mercurial also supports an autonaming of revisions which automatically applies to all child revisions, too: these are meant to be independent lines of development with their its own head revision, and are called 'named branches'; that is what `hg branch` does. The problem that you're identifying (and that I agree is counterintuitive!) is that these names become part of the commits themselves and therefore public knowledge. Mercurial warns you when you `hg branch` that this is happening and says "did you want a bookmark?" but does not tell you, e.g., "to undo what you just did, type `hg branch default`."


This applied to me as well. I like the metaphor someone wrote that Git is the assembly language of DCVS.


Maybe so, but it's missing some important instructions having to do with directories and renames


"hey, can you branch the code to commit those changes and push them to the testing server? This thing's really cool and we don't mind playing with the alpha version, but we might scrap it all later."

Are they talking about hg or git here. Because that flow in git is:

  git branch
  git checkout
  git commit
  git push
The only thing that git adds to that workflow is that creating a new branch doesn't immediately move you onto it (also that most would use checkout -b to do both). And it's not immediately obvious that a non-technical user would need to know about that in order to get the above point across.


Ahahahahahaha. You think that works!

No. That fails with the following semi-helpful error message:

    remote: error: refusing to update checked out branch: refs/heads/master
    remote: error: By default, updating the current branch in a non-bare repository
    remote: error: is denied, because it will make the index and work tree inconsistent
    remote: error: with what you pushed, and will require 'git reset --hard' to match
    remote: error: the work tree to HEAD.
    remote: error: 
    remote: error: You can set 'receive.denyCurrentBranch' configuration variable to
    remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into
    remote: error: its current branch; however, this is not recommended unless you
    remote: error: arranged to update its work tree to match what you pushed in some
    remote: error: other way.
    remote: error: 
    remote: error: To squelch this message and still keep the default behaviour, set
    remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.
Actually all of these diagrams for Git need to look substantially more complicated because you first off need to introduce repositories which have a cylinder with a cloud over them (the cloud of course is the staging area) with a sort of recycle-reduce-reuse pattern of arrows `add`, `commit`, `checkout` between these three entities, with the caveat that `checkout` is only kinda-sorta what you're looking for with this. In fact there is a cylinder-to-cylinder `pull`-type operation called `fetch`, but `git fetch; git checkout` will not actually update any files, revealing the gaping hole in this simple picture, and you'll have to type `git status` to find out that you're directed to do a `git pull` anyway, which has to be diagrammed as an arrow pointing from the remote cylinder, bouncing off the local cylinder, and then pointing at the local folder.

To get to talk about `push` you then need to introduce the SVN-style "bare repository" in the diagram, a folder-box with the cylinder now drawn large inside it, and explain that this folder exists only to contain the .git subfolder and act as an SVN-style repository. You can then draw `pull` arrows down from it and `push` arrows up to it.

Then the workflow is more SVN-style:

    git branch
    git checkout
    git commit
    git push
    ssh testing-server
    cd git-repository
    git pull
Now that almost works, except the `git branch; git checkout` flow is not the proper way to push changes in the working directory to the new branch. (The context of the conversation was stuff that was already being developed, presumably on the master branch.) That fails on the checkout with an error message like:

    error: Your local changes to the following files would be overwritten by checkout:
            foo
            bar
    Please, commit your changes or stash them before you can switch branches.
    Aborting
But, I mean, close enough. It's `git stash branch <newbranch>` and it generates an ugly error message but it does exactly what you want it to do, so you can ignore that error message and hack away.

Now, you're missing the point if you think "God, drostie is really pedantically getting on my case for missing the remote-repository-update and the git stash here! Anyone will learn that workflow eventually!"

The point was not any such thing, the point was clean diagrams when explaining the idea to a fellow developer -- in fact a diagram so clean that a nontechnical user asked about it and accidentally learned enough to get some new vocabulary about how a developer's life works, so that they could more effectively communicate what they want to the developer.

It is my contention that the git diagram, as opposed to the git workflow, is sufficiently messy that a nontechnical eye will lose curiosity and most certainly will not get the idea of "make a branch, push the branch to the shared repository, then update the testing repository, then switch to that branch, then discard that branch if things don't work out." That strikes me as too in-depth for nontechnical casual users to express.


I think people have a way too high tolerance for this kind of crap. We're also kidding ourselves if we think we're smart enough to work with this kind of complexity at no cost.

Our job is often to think up new things. It's really hard to come up with new abstractions when your thinking is muddled by all kinds of incidental complexity.


This is buried but in case anyone reads it, the real reason to open source BK is to show the world that SCM doesn't have to be as error prone or as complicated as Git. You need to understand how Git works to use it properly; BK is more like a car, you just get in and drive.


That metaphor... needs work. Cars need a considerable amount of training to learn to use safely, let alone correctly. I can hack C enough that I dream in it routinely, and due to the resulting brain damage found git intuitive from the start, but there is no way I'll ever learn to drive: it's just too hard.


> Cars need a considerable amount of training to learn to use safely, let alone correctly.

Significantly less training than is required to know the internals of how it operates though.


> I think people have a way too high tolerance for this kind of crap.

I agree.

The core problem with Git is that it was designed to serve the needs of the Linux kernel developers. Very, very, very few projects have SCM problems of similar complexity, so why do so many people try to use a tool that solves problems they don't have? Much of that internal complexity extends up into the Git interface, so you're paying for complexity you don't need.

Others in this thread have praised hg and bzr for their relative simplicity for a DVCS. I'd also like to point out Fossil.

In the normal course of daily use, Fossil as simple to use as svn.

About the only time where Fossil is more complex is the clone step before checking out a version from a remote repository.

Other than that, the daily use of Fossil is very nearly command-for-command the same as with svn. Sometimes the subcommands are different (e.g. fossil finfo instead of svn status for per-file info in the current checkout) but muscle memory works that out fairly quickly.

Most of that simplicity comes down to Fossil's autosync feature, which means that local checkout changes are automatically propagated back to the server you cloned from, so Fossil doesn't normally have separate checkin and push steps, as with Git. But if you want a 2-step commit, Fossil will let you turn off autosync.

(But you shouldn't. Local-only checkins with rare pushes is a manifestation of "the guy in the room" problem which we were warned against back in 1971 by Gerald Weinberg. Thus, Fossil fosters teamwork with better defaults than Git.)

Branching is a lot saner in Fossil than svn:

1. Fossil branches automatically include all files in a particular revision, whereas svn's branches are built on top of the per-file copy operation, so you could have a branch containing only one file. This is one of those kinds of flexibility that ends up causing problems, because you can end up with branches that don't parallel one another, making patches and merges difficult. Fossil strongly encourages you to keep related branches parallel. Automatic merges tend to succeed more often in Fossil than svn as a result.

2. Fossil has a built-in web UI with a graphical timeline, so you can see the structure of your branches. You have to install a separate GUI tool to get that with most other VCSes. The fact that you can always get a graphical view of things means that if you ever get confused about the state of a Fossil checkout tree, you'll likely spend less time confused, because you're likely also using its fully-integrated web UI.

3. Whereas svn makes you branch before you start work on a change, Fossil lets you put that off until you're ready to commit. It's at that point that you're ready to decide, "Does this change make sense on the current branch, or do I need a new one?"

Fossil's handling of branches is also a lot simpler than Git's, primarily because the local Fossil repository clone is separate from the checkout tree. Thus, it is easy to have multiple Fossil checkouts from a given local repo clone, whereas the standard Git workflow is to switch among branches in a single tree, making branch switches inexpensive.

(And yes, I'm aware that there is a way to have one local Git checkout refer to another so you can have multiple branches checked out locally without two complete repo clones. The point is that Git has yet again added unnecessary complexity to something that should be simple.)


Why is it that some people get Git naturally and some experience a world of frustration trying to use it? I think the kind of problems you describe usually come up if you approach Git with a mindset formed by another SCM. They are typical for people who are proficient with, say, SVN and who try to use Git thinking that Git must work something like SVN. (I'm using SVN as just an example here; it could be any other SCM, but I most frequently see people coming from SVN to really struggle with Git.) Well, Git is nothing like SVN and you'll always be missing something if you try to understand Git through SVN concepts. It's best to forget what you used before and learn Git from a clean slate. Maybe I was just really lucky to never having to learn SVN (or CSV or ClearCase), so Git concepts and workflows were clear and almost effortless to understand and use. Or maybe it's like the concept of pointers: some people get it right away and others never get it.


> Why is it that some people get Git naturally

The only people I've ever seen "get Git naturally" were developers starting from the implementation details and working their way up (#0)[0].

Everybody else either worked very hard at it(#1)[1] or just rote-learned a list of commands(#2) that pretty much do what they want from which they don't deviate lest the wrath of the Git Gods fall upon them and they have to call upon the resident (#1) or heavens forbid the resident (#0) who'll usually start by berating them for failing to understand the git storage model.

> Well, Git is nothing like SVN and you'll always be missing something if you try to understand Git through SVN concepts.

Mercurial is also nothing like SVN, the problem is not the underlying concepts and storage model, it's that Git's "high-level UI" is a giant abstraction leak so you can't make sense of Git without understanding the underlying concepts and storage model, while you can easily do so for SVN or Mercurial.

[0] because the porcelain sort of makes sense in the context of the plumbing aka the storage model and implementation details

[1] because the porcelain in isolation is an incoherent mess with garbage man pages


> Now that almost works, except the `git branch; git checkout` flow is not the proper way to push changes in the working directory to the new branch. (The context of the conversation was stuff that was already being developed, presumably on the master branch.)

> But, I mean, close enough. It's `git stash branch <newbranch>` and it generates an ugly error message but it does exactly what you want it to do, so you can ignore that error message and hack away.

Isn't

  git checkout -b new_branch
  git commit -a
  git push
what you are looking for?

And as for the push problem:

1. You aren't going to encounter it in git when pushing a newly created branch, but yes, you then have to ssh in and check it out.

2. I wonder how Mercurial handles pushing to repository with uncommitted changes, does it just nuke them?


Regarding 2 - Do you mean the destination repo has uncommitted changes? There is no need to nuke them, as pushing to this repo will have no influence on the working set: it will just add new changesets in the history!


OK, that makes sense. I imagined that hg implicitly updates remote working set and that parent complained about git's behavior because

  hg push ssh://testing-server/
worked as lazy man's single command deployment for him :)


Regarding your first question: Yes, you can also `git checkout -b newbranch`, rather than stashing the changes and then stashing them into a new branch; I just tend to stash my changes whenever I see that there's updates on the parent repository. Call it a reflex.

Of course, you can also commit your changes and then `git branch`, which sounds insane (that commit is also now on the master branch!) until you remember that branches in git are just Mercurial's pointers-to-heads. This means that you can, on the master branch, just `git reset --hard HEAD~4` or so (if you want the last 4 local commits to be on the new branch and you haven't pushed any of them to the central repo), and your repository is in the state you want it in, as well. (And you'll need that last step even if you `git checkout -b`, I think.)

Regarding your second one, Mercurial's simplified model is actually really smart. You have to understand that Git complects two different things into `pull`: updating the repository in .git/ and updating the working copy from the repository. In Mercurial these are two separate operations: you update/commit between the working copy and the repository; you push/pull between two repositories. The working copies are not part of a push/pull at all. So if you push to a repository with uncommitted changes in its working copy, that's fine. The working copy isn't affected by a push/pull no matter what.

With that said, if that foreign repository has committed those changes, Hg will object to your push on the grounds that it 'creates a new head', and it will ask you to pull those commits into your copy and merge before you can push to the foreign repository. (The manpages also say that you can -f force it, but warn you that this creates Confusion in a team environment. Just to clarify: a 'head' is any revision that has no child revisions. In the directed acyclic graph that is the repository history, heads are any of the pokey bits at the end. You can always ask for a list of these with `hg heads`.)

"OK," you say, "but let's throw some updates into the mix, what happens? Does it nuke my changes?" And the answer is "no, but notice who has the agency now." Let's call our repositories' owners Alice and Bob. Alice pushes some change to Bob's repository. Nothing has changed in Bob's working folder.

Now if Alice tells Bob about the new revision, Bob can run an update, if he wants. Bob has the agency here. So when the update says, "hey, those updates conflict, I'm triggering merge resolution" (if they do indeed conflict), he's present to deal with the crisis. Git's problem was precisely "oh, we can't push to that repository because we might have to mess with the working copy without Bob's knowledge," and it's a totally unnecessary problem.

Bob can also keep committing, blithely unaware of Alice's branch, if Alice doesn't tell him about it. The repository will tell him that there are 'multiple heads' when he creates a new one by committing, so in theory he'll find out about her commits -- though if you're in a rush of course you might not notice.

Bob can keep working on his head with no problem, but can no longer push to Alice (if he was ever allowed to in the first place), because his pushes are not allowed to create new heads either. In fact he'll get a warning if he tries to push anywhere with multiple heads, because by default it will try to push all of the heads. However he can certainly push his active head to anyone who has not received Alice's branch, just by asking Hg to only push the latest commit via `hg push -r tip` -- this only sends the commits needed to understand the last commit, and as long as that doesn't create new heads Bob is good to push.


> Call it a reflex.

PTSD? :) Use local topic branches for everything to avoid unpleasant surprise merges. Once you are ready to merge, pull the shared branch, merge/rebase onto that and push/submit/whatever.

I sometimes keep separate branch for each thing that I intend to become a master commit. This way I can use as many small and ugly commits and swearwords as I please and later squash them for publication after all bugs are ironed out.

This helps with remembering why particular commits look the way they do, especially in high latency code review environments where it can take days or weeks and several revisions to get something accepted.

> Git's problem was precisely "oh, we can't push to that repository because we might have to mess with the working copy without Bob's knowledge," and it's a totally unnecessary problem.

Actually Bob's working copy isn't modified, it's just that if his branch was allowed to suddenly stop matching his working copy, he would probably have some fun committing (not sure what exactly, never tried).


No, the git equiv of svn up is not git pull, but git pull --rebase.


git is the primary example of how bad Linus Torvalds is at writing UI code.... ;o)


From what I read, he connected to a bit keeper repository via telnet on port 5000, executed the help command and then used that information to write an incomplete client. That does not sound like reverse engineering to me.


It is reverse engineering. It's just easy reverse engineering.


As I remember it, it was a bit of a douche move by Tridgell, driven by a Stallman-like free software ideology.


It wasn't. He gave conclusive reply which established that it's ethical (just telneting and help). Unless you believe samba and everything else is unethical and you club every reverse engineering under one umbrella, your comment is wrong. http://www.theregister.co.uk/2005/04/14/torvalds_attacks_tri...


I don't think it's fair to call people douches because they are committed to their moral principals. Especially so here, where the benefit to humanity over the alternative is so clearly obvious.


It is when they attempt to force their moral code on others.

Is the benefit clearly obvious? If you actually adhere 100% to Stallman's code I'm not so sure.


Tridge made no attempt to force his code on others.

In fact, it was the reverse - he felt like he was being locked out of kernel development because he didn't want to align his moral code with those who used BK.

So, he tried to find a way to hold true to his code without forcing the rest of the kernel team to give up BK.


> a Stallman-like free software ideology

You say that like it's a bad thing.


As I remember it, he did

telnet bk-server 5000

and typed "help".

https://lwn.net/Articles/132938/


That's the "how" not the "why".


So having a genuine need to be able to actually use tools that you wrote rather than something a company 'licenses' to you so that can modify, and share these tools is being a douche? Odd that you would think that companies that treat their users like untrustworthy hackers are not douches but those users are!



Lots of cross platform goodies in there as well as some interesting data structures. For example, our list data structure is in lines.c, it's extremely small for a small list and scales nicely to 50K items:

http://bkbits.net/u/bk/bugfix/src/libc/utils/lines.c?PAGE=an...


1 year ago: https://news.ycombinator.com/item?id=9330482

What changed? Is BitKeeper still an ongoing business with some other model, or is that, as they say... it? I hope not.


This is to answer this question and all the "too late" comments.

Too late? Maybe. But we had a viable business that was pulling in millions/year. The path to giving away our stuff seemed like:

     step 1: give it away
     step 2: ???
     step 3: profit!
And still does. So what changed? Git/Github has all the market share. Trying to compete with that just proved to be too hard. So rather than wait until we were about to turn out the lights, we decided to open source it while we still had money in the bank and see what happens. We've got about 2 years of money and we're trying to build up some additional stuff that we can charge for. We're also open to being doing work for pay to add whatever it is that some company wants to BK, that's more or less what we've been doing for the last 18 years.

Will it work? No idea. We have a couple of years to find out. If nothing pans out, open sourcing it seemed like a better answer than selling it off.


My $0.02 canadian; Build something that kicks Gitlab and Github's ass. What an opportunity. Support both BK and GIT repos. Provide a distributed workflow that enterprises will love. Enterprises are obviously where the remaining dollars are. There are billions of dollars of inefficiencies in that sector. Many of these enterprises do NOT want to host their code on Github and are buying Gitlab. Be better than Gitlab.


Github sucks for corporate version control, it's just not designed for the kind of strict role-based change control that enterprise needs. Great support though.

I haven't checked out Bitbucket because last time I evaluated (2+ years ago) they didn't have good on-prem options.


^^ this. PLEASE. :-(

Bitbucket has been rotting since Atlassian bought them, and now there's really no "killer app" for Mercurial hosting. There are Mercurial hosting services out there, but nothing anywhere close to Github/Gitlab.


You should check out RhodeCode. It's no hosting platform, but hosting it yourself is much better. It support Mercurial, and all latest things that comes with it like phases, largefiles etc.

Actually since BK is now opensource we might think of adding a BK backend to RhodeCode and our VCS abstraction layer that already supports GIT, Mercurial and Subversion


We would love that. Contact dev@bitkeeper.org if you have any questions/issues.


doesn't github support mercurial?


Nope, but Bitbucket does support git.


At GitLab we're unlikely to support BK due to lack of demand. What do you mean with a distributed workflow, something like https://gitlab.com/gitlab-org/gitlab-ce/issues/4084 ?


Great comment. Good points. Also - for enterprise, it's OK if the model ends up being a bit simpler than git - may actually be a positive. Give up some things, but get simplicity that scales to a 1,000 folks using some old VCS.

Looking forward to some hopefully differentiated features.


Just supporting big binary files in a hassle-free way would go a huge way to being better than Git and Gitlab.


What would you consider to be hassle-free?


As someone who's also read about how Git and Mercurial started (and how Bitkeeper is involved in it), I'm interested in seeing how it will play out. I hope it does work out for you and your team. Thanks for getting it out there.

I'm also interested how open-sourcing BK will improve the other systems, too.


> I'm also interested how open-sourcing BK will improve the other systems, too.

#mercurial in Freenode right now is monitoring this thread, very relevant to our interests.

Someone at Facebook in #mercurial right now is trying it on some Facebook repo, to compare performance.


If they are interested we would be happy to help them tune performance.


Ha! I'll be checking that out in Freenode now. (Wonder what mpm would say after all this time...)


Thanks for providing this level of detail; it's interesting to see the considerations that went into your decision.

How / why did you decide to use the Apache license rather than the GPL?

(It seems like a viral license might protect you a little bit, if you want to prevent your competitors from forking and improving your code base and then using it to compete against you.)


We decided to go all in on open source. Given our history, anything but a "here ya go" license wasn't going to go over well. We're aware that someone could fork it and compete against us, good on them if they can. Making money in this space isn't easy and if they can do better than us we'll ask 'em for a job. We know the source base :)

As to why that license, I think it was because LLVM or clang or both had recently picked that and all the lawyers at all the big companies liked that one. We don't particularly care, if everyone yells that it should have been GPL we'll fork it and relicense it under the GPL. Our thought was that Apache is well respect and even more liberal than the GPL but we can be convinced otherwise.


(Apache2 has a number of explicit clauses that make it preferable for open-sourcing commercial software. For example, it automatically grants a patent license for any patents used by the software, but then terminates that license if a licensee sues over that software only [as opposed to React's original patent clause, which could be construed as terminating the license if you sue Facebook at all and got them into a lot of trouble], so that contributors can include patches covered under their patents without poisoning it for everyone. It also defines that all contributions are licensed under Apache2 as well, so that if you take patches and then incorporate them into your commercial software, the contributor can't turn around and sue you for them. And it's GPL3-compatible, which many other permissive licenses aren't.)


Great explanation. Allowing use of, but limiting effect on, patents is critical to get more OSS out of big, patent-loving companies. Let's them know they still have their power and profit while doing something altruistic. Or that helps them in the long-run (free labor) while accidentally being altruistic. Works for me either way. ;)


GPLv3 is Apache 2 compatible, which GPLv2 isn't. Most other permissive licenses are compatible with any GPL.


Most GPLv2 licensed software includes the clause "either version 2 of the License, or (at your option) any later version." Through this mechanism a lot of GPL licensed software becomes compatible with Apache 2.

Also, replying to something a little higher in this thread, I wouldn't say that Apache 2 defines all contributions as Apache 2. That section of the license starts with the words "5. Submission of Contributions. Unless You explicitly state otherwise, ..."

And so Apache 2 just becomes the assumed default license on contributions, but it's not at all forced or required that contributions come in under Apache 2.


Please do not GPL this! You've made the sensible choice. Your analysis on making money is spot on.


GPL please!


Here's my dream DVCS: easily self-hostable like Fossil, but with good wiki and ticketing system. (Fossil's wiki and ticketing system are awful, but what really sunk it for me was its unexpected behavior for basic commands like "fossil rm".)

I'm just not super-fond of relying on Bitbucket, reliable though they've been, for hosting my stuff.

But a package I could toss on my own VPS? I'd toss some money at that. Wouldn't even need it to be open-source, but I'm no zealot.


> Fossil's wiki and ticketing system are awful

Care to be more specific?

I'll grant that Fossil's wiki is not a competitor to MediaWiki, but that doesn't make it "awful." It just makes it less featureful. So, what feature do you need in a wiki that Fossil's wiki does not provide?

As for the ticketing system, again, it isn't going to replace the big boys out of the box, but it also doesn't have to match them feature-for-feature to be useful. Also, the Fossil ticketing system's behavior is not fixed: it can be modified to some extent to behave more like you need. Did you even try modifying its behavior, or are you just complaining about its out-of-the-box defaults? Be specific!

> unexpected behavior for basic commands like "fossil rm".

If you mean that you want fossil rm to also delete the checkout copy of the file in addition to removing it from the tip of the current branch, and you want fossil mv to rename the checkout copy in addition to renaming it in the repository, then you can get that by building Fossil with the --with-legacy-mv-rm flag, then setting the mv-rm-files repository option. You can enable it for all local Fossil repositories with "fossil set mv-rm-files 1".

Alternately, you can give the --hard flag to fossil mv and fossil rm. That works even with a stock binary build of Fossil.

> I'm just not super-fond of relying on Bitbucket

For some of us, relying on a cloud service just isn't an option. We're willing to give up many features in order to keep control of our private repositories.

> a package I could toss on my own VPS? I'd toss some money at that.

Fossil runs great on a VPS, even a very small one, due to its small footprint. I wrote a HOWTO for setting it up behind an nginx TLS proxy using Let's Encrypt here:

https://www.mail-archive.com/fossil-users@lists.fossil-scm.o...


Oh, I used Fossil as my only VCS for 3 or 4 years. On my biggest projects, I had a heavily tweaked ticketing system and probably a hundred wiki pages. My experience with Fossil wasn't a "well the defaults suck, next thing" kind of situation for me, I was pretty invested in it.

I also ditched it all 3 or 4 years ago, so my memory's not great, but what got me about the ticketing was that, for whatever reason, I could not sit any non-technical user in front of it and have it make sense to them. No amount of tweaking ever made it make sense for anyone but the devs. I know that's not specific, but this is all in the pretty distant past for me, and that's the takeaway I had from it.

Fossil's lack of any built-in emailing was also lousy. I'm aware that some people rig up some hokey RSS-to-email system to accommodate that, but really, come on, that's awful!

Hey, if fossil actually serves your needs, that's great. I like the value proposition--one file is your repo, wiki, tickets, the whole ball of wax, it's cross-platform, it's just that the execution of the idea didn't work for me.

(As an aside, another thing I didn't like about fossil was its community--tending toward defensiveness and "it's supposed to work that way" instead of "hey, maybe you, the user, are onto something".)


> I could not sit any non-technical user in front of it and have it make sense to them

So name a bug tracker with equivalent or greater flexibility to Fossil's that non-technical users do understand.

I've only used one bug tracker that's simpler than Fossil's, and that's because it had far fewer user-facing features.

Every other bug tracker I've had to use requires some training once you get past the "submit ticket" form. And a few required training even to successfully fill that out!

> Fossil's lack of any built-in emailing was also lousy.

Email is hard. Seriously hard. RFCs 821 and 822 are only the tip of an enormous iceberg. If Fossil only did the basics, it would fail for a whole lot of real-world use-cases, and it'll only get worse as email servers get tightened down more and more, to combat spam, email fraud, domain hijacking, etc.

I, too, would like Fossil to mail out commit tickets and such, but I'm not sure I want the build time for the binary and the binary size to double just because of all the protocol handlers it would need to do this properly. Keep in mind also that Fossil generally doesn't link out to third-party libraries. There are exceptions, but then, I'm not aware of a widely-available[1] full-stack SMTP library, so it would probably have to reimplement all of it internally.

Now, if you want to talk about adding a simple gateway that would allow it to interface with an external MTA, that would be different. I suspect the only thing wanting there is for someone to get around to writing the code. I don't want it bad enough to do it myself.

> "it's supposed to work that way" instead of "hey, maybe you, the user, are onto something".)

If you propose something that goes against the philosophy of Fossil, then of course the idea will be rejected. We keep seeing git users ask about various sorts of history rewriting features for example. Not gonna happen. No sense having a philosophy if a user request can change it.

If you're talking about a Fossil behavior that isn't tied to its philosophy, but it just works the way it does for some reason, logic and persuasion are a lot less effective than working code. The Fossil core developers accept patches.

----------

[1] I mean something you can expect to be in all the major package repositories, and in binary form for Windows.


I am happy that Fossil works well for you. I long ago stopped being interested in tweaking my version control system or living with "email is hard, let's move on" and I moved onto other things.

I still think the Fossil value proposition--one file with all your project ephemera--is a good one. It'd be neat if BitMover produce something similar, e.g. maybe not a file, maybe a directory, but same idea, etc.


i think that bitkeeper could fill in the enterprise niche (like perforce or clearcase) - very big organisations with a lot of developers don't like it that every developer can check out the whole source tree. They usually like to have access control by department/group. Also stuff like 'read only access' or 'right to commit' can be added for greater bureaucratic bliss.


There is plenty of people doing "lets turn opensource and blame the community if does not work" arround.

It's a good excuse to blame that opensource broke your business, and that opensource could not save you from dying...


A South Park reference, ehe?

Well clearly in retrospect, step two should have been renaming it "Dawson's Creek Trapper Keeper Ultra Keeper Futura S 2000" [1], adding incredibly advanced computerized features including a television, a music player with voice recognition, OnStar and the ability to automatically hybrid itself to any electronic peripheral device, absorbing the secret military computer at Cheyenne Mountain, and taking over the world.

[1] https://en.wikipedia.org/wiki/Trapper_Keeper_(South_Park)


I have some questions about Why.html: https://www.bitkeeper.org/why.html

> Spending a lot of time dealing with manual and bad auto-merges? BitKeeper merges better than most other tools, and you will quickly develop confidence in the quality of the merges, meaning no more reviewing auto-merged code.

Do you have examples of merge-scenarios that are a Conflict for git but resolve for BK?

> BitKeeper’s raw speed for large projects is simply much faster than competing solutions for most common commercial configurations and operations… especially ones that include remote teams, large binary assets, and NFS file systems.

Is there a rule of thumb for what size of repos benefits from BK? (And I suppose size could either be the size of a current commit or the total size of the repo.)

Are there any companies like github or bitbucket that support BitKeeper repos?


Wayne pointed to some stuff over on the reddit thread.

As for size it's csets * files, as that gets big, Git slows down faster than linear, we're pretty linear.


I think you guys undersell BAM. That was such a clutch feature where i used BK. It's sad seeing git large file handling just show up, I garuntee it has a long way to go to get parity with BAM.


Amongst all the "too late I loves me some git" type comments, i figure I'd say thankyou and good luck with continued revenue.

I haven't read much about bk so far, so forgive my lazy web question: does/can bk operate over standard ssh as git/hg/svn can, or does it require a dedicated listening server to connect to?

Edit: answering my own question, yes it does support ssh as a transport


How does BitKeeper scale to large projects? (Like, say, gigabytes of binaries.) This is a weak area of Git.

---

From the "Why" page:

BitKeeper’s Binary Asset Manager (BAM) preserves resources and keeps access fast by providing local storage as needed.

BAM is great for any organization that handles:

* Videos

* Photos

* Artwork

* Office files

* CAD files

* Any large binary files


I've been using BK/BAM for my photos, it's got 56GB of data in there and works great. I cheat because I added a way to check things out that uses hardlinks instead copies and I can check out the whole tree in 6 seconds. Doing the copy takes a lot longer: 9+ minutes. Hardlinks rock.

On the commercial site there is a link to some BAM paper, take a look at that and maybe ask in the forum or irc if this gets lost.


Wonder if BitKeeper might be a viable alternative for Git LFS (https://git-lfs.github.com) then.


I half-expected 'very late' comments before I read the comments. I wasn't disappointed.

For those who commented that way, please reconsider this winner takes all approach to your outlook of the world. The world is better because of choice and it's in everybody's best interest to have more distributed version systems.


late does not mean it is useless.

The argument diversity is good is not so simple true, there are tones of benefits to diversity however there is cost to it too: fragmented finding talent, support, time to fix bugs, more eyes on the project, developer headaches in supporting competing standards so on...


Why would I want to use this over git or mercurial?


You wouldn't, unless you have very specific niche needs. They're pretty upfront about it:

>Why use BitKeeper when there are lots of great alternatives?

>For many projects, the answer is: you shouldn’t.

https://www.bitkeeper.org/why.html


Probably the single biggest reason, aside from it's easier to use than git's CLI, is that it has sub-modules that work exactly like files do in a repository. No extra options, just clone/pull/push/commit/etc. Full on distributed workflow.

BitKeeper itself is a collection of repositories. Download an install image, install, and clone it:

    $ bk clone http://bkbits.net/u/bk/bugfix
    $ cd bugfix
    $ bk here
    PRODUCT
    default
    $ bk comps -m
    ./src/gui/tcltk/bwidget
    ./src/gui/tcltk/tcl
    ./src/gui/tcltk/tk
    ./src/gui/tcltk/tkcon
    ./src/gui/tcltk/tktable
    ./src/gui/tcltk/tktreectrl
    ./src/win32/dll/scc
    ./src/win32/dll/shellx
    ./src/win32/dll/shellx/src
    ./src/win32/msys
    ./src/win32/vss2bk
which shows that what we clone by default doesn't include all that other crud (we cache the build result from that and populate it as needed to do builds).

Play with it, it's very different from Git, the subrepo binding is just like file bindings. Everything works together and obeys the same timeline.


It claims to be able to handle binary files well which would be a big deal to game development. They have mostly passed on git and mercurial since they can't handle game assets.


Some game shops have had good luck with Mercurial's largefiles extension, for what it's worth.


It looks like it doesn't have locking, which is the other half of what you need and why most shops go with Perforce.


I'm not sure how locking would work when you're distributed. It really only makes sense for a centralized VCS.


Mercurial has had 'largefiles' since Mercurial 2.0, 5 years ago.


While it is better than git at working with large files, the large file extension doesn't solve the problem 100%.


They actually can. There is just no good way to version many file formats. Git-pack would have to be extended with support for feels compression of so many formats it is not even funny. And leaving those files outside version control is as easy as always.

The about only difference is that git prefers the whole history and you cannot yet set per submodule shallow clone policy. It wouldn't even be too hard to add that.


I don't agree about leaving files outside version control - this always ends in tears.

Regarding versioning, you can always version any file by storing every revision and compressing them as best you can. I believe this is what Perforce does. Repo size can of course become an issue, and git doesn't do a great job with that, since it stores everything, and stores it locally. Perforce can at least discard old revisions and lets you select history depth on a per-file or file type basis.

The more serious problem with git in my view is that there's no good automated merging tools for many types of files, nor are any likely to arise. And more importantly, most people working on your average game aren't interested in forming in-depth mental models of how their tools work, and certainly don't want to have to pick up the pieces when they go wrong. So for most files, you need an exclusive lock (check in/check out, lock/unlock, etc.) model, or similar. That works quite well. But for obvious reasons, git just doesn't support this model at all, and I believe Mercurial is the same - and no amount of transparent/magic large file storage backends or whatever are going to fix that.


Feature set wise this has a number of great advantages over git! It's a shame all of the tools today are so git-centric in some ways


If this works well than this indeed is a HUGE reason to use BK!


Try it and let us know. You can download the binaries at bitkeeper.org and then clone the repo like so:

    $ bk clone http://bkbits.net/u/bk/bugfix/
type make and you should have a working BK built from source.


Are you aware of any hosting solutions that support BK as-is? eg something like Gogs or similar

Asking due to looking for a Gogs/GitLab (like) server side solution for a project under development. However, it needs to handle binary data well, which Git-based solutions don't.


We've got a very primitive hosting service up at bkbits.net. One of the ways we hope to survive is to evolve that into something closer to Github.


I really hope that takes off -- I'm interested in something that like Git but which actually makes sense without costing lots of neurons to understand how to use it in practice.


Interesting. Looking for the source of that, is this it?

  http://bkbits.net/u/wscott/bkbits/


Consider using Git LFS for binary data, our GitLab.com supports it up to 10GB per project, self hosted installations are limited only by the disk volume.


Yeah, it's a possibility. Just hoping for something better. :)


BitMover still holds all the copyright, and have all the developers. They obviously wanted to keep BitKeeper proprietary, and are only doing it now when facing irrelevance in the marketplace. If BitKeeper becomes popular again, who’s to say they won't take development proprietary again? Sure, the community could fork the latest free version, but there isn’t a free development community for BitKeeper – they’re all internal to BitMover.


  $ bk clone bk://bkbits.net/bkdemo/bk_demo
  $ cd bkdemo
  # edit files using your favorite editor
  $ bk -Ux new
  $ bk commit -y"Comments"
  $ bk push
As a user whose first CVS was git, I am quite confused by this "quick demo", I have no idea what "-uX" means, no idea what "new" means, no idea what "-y" means and why it is immediately followed by quotation marks instead of being separated by a single space. If bk wants to get new users onboard, it needs a better quick demo that makes sense to new users.


According to http://www.bitkeeper.com/testdrive:

* The -U option to bk tells it to operate on "user files". That is files that are not part of the BitKeeper metadata

* The modifier x corresponds to "extras", files which Bitkeeper doesn't know about (changed files is c)

* `new` adds files to the repository

* [on commit] the -y option is for changeset comments (~commit messages)

So `bk -Ux new` is `git add <untracked files>`[0] and `bk commit -y"thing"` is `git commit -m "thing"`

[0] aka `git add $(git ls-files -o --exclude-standard)` or `git add -i<RET> <a> <*> <q>`


Too late to dominate, but maybe not too late to cut itself a niche. It seems to have some advantages over the competition, and appears to be a reasonable contribution to the table. Besides, competition is always good.

At the very least, Bryan Cantrill will be happy :-D.



I'm wondering: how does it handle large binary files? Any better than git or hg without extensions?


Yes. Binaries are handled by one or more servers, we call them BAM servers. The servers hold the data and your repo holds the meta data, binaries are fetched on demand.

You can have a cloud of servers so the binaries are "close" (think China, India, US).


Two questions:

It is unclear to me if the BAM server is part of this opensourcing or not. The page talks about a 90-day trial.

Also, it is common in other (usually non-D)VCS workflows to lock binary files while working on them, since concurrent changes can't be merged the way text files allow. Does BK support anything like this?


We open sourced everything. So yes, it's there. The commercial site is out of date.

We have not done the centralized lock manager, we didn't get commercial demand for that (yeah, surprised me too). We could do it though, it's not that hard.


Thank you!


The BAM server is part of the open source version. The 90-day trial is for the enterprise version (which is the same version, only with commercial support).

BitKeeper doesn't support locking binary files.


Great news! Better late than never! I hope they (or a client of theirs) create a BK backed service soon. I for one, think we need more than just github and altassian in the market if only to ensure the businesses don't take their users for granted (hint: sourceforge)


There are tons of alternatives to github/atlassian/sourceforge:

https://en.wikipedia.org/wiki/Comparison_of_open_source_soft...


Huh. Thanks for doing this. As a MySQL employee in the early days I used BitKeeper and fell in love with it and kept using it as long as I could. I mainly use Git these days, but frequently miss BitKeeper -- BK felt a lot more natural to me than Git ever has.


Hey Jeremy, long time no talk. We have the MySQL 6.x tree in BK, we can put up on bkbits if you like.


Something I'm wondering and the man page doesn't clear, does it track files across renames or does it only track content like git?


It tracks renames, it's not like git. Every file has an internal identifier, that's the actual file id, the name is a versioned attribute of the file.


What's not clear from your replies is whether it tracks renames like mercurial, by having users run a manual command to ensure the VCS know about the rename. Except if bk has a file system monitor, I'll assume that's what it does. Unfortunately, data on a few Mercurial repositories I looked at (Mozilla's and Mercurial's) shows that people don't mark all file renames.


The only way I'm familiar with instructing mercurial to do this is with

    hg addremove -s
Is there another way to indicate a rename?


How does it detect renamed files in the filesystem?

Thank you for open sourcing by the way, I can definitely see how some features (binary file handling, submodule handling) could be useful for large-scale projects like games.


Each history file contains it's internal name, much like a file system has an inode # that is the internal name for that file. We call the names "keys" and you can dig out the inode key like so (every delta has a key, the first delta is called the rootkey):

    $ bk log -nd:ROOTKEY: -r+ slib.c
    lm@bitmover.com|src/slib.c|19970518232928|52808|f3733b2c327712e5
The key is user@host|relative path|UTC|checksum|64 bits of /dev/random and they are guaranteed to be unique if your DNS config is correct (we don't create duplicate keys, look for the uniq db in the code).

The way we version the entire tree is simple, it's logically (implemented differently for performance):

    <rootkey> <delta key>
    <rootkey> <delta key>
    ...
where each rootkey uniquely identifies a file and each delta key identifies the tip as of that commit.

Not sure if that is clear enough or not, ask away if not.


Not tracking namespace accurately is Git's biggest weakness and would most probably be BK's best argument for getting a tryout. With Git, renaming a file and editing it at the same time tends to make a mess of the history and cause much pain. You can say just don't do that, but I say it happens, and it should just work.


That and a UI that is uniform and sub-modules that work correctly and a system that doesn't let you do things that mess up your data. Oh and blame that is fast:

  $ time bk annotate slib.c | wc -l
  18508

  real    0m0.031s


What happens when you duplicate a file? Does it track history back to the original parent? Can it help with future merges?


You mean copy the file? There is a bk cp command that copies the file and gives it a new rootkey. But from that point on the history of the two files diverge.


Yeah, copy. I see, thanks.


Yes, files are tracked as first class objects as they are renamed.


Can it import from git or SVN or mercurial?

Looking at the bk import man page, it looks like it cannot import from any modern VCS. I see only RCS, SCCS, CVS, and MKS as options. This is unfortunate, as I have a mercurial tree I'd like to import.


We have a git importer but it's not part of the "official" repo as there are still a few corner cases. I am afraid we don't have a mercurial importer though...


You probably should publish the git importer -- it makes it really hard to try bitkeeper if you have to play with a pretend tree.


Yeah, we're trying to figure out how to do that.


+1 would definitely give it a try with this feature


Same here, I want to get back to my BK love


Well, that took a long time... I wonder what changed in the eleven years that Git and Mercurial were deployed to replace bitkeeper.


    "The ability to seamlessly share only a subset of your source tree "
I've spent a good 10 mins trying to find anything specific in the documentation about this but come up empty. Is this just by virtue of using submodules, ssh and filesystem permissions or is there something more that I'm yet to find? The lack of fine grain security on modern VCS systems is one of the reasons our monolithic repository is still using CVS.

On a related note, the getting started documentation should be more prominent on the Web page.


There is an official mirror on GitHub:

https://github.com/bitkeeper-scm/bitkeeper


It's a read only mirror, the read/write mirror is on bkbits.net. But we'll maintain the mirror (or you can, bk has fast-export which creates a perfect mirror in git).


Would you describe the exporting process to git, pretty painless? If so, I'll look at adding bitkeeper support for my git analytics/search tool. I've uploaded some pictures that shows what the git repos that you have hosted on GitHub looks like at:

http://imgur.com/a/nVvov

Since the export process adds "bk: <changeset>" to the commit comment, it'll be easy to tell that it was created via your fast-export tool, which means my tool can easily point you back to the bitkeeper web interface.


Is there a way to do the reverse, and convert a Git repository into a BitKeeper repository? I found the fast-export manual page:

https://www.bitkeeper.org/man/fast-export.html

But it doesn't look like there's any fast-import. Do you have some recommended way if I wanted to convert an existing repository to try BitKeeper out?


By "fast-export", you mean Packard's "fast-import" format, editable with ESR's RepoSurgeon?


The biggest feature for me is the efficient handling of large binary files, because it means I could finally have a completely self-contained repository (clone and everything is in one place, plus free replication), but without the performance penalties which for example Mercurial incurs with binary files:

https://www.bitkeeper.org/why.html

I have to try it out just for that!


This predates git, in fact if it was open sourced from the start git may never have existed, sigh, how ironic.

If bitkeeper was open sourced it could be a powerhouse nowadays, open source and commercially. Now it is too late and honestly irrelevant.


"[...] Linus moved to it and most of the developers followed. They stayed in it for three more years before moving to Git because BitKeeper wasn't open source."

Um, the "because" part is not quite right.


Glosses over things, but is essentially accurate. Lots of people were not willing to use a proprietary tool, which prompted some reverse-engineering, which caused BK to withdraw their offer.

If all free software activists had accepted the compromise of using the free-as-in-beer BK, git would never have been created.


That's not:

They stayed in it for three more years [...] because BitKeeper wasn't open source.

but

They stayed in it for three more years before [moving to Git because BitKeeper wasn't open source].


I think the same points made in Larry's 1993 paper could be made about various Linux distributions:

  Why a gazillion package managers?
  Why not a common filesystem layout?
  Why not a standard desktop?
IMO, Linus should enforce his Linux trademark by forcing every distribution to follow a set of standards. If they don't, they can't call it "Linux". If he got them in a room and said "This is the way it's going to be, or else", they'd do it.


The people that you are looking for are the systemd and FreeDesktop people. The former have a manifesto addressing this:

> The emphasis of systemd to provide a platform instead of just a component allows for closer integration, and cleaner APIs. Sooner or later this will trickle up to the applications. Already, there are accepted XDG specifications [...] that are not supported on the other init systems.

> systemd is also a big opportunity for Linux standardization. Since it standardizes many interfaces of the system that previously have been differing on every distribution, on every implementation, adopting it helps to work against the balkanization of the Linux interfaces.

-- http://0pointer.net/blog/projects/why.html

They have a systemd filesystem layout that they say modernizes the FHS:

* https://freedesktop.org/software/systemd/man/file-hierarchy....

They have a project to rearchitect packaging:

* http://0pointer.net/blog/revisiting-how-we-put-together-linu...

They have the aforementioned specifications:

* https://specifications.freedesktop.org/

* https://www.freedesktop.org/wiki/Specifications/

They have a systemd DNS client:

* https://www.freedesktop.org/wiki/Software/systemd/writing-re...

They even have events where people get in a room to be told "the way it's going to be":

* https://news.ycombinator.com/item?id=10519578

* https://ti.to/systemdconf/systemdconf-2016


Some history from Linus himself https://www.youtube.com/watch?v=4XpnKHJAok8


Interesting — FreeBSD 7 and 8 binaries available for download. Neither of those is a current supported release. It's like offering RHEL 3 or 4 binaries.


We found that by maintaining a build cluster with many old releases we tend to keep compatibility issues out of the code. Also, FreeBSD is very good about backwards compatibility so current releases will run these binaries just fine.

However we will update the build targets as needed by users.


+1 For FreeBSD support +1 For open sourcing BK

I hope it works in your best interest. And I wish you all at BK the absolute best and thank you for all your incredibly hard work over the years.


BTW, sorry to say we don't have a RHEL 3 release for you but in the 'complete list' are you can find stuff for RHEL 4. ;-)

Really they are Debian 4, 5, 6, 7, & 8 but they match up with Redhat pretty well.


RHEL 4 is still supported. Extended support is good for another year.


I see this as "features" https://www.bitkeeper.org/why.html

See large repo support, security and others.

Is that geared towards comparing with Git/Github? Is there a more focused comparison with those. i.e. both comparing to git itself and to GaaS (Git as a Service).


The nested repository feature sounds amazing. Dealing with both git submodules and git subtrees has been a huge pain for me.

I'm looking forward to trying this out over the weekend. Is there some kind of util/script to import history from git?


Working on getting a crappy one out there, we have simple and pretty crappy and complex / fragile but less crappy.


Does this come with any sort of web interface?


Yes, there is hosting at bkbits.net and if you drill down to

http://bkbits.net/u/bk/bugfix/

that's bk/web which is included in the release.


Great to see this finally happen... However, for 'us' Git remains a keeper.


This is very cool... but also, kind of a bit late. The market already adopted git and the momentum is there. Unless there is a trivial way to switch back and forth from git or there is something that is orders of magnitude better, this is a decade too late.


'A bit late' might be understatement of the day :)


Too late?


Too late :)


You and a lot of other people say that. Sure, if we want to take over from Git, it's late. But Git has left us with an opening, the only way Git works for the masses is Github, Git itself is too complicated and people "lose" their data (they don't but Git makes it appear like they did).

I think people will play with BK and find out that it can work for everyone without something like Github (we still need it but it's a nice to have, not a requirement).

We'll see. When I was proposing BK the intertubes said it would never work. I'm a little skeptical of the nay sayers.


The "too late" comments are depressing since they totally lack insight (who hasn't noticed the dominations of gits?). Yet also here in HN specifically you would expect people to cheer for pivoting into something new. Thanks for open sourcing BK.


Nah, it's easy. Before every git command, I just tar up the source. When git complains, I can untar to get things working again. To work with others, I fetch a new tree and then use "diff" and "patch" to merge my changes into the new tree.

(seriously, as an experienced professional developer, I actually do this much of the time)


Seriously? Your life would be easier if you simply learned to use git properly.


Well, it's that bad. When I try to do things the "right" way I'm constantly exposed to the innards of git. I don't care about that stuff. It's complicated. It's a distraction from, you know, the actual task I was trying to do before git got involved.

I've done significant work with 5 other version control systems, including BitKeeper and ClearCase. Nothing is as difficult as git. At this point, I give up. Screw it.

I can do diff, tar, and patch. I have an editor. That'll do. I won't miss the confusing errors. Most importantly, I trust that these simple tools will not eat my work.


It's too late. There is no reason to use non-git DVCS in 2016.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: