Hacker News new | past | comments | ask | show | jobs | submit login
Git: The Lean, Mean, Distributed Machine (slideshare.net)
58 points by r11t on Dec 24, 2009 | hide | past | favorite | 27 comments



Another advantage of DVCS that wasn't mentioned in the slide deck is that, with git (and other DVCSes), you are always working in a revision controlled arena. Git's got your back.

When I used RCS, I was always working in a revision controlled arena - all my changes were revision controlled, assuming I checked them in... and after the first or second disaster, I learned to check things in. Frequently.

When I used CVS/SVN, I lost the ability to always work in a revision controlled arena (theoretically I still had the ability, but doing it was such a hassle that it simply did not happen). I ended up checking out a copy from SVN and then locally revision controlling my changes in RCS. Ugly.

Working without git is like driving without seat belts. You can do it and probably nothing bad will happen... but if something bad does happen, it is the difference between walking away a bit sore and an extended hospital visit.


I can't say I understand your point.

With svn, I commit often, as long as the code builds and my own unit tests pass. If I need to make a lot of breaking changes, I use 'svn cp' to create a branch and then I can still leverage svn's merge tracking in keeping that branch up-to-date.

This seems like the same way I use git/hg, but without worrying about excessively divergent local/remote branches across machines -- I can see your branches, you can see mine, they're all centrally accessible by definition.


If you own the SVN repository, I agree, it works well enough. However, if you don't have write privileges to the repository or if it is a shared repository with a lot of other people, it becomes much more difficult.

If you don't have write privileges into the SVN repository, you have no ability to revision control your changes without creating a tracking repository of your own. Creating and maintaining a tracking repository is a hassle, especially compared to a DVCS where that is the normal operation.

If there are a lot of people with write access into your SVN repository, creating branches doesn't scale well because all the branches are public. Creating a SVN branch for every experiment rapidly gets out of hand. Again, you can create a separate tracking SVN repository (hassle) or you can use branches, but it is nowhere near as easy as with DCVSes.

The big difference with DCVSes is that scaling across lots of developers is not a problem, branches are strongly encouraged, and merging works really well.


If you don't have write privileges into the SVN repository, you have no ability to revision control your changes without creating a tracking repository of your own.

This is fairly unique to open source software -- it's not something that occurs in an organization, and should only occur when a user without commit access is doing large-scale modifications to a project.

Unfortunately, the functionality often used to maintain divergent forks where individuals fail to push small hacks/changes immediately back to their origin project.

Creating a SVN branch for every experiment rapidly gets out of hand.

Out of hand how? I've worked in organizations where every single feature received a developer specific branch (eg, branches/tolenka-feature-xyz) and never ran into trouble.

The big difference with DCVSes is that scaling across lots of developers is not a problem, branches are strongly encouraged, and merging works really well.

Given merge tracking, merging works more or less the same as a DVCS.

As far as scaling across developers, I'm not sure what the issue is.


Can anyone compare Git vs Mercurial? This decision is coming up for us, and would appreciate the input.


In terms of what you can actually do if you set your mind to it, they're not terribly different. In terms of the default/conventional things recommended by their communities, they're fairly divergent.

Speaking as someone who's used both and prefers Mercurial:

* Git's branching philosophy (lots and lots of branches, only one active at a time) can be handy. But Mercurial's philosophy (not quite so many branches, and usually as separate clones so you can work on more than one side-by-side) is, I've found, usually better.

* Git's much more encouraging of editing history up to the point where you share code with someone else. Mercurial will let you do that if you really want to, but it's discouraged; the only "easy" way is limited to undoing the last commit. Git seems to encourage committing much, much more often and not particularly caring about the repository history until you're ready to share with someone else (at which point you just selectively rewrite the history to look sensible). Mercurial seems to encourage committing in logical chunks so the repo history is clear and sensible the whole time (but again, if you really really want to do the Git-style stuff, you can).

* Mercurial seems to me, personally, to be a bit more logical in terms of choosing good defaults and exposing abstractions. Git's choices for default behaviors have been criticized even by its fans, and it tends to throw a lot of its inner workings at you on a regular basis, even if you don't really care much.

But, again, this is only my own personal opinion; undoubtedly others will disagree with it.


> usually as separate clones so you can work on more than one side-by-side

the major downside here is that all IDEs I know are excellent at reloading your project from the fs when you git checkout a different branch, but much clunkier about having to reopen another directory.


Which is why I said it's a trade-off.

If you go the git route, then you only look in one place for the code and it's easy to just point at it no matter what's currently active. But the downside is that (so far as I know) no tool other than git can show you what's in not-currently-active branches.

If you go the hg route it's easy to see lots of things side-by-side, but of course you don't have the "it's all in one directory, and you do have to point at different locations to work on different branches.

Personally, I prefer hg's approach here.


If you want branches side by side in git, just clone your repository. Or am I missing something?

The one catch is that the "upstream" repository will be the local one you cloned from, rather than your original upstream. git-clone could easily have a switch for that, though.


Notice that I've never said "git can't do this". I've been talking in terms of the conventions recommended by each tool's community. In the case of git, that's not separate clones for branches...


Not taking a side on branching philosophies (other than that I think it's good to be branching lots), but git does let you do checkouts in multiple directories. It uses hardlinks (in the repo, not in the checkout, obviously) so it's fairly efficient in terms of space and time to perform the checkout.


Here's my experience with each. I'm biased and stubborn here, but in ways that may or may not be relevant to you. Your circumstances will differ, so take any advice about this with a grain of salt.

I've used distributed VC systems for quite a while, at least in the small. I find VC systems, databases, and the like verrrry interesting (due to a past life working in libraries), and an acquaintance of mine was the primary author of Monotone. I had already been keeping my home directory and /etc in svn, but that sounded far more interesting (it was also a full backup!). I tried it, and found using repository-as-a-whole versions more convenient than per-file versions. Later, I tried out hg, and felt it had a nicer interface / was easier to set up for common operations. (mtn assumed you cared to set up strong crypto auth for everything.) I was quite happy with Mercurial - it's a good system.

After using hg for a while (nine months or so?) I tried out git. It was still rough around the edges (the interface and documentation have since gotten better), but it had one major feature that stuck out - its ability to nonchalantly create and shuffle between several local, topical working branches in the same repository. To my knowledge, Mercurial still doesn't have a comparable feature, since their decisions about the core data structures make it awkward -- git saves the repo in terms of data snapshots, while hg saves the diffs (or perhaps vice versa, I forget). There was a major release where the hg team introduced "bookmarks", which were an attempts to emulate the git branching model, but you still had to be careful about copying them from one repo to another, etc. It had different trade-offs. I stopped paying attention around then. (If hg has since gained these, it would be a major point in its favor.)

That sounds overly negative towards hg, though. A major con of git is that the system is heavily skewed towards running on Unix. If you install it on Windows, it uses vim as the default editor for comments, it assumes you're used to man pages, etc. As a whole, the native behavior of the platform is an afterthought. Lots of little Unix-isms. Now, I can't blame them - I use OpenBSD at home, and strongly prefer old-school Unix in general. Also, developing any kind of system with the niceties that Unix provides and then porting it to Windows later is a big pain. Still, Mercurial's releases have consistently seemed to more evenly support Windows. I develop for Windows at work (in a field involving high-performance graphics and associated hardware), and, for my own circumstances, would choose Mercurial over git. Again, I should stress they're both a landslide better than any other systems I've seen.

Also, mercurial seems to place more emphasis on being simple for basic use, yet having a clean foundation for adding extensions. I haven't written any, or looked at it in a while, but it seemed like a good example of the different design styles - git was more powerful but scruffy, while mercurial was clean and more concerned with usability. (Whether this is an emergent property of git being written in C and shell scripts vs mercurial in Python with hotspots in C would be an interesting debate.)

I still use git for my home directory and personal projects, though most systems' flaws aren't a big deal for smaller projects. Feel free to experiment. If you're doing research prior to committing to one or the other for a massive project, switching really isn't a big deal until you have branches with months of churn. Unless you're working on the Linux kernel or an equally massive project (unlikely), it'd be far easier (in isolation - not counting retraining) to just pick one, use it for a couple months, articulate pros and cons, then switch and compare. They have quite a bit in common, and either would be vastly preferable to svn (ack!) or perforce (ACK!!!). You can't make an informed decision without trying either, and they're both quite good.

I looked into Darcs as another system that had interesting features and a pretty nice interface, but given the problems with bootstrapping an even remotely recent version of GHC on BSD (http://hackage.haskell.org/trac/ghc/ticket/1346), there's no way in hell I'd depend on Haskell for something as fundamental as VC.

Also, weird web2.0 juice nonwithstanding, it's a version control system, not a religion. It's a means to an end. You (everyone), you're using it because you're writing something, not because you're trying to be seen scribbling deep thoughts in a damn moleskine notebook.


nonchalantly create and shuffle between several local, topical working branches

That functionality's actually available in hg, it just doesn't seem to get used much (except by people who are being forced to use hg and want to make it just like git). The more common workflow seems to be using different clones for different branches, which has both advantages and disadvantages, as does git's approach; it seems mostly that people choose a VCS depending on which side of the tradeoff they like best.


You're right. I found keeping a different directory for a different branch a bit annoying, but it's not a big deal. (Besides, if you're keeping really, really large binaries under VC, you're probably using the wrong tool.)


Well, again -- you don't have to do the whole separate-clones thing. Mercurial's bookmarks are pretty much git-style branches under a different name, for example. And even though they're provided by an extension they're really not anything fancy: they're just a thin layer of UI on top of normal hg operations (in hg terms, bookmarks just boil down to a way to name -- and thus jump to, merge from, etc. -- heads in the repository).


I use git more than hg, but neither bothers me, and I do use both. There's one major workflow that's significantly easier in hg, which is to generate a local patchfile (amazingly clunky in git sometimes, despite being actually kinda simple too). Hg has some UI fail too, it's just different.

Some of gits front end is (even now) a bit weak, but, it's also quite complete (really) and insanely powerful (for better or worse, you can really shoot yourself in the foot with it, if you poke too much).

There are things that are commonly in use that people shoot themselves with all the time (like stash - which is really an antipattern when you have such cheap branching (hint: branch & commit never causes problems, and rebase / cherry-pick help you squash temp stuff)). You can find anti-patterns in any software though, and I recommend you simply use both for a while, I think anyone unbiased will generally agree they've very very comparable.

If you have a brain cell or two, neither of these systems are going to be a major bottleneck to use, despite what folks might say about their UIs etc.

Git is a little faster at doing some tasks, but this isn't going to really matter to you unless you're managing large repositories. Gits recent popularity growth means there's some neat tools out there too, depending on your platform, and preferences.

As far as windows support goes for git, I have to help less technically minded non-developers use git on windows on a fairly frequent basis, and msysgit works just fine there. You'll want to be aware of details of SSH & key management, and debug with ssh first before debugging with git. That's standard debugging though, and not really anything to do with git, or windows, but the lack of commonality of these tools on the platform. Again, a few brain cells correctly engaged very rapidly get past any such niggles if you're being pragmatic.


> msysgit works just fine there

As someone who recently moved from a linux box to a windows one for his primary workstation, I can affirm that msysgit is shockingly, shockingly slow.

I'd definitely recommend hg if you have a lot of people on windows for that reason alone.


Executive summary: Both are very good. Mercurial tends to be playing "catch up" to git in the more elaborate (but very useful) functionality. Git on Windows tends to be rougher but it is catching up, maybe caught up. If you cannot decide, flip a coin and Just Do It[tm], it will be a win either way.

When I last used Mercurial heavily (around a year ago), its branching was weak so the user model tended to be clone to create a new repo as a branch rather than branching within a repo. Logically, this is the same, but practically it is less convenient... it tends to create multiple directory pollution and then you forget what is in which directory (especially when you have CRS disease). I understand Mercurial branching is catching up or caught up with git.

Mercurial is pure Python (sorta), which makes it more portable (sorta) but slower (sorta). Git is written in C, which makes it faster (almost always), but assumes a "reasonable" POSIXish OS environment, making it hard to port to Windows (solved now).

TortoiseHg is great on Windows. I have played a little with TortoiseGit - it appears to be as good as TortoiseHg on Windows, but that is a recent development.

Git has some killer features, starting with its near magical merging. As Linus points out, branching is easy - it is the merging that is difficult. Hg is on par with merging, so that isn't a big discriminator.

A unique feature of git is the "index" staging mechanism. This is very handy to pick out certain changes (with the "interactive" mode, even down to the line-in-a-file level) and check just those changes into your repository without having to either update everything as a lump or jump hoops to separate wanted and unwanted changes.

Another really nice feature is git's "interactive" mode which allows major patch rework/rearrangement - you can split, merge (squash), and reorder patches. This allows you to be anal-retentive on checking in your changes frequently and then rationalizing the forest of changes before pushing it upstream. (From what I've read, while it looks like you are "changing" history, in the bowels of the git metadata, all the state and sequence information is preserved.)

Git's rebase is very nice and very useful if used with discretion. Again, it looks like Hg has a rebase plugin now.

Personally, I prefer git because it is the repository for the open source projects I'm involved in and because my perception is that it is more popular and growing faster. If your development is in the Python world, you probably will prefer Hg for the same reasons. ;-)

The really good news is that you can move from SVN to Hg or git easily and you can move between Hg and git bidirectionally with only the conversion time as a penalty without losing any information. This means that, even if you pick the "wrong" DVCS, you can "recover" easily.


Hg's branching isn't weak anymore. The functionality of git and hg have converged to the point now that it's more a matter of a decision around inteface and community feel instead of features.

This post nicely lays out the options current users of mercurial have: http://stevelosh.com/blog/entry/2009/8/30/a-guide-to-branchi...

I commonly use all of bookmarks, named branches, and anonymous branches to do in-place switching of my filesystem depending on what I'm working on.

I also gave a presentation on Mercurial last month that goes into much more detail on mercurial pros/cons: http://www.slideshare.net/tednaleid/mercurial-dvcs-presentat...


> Hg's branching isn't weak anymore.

And it was not weak at any time in the past, actually.


Mercurial isn't pure Python - a few hotspots (particularly diffing) are done in C. It seems to be easier to port, though. (Git has a thicker "Unix accent".)


I believe they provide pure alternatives to all of the C (or did anyway) so ytou can have a pure python implementation if needed.

http://selenic.com/repo/hg/file/37679dbf2ee3/mercurial/pure


This is a good introduction to Git. Some of the pictures chosen to illustrate points are hilarious, like the smoking angel, or the RCS shack.


84MB download for a dia show of cc eye-catching but nevertheless offtopic images. Wow! Sometimes I really wish people would stick to that dull bullet points style of presentation -- in at least half of their slides.


Actually, I think this style would have been pretty effective if I was watching the guy give his talk.


15:00 Zulu time: URL is down for me: Error: 500.

Will check later.


One thing I dislike about the presentation is it does not give credit where credit is due. BitMover wrote BitKeeper (I use it atm at work) which was the predecessor of mercurial and git. Naturally there were issues with being proprietary till it pissed off the right people causing them to create a new VCS based on BK. BitKeeper is... pretty damn slow compared to Git. But it was the innovation.

Also if he mentioned clearcase's performance it would be obvious that Clearcase is a hut made out of good intentions (we didnt even use sticks at that point).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: