"We asked subscribers to our developer newsletter (788 respondents) and professional developers via StackOverflow (169 respondents)."
With all due respect to the FogCreek team, I'm not sure these numbers could be considered a representation of the development community in general. FogCreek makes Kiln, which according to Joel's (FogCreek CEO) blog is "a web-based version control and code review system based on Mercurial and tightly integrated with FogBugz." I know Joel's writing has broad appeal (I know I'm a fan), but it would stand to reason that there would be a disproportionately high number of Hg users in these results, would it not?
We compared the numbers we got from our newsletter survey and those we got from Stack Overflow and they were actually much closer than even we expected. Mercurial gets a slight bump in our newsletter responses, but SVN still dominates across the board.
And if you look at questions on Stack Overflow you've got:
532 tagged [Git]
119 tagged [Mercurial]
Just to give you a (unscientific) sense of the bias in the Stack Overflow community.
I think this metric is not really useful: Maybe git is just harder to use (no git/mercurial please, just my first thought on the usage of question tags as a - unscientific - metric).
The problem is the bias inherent in the sample. The Joel on Software development community, Kiln mailing list, and StackOverflow all have something in common - they were started by Joel, and their initial userbase comes largely from followers of his blog.
I wonder what the results would've been had GitHub polled their userbase? Or if you asked a random sample of Googlers (hint: 100% would say "Perforce professionally", and then a good chunk would say "Git personally")? Or if Fortune magazine had had CIOs poll the individual employees of their firms (I bet you'd see a whole lot more Perforce, and a fair bit of "What's version control?")?
Yeah, not really useful. The number of questions needs to be normalized by number and type of people using it, though, for it to be a halfway useful indicator of difficulty. Even then, the amount of information available about each elsewhere on the web will change whether someone needs to ask a question about it.
Amen. Whenever I read a joel-on-software or FogCreek blog post, I feel like I'm being exposed to a clever advertisement, where the clever factor decreases linearly over time. This isn't the first time I've seen appeal-to-authority arguments, bad statistics, or other dubious rhetoric there.
And, you'll be increasingly coming across Mercurial, since it's been adopted by Mozilla four years ago, by Google in 2009, and is preferred by Windows, Python, and Django developers. (89% of our respondents have heard of GitHub and 62% of its Mercurial sibling BitBucket.)
"Adopted by Google" here seems a bit disingenuous; they offer Mercurial as an option for developers that want to host their OSS projects with Google, it isn't as if Google's in-house source control has moved to Mercurial.
I've heard that the server it runs on is the most powerful machine Google owns. Perforce can't be distributed and has to keep track of all the files that are opened by every user (p4 edit). Must be quite a beast.
When I last used Perforce about five years ago, it did allow for read-only proxies to help reduce the load on the main server, particularly in environments where the developers were geographically dispersed. It worked quite well.
And when I was there a few years ago, it was also the slowest service with the most downtime. grumblegrumble* I hope that they have addressed that by now.
Nope, it's still basically that way. Well, Mondrian seems to have more downtime, but a lot of Mondrian outages are really Perforce outages that propagate down the stack.
> 70% of programmers today are Windows based, 16% use Ubuntu/CentOS or other Linux, and 14% use Mac OS
Is this a skewed survey towards Microsoft orientated people, or is this the normal for the developer industry? (hard to say when you are in a bubble/niche of web startups)
This is probably skewed. We did our best to make sure we were getting a wide swath of respondents by also running the survey on StackOverflow, but even there, our choice of which tags to advertise the survey with probably biased the results. I'd take the OS distribution not as an indication of who in general developers for which OS, but rather just as information on which percentage of respondents were on which OS. I.e., it tells you about the survey, not about developers in general.
For some time, when Apple had its goofy NDA on the iPhone SDK when it was first released, Stackoverflow was probably the best site to get CocoaTouch related help. It's still a pretty decent resource for iPhone development.
I'm not as convinced this is terribly skewed away from the normal. Outside the Bay Area, the large organizations that hire thousands of programmers primarily use Windows for development. For example, go to any Defense Contractor (Lockheed, Northrop, etc) in the country and sample the computers they use for dev work. Each developer probably has a Windows box they do everything on. There might be some Linux boxes mixed in, but those will probably be headless. Or the developer runs Linux in a VM on top of Windows.
While in the Bay Area, we might think these are skewed because everywhere we go we see Mac laptops, in reality, the employees at big corporations are going to outnumber us.
I suspect that the survey is skewed in other directions when it comes to large corporations, though. I would expect to see much more Perforce if the big not-primarily-computer-companies were included, because for many years Perforce was pretty much the only game in town for large, multi-million-line codebases. I still can't imagine Lockheed running their sourcebase off Subversion or VSS; even when I interned at a mid-sized (~100 person) government contractor, we used Perforce.
I think the survey is representative of the geographically-distributed, Joel-on-Software reading, micro-ISV crowd, and not much else. Micro-ISVs often develop on Windows (because that's what their founders are used to) and use Subversion (because it's fairly easy to setup and generally adequate for 1-10 devs). I suspect you'd get very different numbers at YCombinator or Google or Lockheed, though.
Another mega-corp data point: 15,000 Engineers plus everyone else uses desktop/laptop windows XP. Linux is used on the supercomputing clusters and on test equipment in the factory. IT controls the desktop domains, while the engineers control the clusters and test equipment.
I can confirm this (working at mega-corp) and can also add that a lot of specialized and embedded tools are primarily hosted on Windows. Even high-end CAD and modelling tools are primarily on Windows where I've worked.
High-end CAD tools used to be run almost entirely on Unix systems (Solaris, SGI, HP/UX, etc.). Back in the 1990s Windows systems couldn't handle these tools the way Unix workstations could, but once that changed there was no reason to spend far more money on a workstation. Perhaps if Linux had become popular a few years sooner the transition in high-end CAD would've been from proprietary Unix to Linux. See for instance http://www.ptc.com/partners/hardware/current/support.htm where support for various *nix are being phased out by Parametric Technology, Linux included.
Yeah, I remember those days well, we had Unix boxes for CAD and 386s with Win3 for office work side-by-side. I think you're right - the transition to Linux would have been an obvious choice if the timing had been better.
If your time is cheap and your IT department is free, Windows on a blowout-priced Dell is a bargain.
If there's one thing that's killing corporations it's this narrow focus on acquisition costs and a near total disregard for support costs. People moan about Apple being "too expensive" all the time, but the reality is most computers are too cheap.
I hope one of the major vendors develops some alternative to Windows. HP is shovelling billions into Microsoft's pockets with nothing to show for it.
In my experience, with modern versions of windows, support is a wash for technical users.
I used a windows 2k3/2k8 machine every day for 3.5 years at the same company developing software, and never had a support issue with it. On the other hand, I solely use Macs at home (and, since leaving the aforementioned job to work on my own company), and have also never had a support issue.
However, my grandmother has recently gotten ahold of a vista laptop and a MacBook laptop. She has really shown a greater affinity to the Mac.
Here at HP we're using our own equipment on a fairly large installed base so we actually have a reasonably good feel for relatie support costs.
Also, we have an internally supported and externally available standard Linux environment (LinuxCOE) which engineers (like myself), who run multiple OSes in their cube, can use.
Speaking for myself only here, I would say that the alternatives are available but we can't really ignore what the market demands.
Windows is also a "bargain" when it comes to IT costs. Windows has an excellent track record of strong enterprise support, whereas I have heard from an IT dept that there are all sorts of headaches (and compatibility-breaking changes) to Mac operating systems.
Oddly enough, HP is the one of the major vendors that most appears to be actually planning an alternative to windows. They've already said they plan on bringing WebOS to PCs.
Never underestimate a company's reluctance to change. Usually, the bigger they are (or the more clueless the decision-makers are), the greater the reluctance.
Actually, if you have thousands or tens or thousands of desktops and hundreds of applications something as simple as upgrading Windows (or even Office) becomes quite a serious undertaking - so unless there is a really good reason for doing so it's understandable that organization don't upgrade just for the sake of the latest shiny bit of tech.
This is largely a symptom of Windows itself, not a symptom of needing to upgrade something. It'd be much simpler to push updates to all users if you use a self-configured apt repository than it is to push updates to Windows clients. Not so nightmarey at all to upgrade systems with good package managers, actually.
When I was a sysadmin, I could push my updates via Group Policy. It's not that bad. (I'm now in development so I don't know the Microsoft current recommended method).
The cost of an upgrade is the daunting part IMO. We need to buy a bazillion CALs (licenses) if we want to switch to Windows 2008 server-side, for example.
I don't do Windows (not necessarily by ideology, I just don't do it anywhere I'd learn about IT issues), so this is an honest question: Assuming a decent IT policy that closes off the worst of the security issues in other ways, and which won't change even if you upgrade, and assuming you're not a Windows development shop, what is the big business advantage of moving your company from XP to Windows 7?
1) System stability. If your video driver dies on XP, hello crash. On Win7 you see your screen go blank for about a 1/4 of a second and then redraws itself, back in business.
2) If you work remotely, RDP is way better. I've even streamed video over RDP and forgot that I was streaming it from a remotely located computer.
3) If using SSDs Win7 will have better perf and longer lifetime.
4) Virtual folders.
But I would say that if your machines are completely locked down, for example all XP machines are not on the internet, just some closed internal network, then I think XP might be doable. Once you touch the internet though, all bets off.
We've had to wait longer for cheap computers from places like Dell if ordered with XP than with win7. If you're in a hurry and just need a basic desktop for someone, it's something to consider. At least that's how we ended up with a mix of 2k, XP, and win7.
IMO, Laptops. Supporting XP on recent laptops is an exercise in frustration due to drivers, and in my situation is compounded by the lack of standardization on a small number of models.
At my day job we're making the move from XP to Windows 7 because the large number of cumulative patches to XP have, over time, caused all of the machines to run really really slow. Our ICT department has a really cute name for this OS upgrade project: "CPR" (for "Client Platform Refresh").
My guess is that a lot of that 70% represents the OS these people use professionally at work. I'd be more interested in seeing the split between OS used professionally and OS used personally.
In my neck-of-the-wood it is probably higher, more like 85-90% Microsoft. Programmers use Windows in the Enterprise, Banking, Insurance, old media, etc. Startups and Web-dev are on other platforms, but they are in the minority.
I've used Windows at work for my last two gigs. At home I use OSX and Linux (both desktop and server). My wife and kids use Windows (what can I say? She's the boss).
Personally I've all but given up posting on Slashdot and reading the discussions since their latest redesign a few weeks ago. I was sad to do so (I first hit the karma cap back when you could still see the numerical value, for those who want to judge my "Slashdot age") but it's just too tedious to navigate the comments and too buggy to interact at all now. I still find it a useful source of general geek interest articles, though.
I've been posting more on HN recently, but I find HN links and discussions tend to fall sharply into three categories: general technical/business discussions between well-informed people, material about YC start-ups or other businesses that have no particular relevance to me, and flamefests that are full of people who think they know everything and clearly don't know much about anything. The first group is enough to justify my coming here. I'm trying to be better at ignoring the second and not feeding the trolls in the third.
Reddit is a curious middle-ground, because of the subreddit system. I follow a few subreddits that generally have high quality content and one or two that are mostly lighthearted, and I ignore the rest, including most of the main ones.
TechCrunch sometimes carries interesting news, but as with HN, a lot of it isn't particularly relevant to me and a significant amount is just the personal opinion/ego of whoever got to write on TechCrunch this week.
Given this balance, it doesn't surprise me to see the relative popularity of the sites shown as it is in the infographic.
Given that 70% of the respondents are also using Windows, no. But it's a good thing, really. They're weeks behind with the latest news and decades behind with their operating systems.. we can't fail! :-)
Not really, I think it's still the most popular. At least from my personal experience. It's mostly due to age I think. Reddit is slowly gaining I think but I know many people who've never heard of Reddit. Digg well I don't know if they still have fans anymore. HN hopefully will never outrank Slashdot.
This is from an unscientific survey of my friends and coworkers.
Did anyone else notice the that the chart, "Do People Love or Hate their Version Control System?" was horribly misleading? CVS gets 3.5 hearts, and SVN gets nearly 4. Meanwhile, far below on the page: "Only 11% of Subversion users said that they loved using it; the number was zero for CVS and VSS."
I assume the survey had options like love / like / tolerate / dislike / loathe-with-the-fury-of-a-thousand-exploding-suns and the heart graphic was derived from some combination of those rather than just how many chose the "love" option. "Love or like", perhaps. [EDITED to add: actually, more likely an average score, like the ratings on Amazon.]
They state in the last paragraph that Ubuntu+CentOS+Other Linux OS market share in the survey was 16% which puts it ahead of OSX but they chose to separate Other Linux out in the chart so visually it doesn't seem as significant.
Maybe I'm used to always reading Linux market share at 1% so I want to see that big circle even if it is from a non-representational group.
Personally I use github for projects I want to link to on my resume and bitbucket for websites and stuff I don't want to share with the world (yet). Maybe it's because I'm a novice user who only does the basic commands, but I see no real difference between git and mercurial.
I use TFS in my professional job and am pretty surprised that so many people like it. Everyone I work with either hates it or lives with it. Doing stuff like moving changes from one branch to another is unreasonably hard.
Scott Chacon (author of Pro Git, Git evangelist for Github) has said in his Changelog interview that Git and Mercurial are in fact extremely similar. He wrote some kind of tool for letting the two work together, so he had to learn the internals of both to do so. His take was, use either one. Just make sure you use distributed version control.
I work with TFS (migrating to it and some things on it already) and I hate working with it.
Every single thing it does is a little bit worse then the tools its replacing in my case.
The out of the box diff sucks, its slow, buggy, get latest dosn't, commits sometimes just fail to commit, the CI builds fail for random reasons, the ticketing system only lets you remove time from tasks (not track them).
Its stupid you have to use Visual Studio, Shell Extensions AND command line just to make it work correctly.
Anyone considering moving to TFS better seriously consider their needs before doing it. If you are purely a MS shop, are prepared to spend days fighting the tool, or are happy to go with the defaults, and have big beefy servers sitting around doing nothing then consider TFS.
Otherwise go SVN Jira Fisheye and Crucible. Trust me it will be better and cheaper.
For each VCS, you had to pick between love, like, meh, bleh, and ptooey (I don't remember the exact word options). The infographic attempts to just merge those into degree-of-like, which I believe was done by assigning weights to passion and then averaging.
I think you missed the point. The graphic doesn't effectively communicate the underlying data. I was also thrown by the graphic and words. 3 hearts/stars/whatever to me means average. Zero people saying they love a system to me indicates less than average. I don't fault you for trying to communicate it the way you did, it makes sense from one perspective, but from a user standpoint it's confusing when combined with the text.
> Only 11% of Subversion users said that they loved using it; the number was zero for CVS and VSS. Compare that to the over 40% of Mercurial and Git users that love using them!
Is it just me or doesn't this not reconcile with the relevant portion of the graphic?
Same with the "47% of developers use it as their primary version control system at work" Shouldn't that be more like ~20% given that the two pie charts for win/mac showed ~40 and ~15?
HgInit.com is the site that got me to Mercurial. I now use it for all my personnal projects. Unintended side-effect though: pains with working with SVN at my day job became more obvious.
I find it a bit surprising (if not hard to believe) that hg is more common than git in professional use. That said, I don't find it hard to believe that hg users have more love for their tool than git users.. Before becoming an avid git user I tried both and found hg's commands much more lucid.
Git support on Windows was really poor for some time (I gather it has improved though), whereas Mercurial pretty much always treated Windows as a first-class citizen. That may have something to do with it.
On the other hand, I do find it hard to believe that hg users love their tools more. Generally my impression has been that git attracts a lot of people who are, shall we say, passionate about their particular version control system.
Ahh.. The windows bit makes perfect sense. I wasn't aware (not a Windows user). Excellent point about git's community also. As obtuse as the commands can be (Really? git push :branch_name DELETES a remote branch?), I still love it. Also, the relatively high number of people who claim to love cvs in the survey suggests the bar for satisfaction might not be very high here :|
Because the momentum behind git is huge. Like I said, I preferred hg at first glance and still ended up a git user because most of the people I wanted to work with were on-board and github is awesome.
The momentum behind git is huge in a small community that thinks it is much more important than it really is.
Git's UI has always been very poor. Moreover, on Windows, merely installing a basic Git client requires jumping through silly hoops: if I wanted to run Linux on those PCs, I would be running Linux, after all.
For most developers, even those who are quite happy using CLIs in general and running on UNIXy platforms, having a UI that sucks is a serious disadvantage. Outside of those people working on major OSS projects like Linux where Git is the standard and those who like to use GitHub, Git has few compelling advantages over the other serious DVCSes to make up for their much better usability.
Also, just as an aside, Git is the only DVCS that has ever screwed up a project I was controlling with it due to a data loss bug. That puts it in a class with... well, only SourceSafe, really... in terms of how much I trust it to keep my code safe.
The momentum behind git is huge in a small community that thinks it is much more important than it really is.
Hmmmm. Momentum is a moving target. I'd be blown away if there was another source control tool in history who's userbase has grown as quickly as gits has in the last few years. As for important, I don't really think any of my tools are 'important'. They just let me get important shit done.
Also, I don't even know what "Git's UI" is, but I'm sure you're right about it sucking. I do take for granted my level of comfort using CLI for source control.
I'd be curious about the details of your data loss bug. Most git data loss is user error (though I'll admit the ridiculously obtuse commands and concepts make these very easy to accomplish when learning the tool)
> Also, I don't even know what "Git's UI" is, but I'm sure you're right about it sucking.
FWIW, I meant the CLI. I have rarely seen an interface that manages to make a relatively simple idea seem so complicated.
> I'd be curious about the details of your data loss bug.
It must have been a couple of years ago, so I can't remember the exact details and I imagine it's been fixed by now anyway. Basically, there was a problem (duly documented in their bug tracker; it wasn't user error) where Git would refuse to update one copy of a repository from another properly. If memory serves, it was related to switching between branches in some way.
It's possible that I was being unfair in calling it a data loss bug in Git, because I have a vague memory that they determined the remote repository itself wasn't corrupted if you knew how to rescue it. However, the effect was that the local working copy on my development machine didn't have the data in it that it should have, and ultimately I don't really care why my data isn't there, only that it isn't.
The article mentions Kiln (http://www.fogcreek.com/kiln/) as a code review tool. What other code review/collaboration tools are people using with Git and/or Mercurial?
Yes, GIT is way more hassle than it should be. I hate that it depends on cygwin and msysgit is a great start but still a little alien feeling.
I disagree with the blanket statement about Mac users. Developers using Mac are not Mac users by a long shot. For most of us I know it is a case of the Mac being a better *nix dev box than the alternatives.
I agree that many developers on Macs aren't traditional Mac users, but disagree that OSX is a better Unix. I'd say it's an ok-to-good Unix, but one where sound, copy and paste, and wireless networking all work--and that it's these that compensate for other shortcomings as compared to Linux or FreeBSD.
I witched to Mercurial about a year ago. Before that, I used Subversion, and I used it only through the gui tools.
But when switching to Mercurial, the gui tools weren't as good, so I learned how to use it via the command line. And, to my surprise, I found that it's much easier to use the command-line rather than the gui tools. Everything goes much faster.
So if you're a Windows dev, like me, and have never tried solely using the command-line tools before, I suggest you give it a shot. You may be surprised like I was.
tortoisegit improved by leaps and bounds in the last 18 or so months. it's quite feature-complete. i suggest checking it out even if you've been bitten by it in the past.
I find it way easier to visualize a diff in a gui. That's primarily what GitX brings to the party. For almost any feature other than git log and git commit, you still need to hit the CLI.
(I DO think this means there's a great deal of potential profit available to someone who releases a GREAT git gui for OS X)
You probably aren't our target market (people new to Git) - but we are working on Git UI for OSX aimed at making the most common commands easy to perform. http://gitmacapp.com/download
My VCS of choice has Qt-based GUI flavours of almost all commands on windows, OS X and Linux. The only ones I ever use are annotate, log, diff and occasionally commit.
I see that that wasn't clearly written. What I meant was "I don't consider a CLI to be a legitimate interface (in general)". Obviously Git's is "legit" because, as you say, that's how it was designed. But I'm against such designs.
You can't use Git completely without a GUI [1] and not having everything in a GUI means I'm forever typing some command and then typing another command (or several) to make sure what I expect is actually what happened. To "visualize", if you will. In a GUI I could just see it and this would save a lot of time.
[1] Diffs are visual (vi is a GUI). I'm sure some clown will show how it's possible to use ed or something so you can really do everything without a GUI, but the effort that will take demonstrates my point nicely.
I keep seeing posts about large organisations (off the top of my head, python and mozilla) consider bzr, and then discount it.
This mostly seems to be because of its performance.
Having used git and bzr extensively (and a bit of mercurial), I've found it's performance to be perfectly fine for general use on reasonably sized projects, and the ease-of-use to be far superior. Figuring out how to get Bazaar to do something you haven't done before is much easier than trying to track down a Git feature. Although I will admit that Git is improving in this regard, while Bazaar continues to improve it's performance and focus on interoperability.
Anyway, in summary: I really don't get why more people aren't using Bazaar.
If the survey was done properly, then 'other' consists of lots of different things and bzr is just a small chunk of it not worth dealing with separately.
If the survey was done badly, with an 'other' option with no way to write in what you use, then potentially all of the 'other' block could be bzr. It would just be left out by the survey writer not having heard of it.
Not to nitpick, but the "Most Used Operating System Professionally" portion misrepresents the data. The area should be proportional to the values, not the radius. Notice that the Windows XP circle is over 4X as big as the Mac OS one, even though it's only twice the value.
I'm becoming less enamored with git over time. I've used it for years (by choice, and I'm a GitHub user), but have never been a power user.
I'm annoyed at how the very most basic workflows in Git seem awkward, like I'm working against the tool instead of with it. The simplest example is: I have a hacked-up tree, but I know changes have been made upstream, and I want to pull those changes:
It's complaining because I've modified README locally, which was also modified remotely. Every other reasonable version control system I've ever used will happily merge the upstream changes with my not-yet-committed local changes. But Git refuses. This is annoying.
What I have to do now is commit my hacked-up, non-compiling, possibly-swear-word-containing in-progress changes. I really dislike this. To me, "commit" means "take a finished bit of work and add it to the global history." I really dislike having to commit something that is extremely unfinished just because I wanted to integrate some upstream changes.
So what I usually do in this situation is "git stash", "git pull", "git stash apply." This works ok for the "pull" case. But what if I have multiple sets of locally-hacked-up changes? Like suppose I was working on one change when I realized that there's something else I should really fix first. "git stash" quickly becomes limiting, since you can't name the individual changes, so you get this list of changes that you don't know what they are or what branch/commit they were based on. In other VCS's like Perforce, you can have multiple sets of independent changes in your working tree. Not possible with Git AFAIK.
Anyway, I'll probably keep using Git, but I'm not as enamored with it as I once was. I used to figure this was all just porcelain issues that would be refined over time, but it doesn't seem to be getting any better.
It's complaining because I've modified README locally, which was also modified remotely. Every other reasonable version control system I've ever used will happily merge the upstream changes with my not-yet-committed local changes. But Git refuses. This is annoying.
Commit-before-merge is not a limitation, it is a feature of DVCS's. Merge-before-commit is broken by design, as you are modifying your unsaved work, and there is no way to get back to your pre-merge state (say you decide the merge conflicts are too much to deal with at the moment).
So you are correct, git will not let you merge if youmhave uncomitted work that would be affected by the merge, but it's really just trying to keep you from losing work, not trying to annoy you.
That said, this is one of the use cases for 'git stash' which will set aside your uncommitted work. You can then do the 'git pull' and then unstash (stash pop) your work. The advantage of this is that you can always undo the merge if you decide it's not what you wanted afterall. With merge-before-commit, you'd have no such option unless you manually set aside your work.
But what if I have multiple sets of locally-hacked-up changes? Like suppose I was working on one change when I realized that there's something else I should really fix first. "git stash" quickly becomes limiting, since you can't name the individual changes, so you get this list of changes that you don't know what they are or what branch/commit they were based on. In other VCS's like Perforce, you can have multiple sets of independent changes in your working tree. Not possible with Git AFAIK.
This is what branches are for. Do not be afraid to commit work in progress... you can always polish up that work before you share those changes. For example:
git checkout -b feature origin/master
edit, ut oh, interuption,
git commit -a -m WIP
git checkout -b bugfix-1234 origin/master
fix bug, git commit -m "fixed bug"
git push origin HEAD:master
git checkout feature
git reset HEAD^ # removes the WIP commit, but leaves its changes in your working copy.
I hope that gives you a better idea of how you can use branches. You might also want to spend some time reading up on rebase -i. Basically, start thinking of your localc branches as independent patch queues that you can freely edit, reorder, etc until they are ready to be shared. At which point you can push them out to the world.
> it's really just trying to keep you from losing work, not trying to annoy you.
The same thing could be achieved by having Git automatically create an "undo" commit before performing the merge. Then you could revert to pre-merge state with "git pull --undo", just like you can abort a rebase with "git rebase --abort."
Optimize for the common case. Of probably hundreds of merge-before-commit operations I have performed with other VCS's, I can't think of a single time I have wanted to undo this operation (after all, if upstream has changed you're going to have to merge sooner or later -- it might as well be now). On the other hand, I am annoyed by commit-before-merge every single time I perform a pull.
> This is what branches are for. Do not be afraid to commit work in progress...
I'll have to try the "git reset" approach, I hadn't thought of that before. But even that has issues IMO:
1. "git reset" is a data-losing operation if you call it with certain parameters. For example, "git reset --hard HEAD^" would throw away the WIP! I'm wary of making such a command something I type all the time, because there's always the risk that I'll call it wrong.
2. I have to do this dance of "commit, checkout, pull, checkout, rebase" just to pull upstream changes. And when I want to actually push the change upstream, I have to "merge --squash" and then delete the old branch (otherwise lots of branches will build up and I won't know which ones have been committed and which haven't). It's a lot of annoying overhead. I'd rather just work on master where I can just "pull", and branch only if I really want to work on two big changes in parallel.
> HTH.
I really do appreciate that you were genuinely trying to be helpful (as opposed to other replies). But my frustration remains that Git doesn't let me work the way I want to, and makes me perform contortions to fit its way of working.
The same thing could be achieved by having Git automatically create an "undo" commit before performing the merge. Then you could revert to pre-merge state with "git pull --undo",
I'm a regular on the git mailing list, and I've never heard of anyone desiring this workflow. It is unusual to want to merge into your work-in-progress. I'd go so far as to say "you're not using git as it was intended". Perhaps this helps a little:
But really, that's not how git is intended to be used.
just like you can abort a rebase with "git rebase --abort."
Actually, rebase is much stricter than merge -- it won't let you start unless your working tree is completely clean. At least merge only cares about whether the files it needs to touch are clean.
Optimize for the common case.
That's not the common case, you've just been brain-damaged by non-DVCS's into thinking it is. :-)
"git reset" is a data-losing operation if you call it with certain parameters. For example, "git reset --hard HEAD^" would throw away the WIP
git config --global alias.popcommit "reset HEAD^"
2. I have to do this dance of "commit, checkout, pull, checkout, rebase" just to pull upstream changes. And when I want to actually push the change upstream, I have to "merge --squash" and then delete the old branch (otherwise lots of branches will build up and I won't know which ones have been committed and which haven't). It's a lot of annoying overhead. I'd rather just work on master where I can just "pull", and branch only if I really want to work on two big changes in parallel.
"merge --squash"? It really sounds like you're trying to use git as if it's subversion or cvs, and it just isn't.
You can check for merged branches with "git branch --merged origin/master"
If you just want to examine upstream changes w/o integrating them into your current work, you can use "git fetch" and then "git log master..origin/master".
And if you find you often need to be working on multiple branches at the same time, you can always make an additional clone and/or use the new-workdir script in the git.git contrib directory.
Perhaps git just doesn't fit your notion of how a VCS should work, and that's fine. But your annoyances with git seem to stem from trying to use it not as it was intended. :-(
<tangent>Git is a powerful VCS with a rather-awful CLI, but built on top of simple and elegant concepts. Trying to derive a mental-model of how git works from its CLI is fraught-with-peril and will lead you astray. It is worth learning how git works conceptually, and then mapping CLI commands to those concepts. For small projects, it doesn't really matter, but for large projects, git is extremely flexible and you can do things with it that I cannot imagine doing with any other VCS.</tangent>
It does not let you merge if you have uncommitted work, just like git:
$ hg pull
pulling from ...
requesting all changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
(run 'hg update' to get a working copy)
$ hg update
abort: crosses branches (use 'hg merge' to merge or use 'hg update -C' to discard changes)
$ hg merge
abort: outstanding uncommitted changes (use 'hg status' to list changes)
Sorry, you're right that it does refuse to merge with the working copy if the changeset you're updating to is not a direct descendent of the changeset you're currently at (this is what the error about "crosses branches" refers to).
However, if it is a direct descendent, it will try to merge for you:
$ echo b >> a
$ hg status
M a
$ hg pull ../a
pulling from ../a
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
(run 'hg update' to get a working copy)
$ hg update
merging a
warning: conflicts during merge.
merging a failed!
0 files updated, 0 files merged, 0 files removed, 1 files unresolved
use 'hg resolve' to retry unresolved file merges
$ hg resolve -l
U a
In this example, the upstream repository made a change to "a" that conflicts with my local, uncommitted change to the same file. It uses its normal merge machinery to try to resolve the conflict.
As far as I know, Git doesn't allow merges in this situation.
> It's complaining because I've modified README locally, which was also modified remotely. Every other reasonable version control system I've ever used will happily merge the upstream changes with my not-yet-committed local changes. But Git refuses. This is annoying.
No it isn't. Git is trying to be predictable.
> Like suppose I was working on one change when I realized that there's something else I should really fix first.
Branches are designed to resolve this problem. When you want to do something that takes a lot of time, you branch from the stable version. When you need to do something else, you branch from the stable version again.
You are completely ignoring my objection that I don't want to "commit" a tree that is totally broken. What should my commit message be?
"Tree is totally broken and doesn't compile, but Git made me do this to pull upstream changes."
There is nothing logical/reasonable for me to write in that commit message. I definitely don't want that commit to make it upstream. Sure, I could squash later, but why is Git forcing me to do something that doesn't have any value (write a "commit" message for a tree that is totally broken)?
Also, I say that something is annoying to me, and you say "No it isn't." How can you deny that something is annoying to me? Ignoring what a user actually wants because it's not what you think they should want is the classic hacker UI design failing.
It seems that what you're really hitting is the, erhm, ill-advised decision to name most of git's most common actions the same as actions in subversion, but make them do significantly different things. Git's "commit" should probably have been named "mark" or whatever because "push" is what's actually comparable to a subversion commit. I've mostly conditioned myself by now, but I also recall some fun with checkout.
Whether it's called "commit" or "mark", I have to write a commit message, which I don't want to do unless my tree has reached a state where I have actually accomplished something. The commit becomes part of the history, which will be visible upstream unless I squash later, and I definitely don't want upstream to see a series of commits where my tree is completely broken.
The problem here seems to be that you're working against git, not with it. Committing and branching should be things that you do casually in git, by making a big deal out of them you're limiting yourself and crippling git.
If you try to use a screwdriver like you use a hammer, you're always going to be disappointed.
Personally I write crappy commit messages for all those commits I'm going to squash before publishing. git is guilty, but only of giving you too much rope and not enough direction.
Don't pull. Pull is two operations -- fetching the upstream work, and merging it in to your current branch. It's true that CVS and SVN both let you do this with dirty trees, but it's generally not what you should really be doing. Merges are hard and every update from "upstream" merged into your local changes has a chance of breaking your local changes.
"git fetch" and "git remote update" both let you do the fetch without the merge. Once you have the updates, you can then decide what to do with them, which may have lots more options than just a simple implicit merge that most tools provide: rebase, overwrite, ignore for now and handle later when your work is in a cleaner state, etc.
Git has tools to manage this in the case of long running lines of development. Providing them for uncommitted changes is a much harder task (precisely because you can't name your work and roll back to it), and mostly unnecessary because commits are so lightweight. This is your fundamental issue with git -- commits still feel heavyweight to you.
Git's model really is fundamentally not "get everybody's local changes working with the exact same upstream code" (indeed for the upstream author, they may trust their code far more than the code they are pulling). Instead it is closer to "let everybody pick what changes and history to use in order to create their own coherent source". DVCS means each developer chooses what history to treat as authoritative. As such, temporary commits, temporary branches, and rewriting unshared history are encouraged.
No Darcs? Personally I prefer darcs over other distributed version control systems, unfortunately their tooling is somewhat behind and due to the github craze we're forced to use git on many projects,it's very sad as the ease, the features, and simplicity of darcs is in my opinion unparalleled by any of the others listed.
The last time I looked at darcs -- which, admittedly, was years ago -- there was a problem where, under certain poorly understood circumstances that tended to apply to most people with large repositories, performance would fall off a cliff, meaning IIRC that from then on, every merge could take hours. Or years -- I think it went exponential.
The Wikipedia page about darcs says "Although the issue was not completely corrected in Darcs 2, exponential merges have been minimized."
I wonder what "minimized" means here. On the face of it, this seems like a pretty compelling reason not to use darcs for anything important.
I haven't experienced that issue in years. I do remember the issues it had before, and thankfully they seem to have been addressed. I don't know how up to date the Wikipedia entry is, but to the best of my knowledge this issue has been resolved, especially if you use darcs2 formatted repos.
I love Darcs conceptually and I used it for a few years but after moving to Git I have to say I prefer it. Darcs is certainly cleaner than how Git is actually implemented (scripts and perl everywhere) but Git is nicer to work with day to day. The reason is that in Darcs to make a branch you have to basically check out the tree again. In Git I never need more than my repo directory. All branch checkouts, etc. happen right in that directory. That may sound like a trivial difference but in practice this can save hours a week for the way I work. Git can do most of the cherry picking, etc. that Darcs can do.
I do miss the ability to check in changes that I only ever want to have local though. I haven't found a satisfactory way to do that in Git yet.
I agree about the convenience of having multiple branches in a single repo, as opposed to the darcs way which is one working directory per branch. For one thing, I never found a convenient way to switch between branches in an IDE when using darcs. For another, having a separate working copy for each branch consumes disk space unnecessarily. (Hopefully with the popularity of SSDs these days, people are less likely to reflexively respond "disk is cheap," but just in case: with git, all branches of my current project fit on my USB stick; with darcs, they wouldn't. It also makes a big difference for copying times.) darcs repos can be deduplicated using hard links, but that only affects the patch history--the working copy is still duplicated.
[edit: removed question that was answered by sibling post]
Yep, this is one of my biggest issues with git. I can't fork builds and still push/pull from each other. The best you can do is a git rebase to pull in partials, but then you'll never really be able to get new pulls. It's so nice being able to fork, and have independent deviations, while still sharing the remaining code.
That is 63% that don't do code reviews. I think it is put together oddly because I thought that as well at first but later in the description they talk about code reviews.
CVCSes work OK. But they all miss a critical piece. They aren't nearly as good at doing one of the main things that a version control system can let you do: branching and experimenting with your code
I hear this complaint a lot, but I do this all the time with SVN. Just have a directory (tempbranches) where you create your branches. It's probably something I do once a week (and not because I feel limited to only doing it once per week, but most of my work occurs in my local branch -- and I branch that one only when I want to do something that is more experimental, but will take a few days). I'll grant that its not as clean as a DVCS for this, but it works just as easily. The big difference is that we now have centralized accounting of this action, rather than it being distributed.
Um, are you sure? I find branching and merging in git (after three months using it) much, much, much easier than I did in SVN (after 7 years of using it).
Can you expand. Because I've probably done a few hundred SVN branches. And probably 100 Git branches. The big advantage I had with Git was that since it is decentralized, you don't have branches on the server (so no need to have a tempbranch target on the server). But that doesn't seem like a huge deal to me.
The actual activity of branching and merging seem equally easy to me.
With just 100 hundred git branches, I think you're not really using git often and/or to its potential. On a busy coding day, I easily create a few branches. If you also count temporary branches on non-development testing machines (I do driver development on Linux for fun) that can easily reach a dozen.
Now what for? I create branches to test merges with other developers code, branches before refactoring my in-progress stuff (so I can go back and pick up the original branch with all its history in case it was a dead-end). I do new branches for simple fixes, tracking down bugs (versioning all test-patches so you can check the next day/week what theories you've already tested is sometimes extremely useful). Sometimes I just do a new branch to get the feel of starting with a clean slate.
Once a month or so a look at all the branches lying around and delete those no longer useful.
All this is only possible because git branching is _cheap_. Orders of magnitudes cheaper than on any central rcs. And the best part is that git never ever loses anything. git reflog shows you _all_ the states you've gone through to the current one. So all these branches are actually valuable.
Relatively unscientific, but git merges feel smarter. It notices when I've moved a file (without telling it I did), and I experience fewer conflicts that I was used to. Some possible suggestions as to why here:
Also, the fact that commits have an atomic life is useful. If I cherry pick a change into a branch and then later merge it, Git generally knows what's up. With SVN this sort of scenario seemed to confuse things. Another advantage to this model is that it's very easy to compare (what commits differ between these, instead of a big code diff).
The differences may be subtle, but all in all I find myself much happier and more likely to use branches than I used to.
I can cherry pick easily when I merge back. I just select which revision range I want to merge in. I can say revision 1, 2-5, and 11.
I think I do lose history though. And while history is important, if its just history, why don't people say that? People make it sound like you just can't do branches and merging in SVN easily, while you can.
Try to have a team of developers working on various parts, you modify a function to add some code to it, another developer moves the function to the bottom of the file, and yet another developer modifies the code to fix an off by one error.
Git will happily merge the result together, in all of the cases I've seen it will do so without even needing user intervention and everything ends up where it should be. With subversion this is a nightmare. If code changes location in a file suddenly merge'ing in more changes becomes problematic.
Recently my boss and I were working on different parts of the same project. I proceeded to move a whole bunch of stuff around to clean up the entire source tree, in the mean time he had added two more files and made modifications to some of the ones I had moved. I then proceeded to merge in his changes, git happily merged in his changes to the now moved files, and added the new files into the original location I had just moved, I moved them, committed and everything was happy. We tried the same thing in Subversion not too long ago and it was a complete mess leaving huge merge issues that took a developer a while to figure out what was going on.
Try to have a team of developers working on various parts, you modify a function to add some code to it, another developer moves the function to the bottom of the file, and yet another developer modifies the code to fix an off by one error.
We have a team working on various parts of code every day. Like literally every day. Merge conflicts do occur, but they're not the common case. It occurs infrequently enough that its not a big deal. And probably a quarter of the time the conflict is something you didn't want merged.
I proceeded to move a whole bunch of stuff around to clean up the entire source tree, in the mean time he had added two more files and made modifications to some of the ones I had moved. I then proceeded to merge in his changes, git happily merged in his changes to the now moved files, and added the new files into the original location I had just moved, I moved them, committed and everything was happy. We tried the same thing in Subversion not too long ago and it was a complete mess leaving huge merge issues that took a developer a while to figure out what was going on.
That I do see happen. SVN doesn't like changes to directory structure... merges or not.
The problems occur when you make substantial changes in a branch and then you go to merge it back. SVN has several issues how it merges things. Probably the biggest issue is that SVN can't properly merge file renames. Second it does 2-way merges as opposed to Git's recursive 3-way merges (I assume Hg does something similar). Recursive 3-way merges are much much better at resolving conflicts. Finally, SVN has some issues with merging a branch multiple times. This is because SVN doesn't have a natural way to record merges in history. I've heard that this last issue has improved in newer versions but it doesn't sound like it has completely gone away.
I don't understand our obsession with version control systems. I still find that SVN meets my needs 95% of the time. I've used git for my projects that are hosted on heroku, but I've never been so impressed with it that I want to completely move to git.
So I started using git with my latest project, just to get a feel for what all the Ruby on Rails guys were talking about when they weren't bragging about the size of their Macbooks. The koolaid is delicious. Even in a one-man, one-repo world, I spend a lot less time fighting my VCS for dominance.
Example from today: I was working on my deploy branch, where I typically only make to-be-deployed-within-the-hour microchanges. Then I saw another bug, so I squashed it. Commit. Then I saw another bug. So I squashed it. Commit. Then I saw a major opportunity for a simplification with a refactoring. Then I realized the refactoring was under-tested and that failures would be catastrophic, so I started adding extra tests. Now I'm fifteen commits past the last deploy, I have code that I'm 85% positive works on my must-not-fail deploy branch, there is one commit in there which addresses a bug which I want dead, and it is quitting time.
I have been in this state in SVN before. Recovery is NOT fun.
git branch all-the-work-i-did-today #Creates a new branch whose history looks exactly like deploy's does.
git reset --hard production_deploy_92 #Moves head of deploy branch to the tag of the last deploy, essentially forgetting commits afterwards.
git cherry-pick carefully_copy_pasted_hash #Nabs the one bug fix that I really wanted to deploy today.
Git makes my development and deployment processes better. Transformatively better, in some cases.
I don't think you would have this problem if you were working against a branch rather than the trunk. I feel like it is very simple to merge one file or change from a branch into the trunk and continue working on the branch. Rather more simple than your git solution, but perhaps I am missing something. Do you gain something special by working against the master?
Is this common? Usually we have a hard freeze on a release branch a couple of weeks before actually releasing it. I'm working on embedded systems so the test cycles are usually slower due to heavy HW involvement.
In web development, outside of large corporations, I'd say that near-realtime deployment is the norm. Fix the bug, test the bugfix, deploy to servers, all in the same day. git does work well for this. In Open Source projects, as well, folks tend to work directly on HEAD and only go back to old branches/tags for security fixes (if at all). Again, git is tight for this kind of work; which shouldn't be a surprise since it was built for Linux kernel development.
Embedded systems have to be damned-near perfect before you ship. Web systems, not so much.
How many people are working on your projects? Remember that Linus created git to scratch his own itch in managing patch submissions to the Linux kernel. If you've got squillions of people you don't know working 24/7 around the globe to come up with patches to your C-based system, then git was designed for you. If you're a one- or two-man team who svn merges twice a day, you might not care.
Spolsky did a decent job outlining the headaches that DVCS is designed to solve: http://hginit.com/
I suspect you are using git the way you use svn. In that case there's little or no benefit. It's essentially the "Blub VCS" problem. Using Git or Mercurial offer a few very powerful things, but they won't seem like much until you have the "Aha!" moment. If you're interested, read the workflow parts of http://progit.org/book/ and see what you think.
The Blub users are the people who think it's quite a stretch to equate a programming language with a complex and sophisticated tool that can significantly improve or impede your work depending on how well it's designed.
Perhaps it would be a good idea for the readers of HN to try and collaboratively design a VCS survey, and then distribute it among the general developer populace (in the hope of gaining a large sample).
A quick google indicates http://www.surveypirate.com/ as a tool which would allow large numbers of responses for free, or perhaps google documents.
First step would be some questions though.
EDIT: The OP has been updated to solicit responses from anyone visiting the page, which is pretty much what I was hoping for with this comment.
Can you explain why the learning curve on git is high? I have been using it for a couple of months after SVN completely dorked my project. I have not found git hard to use. What aspect of git is hard to learn?
I like git just fine, but it does have over 100 subcommands, way more than any other source control system. If you're using it on a solo project, you're probably using 4-5. Don't worry, some day you'll get the dreaded 'octopus merge' error or some other non-intuitive message.
One thing git has done for me is think a bit about how I work and organize my code. I love the fact that I can work on a bunch of stuff then spend time organizing which parts are going to go into what commits. With me being the only one working on it does make it easier. I am proposing we use git at work so I would be interested in creating these 'octopus merge' situations.
for http://www.virtualrockstars.com I am using SVN, simply because that's what I used before and I didn't know of Git being so awesome at the time. But I must say SVN is still pretty good... I agree that you can't love it compared to Git but its pretty pretty good.
With all due respect to the FogCreek team, I'm not sure these numbers could be considered a representation of the development community in general. FogCreek makes Kiln, which according to Joel's (FogCreek CEO) blog is "a web-based version control and code review system based on Mercurial and tightly integrated with FogBugz." I know Joel's writing has broad appeal (I know I'm a fan), but it would stand to reason that there would be a disproportionately high number of Hg users in these results, would it not?