Hacker News new | past | comments | ask | show | jobs | submit login
Mercurial developer responds to "Switch to git?" (groups.google.com)
207 points by St-Clock on Nov 17, 2013 | hide | past | favorite | 198 comments



Articles like this worry me a lot.

Most of this post leaves me thinking, "Why should I be worrying about things like this?" A fair amount of it leaves me at least somewhat boggled; I suspect I'm not alone here. The idea that a user of a DVCS needs to be familiar with issues like these tells me that something is seriously wrong with the design of DVCSs.

Remember, a DVCS is a tool we use to track and store data related to a project. The project is what we are interested in: the Python or C++ or LaTeX or whatever. Git, Mercurial, etc. are just there to help us keep track of the real work. Time spent dealing with the intricacies of Git is time not spent on the primary task -- the thing that Git is supposed to help us with.

Consider filesystems, which are also tools used to manage stored data. If I'm writing some Python thing, I don't proclaim to everyone that I'm storing my work on a ReiserFS volume, and I'm thrilled by the fact that ReiserFS indexes its metadata using a B+ Tree. I just store my data.

I think a well-designed DVCS would be thought of in much the same way. Clearly, current DVCSs aren't. Something is very much amiss, folks.


Taking your logic a bit further: the final product is what we (should be) interested in. Python or C++ or LaTeX or whatever are just to help us make it. Everything else, whether it be Python or git or C++ or Mecurial or vim or Windows or an ergonomical mouse, is just there to help us make the final product in an easy and quality manner.

I don't see a problem with worrying about git any more than I see one with worrying about Python. Change for the sake of change is bad, but change for the sake of an improved workflow is good.


That's an interesting point. Here are my off-the-top-of-my-head thoughts:

I guess the real goal is proper separation of responsibilities. When I cook a meal, I use devices that run on electricity. I want to be able to power them up by plugging them in without thinking much about it. The people who work at my power company need to think about the intricacies of generating and transmitting electricity, but I don't want to. On the other end, the people who eat the food I cook should not have to worry about the use of a stove timer or a rice cooker.

Similarly, when I put together a software package, I am functioning as a programmer/software designer. I am interested in the code, how it is structured, and what it is supposed to do. Time spent worrying about storage & version management is generally time spent away from my primary task. Someone needs to worry about that, but, for the most part, I don't think it should be me.

The users of the package I write just want to use it. They care that it does what they want. The fact that it is written in Python, developed using agile methods, etc. are irrelevant to them. (But not to me, as a programmer.)

And maybe for the users, the software is a tool to help them in their business. The customers of that business probably should not care about the software at all; they just want whatever service the business provides.


Even when powering electronics in the home, you have to worry about which sort of power the device needs. Does it need a great big 240V AC outlet, or just the regular 120V AC (using American values...)? Is the device DC powered instead? Does it have it's own wall-wart, or can you use a AC->USB converter? Does it use USB micro or mini?

These are things the consumer has to worry about. Of course in practice "worry about" is an exaggeration; this is all pretty elementary stuff and no reasonably intelligent consumer actually loses sleep over it.

The matter of what version control to use? The consumer never has to worry about that. That concern is something that only developers (and related professionals) will have to worry about. I don't think it is a big deal. Just pick whatever fits your needs and that you prefer.

Should the developer have to know anything about version control? I don't know, should the electrical have to know anything about multimeters, outlet testers, and wire cutters?


> Even when powering electronics in the home, you have to worry about which sort of power the device needs. Does it need a great big 240V AC outlet, or just the regular 120V AC (using American values...)?

This isn't accurate as to the electronics that I use. All the adapters say "input 120 - 240V". Non-portable electronics tend to be compatible with the plugs in the same area where you buy them... since they're not portable.


The parent comment didn't refer to the 120V / 240V distinction because of the voltage itself, but rather refers to two distinct kinds of electrical sockets.

120V = NMEA-5, 240V = NMEA-6 http://en.wikipedia.org/wiki/NEMA_connector

When you install a washing machine (or, for example a circular saw in your workshop) you'll have to make sure that the proper socket is available at the place where you want to put it (or you'll have to install one at that location).

In Europe, the corresponding comparison would be between 1-phase (230V) power and 3-phase (400V) power, e.g. normal household socket and the red, round 5-prong CEE connectors.

http://de.wikipedia.org/wiki/CEE-System


In larger teams you can assign merging and conflict resolution to specialists. For example, Linus Torvalds does pretty much exactly this special role (merging stuff from separate departments) all day long and git is excellent at this.

The interesting question is, how much can you automate? Git is the "very little" department, because Linus prefers a deterministic and simple process to an easy, but less reliable one.


> Taking your logic a bit further: the final product is what we (should be) interested in. Python or C++ or LaTeX or whatever are just to help us make it. Everything else, whether it be Python or git or C++ or Mecurial or vim or Windows or an ergonomical mouse, is just there to help us make the final product in an easy and quality manner.

This is my mindset. I do have my preferences, but it is all about what technologies our customers require.

So I tend to say I am language and OS agnostic.

The hard lesson about this came from the time I was still trying to get into the game industry and was spending more time thinking about FOSS technologies for game development, than the game design itself. After getting a foot into the industry, it became clear to me that what matters is getting a game released, the technology religion has no place.

So nowadays I always advocate the goal is to have a project done and solve the customer problems. The technology used for it is secondary.


Articles like this make me wonder if I'm stupid. 95% of the time all I ever do is commit my work and push/pull, whether I'm using svn, git, or hg. I just hardly ever find myself needing to understand anything more complicated than that. Why does it seem like everyone else has spent thousands of hours understanding esoteric git or hg incantations that I've never encountered the need for.


I know nothing about your situation, but I know from my own experience that as soon as you have multiple developers working on a project, plus staggered QA/Staging and Production deploys, things get really complicated really fast.

Then you add in stuff like experimental features (refactors, porting to new framework versions, etc.) that need to be developed alongside regular work without clobbering it, or features that get shelved while they are half-done, and things get just downright messy fast in something like SVN.

So my guess is that you aren't stupid at all, but that you just haven't had to deal with that sort of garbage.

(It's also possible your teammates simple deal with all this other crap for you, and don't bother you with the details.)


Maybe you could let git influence and hopefully improve the way you work.

For instance, when refactoring something quite complex with good test coverage, you want each commit to be the smallest atomic commit that can pass all tests. But you also want to very quickly and "carelessly" add debug logs, assertions, changes in many files. For this kind of task I have found the following workflow to be very helpful and fast:

1. edit files relentlessly running a few tests cases

2. when you feel you "got" something run full test suite

3. if 2. pass, use `git add -p` to add the smallest changeset doing the thing.

4. Use `git stash --keep-index` to remove all other changes

5. Run full test suite

6. If 5. fails `git add -p` to continue editing your commit, or `git reset` if too far off

7. If 5. pass, `git commit` and `git reset --hard`.

Edit: When using `git add -p` you can even use 'e' command to manually edit your patch.


I'm sure that git with it's very powerfull backend has way for improving way someone works, but I'm not convinced by this example. IMO it shows again that usability of git is far from perfect. For example in bzr to accomplish this task you can:

1. (same)

2. (same)

3. put all changes that you are not interested in on shelve: `bzr shelve` (or qshelve if you want QT gui)

4. Run full test suite

5. If 4. fails keep editing

6. If 4. pass `bzr commit` (or `bzr ci`), and to revert shelving use `bzr unshelve`

You need to remember 6+ git command to accomplish this tasks, and only 3 with bzr. Moreover there is symmetry in bzr (fe: shelve - unshelve, commit - uncommit) that is no present in git.

edit: formatting


> 3. put all changes that you are not interested in on shelve

Usually what I want is the opposite: cherry pick a few changes, increasingly adding them to the staging area (or index in gitspeak), because oftentimes I have a change that I am not sure is required for this patch.

I was very unconvinced by the staging area thing when starting using git and tried to avoid it, but in fact it is very powerful and not that complex. I do not know bzr and only used hg for a few months so maybe they have similar tools, but my main point is not git against the other, it is learn power uses of the tools to help improve workflow, against the claim that dcvs should be not our concern.


I feel like a complete idiot coming from bzr to git, it just seems so much better designed.


> Edit: When using `git add -p` you can even use 'e' command to manually edit your patch.

This is not something specific to git. Also, this example as a whole is nowhere near an improvement over an hg + mq workflow:

1. `hg qinit; hg qnew refactoring' to add a patchset

2. Make changes, add to patch with `hg qrefresh'

3. Run tests, if tests fail, pop a patch from the queue with `hg qpop'

4. Repeat until everything is fine, then `hg qfinish' to commit the patch to the history.


> all I ever do is commit my work and push/pull, whether I'm using svn, git, or hg.

And yet, even those simple operations are different between different DVCS. 'hg pull' and 'git pull' are not equivalent, same with push. I try to stay away from git but when I can't avoid it, I have to remember that the sane/safe way to pull in git is apparently 'git pull --ff-only'.. #fail


Are you working with a large team? I find that tends to be when your "VCS muscles" really start to get exercised.


I suppose it depends on what you consider large. At the moment my team is 3 who are doing daily work, and a few more that do more intermittent work. I'd consider that to be small.

I have in the past worked on larger teams (~15 developers), that project used svn and we never seemed to really run into any issues even though we had branches for mainline dev work, and support branches for each "released" version. I'd get the occasional conflict to resolve but it generally wasn't too bad, and I get those with Git also.

I've never worked on a project where dozens or hundreds of remote developers were contributing, but I'd say that outside of a small number of high-profile open-source projects, most people don't. I understand that's what git was built for; maybe that's why all those features just seem like overkill for anything I've never worked on.


I guess 'large' is really a function of number of developers and rate and diversity of commits. I've been in a few situations where two or three of us were putting out several changes to the same piece of code rapidly (typically to meet a deadline). Git has proven invaluable in untangling some of the knots we've put ourselves into in those sorts of situations. If commits are coming in slowly, with plenty of notice, and everyone is working on different parts of the codebase, then you probably won't need to do much fancy stuff very often.


Having to maintain several legacy branches and several feature branches comes to mind. Is this possible to do with SVN? Absolutely. Is it a fucking waste of time and energy to do with SVN? Absolutely.


Consider yourself lucky :). I've just learnt things as I've come across new and difficult situations.

The issues are really in dealing with multiple concurrent streams of development (multiple developers, or different branches of your own development if you have to switch between tasks). It's also useful having good tools to look at diffs and history to try to diagnose any problems.


I didn't need to do anything more complicated than pushing and pulling until I got my current job. Maybe it just depends on the team and codebase you're using.


I can see your point but the fundamental problem is that unless you have a simple linear history a DVCS can't possibly know how to merge and apply patches without some heuristics. At best the DVCS can be a tool to assist you.

There is of course a lot of simple tasks which DVCS can and do apply near perfectly but I suppose managing a complex project you're at some point going to have to have a basic understanding of how the system works.

I've used IBM's RTC which is an excellent GUI VCS tool that I believe internally works similar to git/hg but it attempts to hide how its internals works so on a large merge 500+ changesets and then remerging more commits it refused to push despite saying no incoming changesets. We ended up fixing it by pulling to a different repository based of the branch we were trying to merge from based of advice from support. But I'd prefer a deterministic cli interface over a gui interface anyday.


Code is read more than it is written. Version control systems provide tools to make it simpler to read code and understand why it was written.

Version control systems also define how you branch and merge, which are are hugely important concerns if you're working on a team, and still pretty important if you're working alone.

Git handles branches and history management better than Mercurial does for my needs, and it makes a huge difference to me.


Just curious, what about Git's branches and history management is better than Mercurials with the appropriate plugins turned on?

I've been able to successfully replicate all the Git use cases I can think of in Mercurial even if I think they are bad practices. Conversely, nothing I do makes git's cli anywhere near as good as hg. For me, that is the single biggest glaring problem with Git. I spend a very small minority of my time handling complicated history/branch issues as part of my major workflow and I spend an inordinate amount of time working with the interface to my source control. Git seems optimized for the minority case.


I a not mega familiar with mercurial, but I was under the impression that culturally, rebasing is considered bad. Yes, there is a rebase plug-in, but mercurial users don't generally think it's a good idea to mutate history. Git users tend to branch, make a ton of commits, then clean them up using rebase and then merge and push. hg users tend to... not?

I may be _entirely_ off base here.

(personally I find git's command line to be fantastic, but I often feel like I'm the only one.)


I think the view you present here is outdated. Today, rebasing is used a lot by Mercurial users. Even for the development of Mercurial itself, rebasing is not considered bad, it's considered a requirement for submitting a patch series. That is, the patches are sent by email and applied with 'hg import', and thus implicitly rebased. If that causes too many merge conflicts, you will be asked to rebase it yourself and resubmit it.

Mutating history is front and center in the changeset evolution framework people have been working on for some years now. The first step is that Mercurial tracks when it's safe to rebase or edit a commit: each commit has a "phase" which is "draft" initially, but changes to "public" when you push the commit somewhere. The history editing tools ('hg rebase', 'hg histedit', 'hg commit --amend') will then tell you when it's unsafe to edit a commit.


Exactly. Although rebasing is convenient, it does involve taking destructive actions on a repository which is ostensibly a tool to make sure your data isn't lost. The feeling in the Mercurial community is that this is potentially dangerous and not necessary.

Really, it would be best if there was some way to retain the original commits while rebasing to clean the history. Time for a new DVCS?


> Really, it would be best if there was some way to retain the original commits while rebasing to clean the history. Time for a new DVCS?

When you rebase, commits are not lost. If there is a ref pointing at them then they will continue to stick around. If there isn't then the next time that gc is run they would be removed.

The solution to what you are looking for is to precede each rebase with a command to 'anchor' the current commit under a custom ref. If you wanted to look back in time, you'd use a corresponding command. This is all, say, 20 lines of shell, primarily using the git plumbing..

Of course this is just recreating a permanent reflog-work-alike, reflog of course already having this usecase covered for individual devs working on their machines...

If it is the other usecase of rebasing, rebasing public code, say on a centralized repo or the repo of your build fleet, that has you concerned, then instead of just set a policy of only permitting fast-forward commits in those cases. That is a reasonable thing to do, it is a legitimate workflow that git supports. There is no reason that you cannot keep auditable logs of exactly what has been going on with your git repo.

I think people hear "rewrites history" and let their imaginations run wild with sci-fi tales of wonder, only then to think of the grave horrors such power would enable... but forget to look at what actually is happening when you "rewrite history" in git. There is no reason to fear it.


> I think people hear "rewrites history" and let their imaginations run wild with sci-fi tales of wonder, only then to think of the grave horrors such power would enable... but forget to look at what actually is happening when you "rewrite history" in git. There is no reason to fear it.

Yeah, I very much agree with this. Rewriting history is a tool, a very powerful tool. It's something you can use if you want to.

It reminds me a little of a discussion I had recently with a developer who mostly used statically typed languages. I'm using to dynamically typed languages like Python and JavaScript, so it was puzzling for me to hear him talk about the horrors of dynamic types. He said things like "I pass a Person object to a function and the function might do anything with it -- like adding new methods and fields to it!". Yes, it is true that you can add a new method to an object in most dynamically typed languages. No, adding methods and fields by accident is not really a problem in real life.

Just because you have the option of doing something, doesn't mean that you must do it.


Why bother? Even if the history of some off-branch is messy as heck, it can be summed up in the merge anyway.


To use bookmarks, I have to convince the whole team to use bookmarks. Why not convince them to switch to git instead? I use Mercurial plugins to give me commands that make my life better, like `pull --rebase`, `shelve` and `strip`, but getting coworkers on the bandwagon requires explaining which extensions to install instead of pointing to the man page for the appropriate command.

My git workflow is probably replicable in Mercurial, but git does it by default.


(OP here) You don't need to tell the rest of the team if you're using bookmarks, just like you don't need to tell the rest of your team the names of your Git branches.

Bookmarks stay local by default, but they can be exported to the server if you like. There's nothing to enable any longer since it's a core feature.


Mercurial also has named branches. Which suck. I'd rather use bookmarks, but my team uses named branches, so that's what I have to use as well. This problem wouldn't exist with git.


So, why do named branches suck? We use both, named branches and bookmarks, but for different things and it works great.


> which extensions to install

I don't think you need to install those extensions, you just enable them by putting a line in .hgrc


Mercurial ships with 30+ standard extensions and you simply enable them as needed. More experienced users can download and then install third-party extensions as they like.

This of them as major modes for Emacs, if you're familiar with that. Emacs comes with syntax highlighting support for a lot of languages and the mode is turned on automatically. Sometimes you need to install a new major mode yourself, but the bundled modes get you a long way.


I find git's CLI pretty efficient: most of the time it is just commit/rebase/push/pull.


The reverse question is, what about Mercurials UI is better than gits with the appropriate aliases and wrapper scripts?

In both cases the answer is pretty much nothing.

Git is not optimized for the minority case, it is optimized for simple implementation. It basically leaks implementation details left and right. I believe git users come to like this, because the core model is so simple (for a computer scientist). Git internally is just snapshots of the code connected with prev-version edges forming a directed acyclic graph and then some named pointers into this DAG.


> what about Git's branches and history management is better than Mercurials with the appropriate plugins turned on?

I don't know which it better, but at least different: git has automatic garbage collection for dangling commits (after 30 days or so by default).

With Mercurial, if you want to remove dangling commits, you have to do it by hand. And if they have been pushed, you have to go to the remote repo and manually remove them there too. And everyone else who has pulled them, will need to manually remove them.


If you're just storing data, your analogy is fine.

It breaks down with more advanced usage like integrations which can be very difficult with large teams.

I manage a VCS with thousands of developers committing to a large code base. The VC system starts to really matter and you have to understand how it handles things like conflict resolution or things can get out of hand really quickly.


DVCS is a relatively new technology. We're still figuring out the best ways of designing and using it.

To continue your filesystem analogy, filesystems were invented sometime around 1960, but it was over 20 years before everyone settled on Unix-style filesystems, where files are just a sequence of bytes. Eventually, they realized implementing version control or record structures at the filesystem level was unnecessary. DVCS implementations may similarly grow more alike and become simpler.


> Eventually, they realized implementing version control or record structures at the filesystem level was unnecessary.

I wouldn't say 'unnecessary'; it's more the case that the quick and dirty Unix and DOS solutions 'won' by deign of being easy for implementers & users.

And ever since then we've been hacking-on the features that were lost because of this.

Two features we lost:

1. Generational data groups, as on MVS. The OS automatically keeping a history of changes to the 'file'

2. Addressing any file as an RDBMS table, as on OS/400. Amazingly flexible.


1. is addressed by svn,git,etc and 2. is addressed by SQLite,BerkeleyDB,etc. The advantage of the user-space solution is that they are portable across OS boundaries, version management is independent of OS, and more diversity. How is it an advantage to put this stuff into the OS?

("in the OS" probably always means "in the kernel" in this context, not "in system user-space libraries")


> Consider filesystems, which are also tools used to manage stored data. If I'm writing some Python thing, I don't proclaim to everyone that I'm storing my work on a ReiserFS volume, and I'm thrilled by the fact that ReiserFS indexes its metadata using a B+ Tree. I just store my data.

No, but you do see thousands upon thousands of pages of text written about how to best use a shell to interface with the file system. This is what I think we are dealing with, we are on the cusp of actually having competent tools for the task of this collaborative, everybody works on their own computer but everything needs to be merged together in the end, and still work, thing that we have been doing for years. Give it a few years to settle down and either csh (Hg) or bash (Git) will win out and there will still be odd birds out there using sh (SVN) or zsh (Darcs) (feel free to permute those assignments to your taste).


Only one tool of each category will "win" out? That's a mighty dystopian future.


Well, sorry, but as I see it, that is kind of the way of things. The things that coexist/compete share ideas so strongly, due to chasing each other's features, that the things they were before cease to be and they usually, eventually, converge, at least in a functional sense which is all that matters to users. That or one is forced into a minority status. Not really dystopia in either scenario.


All restaurants are Taco Bell is what you want. :)


DVCS's solve a hard problem where there isn't a single obvious way of thinking about it. There's no easy solution to managing multiple concurrent streams of development. And everyone has their own different ideas about what's the best way to solve it.

I've used tools that are "simpler" than git: subversion for one. My experience has been that this simply means that they don't handle more difficult version control scenarios that do occur in practice. Subversion, in particular, just has a tendency to throw up it's hands and leave you to manually merge changes in any non-trivial merge scenarios.


Software development with more than one person is inherently a distributed system. People work on outdated versions and in parallel and conflicts arise. Development requires branching and merging.

It is probably impossible to package this complexity into an easy UI. I like git, because it has a simple core model, but the UI could certainly be easier. Mercurials model is not so simple (e.g. multiple types of branches and commits), but the UI is easier. Subversions UI is even easier, but the model is so simple that some things are impossible (the distributed stuff).

(can we assume on HN that people have understood "simple != easy"?)


There is only one type of commit in Mercurial. Please don't suggest otherwise.

The commits form a directed acyclic graph (a DAG) and you expand the graph when you make new commits. Subsets of the graph can be said to belong to the same branch (a named branch, the ones you create with "hg branch"). By default all commits belong to the branch named "default".

Any given subset of the commit DAG might have multiple heads (commits with no children), e.g., the subset of the DAG that is the "default" branch might have multiple heads representing multiple features that haven't been merged yet.

To help you pick the right head when you update, diff, etc, you can give it a name — you add a bookmark to the head. The bookmark is just like a post-it you stick on your monitor saying "commit e9ade5b11421 is better known as fix-broken-build". It's just an alias for the commit ID, in the same way that a Git branch name is a convenient name for a commit.

To be useful, bookmarks move when you make new commits. After 'hg update fix-broken-build', a 'hg commit' will move the "fix-broken-build" bookmark so it points to the newly created commit. This is again very similar to how a Git branch pointer will advance when you run 'git commit'.

I hope that helps a bit!


A filesystem is a tool we use to track and store data related to a project.

A DVCS is a tool we use to track and store data related to a project, collections of that data, changes in that data over time, changes in that data made by others, pointers to certain states, and the relationship of all of those things.

What you're really saying is that the filesystem API/UI is simpler than a DVCS API/UI.

My argument is that, a filesystem can be reduced to a simpler UI because it does fewer things.

We may not have found the simplest possible DVCS API yet, but it by nature is going to have to be more complicated.


I'm not sure to get the point. If you are relying on a tool to reach your end goal, you should be as familiar as needed for your purpose. I think the more you will rely on the tool, the more you should know it's inner workings.

For your filesystem analogy, if your Python app uses the filesystem as a storage layer you should be clear about the features you expect. When a DOS user will come telling you his data is not stored, you'll have to explain you use long filenames with non compatible characters. And perhaps symlinks for your folders, or access control for user management. And suddenly "just store my data" is not good enough, you'll have to care how and with what limitations.

For any abstraction you heavily rely on, eventually you'll have to understand where it leaks.


Spot on. The whole thread can be summarized as "dev do not get either mercurial not git, complains they want to use git because they have more workflow articles on reddit to copy paste solutions"


A skilled painter cares about the origins and chemistry of the minerals and oils used in his/her paints. A skilled guitarist cares about whether the pickups were made with alnico or ceramic magnets. A skilled knife-maker cares about the type of stone used for grinding steel.

Tools are important to the craftsman. Plus, some people care about the entire process of building and not just the final product.


All software has its quirks.

And it shows when you move out of the ultra comfort zone. If you want to drill a hole in the wall you need to take into consideration the size of the hole, the power of the drill and what are you drilling in. You cannot abstract them.

And then one day someone decides to use your python code not on ReiserFS but on Ext2 that has no journaling and all hell could break loose.


yeah, as a rule a user shouldn't care about implementation details. to be fair this is mainly possible with git in my experience... but not always. my experience with hg has been positive in that regard


That is primarily a git thing. And the reason is that your main assertion was not true for the primary developer of git. "The project is what we are interested in: the Python or C++ or LaTeX or whatever" was not the case for Linus, messing with VCS and patchsets is precisely what he was interested in, that's basically all he does.


i am worried about the popularity of git to be honest. i'm convinced it is popular rather than good.

"I've really tried to 'get into' mercurial's mindset several times now, but never could, whereas, IMO, git's model is simple and powerful. "

I find it hard to understand what this means, but this is typical of the arguments i see for using git. really the core concepts of DVCS are the same no matter what tool you use, and actually i consider the things that git does differently to hg to be demonstrably and measurably counter-intuitive and in some cases dangerous.

the biggest problem by far is that git is dangerous out of the box - it can destroy your work very easily or leave you in an unrecoverable state.

why can't i roll back a merge in one step if i didn't configure things to be able to do that? why does my branch disappear when i merge? how do the jenkins guys break their repo so casually and find themselves struggling to recover it?

i believe this is the robustness mentioned here: "The changeset graph is in some sense more "robust" in that it's just there and doesn't change on its own initiative"

mercurial seems loathe to alter history - which is pretty sane and common sensical seeming to me, git does it as part of how it is 'supposed to be used' which frankly sounds as mad as travelling back in time to shoot your grandfather. for these reasons - to me - using git is asking for trouble (it has caused me trouble and i switched to hg for precisely this reason).

i cant reasonably recommend git to anyone... which is a shame. other than that flaw it is really quite rich and powerful and has other advantages over mercurial - including (perhaps foremost) its enormous popularity.

EDIT: watch as this gets downvoted from hipster gut responses instead of thought :D


> (...) the biggest problem by far is that git is dangerous out of the box - it can destroy your work very easily or leave you in an unrecoverable state.

It's almost impossible to leave git in "an unrecoverable state", short of actually deleting your ".git" directory.

If you do wind up in a bad spot, that's what the ref-log is for [0]. If you've made an error that is difficult or complicated to unwind, no sweat: hit the reflog and go back in time to the point where things were the way you wanted.

git is many things -- frequently useful, often complicated, sometimes obtuse. But it's almost always safe.

[0]: https://www.kernel.org/pub/software/scm/git/docs/git-reflog....


I will concede that it might be safe in this sense - unrecoverable is too strong a word.

My real world experience of this is that twice I've seen people frantically trying to work out this same problem of reverting a change and finding it difficult even to get a copy of the repo that matches the desired state. Granted it works easily in the most common case, but I've never seen thus problem arise with cvs, svn, hg or p4...

Its sensible to use simpler tools if they are more practical, ie I have to learn less to get what I want from those tools.

This seems to be a general problem with git currently - many commands need lots of switches or background knowledge to become useful and its really not necessary IMO. It would be nice to see that improve...


I appreciate the effort you have made in trying to educate me here - reflog does look like it might have helped to solve my problem although it is extremely non-obvious even after running it. (i.e. i know i need to google or read something else rather than how to solve the problem).

I have recreated the problem I originally had and documented precisely why it really is a usability problem. Now that I have done this I am pretty sure that my anti-git sentiment is warranted - even if my initial comment was massively too harsh.

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html


I agree that git can be characterized as having some usability issues; I initially found the documentation obtuse in many places.

My favorite git joke, by far: "git gets easier once you get the basic idea that branches are homeomorphic endofunctors mapping submanifolds of a Hilbert space."

If you're having trouble understanding git but you find yourself comfortable with basic CS terms, you might want to check out Git for Computer Scientists [0]. This really helped put it in perspective for me.

[0] http://eagain.net/articles/git-for-computer-scientists/


> git reflog

This, a thousand times this. Unless you happen to randomly run `git gc` for some odd reason, you have a few months (with default settings) until your repo is in an unrecoverable state, unless you touch the .git directory directly.


Even if you do run git gc, by default git will still keep the reflog and all objects reachable from it around for 30 days.


a few people have suggested this. disappointed that my initial google searches did not reveal this to me...


If your comment gets downvoted, it's much more likely to be because it is simply inaccurate.

Git is remarkably safe out of the box. Git makes it damn near impossible to lose anything other than intentionally. No matter if you rebase, reset --hard, or any other "destructive" operation, you can always get back to your prior state with nothing but a quick look in the reflog. Even if you start doing the really dangerous stuff like filter-branch, there again Git goes to pains to prevent you from losing data, by creating a refs/original folder to preserve your old refs (though you can also just use git fsck). Git goes above and beyond the call of duty when it comes to preventing data loss. Short of intentionally running the garbage collector with --prune=all right after rebasing, you won't lose data.

As for the core concepts being the same, you're talking about usage whereas the person you quoted was talking about the actual implementation-level object model. Git's model is remarkably simple, and that simplicity is the source of many of its benefits. Every other VCS out there is more complex.


To my knowledge, reflog won't help you get back uncommitted work that you nuked with git-reset --hard.

Of course --hard is not on by default so... there is no issue here with git's default behaviour.

> "Git's model is remarkably simple, and that simplicity is the source of many of its benefits. Every other VCS out there is more complex."

Yup, exactly. The simplicity of git's model is hugely important. It allows me to not only know what git will or will not allow me to do (knowledge that in other systems is acquired by studying the docs for each particular command to see what all the flags do) but also lets me figure out what sort of novel operations are possible if I am willing to write a little code of my own.

Example: Update hook that only allows a push if two or more colleagues of the committer have pushed annotations for that commit already? With little more than my knowledge of git's data model, and what update hooks are in the first place, I know that can be done and I know how I could do it.


> To my knowledge, reflog won't help you get back uncommitted work that you nuked with git-reset --hard.

Every version control system has a way to overwrite a dirty (modified) work directory, so git is hardly any more dangerous than any other tool.

It also won't help you get back uncommitted files that you destroy with `rm`. No other tool does, either.


To be clear, I think that the behaviour of git-reset --hard is absolutely appropriate. I am just correcting the overly strong statement: "No matter if you rebase, reset --hard, or any other "destructive" operation, you can always get back to your prior state with nothing but a quick look in the reflog."

There are situations that reflog won't get you back from, but those situations are perfectly logical and appropriate exceptions.


> No other tool does, either.

Darcs does: if you "darcs revert" to erase your local unrecorded changes you can actually "darcs unrevert" and get them back later. I guess it's a little like "git stash" but not quite, as you can only have 1 saved state and many darcs operations will nuke the unrevert save (loudly and with prompts, so it won't catch you off guard).


You can actually get back files that you rm. I've seen it done. Apparently it has something to do with unmounting the drive, restarting in single-user mode, then finding/knowing where it is on the drive before the OS overwrites it (hence the unmount).


Doesn't generally work with modern FS's that use more "clever" allocation schemes.


I'm not sure what you mean. The 'clever' filesystems I'm aware of are copy-on-write, which makes it much easier to recover deleted files. Unclever filesystems, like NTFS, are also quite easy to recover files on. I've had to do so multiple times.


Yeah there's not a lot of tooling for it. I recall writing my own to analyse a disk image to find traces of an intrusion, so its possible if you put in the leg work even for modern complex FS that do stuff like store small files inline in inodes.


That's not correct, you can get back to any ref in the reflog, including those that you moved away from with git-reset --hard. Those commits are just dangling but are easy to get back to. I've got a presentation [1] that goes into how to use the reflog and how the various flags in reset work (among other things) that could explain more.

[1]: http://tednaleid.github.io/showoff-git-core-concepts/


My point is that git-reset --hard can remove data from the working tree filesystem that never hit the DAG in the first place:

  $ git init                                          
  Initialized empty Git repository in /home/john/tmp2/.git/                    
  $ echo "foo" > foo                                  
  $ git add foo                                       
  $ git commit -m 'init'                              
  [master (root-commit) 84fb5d1] init                                          
  ...                                                                          
  $ echo "bar" >>foo                                  
  $ cat foo                                           
  foo                                                                          
  bar                                                                          
  $ git reset --hard HEAD                             
  HEAD is now at 84fb5d1 init                                                  
  $ git reflog                                        
  84fb5d1 HEAD@{0}: commit (initial): init                                     
  $ git status                                        
  # On branch master                                                           
  nothing to commit (working directory clean)


Ok, yes, I agree with that clarification (and I missed that you specified in your original comment "uncommitted work", my mistake).

If you git reset --hard and you have a dirty working directory, you can absolutely blow work away. That's one of the main reasons to use git reset --hard, but I agree that it needs to be used with intention.

The easiest way to protect against it is to never use it if you have a dirty working directory, always commit first and then reset --hard after you've committed.


I guess maybe the counterargument is that perhaps this should happen invisibly and automatically.


You just ran a script that scattered production / sensitive information in your source files. You definitely don't want to get that into history and get the last head state back.. Yes, you can delete the working tree and check out head again, but than again, git reset --hard pretty much does that and it's welcome at times.


Well, there is to my knowledge no other vcs that does this. Mercurial will hapily blow your changes away if you do a hg update -C. If you want your changes to stick around, you do a git stash first.


This is true, but seems specious. By this definition (development tools that can irrecoverably destroy data in the working directory) you should be flaming against vim too, not to mention rm or the dreaded "clean" target to make.

It just seems silly, sorry. And while I know little about hg I suspect it's not even true: hg will never delete a file it doesn't understand? What about the equivalent of git clean (which I use daily -- it's loss would be a minor hardship and definitely not an advantage for mercurial)?


https://news.ycombinator.com/item?id=6752077

Understand that I am very pro-git. I'm just trying to be precise here.


i was too strong with 'unrecoverable'

'unrecoverable in acceptable timescale with acceptable resource' would be accurate.

specifically i found it difficult to undo a merge and found that i was in a team where people had decided to use tools they didn't know enough about to have been using at all.


git reset --hard at least has the benefit of sounding like it's going to do something scary. "git checkout ." will also completely nuke your working directory without warning, and looks benign enough that you would never expect that problem.

Which really, at the end of the day, is the real problem with git. It's not that it's inherently more dangerous, it's that the CLI is so inconsistent it can be hard to remember what you're doing, what's safe and unsafe, etc. Sometimes you need to pass "--force" for dangerous stuff, sometimes you capitalize the argument (like force deleting a branch). Sometimes "--hard" indicates that you should do the dangerous version of something. Sometimes there's not even a safety switch.


I'd be surprised if mercurial didn't have a similar “feature”.


I'd recommend learning "git reset --keep" instead of --hard because of this reason.


Certainly. --hard should only be used if you actually want what --hard provides.


You're completely right, I should have clarified that I was only talking about committed data. My mistake.


Your definition of safe is not mine. I have no doubt the data is preserved - 'unrecoverable state' is too strong.

Its not safe in the sense that if i make a change on submission day and need to revert it then its more complicated than it needs to be. Granted I should not be using something I don't have sufficient mastery of in that kind of scenario, but its not always so simple.

My data being 'safe' is worthless if there is a steep learning curve to recovering it.


The learning curve is not that steep. Compare it to learning vim or emacs, for example, and it's no contest. The learning curve is basically: 1) read Pro Git, 2) play with git for a little while. That's about it. If you can't afford that learning curve for something that is a core part of your workflow, you're doing something wrong.


I appreciate the effort you have made in trying to educate me here.

I have recreated the problem I originally had and documented precisely why I think it really is a usability problem. Now that I have done this I am pretty sure that my anti-git sentiment is warranted - even if my initial comment was massively too harsh.

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html


It looks like your troubles came purely from not understanding the concept of fast-forward merges. I completely agree with you that the documentation does not make this as clear as it ought to. However, the standard book on git (Pro Git, available free online) covers this (and most other git concepts) very well. This is simply a case of not understanding your tools (which you did acknowledge in the post). This is a widespread and deeply problematic situation, not just for version control systems but for almost all software, and it seems to only be getting worse. Takeaway lesson isn't that there's anything wrong with git, but rather that there's something wrong with an industry that seems to think we should be able to use tools without first learning how they work.


I'm guessing the reason you're getting downvoted is because it sounds like you don't know how to use git. That may not be true, it just sounds that way.

> the biggest problem by far is that git is dangerous out of the box - it can destroy your work very easily or leave you in an unrecoverable state.

Please explain. How is it dangerous? How can it leave you in an unrecoverable state?

> why can't i roll back a merge in one step if i didn't configure things to be able to do that?

Depending on how many commits the merge had it'd be something like

git reset HEAD^^ --hard

> why does my branch disappear when i merge?

What do you mean? That the branch you merged into doesn't indicate which commits were done on the other branch? Branches don't disappear.


I think this gets to one of the issues people who have previous source control experience have with git. They apply the concepts from, say, subversion to git, and then have trouble.

Take branches for example. In subversion, branches were heavyweight, scary things. In git, they are extremely lightweight, in that they are really nothing more than a pointer to a specific commit. They almost shouldn't even share the same name, they are so different.

Basically, I found that I got much better once I stopped trying to apply my old understanding of subversion to git, and especially once I just started playing around with git commands on a dummy repo, to fully understand what was going on. Now that I've done that, I would never switch back to subversion. Perforce, maybe, but only for certain very specific scenarios.


They apply the concepts from, say, subversion to git, and then have trouble.

I have found this to be true. I helped a lot of people at work switch from svn to git. As I showed people what to do the biggest problem was trying to make git fit in with what they knew from svn. Now I start every conversation with forget svn :)


The specific experience I have is being able to revert a merge painlessly with a single operation out of the box. This was unreasonably challenging with git. AFAIK there is no command for this out of the box and my research led me to believe I could only fix it for future commits by making other changes. That's not acceptable and utterly destroyed my confidence in every other feature. Why in as source control solution would you ever implement a feature which makes changes unrevertable by default? Its also a great example of the implementation impacts ng the user when it shouldn't - the only reason I need to know anything beyond what command to use here is poor design.

My problem is not from applying svn logic to git... Its from not knowing it well enough and expecting convenient defaults out of the box. Just using Mercurial solves this for me.


Oh, reverting merges! I've been there, and it is completely possible, out of the box, with git. But! It requires a relatively thorough understanding of how it works, or you will definitely mess it up. I wrote up this summary a while back for our company wiki:

Imagine this scenario:

- Team Sandy does a ton of work on branch foo.

- Team Sandy ensures the release manager that branch foo is perfect and bug free.

- Team Sandy merges branch foo into branch develop and pushes.

- Other developers continue committing to develop.

- Massive devastation and destruction start occurring due to massive bugs scattered throughout Sandy's code.

At this point, we want to get all of Sandy's code out of develop, as we can't post to production until develop is Sandy-free. What do we do?

- Revert the merge commit

The important thing to note here is that this basically undoes the effect of the merge, but NOT the history. If you were to try to merge branch foo back in at a later date, git would do nothing and tell you that everything is already merged in. This is technically correct, but confusing.

Let's imagine that Sandy goes back to branch foo and commits bug fixes to fix the devastation. Now, branch foo is ready to be merged back in. What do we do?

- Revert the revert commit (yes, really). This basically undoes the effect of our original undo. At this moment, develop has all of branch foo's code pre-bug fixes.

- Merge foo into develop. This merges the subsequent bug fix commits into develop.

A much more thorough discussion can be enjoyed here (I highly recommend the read): https://www.kernel.org/pub/software/scm/git/docs/howto/rever...

--

Basically, it can be done with one line on the commandline, but I wouldn't recommend anyone do it unless they've read the above (specifically, the discussion I've linked to).


'- Revert the merge commit' this is what i had problems with. i got an error message back from git that you can't revert a branch commit and when i googled it the conclusion i came to was i needed to go back in time and change our configuration to make this possible. its entirely possible that i ended up down the wrong path from google searching.

now, googling around i find some stack overflow answers and such i'm pretty sure i found at the time i had the issue. when i am next there (at work) i'll see if i can find a copy of that repo or recreate the same problem so that i can show what happened and why it is difficult to solve and dangerous in a production environment. actually... i might just try and repro it now, but my connection is so poor here (at home) that even installing git is likely to be painful.

however, ignoring that, the process you describe is horribly counter intuitive. well done for working it out in the first place, although i do see this suggested in a few of the stack overflow answers... :)


I completely agree that reverting a merge in git is painful (painful, that is, in figuring out the parameters to pass to git revert).

Let's say this is your scenario:

  d         Merge branch 'test'
  |\
  | c-test  Test Branch Commit 1
  b |       Master Commit 2
   \|
    a       Master Commit 1
And let's say you want to revert d, the merge commit. You would do this:

  git revert -m 1 HEAD
Now your history looks like this:

  e-master  Revert "Merge branch 'test'"
  |
  d         Merge branch 'test'
  |\
  | c-test  Test Branch Commit 1
  b |       Master Commit 2
   \|
    a       Master Commit 1
-m, or --mainline is the key here. Git needs to know which parent is the appropriate one to revert to.

As the d-b-a line is the leftmost one, its parent number is 1.

Let's say that, instead, for some reason you actually wanted to rely on the d-c-a line as the basis for the commit, effectively throwing away the changes presented in commit b. You would do:

  git revert -m 2 HEAD
That said, you almost always want -m 1. And that said, this is how I understand merge reverts to work, but I've only had to do this a few times, so take that with a grain of salt.

--

With all that said, I completely agree with you about the frustration in trying to figure it out, even with SO. This is exactly where some nice porcelain sitting on top of git could make things much simpler.


I appreciate the effort you have made in trying to educate me here.

I have recreated the problem I originally had and documented precisely why it really is a usability problem. Now that I have done this I am pretty sure that my anti-git sentiment is warranted - even if my initial comment was massively too harsh.

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html


Ah, and this is why I shun fast forward merges. You were using fast forward merges (and, honestly, I absolutely cannot blame you for doing so). A very, very simple graphic explaining the difference can be seen here:

http://stackoverflow.com/a/2850413/1397661

The fast forward merge loses any concept of where the branch was branched from and when it was merged back in. Many in the community find the fast forward version cleaner. Sure, it is cleaner, until you need to revert that merge.

After using git professionally for quite a while, my opinion is that one should almost always use --no-ff for merging (unless one really knows better). Sure, the history is a bit more complicated to view. But, if you want to easily revert that one merge, everything becomes much, much simpler. It adds a little complexity for the common case while significantly reducing the complexity for a number of slightly less common cases.

That said, this ties back into the learning curve of git. How were you to know that you should do non-fast-forward merges so as to make merge reverts easier? Git has probably the steepest learning curve of any current version control system. It takes time to figure out the proper workflow for a company, and I've seen many rather intelligent people fail at git.

By the way, you brought up valid points and questions, and I enjoyed this discussion.


It's possible that when you were doing this things were different, because even in the last year or so it seems like git has become a lot less esoteric in its interface, but the man page for git revert currently explains a fair bit about how to use it to revert a merge commit. Basically you just need to specify which parent (of the merge commit's 2 parents) is the one that should survive the reversion with '--mainline'. This seems intuitive to me, with a maybe deeper than average understanding of git, though the rather more involved process of undoing the reversion is less so.


yeah, i feel like this is almost certainly the case. that man page was absolutely my first stop when i got the error message and looking at it now i can't imagine how distracted and stupid i must have been to have missed all of that information. its a weak argument, but i have never struggled to learn anything when provided sufficient information. usually much less than that is enough...


"Perforce, maybe, but only for certain very specific scenarios."

These days you can get a lot done using git as your perforce client but.... ick.


This is precisely true. I don't know how to use git. That's a big part of the problem I have with it, I get much more bang for my buck in terms of time spent working things out if I don't use git.


I don't know how to operate a airplane, but that isn't a problem I have with airplanes. That is a problem I have with myself.


this is a terrible analogy, its more like 'i can't drive this car if i don't do an engine rebuild myself - maybe i'll drive this car that already works the way i want'

tbh, i started a place they used git, i started learning it but nobody already there knew a damned thing about it and they should have been using it at all. they used it because it was more popular, despite having history with mercurial. we now use mercurial.

no need to be condescending. i did just point out not knowing it was precisely my problem


Engine rebuild of git? You mean "learn how to use it". If you want to go with a car analogy, then it is more like you are going to keep on driving one car because you don't know how to drive the other.

Git isn't broken just because you don't know how to use it.


yes, but i also i mean configuring it. its not just that i don't know how to drive the other car, but to make it drive in a way which allows me to reach my destination i need to make changes to its configuration - rebuild the engine might be harsh, so lets say adjust the timing belt instead... or change my tires or something. analogies suck.

learning curve is important for tools. the biggest problem was that (before i was even there) someone decided to use a tool they didn't understand instead of a tool that they did. learning git as i went along turned out to be a dangerous path compared to learning svn, p4 or hg as i went along - especially in an environment where everyone else was learning as they went along.

i regret my initial harsh wording because it really isn't broken and in many respects its a fantastic tool, its just poorly designed - and this is very obvious if you look at any of the help pages - there are very few commands which perform common tasks without excessive options. i don't want to configure my tools that much when i don't need to (i.e. when other tools don't need that and provide the functionality i care about). implementation details shouldn't pollute my user experience.

so, how do you revert a branch commit? every answer i've seen required considerably more knowledge and effort than any other source control solution i've used.


> Git isn't broken just because you don't know how to use it.

That is a lame cop-out. Git has a terrible, terrible user interface. Its documentation is equally terrible. Git apologists are nearly as silly as Javascript apologists.

A distributed source control system does not intrinsically have to be as error-prone and obscure and arbitrary as git. The fact that git fanboys can't see this is either a lack of introspection or a lack of imagination, or both.

My favorite example is the fact that "checkout", arguably the single most common source control command, will irrevocably and silently destroy local changes. Those with more time and more hate on their hands have written volumes about this[1], but for me, I find that plenty of very talented devs I know really do not care for git, and suffer its numerous deficiencies because it is the flavor du jour (and because of Github).

[1] http://stevelosh.com/blog/2013/04/git-koans/


Now that I have recreated my original scenario I am now firmly of the belief that the interface to the user, documentation and available reference are inadequate.

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html

Hopefully I can claw back some of the respect that I have clearly lost from you. Or at least discourage you from being so offensively defensive about this. :)


so, git reset was not able to solve my problem, nor was git revert, git gave me an error message explicitly stating that branch merges can not be reverted. when i researched it i came to the conclusion that the a configuration change was required to prevent this problem going forwards. i came to the conclusion that git had bad defaults for what i consider a critical function of source control - i consider this dangerous.

i was able to solve our problem at the time, but the amount of time i wasted investigating a 'correct' solution using git exceeded the amount of time it took me to later create an 'anti-merge' commit myself (the fix) and move over to hg (prevent recurrance). i also chastised everyone for using something they didn't understand instead of the familiar well known tool because it was more popular (curing the cause).

i am going to investigate reproing this since everyone is in such a state of disbelief and a good proportion seem to think i am completely inexperienced and stupid.

the branch disappearing is a small thing thats confusing, but it comes down to the implementation details of git - because its all just a pile of diffs with labels and such when i merge it seems like the 'branch' disappears, even though its exactly the same data that was there before, just organised into a different hierarchy of changes. this is quite confusing, but it also means that, e.g. producing the anti-merge change, required more effort than it should have.


As far as danger goes, I've never got into a state I couldn't back out of using the reflog - though the user experience surrounding getting there and back could definitely use some polishing.

As far as having mutable histories, from the standpoint of someone who has to review and integrate lots of branches from a large distributed team, its amounts to a necessary evil. It's difficult to review and merge anything more than a concise set of changes that achieve their end goal and it's also difficult to have committed such a changeset on the first try. Not having the team rebase out experimental and exploratory commits would be a huge mess and while it's an option to just wait to commit until you're done, that risks losing things, and leads to giant one-shot patches that can still be tricky to review and ship.

I'm not sure what workflows hg users that loathe to alter history adopt where they don't suffer from trying to integrate messy branches.


"from the standpoint of someone who has to review and integrate lots of branches from a large distributed team"

I think this is a very telling point. I found git infuriating and inconsistent to an irresponsible degree, until I realized it wasn't built for me as a producer of code. It was built for people who have to integrate lots of code from lots of disparate producers. Once that lightbulb clicked a lot of things started to make more sense.

I still find git infuriating as part of my daily development workflow, but at least I get it now. I still have a hard time recommending it for small teams that are doing on the fly integration, but for the big distributed team model, I could see it being the best choice.


Yeah, I get the impression that in most meaningful ways hg is a better choice on a technical level. But git does have a few real advantaves. Git is faster, though the difference is small enough that I can't imagine this being a serious issue. Git repo's are smaller, particular in the face of (directory) renames.

But the real advantage of git is its popularity. You present that as something worrying; but at the end of the day, hg and git aren't all that different, and popularity means pervasiveness means collaboration is easier - and that's worth something.


you are correct. its popularity is an advantage, and this is precisely why i would want to use git - so many people are familiar with it and it almost seems like the standard solution today for source control.

but its still worrying to me for its popularity. either i am fantastically stupid or people are happily using a tool without any idea what they are doing or how to resolve what i consider to be a very standard source control problem. these are both things for me to worry about.

if you are wondering what i had such a hard time with here is a trivialised example:

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html

even if this problem is solvable and there is a solution this is a perfectly good way to approach problem solving - the problem is the lack of quality documentation and source materials to research a solution from.


Very sensible comment :) Having gained the most mind-share is Git's best feature, along with it's flexible branches and speed. Those features are super important and as a Mercurial developer, I'm very impressed with them.


> git is dangerous out of the box - it can destroy your work very easily

I really wonder what do you mean by that?

Yes, it's relatively easy to checkout some older commit and end up in a headless state, where an ordinary `git log` command won't show your more recent commits. And then panic and think that git has destroyed your work. But all your commits are still there. You need some more involved way (like the reflog) to get back to them, but git has not destroyed them.

After 30 days or so, git may run garbage collection and remove dangling commits, thus really destroying your work then. But your words "can destroy your work very easily" are a bit strong if this is what you mean?


I chose words that were too strong. The core of my problem is the amount of learning and configuration required to revert a merge. Its too much. The poor choice of defaults and requirement that I fiddle with implementation details shakes my confidence in the whole of git. I'm sure its aslwasys eventually recoverable but that's far away from practical to me vs just using mercurial which I find easy to use and has sensible defaults


I'd just like to respond to all the people who are saying "Use the reflog" here.

Yes, you can recover lost commits using the reflog. But it is not well publicised that you can do this. Most tutorials do not mention it and most Git GUIs do not expose it. It is also not functionality that you expect to exist, so you are not likely to think to Google it. Unless you are an advanced Git user, your commits are indeed, to all intents and purposes, lost.


Doesn't the large number of people saying "use the reflog" indicate that amongst git users, they know that it exists?


Probably. It's one of those things you learn as a Git user sooner or later -- though in most cases, not before a few heart-stopping moments and probably a bit of an injection of FUD into your team.

Rather more worrying is the possibility of deleting history remotely with an injudicious git push --force. Some such remotes (coughGitHubcough) don't give you any access to the reflog on their end to sort things out.

Git really needs to set receive.denyNonFastForwards and receive.denyDeletes to true by default.


Good point, the reflog should be mentioned earlier in tutorials.


i don't think thats a fix at all.

i still think it should be possible to revert a merge with a single command and no flags.

the volume of questions about this on stack overflow and co. and the variety of answers, some of which i'm sure i tried and found not suitable, suggests that this is a real user experience problem.

alternatively their should be a decent ui solution to hide this stuff and it should be packaged with git in some friendly installer somewhere (not SourceTree which although the best of the bunch I've tried, was of no help to me when i had this problem and has a huge stack of usability problems vs. e.g. p4v which frankly everyone should rip off because i'm sure its why perforce is still even a thing aside from support contracts...)


You do not use the reflog to revert a merge. You do it to undo a messed up rebase, and I am not sure if that is something which you want to fix automatically. I think using the reflog here might be better, maybe with a shortcut to make it easier to reset to a point in it.


yes, tbh this is the first time i have heard of the reflog.

i extensively googled when i had the original problem which put me off using git


Nobody is saying you have to squash, it's just recommended to tidy changehistory. For all of your projects, you can simply not squash, and ask contributors to do the same. It's not like git rewrites history automatically...

Just because git will let you shoot yourself in the foot, doesn't mean that it's a tool that should be avoided.


thanks for the suggestion, i did recreate my problem so that i could present a properly reasoned argument rather than a ragey comment:

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html

i'm pretty sure that i need to do this particular shooting myself in the foot to be able to do work unfortunately. i might be lacking knowledge, but it was far too difficult to acquire said knowledge. as someone who self taught himself programming in many languages,classically hard stuff like functional calculus, quantum field theory and general relativity - as much as that is a weak argument - i'm inclined to think there is insufficient information, or at least extremely poor documentation here.


Actually it does rewrite history by default AFAIK. This is how the merges work from the end user perspective.

I've never had trouble reverting a merge in any other context.


Agreed... rewriting history is a powerful tool that power users will want to use. Nothing more. That of course applies to all tools :)


False.

If you ever thought you lost work in Git, it's POSSIBLY because you overwrote your working copy. That is, code that hasn't been committed yet. But once you commit -- hell, once you just "git add" to the index -- your work is safe and won't be garbage collected off the disk for at least 30-45 days.


My choice of worded is quite poor. I know the problem I had was recoverable but the amount of learning involved and bad defaults standing my way put me off git. I get the same functionality from mercurial with a fraction of the effort


You'll note that the Jenkins guys were able to recover all their work. They also use a non-standard branching and access model.


Did they? Last I heard, they were having trouble with a few repos, because the hashes they got from github weren't quite right.


yeah, but did you read about what they were doing? granted the breaker of it all did something he shouldn't have... but its still terrifying to hear of these things.


Repeating my description: Git internally is just snapshots of the code connected with prev-version edges forming a directed acyclic graph and then some named pointers into this DAG.

This is the simplicity of git. I do not use or know Mercurial, but from the article Mercurial seems to have multiple types of branches and commits. This makes it more complex.


everything is a changeset, and yes there are multiple types but for the end user they have no need to know about them - from there perspective everything is a changeset and they are all operated on equally.

from the end user perspective git does have multiple types of change - and this was precisely my problem. i recreated the problem and documented it here:

http://jheriko-rtw.blogspot.com/2013/11/why-i-hate-git.html

to be fair, i completely took that quote that I used about simplicity out of context because I've heard similar said in a different context more than once and I jumped to a conclusion.


> "I've really tried to 'get into' mercurial's mindset several times now, but never could, whereas, IMO, git's model is simple and powerful. " > > I find it hard to understand what this means

The git model can be, to a sufficient degree of completeness, be demonstrated with child toys: http://www.youtube.com/watch?v=1ffBJ4sVUb4

I can't even imagine how a similar demonstration would look like with Mercurial. It would probably involve having a bunch of such trees on the table, and super-glue instead of sticking things together.


Please see this figure: http://www.aosabook.org/en/mercurial.html#fig.hg.revlog

I hope you see the similarities between the storage layers of Mercurial and of Git: Mercurial has a changeset graph with pointers into a manifest graph. The nodes in the manifest graph points to the files presents in that commit. It's (also) not very complicated.

As for branches, they're very simple too, as I tried to explain here: https://news.ycombinator.com/item?id=6753477


Agreed. Honestly, I see the appeal of maintaining a clean history, but shouldn't that be done in some non-destructive fashion? Do we need source-control on our source-control? Or at least simply flag squashed commits instead of destroying them?


> I see the appeal of maintaining a clean history, but shouldn't that be done in some non-destructive fashion?

Don't rebase code that is public (few people recommend that you do) and there is nothing destructive about it. When I rebase my changes before making them public, I am not destroying history, I am deciding history.


> Do we need source-control on our source-control?

That's more or less what Mercurial's changeset evolution [1] is/will be (it's not enabled by default yet).

[1] http://mercurial.selenic.com/wiki/ChangesetEvolution


as a note this seems to have +36 points, i assumed it was negative because my reputation seems to have taken a dive by roughly that same value. just noticed in my history that negatives are marked with a '-' and get greyed out. how strange... or perhaps i am again just being stupid.


I really appreciated that this response was not overbearing, or just raw opinion, and rather compared many of the similarities and differences from mercurial to git, and how they approach similar issues differently.

Honestly, I've struggled in moving from SVN/TFS to GIT, but it's been worthwhile.. having local branching, and being able to work locally is great. I can't really compare this to hg, as I haven't worked with mercurial at all.

The company I work for has moved a lot of its' development to a github enterprise deployment, and it's been interesting. I like git extensions for windows, though it's a bit rough around the edges. It's also a bit limited in terms of VS integration, but I can manage. I think that Tortoise + Ankh was a better combination overall in terms of the polish of the tooling, over what git currently offers.

I tend to now work with a few console windows open, and have been using the command line much more.


You should try SourceTree, it's a great Windows and Mac Git GUI client! It's free (as in beer) and also supports Mercurial.


I second SourceTree - if there's something missing that you need, Atlassian are surprisingly responsive to feature and bug reports.


I like SourceTree but it the UI gets clunky and slow within a short period of time. Not sure why, but it usually happens when you have anything more than a trivial number of commits.


Have you tried it in the last few days? After SourceTree got relatively slow for me with the past few updates, the last update actually completely fixed that!


I use SourceTree for git, but I find TortoiseHg is better for mercurial; it's just far more feature complete, with support for almost all mercurial commands and (bundled) extensions.

I'm still somewhat surprised by how poor git gui's are, considering it's the more popular VCS - though SourceTree is certainly one of the better ones.


I was really happy when I heard about SourceTree(back then it was not even released yet) but when it finally came out it was bulky to begin with.

TortoiseHg on the other hand always looked nice and clean.


Or for many of the world's desktop Linux developers who use Eclpipse, EGit is quite good and also free. Oh, and it works on Windows, Mac, for that matter just about any Linux platform that supports Eclipse.


Are you kidding!? I found EGit to be extremely buggy, not at all matching git's conceptual model, and difficult to use/navigate. Compared to the hg plugin for Eclipse or gti's command line it is light years behind. The only thing I use it for is viewing the history of a single file.


SourceTree is good if you can handle the .NET 4.5 requirement. At work, we can't risk the breakage 4.5 can possibly do to 4.0 apps and can't force customers to upgrade for 2/x applications so I don't bother.

My first foray into DVCS was Mercurial through Kiln and subsequently TortoiseHg. It's always been great but the introduction of the workbench made everything gel. Having all of the UI in one spot made the experience that much easier. It makes discovery pretty simple compared to TortoiseGit/SVN where you have to know what each menu item maps to.

I've since moved to GIT for work and personal use. GIT flow sold me as hg flow really didn't feel equivalent. I also like submodules as they make proper segregation that much easier. Hg has patch queues which made working on items that weren't commit ready easy. At the time, I didn't understand feature branches so I should likely revisit Hg earnestly. Hg has always had a better user experience where error messages aren't cryptic but there's a wealth of knowledge online to solve any problem in either DVCS at this point. There's some spots in git where you still need the command line where TortoiseHg seemed to cover every need I had.

TFS finally supporting git as a repository, despite its caveats (convert once, only vs2012+) will likely push git pretty far. Having used Ankh, I was never impressed. I'd rather use tortoise + git source control plugin or git extensions as the experience is definitely sufficient. I actually prefer the tortoise workflow over doing it all in vs though but there's a bulk of what we do outside of vs anyway so being proficient there helps tremendously.


Have you tested 4.5? What breaks?


A significant problem is that once .NET 4.5 is installed, a developer can no longer test how an app behaves under .NET 4, even if the app still has its target framework set to .NET 4.

There are several major WPF bugs fixed in .NET 4.5; developers will no longer encounter those bugs locally, but can only find them on a dedicated .NET 4 testing machine, or (more likely) reported from a .NET 4 customer in the field.

There's a detailed writeup of the problem here: http://social.msdn.microsoft.com/Forums/vstudio/en-US/c05a8c...

See also this suggestion: http://visualstudio.uservoice.com/forums/121579-visual-studi...


My normal workflow with git is through the cli. Cli really is the best way to use it since git is good at prodding you along to do the right thing. Make sure to go find a good set of aliases and turn on git prompt.

When I need to look at histories or large change sets I'll open up SourceTree or go to the BitBucket repos. If I need to see the file history of a single file gitk -p -- file is the quickest way that I have found.

I'm currently dealing with 2 projects that are similar and share code but are in different repos. Git has made it easy to add remotes to each project and cherry pick commits that need to be shared. As someone who started with VSS years ago, Git continues to amaze me with how easy it makes source wrangling.


What is "git prompt" and how do I turn it on? A quick search didn't reveal anything.


Add this to wherever you setup your bash prompt:

source /usr/local/git/contrib/completion/git-prompt.sh

source /usr/local/git/contrib/completion/git-completion.bash

GIT_PS1_SHOWDIRTYSTATE=true

export PS1='\u@\h \[\033[01;34m\]\w\[\033[00m\]$(__git_ps1 "\[\033[01;33m\](%s)\[\033[00m\]")$ '

source ~/.profile

What this does is add a git style prompt anytime you are in a directory managed by git. On the prompt it will show your current branch, what state it is in (+'s/*'s show if you have changes), if your are merging/rebasing/cherry-picking/etc...

The source files may live in slightly different places on your machine depending on where you installed git.

This above also adds tab completion for git commands, branches, tags, etc...

Also, see my .gitconfig for aliases I have put together from various places on the net:

https://github.com/matwood/cfg/blob/master/.gitconfig


I think the parent is meaning something like the git-status integration of e.g. oh-my-zsh [0], or something like the 'official' git-status integration script for bash [1]

(I've used the former, but not the latter.)

[0] https://github.com/robbyrussell/oh-my-zsh [1] https://raw.github.com/git/git/master/contrib/completion/git...


I use TFS at work and Git at home - I do miss Visual Studio's merge features which have been greatly improved in 2012 and 2013. How did you get your head around resolving merge conflicts in Git?


It's as easy as searching for ">>>>" in your text editor; the conflicts will be clearly marked and you take the markers out and make the merged code look the way you want it.


Or use my favourite conflict program p4merge


it's not too bad in git. you just look at the merge conflict markers in each of the files specified, delete or merge what you want in each of the files, then add the changes to the staging area, and commit again to complete the merge.

this might help: http://stackoverflow.com/a/7589612

and this: http://git-scm.com/book/en/Git-Branching-Basic-Branching-and...


Even better, one can specify a mergetool in their config, and use a standard diff tool to resolve conflicts.


At work, my Git repos are exclusively C# and I've been using SemanticMerge [1] for resolving conflicts to good effect.

[1] https://www.semanticmerge.com/


It's a pity they're a subscription and also they want me to log in to view the pricing.

Edit: There is a once-off price and also somewhere to apply if you're working on open source.


I quit using Mercurial around 1.4 and was happily surprised by all the improvements in 2.8 (shelve, bookmark improvements, etc.). Martin Geisler's reply was both thoughtful and respectful, a rare sight in this debate.


> Martin Geisler's reply was both thoughtful and respectful, a rare sight in this debate.

I have seen this debate plenty of times but I'd hesitate to say that the overwhelming majority of times are hostile.

I think the mercurial guys know that their tool is good with some flaws and that they know that the git tool is good with some flaws and that most other tools are not quite as good. The typical reaction that I have seen is I don't care which one you use if you are using git or hg or some other equally capable revision control scheme.

I definitely agree that Mr. Geilser's response was well thought out and polite.


It's refreshing to see someone with a level-headed response who doesn't fly off the rails defending their source control tool of choice, unlike what you usually see with developers of well-known frameworks like PHP and Ruby. I don't personally use Mercurial myself, but it doesn't seem like a bad source control choice, anything is better than SVN, right? Git and Mercurial both seem like great and sensible choices.


This type of response is quite common in many open-source communities.

The trick to finding it is to consider carefully the motivations of why someone wants to participate in open-source. If you have a problem, you wrote a tool to solve it, and you want to share that tool with the rest of the world, you won't be threatened when someone else has a competing tool to share with the rest of the world. Just explain why your tool is useful, and under what circumstances, and let people make up their own minds. It's no skin off your back if they choose not to use it.

If you participate in open-source to gain social approval and hacker cred, however, then it's very threatening when other people choose to use a competing tool. If your users disappear, so will your social approval, and you'll be left feeling alone and unwanted. And so people who are in open-source for this motivation frequently put-down technologies that that compete with their own favorite tools. Sometimes it's not even the tools' authors that do this; it is people who have adopted the tool and feel like their choice of preferred tool will be threatened if others adopt different tools. Hence, fanboyism.

I have found, when evaluating technologies, that I rarely go wrong trying to select for the first category of community over the second. The problem is that there's a bit of an adverse selection effect. If you ask "What's the best framework for X?", then all the people in the second category will rush to defend their favorite tool, while the people in the first category may present cogent arguments for their tool of choice but won't seem all that impassioned. Human beings are hard-wired to recognize passion, but we typically cannot recognize experience until we have had that experience itself. And so your natural tendency will be to pick frameworks in the second category unless you specifically try to look past the arguments and look at the motivations and background of the people making the arguments.


> anything is better than SVN, right?

Hey, Subversion was the new hotness for quite a while. It fixed most of the hideous flaws of CVS and made version control actually usable.

Any DVCS has a number of inherent advantages over SVN, but it does a good job for what it is.


Sadly no, not everything is better than SVN.

SourceSafe, CVS immediately come to mind as systems worse than SVN.


After reading most of the comments, and having participated on more than a few git vs. <insert random DCVS> discussions, here are what I hope are some hopefully new contributions.

All systems have different tradeoffs depending the target audience that they are trying to optimize for. When comparing bzr, hg, and git, one way of thinking about it is that they differ in the size of the developer community of the project that they are trying to optimize for. The sweet spot of bzr is probably up to ten or so active developers at one time; for hg, it's probably up to around hundred or so; and for git, it's designed to scale up to several thousand active developers. Different people might want to argue over the exact breakpoints, but in terms of orders of magnitude, I think it's pretty close.

One comment that was made in a thread below was that git was optimized for the people who integrate code, as opposed for those who actually produce the code --- and I think that's mostly true. Which is to say, when there was a choice between optimizing for a project which might make things easier for the integrator, or for the sub-integrator, or sub-sub-integrators (in Linux code integration happens in hierarchically; it's the only way we can scale), and making it easier for a newbie coder, git has very unapologetically optimized for the former. It's true that there are some warts which caused by legacy UI decisions which would probably have made differently if the designers could go back in time, but in my mind these are in the cateogry of things like TAB characters being significant in Makefiles; it's annoying, but practitioners very quickly learn how to deal with these warts, and they generally don't cause significant problems once people get over the startup hump.

The other observation is that since choice of which DCVS gets made is generally made by project leads, who tend to be the integrators, it's not that surprising that git is very popular for them. It's also true that most project leads tend to be over-optimistic about whether their project will be the one that will grow up to have thousands of active committers (just as most startup founders are convinced that their baby will be the defeat the odds and become the wildly successful IPO whose market cap will exceed $4 billion dollars :-).

Given that most projects generally don't have thousands and thousands of active developers, it might be that hg is a better choice for most projects. However, if most of your senior developers are more used to using git, because that's what they are mostly familiar with, maybe you might want to use git even though the project's scale is one that would be quite satisfied with hg. For me, the e2fsprogs project falls in that category; while the number active developers are such that hg would be fine, most of the developers are simply much more used to git, and so we use git.

The third reason why git has probably become popular is because github is really good at hiding many of git's rough edges, and if people are used to github, then it might be a good set of training wheels before people graduate to using git without github's training wheels.

If these three factors don't apply to your community, then maybe hg is a better choice for you. If that's true, then don't hesitate! One thing that most people forget is that while transitioning between DCVS is painful, it can be done. So if it turns out the situation changes in three or five years, it is certainly possible to convert your project from hg to git. It will be rough for a month or two, but for some projects, that might actually be better than starting with git, and then finding out that it caused some increased friction initially, and that they never needed the scale that git provides.


I don't think Mercurial is the issue here...

Google Code is the issue... Didn't realise people still used that, its worse than Codeplex. Atleast use Bitbucket or something that makes it easier for people to contribute to while still keeping it with Mercurial.

I've found bugs in stuff before and ended up using a different project/library for the sole reason that I found out its in Google Code and the amount of effort involved in using the site let alone raising an issue or fixing it just wasn't worth it.


Agreed, Google Code is awful. The tree ui for browsing code is so very terrible. Browsing commits is a terrible pain. Viewing diffs is awful. And the site just looks ugly and is hard to use. I was tired of Google Code before I even knew about GitHub, I had switched to Unfuddle.


I think the big summary of this post comes down to, if you don't have something like:

    [extensions]
    shelve =
    histedit =
    rebase =
    mq =
As a git developer using hg you're going to be frustrated by all the things 'hg doesn't support'.

It doesn't actually not support them, they're just (for some reason?) shipped with hg but not turned on by default.


I'm not sure about shelve, but histedit, rebase, and mq are all history modifying extensions. Mercurial differs from git in that it prefers immutable history.

The extensions are there if you really need them, not as something you should always setup imo.


I don't think this is true today. Git and Mercurial have the same basic model, which means that a commit is immutable because the identity of a commit is determined by its content.

This means that you must re-create a changeset in both systems if you want to "change" it. Mercurial and Git can do this and have been doing it for years. The difference is that Git has a built-in concept of garbage collection whereas Mercurial does not. So commands that modify history in Mercurial must trigger the garbage collection (we call it strip) manually -- and they do, of course.

So you wont see any big difference between 'hg rebase' and 'git rebase'. They both build new commits and remove the old commits (in Git they're removed eventually, in Mercurial they're removed immediatedly, but with a backup if you want to restore the pre-rebase state).

The latest versions of Mercurial has history modification built-in: you can 'hg commit --amend' without enabling any extensions. The changeset evolution concept will take this even further and allow really cool collaborative editing of shared history.


Modifying local history is not dangerous, so I think mq is good. Never really used histedit and rebase.


After 4 years of Git and 1 year of Mercurial, I like them roughly equally. I love cheap local branches, I also love MQ. They're just tools, I wish people could get over this kind of stuff.

If this is really about getting more contributors, I suggest a GitHub mirror. We do that, it's not a big deal (hg-git is pretty good), and it does get us more contributors.


Having used both extensively, Git at one job, HG at the next, now Git again, I mostly have this to say:

Both tools lack polish.

The biggest piece of mis-information I'd like to clear up is that often times a user new to either of these makes a few mistakes and then the cold hand of death grabs them and they're certain they lost work. You never really do. In Mercurial, the permanent nature of named branches and in Git the reflog both serve you well. Seek help in such a case, because if you think you lost work, you almost certainly didn't.

Edit:

Also, if you're using HG and not using patch queues, you're doing it wrong. Just sayin'


I've used patch queues with hg, but there are other tools available. For example bookmarks allow you to use hg more like git.

Hg with patch queues is a bit like using git and always rebasing, almost never merging. You do lose history with patch queues.


Author/thread starter here: Really didn't think this post would have generated as much interest as it did when I created it... :D

It's a fairly enlightening thread though. Mad props to Martin for a fantastically thoughtful reply!


So, you are not switching now, hopefully?


Mercurial's changeset evolution sounds really powerful and seems to address the 'git rebase and force push' problems that can occur.


Here's a video of a FOSDEM 2013 talk about Mercurial's changeset evolution:

https://air.mozilla.org/changesets-evolution-with-mercurial/


Part of me wonders whether this is an instance of "The Magpie Developer"[0]. I know that I personally prefer git and bzr over cvs and svn, but just ten years ago, people were all talking about how Linux had just switched to BitKeeper.

I also suspect that rather than having a bikeshed-esque argument about switching between one DVCS and another, we can come up with technical solutions to enable everyone to work together. For example, when I was interning at Facebook two years ago, git-svn was used extensively to let the rank-and-file employees work with git, while still allowing chuckr to deploy off of SVN.

For mercurial and git, there are services like Kiln Harmony[1] and hg-git, too.

[0] http://www.codinghorror.com/blog/2008/01/the-magpie-develope...

[1] https://secure.fogcreek.com/kiln/


"The local revision numbers play no role here -- btw, they're just an (arbitrary) ordering of the commits in your local repository. No magic there."

I disagree. Mercurial is famous for its "simple" revision numbers that differ between repositories. These revision number are magical as they can suddenly change when you merge some branches and older commits get pulled into your history. Since mercurial and git are mostly about collaboration, non-stable identifiers can be very confusing.


Revision numbers are stable within a given repository. The revision number for a commit is simply the index of the commit in the changelog — nothing more. Since a changeset must come after its parent in the changelog, the revision numbers give you a topological ordering of the changesets. A topological ordering is often not unique — this is why revision numbers can differ between repositories, even if they contain exactly the same commits.

Because the changelog is append-only, new commits you pull in get higher revision numbers than the existing commits. Your existing revision numbers will thus not change when you do 'hg pull' and 'hg merge'. We don't actually guarantee that revision numbers cannot change when you pull, but it's been the case until now.

In any case, the stability of the revision numbers isn't why we like them — we like them since it's often easier to type 'hg histedit 12345' than 'hg histedit c38c3fdc8b93' when you've looked up a particular changeset in your (local!) repository.

Hosting sites like Bitbucket and Kiln will wisely not show you revision numbers since the concept is meaningless when talking about more than one repository.


They can't change in a given repo unless you use history-modifying extensions. They do differ between repos. Most modern third-party tools, including Kiln, hide them by default for this reason.


Thanks for clarifying this. I haven't touched mercurial for a while.


Ive been using perforce/p4 since 97. the last 8 years have been coupled with P4V and the eclipse plugins. My workflow is virtually identical to the much offered Git branching model [0]. Turn it -90deg with time left to right. Rename "develop" to "trunk". and think of "master" as being the release labels. release and feature branches are created from the build label of the previous release needing special changes that can not just go into trunk or when we want to do a minimal patch-only release. a developer makes local copy of the portions of the trunk they want/need. the server knows who has what opened for edit/add which makes checkins fast. only the change must go in. same for regular update of my local workspace. we rarely have people editing the same exact file and if we do its a simple resolve in P4V during your commit or integrate activity.

So far, there are FEW benefits i see to using, or switching to, GIT: LOCAL REPOSITORY - I have the whole repo for what i am working on so can work remote easier LOCAL VERSIONING - I can create lightweight local branches that duplicate only what is required without resync of my local repo POPULARITY - all the cook kids are using it so lots of ops utilities now are built expecting it to be there under the covers.

I can live without the LOCAL benefits. I have for 8 years. I worry about the third - popularity vs. merit.

Where are the concrete comparisons showing all the features and functions of these solutions side by side and how they compare for a student, startup, or enterprise?

EDIT: found a newer perforce:git comparison on the perforce site[1]. The LOCAL REPO/VERSIONING is available with P4SANDBOXING.

I suppose modding me down is appropriate since i picked up on the POPULARITY aspect of the discussion and don't have any Hg experience. However, it seems these discussions just keep happening and there is no one-true-solution or approach. they each have merits and faults. it will be the careful consideration of these that leads to your own solution implementation. The popularity of GIT appears to be its strongest argument.

EDIT2: located a GIT-v-Perforce on SO.[2]

so many of the diffs, however, have been met with sandboxing

It really appears the strength is in popularity - everyone else uses GIT so you should too…

[0]: http://nvie.com/posts/a-successful-git-branching-model/ [1]: http://www.perforce.com/sites/default/files/pdf/perforce-git... [2]: http://stackoverflow.com/questions/222782/git-vs-perforce-tw...


I'll mention that a number of the answers covered in this StackOverflow post [1] (note, that post is old, so it is entirely possible that perforce has improved since then) cover a number of the reasons I prefer git over perforce, but, for me, cheap branching and speed (due to everything being local) are probably the two biggest.

Cheap branching, especially, completely changed my workflow. When a branch is effectively free, and switching between branches is quick, a lot of doors are opened.

And I say this as someone who really liked perforce previously. I just think git is more capable, albeit with a higher learning curve.

[1] http://stackoverflow.com/questions/222782/git-vs-perforce-tw...


you found the same one that i did while editing. if git is great, is it worth changing an enterprise to use it? when the auditors come, I need to list the changes on a server. when its an app, the changes to that app. who, what, when, where, why - reviewer, requestor, approver, author, deployer, etc.etc.etc. how granular and how far back can i get this with GIT? how solid is the nonrepudiation factor with GIT? for commits - can i see who else might be looking at code? for security - can i protect others from getting the code?


Does perforce provide cryptographic checksums? If not, git is actually a win as far as the auditors are concerned. git does not allow any changes to history.

You can make new history and garbage-collect the old history. But for a fixed commit hash, all history and all contents are fixed forever. For extra security, you can sign the commits (ie their hashes) with your PGP key.

> how solid is the nonrepudiation factor with GIT?

Rock solid: as far as I can tell (from https://en.wikipedia.org/wiki/Perforce) in Perforce you have to trust your admin not to go behind your back and change history. In git you have cryptographic hashes to protect your history and content.

> for security - can i protect others from getting the code?

git leaves the protection to the file system permissions.

git doesn't track who requested a change nor why. People usually do this via conventions in their commit messages. You can write plug-ins to enforce these conventions.


I have no experience with perforce so I cannot directly compare it to git. Some of the things you ask for can be done with git. As eru mentioned, you can sign a commit (and therefore all history leading to that commit), which will prevent anyone else from modifying its history undetected. With abundant signing and tagging, you could probably create a thorough audit trail.

Git by itself does not have granular access control, and relies on filesystem permissions for security. The most common tool I know of to finely control write access to git (other than GitHub) is gitolite[0]. But even gitolite only provides read access control at the repository level.

[0] What is gitolite? http://gitolite.com/gitolite/index.html#what


You bring up some valid questions - thanks. I think you are getting at a few scenarios where, in fact, perforce is better. I like git, by no means is it the best solution for all cases.

if git is great, is it worth changing an enterprise to use it?

There are a ton of factors involved in answering that question. A lengthy blog post could be made trying to answer that.

when the auditors come, I need to list the changes on a server. when its an app, the changes to that app. who, what, when, where, why - reviewer, requestor, approver, author, deployer, etc.etc.etc.

This depends on how git is setup. In the simple model of everyone on a team having push access to one origin server that serves as, effectively, the central repo, and that repo allows force pushes, then yes, this may not satisfy an auditor. As in, anyone on the team can rewrite history. Of course, when the central repo suddenly doesn't match up with all the copies everyone else has, somebody will probably realize that something is amiss.

That said, being a DVCS, it is not hard to setup something where only a few top level people have push access to the what we'll call the authority server, and all commits from everyone else has to flow through these trusted people. Additionally, force pushing can be disabled on the authority server, so no one person can rewrite history (aside from the people who have direct file system access to the authority server).

Or, think github - people fork a repo, then submit a pull request. Tons of people can commit to a repo, but most commits flow through a trusted few.

how granular and how far back can i get this with GIT?

If you are asking history, the git history goes back to the very first commit. It is also easy to list the entire history of just one file, and more.

how solid is the nonrepudiation factor with GIT?

As with a couple answers previous, it all depends on how git is deployed in the organization. The answer will vary from none to rather good, depending on the deployment.

for commits - can i see who else might be looking at code? for security

If you've granted access to someone to be able to pull from your repository, then no, you can't see who is looking at what. Anyone with pull access will have access to the entire repo. Want finer grained control here without resorting to multiple repositories? Then git is not the right vcs. This is where something like Perforce will win.

can i protect others from getting the code?

In what way? People who already have some access to the repo? See the above answer. Or Joe Q Public? That is trivial.

--

Basically, the big area where something like Perforce (or even SVN I believe) will win is when you need fine grained control over security. E.g. User A only has access to this subfolder, User B only has access to this other subfolder, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: