I feel like Mercurial still has a chance to become a real force if it can provably solve some of the real issues with git. It won't do to just have a nicer CLI since most people are used to git by now.
Here are some of git's real problems:
* Performance issues with multi-GB git repos
* Handling of large binary files
* Submodules - Mercurial has subrepos, but I don't know how they compare
while i like LargeFiles, it's really not enough. Support for binary content (and indeed, history truncation) needs to be native. As it stands, LargeFiles can only be used on intranet networks, and if you've ever used it for any serious purpose, you'll run into corruption.
It's really disappointing to see 3.0 announced with no solution to this.
Why intranet only? The data still comes from the HG server through standard means.
And what kind of data corruption do you see? It tracks files by their SHA1 value and downloads them as needed. Did you report any corruption issues you hit? Or have links to anything specific?
There is opportunity for a powerful tool to handle the very common bad practice of including giant binary blobs which don't belong in version control.
I'd like to see a second class of files which are only checksummed (or timestamped) on 'status', 'diff', 'add' and then binary-diffed on commit for possible compression (or perhaps deduped with checksums of blocks) with features like 'git/hg binary-add somefile.jar' to differentiate.
> There is opportunity for a powerful tool to handle the very common bad practice of including giant binary blobs which don't belong in version control
Why do you say it's bad practice? Game assets come to mind, these aren't derived artifacts and you can't build the game without them.
Usually, you can build the game without them, up to the final packaging stage. Almost any game that is built on an engine can have all of the asset files swapped out without having to recompile anything, and as long as the file format and structure for assets is relatively stable there doesn't have to be a tight coupling between engine versions and asset versions except when doing QA and benchmarking.
This is one of the cases where git's inability to handle rewriting a public history is a pain. The art and programming departments should be able to develop along separate branches, and the programmers should be able to pull a tree from art that has everything between tags squashed (but in a reversible fashion in case there's a need to bisect something later).
There usually are dependencies between the game and its assets, but sure, these dependencies can be managed.
Historically I've dealt with the problem by keeping code and assets in separate repositories (using different version control systems). While it's worthwhile to do so in order to use git for code it really is a pain.
Exactly my initial point. It is really worthwhile to treat code and binary data differently, and the tools don't exist to make it not-so-painful.
To answer a couple of levels up, it's not bad practice because it isn't useful... there are many circumstances where having big binary blobs versioned and associated with code. It is bad practice because the tools are all written _for_ code and can get nasty when filled with binaries.
So there's a problem and a solution, but nobody seems to have put together the right magic yet.
There is opportunity for a powerful tool to handle the very common bad practice of including giant binary blobs which don't belong in version control.
It's only a bad practice because our tools don't support it properly.
There seems no reasonable argument that there is some predefined size limit that all assets in version control must, by natural law, fall under.
The criteria I have for version controlled assets is they must 1) be versionable, 2) have a comparison tool, and 3) be strongly connected to the other assets under control.
It operates under a similar principle (committing hashes rather than file contents) but stores the actual file contents elsewhere rather than using a compression/deduplication within the repo itself.
It's neat, but it's more about replacing something like Dropbox than it is replacing git (despite being based on git). It also isn't particularly stable, though I have nothing but respect for the developer.
As I understand it, it became something to replace Dropbox after Joey realized that git-annex's task of distributed storage of large files intersected with the Dropbox use-case, but it didn't start that way, and IMHO doesn't exclude the parent's suggestion.
I backed Joey's kickstarter, but to my shame I have yet to give git-annex-assistant a spin.
Those are interesting, and git has some similar large file extensions too. However during some years of experimentation I have become convinced that these features have to be part of core or they will always be second-rate citizens and not work well in practice.
Hg is modular and lots of more advanced features are shipped as extensions. Just because they are disabled by default doesn't mean that they are unsupported or are second-rate citizens.
While a lot of tutorials mention how complicated git is in contrast to mercurial I - being a git native - feel the other way around. Git is intuitive with a small number of concepts necessarry to grasp my whole workflow.
Using this workflow with mercurial is really frustrating when I do it - the occasional pull request for a python-based project. A git branch as a concept is really simple, the mercurial ways I just cant wrap my head around (granted I only use it occasionally).
As a platform I like how all the porcellain in mercurial is implemented in a high level python. I can only wonder how productive writing custom porcellain commands in mercurial is given that interface.
There are two things people mean when they say Git is complicated.
Some people mean that Git's model is complicated, which what you're talking about. I actually find its model very simple, some people find it complicated, but at any rate, I certainly think Git and Mercurial have comparable complexity in the model.
What most people mean, though, is that Git's UI is complicated. To be blunt, I think this is simply objectively correct. For example, "git checkout foo" might mean go to the foo branch, or might mean revert a file called "foo". There is no way to know. If you want to be sure to revert a file called foo, you can do "git checkout -- foo", I believe, but I don't know a branch equivalent (and it at any rate won't be symmetrical). Want to create a new branch? That's "git checkout -b newbranch". Want to delete it? That's "git branch -d newbranch". There are tons of things like this in the Git UI, where commands have basically arbitrary parameters in different contexts.
Way back when Git was first created, the plan was for what is now called Git to be the underlying implementation of a higher-level UI. I really, profoundly wish that had actually panned out. Instead, Git's low-level commands gradually grew more user-friendly until it hit the "good enough" zone. That's what people usually mean when they say Git's complicated.
I feel that people that rail on "git checkout" don't actually understand what it does. It does two things: 1) unpacks files from the repository into the working directory, and 2) updates the current working branch.
Want to revert your changes? That's unpacking from the repo to the working dir. Want to switch to a different branch? That's also unpacking files from the repo to the working dir. The only difference is that in one case the current working branch stays the same and in one case it changes (or you could look at it as in one case you set it to the current value and in the other you set it to something different).
This is why people say you need to understand the underlying model of git to grok it. The commands make perfect sense from the perspective of the data model.
Also, want to create a branch? "git branch new-branch". Delete it "git branch -d new-branch". The "checkout -b" is an optimization, just like "rm -r x" is an optimization of "find x | xargs rm" (why do two commands when you can do one?).
The only thing I don't like about git UI is that it can't decide what to call the "index". It's called "index", "staging area", and "cache" in various places (including command line flags), which is confusing.
> This is why people say you need to understand the underlying model of git to grok it. The commands make perfect sense from the perspective of the data model.
Making people understand the internal data structures in order to understand a UI is... not good.
The "underlying model of git" that David refers to (blobs, trees, commits, tags, branches, HEAD, etc.) aren't seen as internal data structures to be glossed over by a UI. They're the very essence of git.
> The "underlying model of git" that David refers to (blobs, trees, commits, tags, branches, HEAD, etc.) aren't seen as internal data structures to be glossed over by a UI. They're the very essence of git.
Users don't need to know the 'very essence' of a tool inorder to use it. Do you think that people who drive cars know in detail about how car works? They just need to know about starting a car, making it go forward, backward, parking etc and that's what most car drivers care about.
It's not all that helpful to compare version control tools with cars, I don't think. Users of a version control tool do need to know about branches, tags, commits and such because they are the heart of what version control is. If you've ever seen an environment where version control tooling has been introduced without the support of the developers you'll have seen what happens when you treat it like a car: empty check in messages, commits getting clobbered by "I'll jut check in my version" merges, and contention over file locks in tools where locking was part of the model.
You don't have to understand how a carburetor works to drive a car, but you do have to know that "Drive" connects the engine to the wheels.
Git is like driving a manual transmission. In a manual car you have to understand that the clutch disengages the motor from the drive shaft and how the different gears work in general. It's not rocket science and most people can pick it up.
> In a manual car you have to understand that the clutch disengages the motor from the drive shaft and how the different gears work in general.
No, you most definitely don't, that's complete lunacy. The vast majority of (manual) drivers[0] have no idea how things work and they don't give a fuck. Different gears are for "go faster" and "go slower", and the clutch is for "change gear". People learn to do it right because the alternative is to stall or get a horrible grinding sound and pay top bucks to get stuff fixed, not because they understand how things work under the hood.
[0] in countries where it's the norm, not in countries like the US where manual is for nerds and passioned
> Different gears are for "go faster" and "go slower", and the clutch is for "change gear".
Sorry, that's exactly what I meant by "the different gears work in general". You need to understand that underlying model before you can make it work, even if it's only intuitive, and not cerebral.
> the clutch is for "change gear"
That, and it's also "don't stall when I stop".
I think most people understand that it disconnects the engine. I don't think that's as crazy a leap as "understanding the details of how a gearbox works" (which I don't even know, since I've never looked in one).
> You need to understand that underlying model before you can make it work
No, you only need to understand what the final effect is. The vast majority of drivers neither understand nor care that gearbox speeds change the conversion ratio between the engine's rotating speed and the axle's, if you did and had to gear speeds would be labelled by their conversion ratio not 1-6 (and then you'd have to include the axle's conversion ratio in the mix). And I don't doubt that a Git-based gearbox would do exactly that, and that going in reverse would require either using an inverter or would require using a completely separate reverse transmission.
The thing I find unpleasant with git is the inconsistency of commands. Why git branch -d for deleting branches, but git remote rm for deleting remotes? Etc.
As to git's basic model - a dag of commits, with branches being little more than auto-updating labels for commits - that fit into place in the first 5 minutes of usage. My learning time with git was shorter than any previous source control I used, except for sourcesafe, in so far as that is a SCM.
Your complaints seem like a very indirect way to get what you want. Perhaps it's git's fault for making these possibilities, but I didn't even know they exist.
If you want to revert/reset a file, why don't you just use the reset command? That seems like a more direct way than using checkout.
If you want to create a new branch, why not use "git branch branch-name"? As another said, the "git checkout -b branch-name" is just shorthand for convenience.
I've been learning git over a mere 2 months (I came from svn), I'm finding its UI to be very pleasant.
As someone else said, that index/staging/cached situation is bad, though.
Interesting you should mention "git reset", because, depending on the options you give it, it either:
* reverts one file
* reverts all files
* removes changes from the index, but doesn't otherwise change anything at all
* moves your branch, without changing any files
* moves your branch and changes all files
This rather makes my point.
"git checkout -b", meanwhile, is not a shorthand for "git branch something", but rather "git branch --track something origin/something", unless there is no "origin/something", in which case you are again correct.
The branch equivalent is `git checkout foo --`. I wrote a simple shell script[1] that makes that the default, requiring you to do `-- <file>` for files, so there's never a possibility of ambiguity.
> I can only wonder how productive writing custom porcellain commands in mercurial is given that interface.
You don't have to use Python to add more commands on top of Mercurial. Its CLI is also its API, just like git, and you don't typically have to use Python to write tools on top of hg, just like you don't typically need to write C to hack on top of git.
The stdout of hg is guaranteed to remain stable, so you can script it by just parsing stdout. The options are guaranteed to remain stable, so your old tools won't need to be updated in case hg's CLI some day changes, because its CLI never changes.
There are also a bunch of hg CLI commands that start with debug (e.g. hg debugparents or hg debugdag) that you can use to directly manipulate hg's internal revlog data structure and do fun things like create a corrupt repo, if that's what you want to do. :-)
I probably should have phrased this more clearly: I think that a python interface (or any other high level script interface) is the more convenient way for a programmer to extend or write hooks/extend the dcvs (I am now thinking about pre commit or post commit hooks in git, etc.).
> Git is intuitive with a small number of concepts necessarry to grasp my whole workflow.
http://jordi.inversethought.com/blog/on-gitology/ explains git's complexity extremely well (without even getting into horribleness of the command line UI). The following is a key quote, though you should read the whole thing.
The following gitological concepts are not particular to git:
* repositories (repos)
* commits or changesets (csets)
* directed acyclic graph (DAG)
* branches
* pushing and pulling changes
* whole-repo tracking, not individual files
* rebasing csets
* pushing and pulling csets
The following are purely gitological and add unnecessary complexity, in addition to eventually being unavoidable:
* Exposing the index/staging area
* Exposing other implementation details: blobs, trees, commits, refs
* refs and refspecs
* Branches are refs
* Detached HEADs (a.k.a “not on a branch”)
* Distinguishing remote and local tracking branches
As with most software, it probably depends largely on what you're used to. I started my distributed SCM life in Mercurial (moving from svn) and found it to be fast, simple and friendly to use. Being an experienced Mercurial user, I hated git when I first started using it - changing something as core to one's development life as the SCM isn't easy.
But after a while of working with both, you start seeing git and hg as relatively equal products, just with different UI quirks.
I eventually moved most of my projects to git, though. Not because I thought it was better, but because it was more widely used. And being the collaborative type, using git was the path of least resistance.
Kind of sad how this post seemd to have resonated mostly for its reference to git, I intended (and failed) to point out my questions on mercurial. As a git user, I see one of the main features for mercurial: As it is in python, it can be used as a library. This should make tooling and cool developments so much easier for the average developer who is not a shell script person.
The key concept that you need to truly understand how Git works is to just remember that Git is just a tree of commits, with temporary and movable Post-it notes attached. (The Post-it notes representing branches and other references).
That's a good image. The key difference to Mercurial is then that Mercurial's tree of commits don't need Post-it notes. The tree can stand on its own without extra branch labels. You can add them (see bookmarks) but they're an optional feature.
I'm excited because it is SO MUCH more powerful than git's commit history rewriting, because "I re-wrote history" becomes part of your (distributed) repository's history.
Exactly! The fact that Git allows you to destructively rewrite history public history is EVIL. Not every user of revision control is going to have a Phd in Not Fucking Up The Repo and I never want a situation like the Jenkins devs had[1] to occur with my projects.
"Git", (really, whatever git server you are using for your publicly accessible repo) only allows you to do that if you allow it to allow you to do that. Rejecting non-fast-forward commits is standard practice in every git shop that I've worked in.
We use git rewriting extensively and do allow non-FF commits. We also have process and tools in place such that master is held sacred (and our live branch is untouchable). Every once in a blue moon, a master branch will get a non-FF, and our tooling is such that we make sure everyone knows it happened.
However, we are a small team consisting of mostly Linux kernel developers, so that may influence the level of trust we put in not screwing things up. We also work pretty much independently; were that not the case, this would get ugly.
This is probably due to an understanding of rebases, rebases should be used to bring your commits to the top of the stack for easy review in OSS projects.
Amazingly useful for making sure patches are easy to apply while following a remote branch.
My biggest problem with hg is the lack of real topic branches and how they become impossible to delete -- and having to use things like quilt on top to try to make local branches more sane -- but it's frustrating because it's not the same thing.
> and having to use things like quilt on top to try to make local branches more sane
I consider Mercurial Queues as one of Mercurial's youthful mistakes. It was ok in 2005, but we have much better things with bookmarks, histedit, and rebase and even with hg commit --amend.
I think I meant queues. I was looking for the plugin and failing to find it, but it was basically a thing like quilt but an Hg extension. Was not a fan :)
> I'm excited because it is SO MUCH more powerful than git's commit history rewriting, because "I re-wrote history" becomes part of your (distributed) repository's history.
That can be a feature or a bug.
This would be wildly useful for a public branch that needs periodic rebasing, because unlike a git rebased branch, you'd have a history of the rewrites.
On the other hand, most users who locally use git rebase -i to transform a local series of WIP patches into a sensible patch series for submission do not want any record of the intermediate commits (which may not bisect, or even build, and which may have commit messages like "WIP: try fixing it again"). git makes it easy and sensible to commit early and often, and then sort out a sensible patch series from the result.
This is of course a straw man as Mercurial already has several tools for this (MQ, rebase, histedit), which will keep working as-is even with changeset evolution. So changeset evolution allows things in addition to the local rebasing.
Not only that, but outdated changes aren't shared by clone or pull unless you explicitly ask for them. The full history stays on the server but intermediate commits slowly fade away as users hardly ever pull them.
I was impressed and excited when I read about that feature, but as I chewed on it for a while I realized I don't think I would ever use it. Most of my history rewriting is done locally before I've ever pushed—in that case I don't want those kept track of because they are throwaway (the same reason I don't check in a file after every single character change).
For the case when you want to push out to the world… Well, that's mostly for catastrophic things—oh, no I accidentally checked in the private key! In that case I also don't want the history of that kept around.
So I don't know. I like the idea, but I'm not sure when it's applicable.
>Most of my history rewriting is done locally before I've ever pushed
That's just a habit you acquired because right now rewriting public history is a "problem". It shouldn't be a problem. In fact, it's something people do, e.g. how about being able to edit a pull request as it's being discussed and it being ok if that pull request gets merged as it's being discussed?
Perhaps, but I think it's more fundamental than that. Like I said earlier, I don't save after ever character or word typed into a file, and I don't even commit changes until I think they might be ready. I don't push until it works (for some definition of "works" depending on how lazy I am and what project it is).
There are all these different levels of "saveyness" and the lower levels are just not as interesting.
In particular, I don't want stupid untested typos in the commit history because they just aren't interesting or helpful.
But I'm willing to say that I'm probably missing something… I just don't see what it is yet. :-)
When you enable mercurial's evolve, history rewriting operations are no longer destructive. Mercurial's evolve creates a sort of repository "meta history" by saving every revision before you modify it (e.g. by using amend). It makes those saved, old versions "obsolete" and it "hides" them (so that they won't show on your DAG when you do "hg log", for example, unless you use the --hidden flag). Evolve also keeps track of the relationship between obsolete revisions and their "successors". That is, it will know whether a certain revision in your DAG is the result of amending one revision, or perhaps of folding several revisions into one, or splitting one revision in two or perhaps just removing a revision from the DAG (this is what I called repository meta history above).
Those hidden, obsolete revisions are not shown on your DAG and they are generally not pushed nor pulled. In most respects they behave as if they were not even there. It is only when you need them that you can show them or go back to them (by using the --hidden flag of some of mercurial's command such as hg log or hg update). This gives you a nice safety net (since rewriting history is no longer a destructive operation) that you can use _if you want_. It also makes it possible to rewrite revisions that you have already shared with other users (since when you push a successor revision you also push the list of revisions that it is the successor of).
I think evolve is a significant step forward on the DVCS paradigm as it enables safe, distributed, collaborative history rewriting. This is something that, AFAIK, was not possible up until now.
This is not the same. Well, it's sort of the same infrastructure, but it would require a lot of work to actually work like hg evolve.
With Evolve, there is something similar to .git/refs/replace, called obsolescence markers, which may or may not indicate which commit replaces the obsolete commit (some commits are replaced, others are just pruned). These markers are created automatically every time you rewrite history. They don't have to be created manually like with git replace. Moreover, the obsolete commits are hidden from the UI unless you pass the --hidden argument to commands. Lastly, these obsolescence markers are propagated with push and pull operations. It doesn't seem to me like git replace can work over the wire?
> It doesn't seem to me like git replace can work over the wire?
I thought that being in the /refs/ namespace would make them eligible for easy synchronization once set up, but on second thought it doesn't seem like it. Git examines parents of refs to determine when something needs to be updated, but would use the parent of the object replacement in this case.
I think a mechanism similar to "git notes" would be better, where the ref points to a history of commits with each tree containing files for each replaced object. I've hacked git to do this at one point so we could retroactively edit git commit messages, but abandoned the effort after discovering "git notes".
Changeset evolution doesn't actually delete the entries in the history log. Rather it is a commit that changes what history looks like at a certain point in time. There are CLI commands to show the commits that are actually obsolete or have been rewritten.
Since it's really a commit that changes what history looks like, it's safe to push to other users.
In git, when you rewrite your history, the old version is gone (well, you still have the revlog for 30 days, but then it's basically over). This is why some commands like `git push --force`after a rebase can cause so much hassle to a community (hello Jenkins!)
Here, the principle is te keep all the history, and its rewrites, forever, and to ease the distribution of those changesets.
Nitpick: The old version will not be gone unless it is no longer reachable from any of the refs. Making it no longer reachable from merely one of many refs will not cause it to be GC'd.
This isn't really a nitpick though, since this means that similar porcelain could be implemented on top of git fairly easily. The underlying data-model supports it.
Changeset evolution has some similarities with a distributed reflog. Like Git, commits are immutable in Mercurial and we can only "change" a commit by creating a new version and then hide the old version. Mercurial "hides" the old version today by stripping it from the repository — the old version is then stored in a bundle in the .hg/strip-backups folder.
This is far from optimal, so a first step was to add a concept of hidden commits. Hidden changesets have been part of core Mercurial for some time now. The evolve extension enables it and actually changes commands to use it. So "hg commit --amend" will normally strip the old commit, but when evolve is enabled, it will instead hide it. This is both faster and safer.
The next step is the introduction obsolete markers. These are small markers that tell you when a commit is succeeded by a better version. When you amend a commit, an obsolete marker will be created that say "the new version obsoleted the old version". This information is something that Git doesn't store, and it is by distributing these markers that we can make Mercurial more intelligent. As an example, if I amend a commit that you have already based work on, then evolve will know that it should rebase your work onto the successor I created. It will tell you about this when you pull from me and get the new version along with the obsolete marker. Your commits will be called "unstable" as that point, meaning that they are descendants of a commit marked obsolete (they descend from the commit I amended and thus marked obsolete). You can run "hg evolve" and it will figure out that it should run "hg rebase" behind the scenes.
Seen like this, I would say evolve is similar to what happens in Git when you edit history, but with some extra meta data that will allow you to edit shared history with confidence.
In hg "rebase" just means "change the base" not "rewrite commits". So I assume you mean "rewrite" in general.
With evolve, the obsolete commits stay around foreverish, but they slowly fade from history as new people clone or pull, since obsolete commits don't get pulled or pushed by default.
> If I accidently commit "the keys to the kingdom" how do I get them out of the history?
Mercurial never actually removes any functionality, since it's got the deepest commitment to backwards compatibility I've ever seen. Thus, you will delete commits the same way you do now: hg strip --no-backup. That deletes commits with extreme prejudice, locally. Now you just have to run this in every copy of your repo, including remote ones, but the genie-out-of-the-bottle problem is one you can't avoid with a DVCS.
> In hg "rebase" just means "change the base" not "rewrite commits". So I assume you mean "rewrite" in general.
rebase doesn't mean "rewrite commits", it means "create new commits based on these ones, based of a new base". Your original commits are still there, and are pushed to the remote, but are GCed after a certain period (default 30 days?) if they are not referenced from anywhere. Since unreferenced commits are pushed to the remote, you can easily restore those commits within the GC period.
When you rewrite a book and create a new edition, you don't typically recall the old editions. A rewrite needn't be destructive.
My point is that git says "rebase" even when the underlying base of the commits affected is not changing. This is an artifact of the UI, since the command to rewrite in git is typically git rebase -i.
You can still permanently delete a changeset using `hg strip`. Of course if you have pushed that changeset you will need to run `hg strip` on all remote repositories that have a copy of the commit.
Mercurial will also (very helpfully) create a backup bundle of the changeset you strip, so you will need to securely erase that as well.
If you've accidentally published a key, your only safe option is to change that key. Abolishing it from history is closing the gate after the proverbial horse is well into the distance.
This is what I was wondering. Is it possible to do this with mercurial? I am thinking that it is, but the evolution thing is a way to handle other types of rebase situations.
I'd say that with Phases[1] and Publishing vs. Non-Publishing Repositories[2] Mercurial is already ahead of Git in terms of safe history rewriting. Changeset Evolution is significantly more powerful than anything Git has to offer.
For me, the extreme popularity of GitHub is the No. 1 reason to avoid it as I do not want to contribute to the centralization of the Web. Also I like Fossil's [fossil-scm.org] approach of integrating wiki and bug tracker into DVCS, which allows projects to be less dependent on hosting services.
Fossil's user base has yet to reach a critical mass, mercurial and git are at least widely used and plugins, and services exist. I wonder if darcs will become a viable option again.
For git there is ticgit, which is a tracker that lives as a git branch.
Its definitely great to have several tools available and see a really productive evolution for good version control systems.
I definitely prefer hg but github has pretty much won war so I'm stuck with git. At the end of the day I'm not going to get religious about version control, I have vim for that :)
Sadly bitbucket does not have very good support for features from "modern" mercurial like bookmarks and changeset evolution. Since most of their users are git users I don't think they are strongly motivated to shore up mercurial support.
The advantage of changeset evolution is obvious to me. But as a longtime Mercurial user, I don't grok the benefit of bookmarks. Do you have a simple example where a bookmark is more useful than a permanent branch?
The teams I work with don't like the permanence of mercurial named branches. Bookmarks are a nice lightweight alternative way to name a short-lived head rather than creating a permanent tag for a two-line commit. Before bookmarks we used anonymous heads.
Has Facebook switched to Mercurial? In one news it says so, in the next one read it has a ~50GB Git repo. If they really switched, it would be a good advertisement for Mercurial and its capabilities.
There are a couple of settings in repository-level configuration file which tell Mercurial where to pull from (`default`) and where to push to (`default-push` or `default`; think "remotes" in Git parlance):
Here are some of git's real problems:
* Performance issues with multi-GB git repos * Handling of large binary files * Submodules - Mercurial has subrepos, but I don't know how they compare