Ah, there it is. I was wondering when this would happen.
Facebook used to be involved with the Mercurial community, but it was difficult to work with them. They always wanted to do things their way, had their own intentions, and started to demand that the Mercurial project work the way that Facebook wanted. For example, they demanded that we start using Phabricator and started slowly removing sequential revisions from Mercurial in favour of always using node hashes everywhere, arguing that for their gigantic repos, sequential revisions were so big as to be useless.
Eventually the disagreements were too great, and Facebook just stopped publicly talking about Mercurial.
I figured they would emerge a few years later with their fork of it. They love doing this. HipHop VM for PHP, Apache Hive, MyRock; these are examples of Facebook forking off their development in private and then later emerging with some thing they built on top of it.
The Mercurial project is surprisingly still chugging along, and there are still those of us who actually use Mercurial. I doubt I'll switch over to Sapling, because I disagreed with the things that made Facebook fork off in the first place. But if others like Sapling and this manages to put the slightest dent into the git monoculture, I'm happy for the change and innovation. I really hope that git is not the final word in version control. I want to see more ideas be spread and that people can see that there can be a world beyond git.
Absorb is amazing! Even if you don't take up Sapling, there's a 'git absorb' plugin which I have found absolutely invaluable: https://github.com/tummychow/git-absorb
git-fixup will add fixup! commits, so it still needs the mentioned 'git rebase -i --autosquash' afterwards. Usually you do not even need to give it a specific commit if your branch is set to track an upstream branch.
Absorb is fantastic and one of Jun Wu's (seen in this thread as quark12) best contributions to Mercurial. I want everyone to know about this tool, it's amazing. I had fun trying to come up with a name for the feature:
It seems this is still based on changing history - Does hg have something to address the downsides of rebasing that are present in git? such as chilling effect on cooperation of branches (eg you don't dare touch or use remote branches which someone may use rebase on, and destruction of actual history for use in eg reconstructing a timeline of a bug and fix later found to be important).
Slight correction: HipHop for PHP was cleanroom, including rewriting large families of native extensions to work with its C++ runtime, although it eventually developed workalikes for the PHP dev headers to ease development. Source: I worked on HHVM, its JIT successor that initially shared its source tree and runtime.
Facebook developers seem to have a surprising amount of free time to go around reinventing things that are not obviously social network features. (Or to have had it in the 2010s, at least.)
Note WhatsApp had 35 employees when they were acquired and Instagram had 13. At that size you need to be productive at managing servers but you're probably not thinking how great it'd be to have a "whole new programming language and source control system" team.
WhatsApp and Instagram at the point of acquisition were simpler than Facebook is (and was), or even compared to what it is now. Once you scale you start to need a lot of engineers to help keep things standing up and everyone on the same page.
WhatsApp had like half a billion monthly active users when they were acquired, that could be considered fairly large scale, no? But I agree with your point in general.
Yes, but WhatsApp is a point-to-point communication tool with mostly small groups. Each individual message doesn't need to be distributed to a potentially very large audience like in Facebook, making processing and coordination of nodes smaller and simpler.
Note though that other large projects with similar scaling-git problems tended to just write wrapper tools to work around it, see how Chromium and Android do it.
I switched from git to Mercurial and was absolutely gobsmacked by how much better it is. The only comparison was switching from a Blackberry to an iPhone - everything just works exactly the way I want it to.
Yes, I read the manual for git, but I never needed to for Mercurial.
I guess you didn't start from cvs or svn, because the experience moving to git was otherworldly. If git was your first versioning system than you had to learn some concepts first, which you already internalized when you switched to Mercurial.
Due to my workplace I moved from svn to hg (awesome experience) and then from hg to git (terrible experience). That was years ago, and the git experience remains terrible.
If you're interested, can compare tfs to git. Maybe it's ide integration quality, but FWIW:
1) tfs: shelves are named and can be worked with independently; git: stashes are numbered in a sort of a stack and only the top stash is unpacked and deleted destroying your data, infuriating, also local edits are moved into the stash, not copied.
2) tfs: branches are mapped to different folders and can be worked on simultaneously; git: branches are mapped to single folder and switching branches deletes local edits.
3) tfs: pulling from server merges new text preserving local changes; git: pull and checkout deletes local changes.
4) tfs: can't commit unresolved merge conflicts; git: commits just fine, it's also not obvious if you have merge conflicts to commit or not.
5) tfs: handles concurrent edits as merge conflicts and handles them as they occur; git: you create feature branches and in case of concurrent branches you have merge conflicts when you merge to master, then you have p.4. Feature branches are advertised as a big fat killer feature of git, but I don't quite see the win here, you still have merge conflicts.
6) tfs: all commits in a branch are visible, it's not obvious how to delete them, you can only create rollback commits; git: if something happens to the branch label, the commits are gone.
> 1) tfs: shelves are named and can be worked with independently; git: stashes (...)
I don't think your comparison makes sense. Git stash is a way to quickly get to a clean local workspace that can be reversed as you see fit. You use local branches to track independent work.
> tfs: can't commit unresolved merge conflicts; git: commits just fine
Again, this doesn't make sense. In git, users determine whether a conflict is resolved or not. If you commit a changeset originating from a merge conflict, you have to explicitly state that the conflict was resolved and your changes are good.
> Feature branches are advertised as a big fat killer feature of git, but I don't quite see the win here, you still have merge conflicts.
Honestly I didn't understood what point you tried to make. Merge conflicts happen because multiple sources of change touch the same document region in a way that can't be resolved automatically, thus needing human intervention. To the best of my knowledge, there is no cvs in the world that eliminates merge conflicts.
Regarding Git's support for feature branches, the fact that you don't understand the big win Git brought to the world with it's branching model is already a testament to how groundbreaking Git was at the time, and how everyone around was quick to roll out Git clones that follow the same approach. To see what I mean, spend a day working with a SVN repository trying to do work involving feature branches.
> git: if something happens to the branch label, the commits are gone.
Aren't you actually saying that if you delete a branch then the branch is deleted?
I assume tfs and svn work with feature branches the same way. In my experience with feature branches in tfs they were a hindrance, not a win.
Local branches store immutable state, but shelves can be unpacked anywhere (like a stash), thus local changes carry over, but git deletes them on every occasion.
In a nutshell, git has an unintuitive and unfriendly CLI with bad defaults.
I want my VCS to be quiet, out of sight and do as it's told, because my main focus should be programming, not how to tame a tool that's supposed to save text. The fact that you have to "learn git", and that there are so many StackOverflow git question on how to do (what should be) trivial operations is probably a hint that things aren't great in the usability department.
For hg, I just read an introductory guide (I think it was Joel Spolski's one) and that was enough. I used to be able to do a more in-depth comparison and criticism but nowadays I use git and try not to think about it too much.
> In a nutshell, git has an unintuitive and unfriendly CLI with bad defaults.
I see this claim often, but it never is accompanied by evidence or any concrete example.
I've been using Git for years and I never noticed any semblance of unintuitiveness or bad defaults. Everything in the happy path is straight-forward, and all obscure things are a quick googling away.
Do you actually have any concrete example to back your claims? What's the absolute best example you can come up with of said unintuitiveness and unfriendliness?
In case I wasn't clear, "git has an unintuitive and unfriendly CLI with bad defaults" when compared to hg.
Just compare the man pages! "git help clone" vs "hg help clone".
A random (trivial) example off the top of my head. When working with branches:
> git pull
> git switch another-branch
Your branch is behind 'origin/another-branch' by 2 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
... why? I literally just pulled, why are you asking me to pull again? 99% of the people on the planet literally want the last version of that branch (provided there's no local changes leading to a conflict).
Compare with hg:
> hg pull
> hg update another-branch
Done.
I haven't worked with hg in a long time so I can't really provide an "absolute best example". All I can say is that, from memory, hg always got out of the way, and when I wanted to do something out of the ordinary I could either guess how to do it or it was easy to figure out from the manual.
With git, almost nothing's easy. It can become easy if you invest a lot of time in understanding how it works internally (which explains some of its CLI choices). But that to me is a sign of a bad tool. "all obscure things are a quick googling away" - why expose the user to obscure things to begin with?
The entire CLI is badly designed, with highly non-orthogonal commands interacting in unexpected ways. Recent versions of git has started introducing commands with a more top-down design (e.g. git switch), but that's a very novel development
Git also diverged significantly from the SVN command line, but instead of following the tasteful Darcs path of making commands clearer and cleaner it commonly:
- reused the same terms for different operations, usually less intuitive ones (e.g. the absolutely awful "git revert")
- removed clear commands to confusingly tack them onto others (e.g. "git add" to resolve commits)\
On that front, mercurial extended the existing commands-set much more cleanly and clearly.
Mercurial's CLI felt like an improved extension of SVN's, Darcs felt like a drastically different take with plenty for it, Git's felt like one of those ransom letters cut out of newspapers, full of jangly bits which make little sense, and concepts which worked fine altered for no perceivable reason.
The fact, e.g., that it's a completely different operation than checking out a single file or folder? (One changes HEAD, one doesn't. One warns about overriding local changes, one doesn't, etc.)
My experience is the opposite. Most of the staunch Mercurial supporters came from CVS or SVN and found Mercurial to fit better into their preexisting mental model of source control.
Users that started with Git are more likely to internalize Git's concepts as "the natural way to do version control", and more likely to find Mercurial counterintuitive.
Not my experience. As it happens, Mercurial and Git are very similar in terms of the "mental model" required to use them: history is just a DAG, and common operations consist of navigating the history and creating/editing/re-organizing commits.
But where mercurial shines with a clean, consistent and simple UX, git is a mess where storage-layer abstractions leak to the user and where single commands serve multiple unrelated purposes.
To give you a practical example, any command that takes commit(s) as argument (for checking-out, logging, rebasing, …) would be passed a `revset` in mercurial-land. Revsets are a simple DSL to address commits (by hash, lineage, topology, distance, …), which makes new commands easy to learn, and renders one third of `git help log` inadequate. Most commands that output something use templates as argument, which renders another half of `git help log` laughable.
There is nothing "natural" about the git UX, you've got to accept that it grew organically with no attention to details.
I really love git for the way it has taught me so many things by being so exposed. I love that you can easily use it as p2p, via email as a server or client, as a ci/cd solution. I love that you can easily inspect its model with cat. I love that while “branches” and “tags” seem special, you could just as easily use its “notes” or even just tack on your own ref system willy-nilly.
All this to say, I love git, and as a versioning system, it seems obvious to me that we can do better than git for 90% of workflows. Just yesterday I was surprised by how `git push --tags` worked and had to read the man page for `git-push` to see I wanted `git push --follow-tags`. Just reading this forum today I see that I alternately probably(?) could have figured out what I wanted with `git help push` which I didn’t even really realize was a thing.
But do I want to use an easier versioning system? I’m not sure. git has a kind of forcing function for a growth mindset. I value it for its creative expressivity.
Edit for sibling comment: Stockholm Syndrome, quite possibly. But as the “victim,” I still love git!
heh, I'm highly sympathetic of what you say :) . I also got to use git after being told how awesome it was, by people who themselves were told how awesome it was, and so on and so forth. And for many years, I was a believer, too, and probably heightened the bar of what I found acceptable as a result.
That's not to downplay git qualities, but as a social experiment in soft peer-pressure and group-thinking, I think it has value (though that's beyond my field of expertise).
I came to mercurial when, after shooting myself in the foot with git for the N-th time and going on a rant about it, someone on IRC told me to give mercurial a shot and move on with my life. I confess that TortoiseHG helped me translate my git habits into the equivalent hg commands, and the kind of history exploration that I was doing then set me up to speed with the revset way.
Then, what I found formidable was that all the knowledge about git intricacies that I had accrued and internalized with pride over the years became absolute no-brainers and irrelevant in the hg world: I remember a famous stackoverflow thread in 10 steps for merging two unrelated repos (including arcane git commands, shelling-out to sed, and non-transactional storage-level ops that would warrant a backup, as was the norm back then). How do you get history from a repo in hg? `hg pull`. How would you go about getting history from a repo (unrelated) in hg? `hg pull` as well. And thinking about it, would have git been nicely designed, it wouldn't have had to care about the difference and even less so had to expose it to the user.
Mercurial, although not perfect, really opened my eyes on what good UX design should look like.
People don't internalize Git concepts as "natural", they get Stockholm Syndrome.
I'm not saying mercurial is better, but there's a reason I have to remind people that this[1] is satire - the real manuals are so convoluted that they seem like parodies of themselves.
There are two reasons Git is better, and the UI is NOT one of them: speed and repository format. I think Keith Packard wrote about this best: Git is technically under the covers way superior over Hg, it's not even a competition. For usability, I think with Sapling we finally get some real competition and that might be the user interface alternative we need, as well as the monorepo approach it was basically made for, but let's just say that I don't trust people who think Mercurial is technically even close to a good idea.
I think the problem arises from when you switched to hg from those older systems and grew accustomed to it and then were FORCED to switch to git because of work or whatever.
I feel like the Mercurial fanbase is just another loud fanbase. I did start with CVS and SVN and git was absolutely fantastic when I started using it. On the other hand I could never get the hang of Mercurial. From MY PERSONAL perspective it has a terrible UX(which is the exact opposite experience of the loud hg fan git critics on HN). I absolutely cannot relate to the people who say that Mercurial has the better UX, but I don't remember myself constantly bashing mercurial either like the other side does. From my perspective `git add -p` is an important essential functionality that hg does not provide. I believe there was some sort of plugin but it was no where near as polished.
Yes, there are a handful of nice features in mercurial, but none which are actually necessarily needed inside of the git core.
I do faintly remember that my biggest problem with git was understanding that a commit doesn't push automatically. But then again that's just a difference between a DVCS and whatever was there before.
Having used svn, git, mercurial then back to git (with the last two at the same job, the business unit having decided to move from mercurial to a git-based solution) and I've preferred mercurial to git, where it feels like there's more possible foot-guns and its features don't bring enough value to justify the increased complexity when using it .
What are the changes that Mercurial has? I've read many times that Mercurial is easier but I've never seen the reasons. The times I've used mercurial (mostly to download a repo) it doesn't look that different, and other than the staging area weirdness I don't find git too difficult.
You know how when using git you have to add tons of flags to every command to make it do what a sane person would actually want? Mercurial just does what you want.
I'm going to be blunt and state that what most people want is not what they actually need. If you use Git for how it was designed, the commands remain generally pretty clean. If you come from an SVN mindset, you are probably not really using Git the way it was designed, and Mercurial fits better.
I have used (actually introduced) Mercurial before at a company and considered them basically equivalent enough, only to get stuck in some horrible design choices of early Mercurial (named branches and not having rebase by default). I am happy to see these elements corrected in Sapling, giving me enough confidence that I might actually use Sapling over time...
The way Mercurial is structured is, to me, intuitive, easy, and powerful. I'm still using it for all of my local projects; luckily hg-git lets you pull in dependencies from github, which would otherwise require me to switch to git.
As someone who used to be intimately involved in the development of PHP, HHVM was an interesting project because for a long time it supported standard PHP (alongside Facebook's custom language, Hack), so it brought competition to the implementation space! But eventually Facebook lost interest in that part, probably because they had no use for it.
If Sapling encourages a Git-usability renaissance the same way HHVM encouraged PHP to get good performance and typing, even if it eventually gets abandoned, I will be thankful for it :)
Mercurial has a far better user experience and mental model. The commonly used commands simply do what their names say: hd add, hg commit, hg revert, hg update, and so on.
Git is a twisty maze of operations combined under poor names (e.g. git reset) with dozens of obscure options (e.g. man git-log) and broken abstractions (e.g. what is HEAD and why does Git emit a warning whenever I check out a tag?). I often feel so sad that the entire software industry has fallen to Stockholm syndrome under Git — we think these contortions are normal, when in fact they are arcane.
I was going to say that they already had started their own successor to Mercurial called Eden, but it seems like Sapling is just a renaming of Eden. Maybe anyway. It's a bit unclear.
It's a pet peeve of mine folks that use python subprocess as a replacement for bash because it takes special discipline to not eat stdout and stderr, or to correctly try:finally: to show the Proc details before the raise eats the variable
> I really hope that git is not the final word in version control.
common problem in open source. any project that gets big enough effectively stops anyone from wanting to work on an alternative, or use an alternative, due to the momentum of the large project.
deviating from it makes it harder to collaborate or be productive because the big project does everything (though often poorly), everyone knows it already, and no one wants to learn something new, and no one wants to work with the people using the weird thing.
same reason why it's hard to make a Facebook alternative.
That's true, but that's not the whole story either. I remember not too long ago when it wasn't a big deal to jump from one VCS to another. Devs were proficient in CVS, SVN, HG, BZR, GIT as a standard, would go from mailing patches to a ML on a project to pushing to a repo on another, to zipping code and uploading it on a FTP on a third. It was the project's workflow of choice, and people would respect that.
Things really became one-sided after github started gamifying open-source contributions, and when a new generation who perhaps grew-up in a more competitive academic setting took it as an opportunity to make their resume more impressive.
We peer-pressured ourselves into collectively using a less-than-ideal tech because that was the price to pay to belong.
I'd add on that a tools ecosystem integrated with git/hub over other options has also made git adoption a more natural fit for those who are less about tools and more about a particular outcome, which also spanned git usage (or git as a background vcs) beyond just software developers.
The big baddie here, IMO, isn't even git itself but GitHub. Many younger devs don't even seem to realise that they're not the same thing, that git can be used without GitHub.
> Devs were proficient in CVS, SVN, HG, BZR, GIT as a standard
That's completely made up on your end. I've been doing FOSS for 17 years and the vast majority of people were barely competent with any of them at all beyond whatever their chosen ponyshow was (including off-brand ones like Darcs and Monotone), and any switching from a persons preferred one to the use of another project was often met with grunting and complaining, if it was done at all.
The reality is we only think this because we saw people do this, at great cost of their own time -- but that's the literal definition of survivalship bias. For every 1 person doing this 50 just stuck with whatever they used and wouldn't bother. I've literally seen people refuse to contribute to a project over tabs vs spaces, and people still do this with git vs hg today, just not as much today.
More people use Git and contribute to FOSS in a single day in 2022 than every developer who knew all these tools combined back in 2008 or whatever. Whether this is good or bad is up to you, but you don't need to make up claims about developers being epic journeymen in the past and mastering 50 version control tools to do a single days work. They did not.
> Things really became one-sided after github started gamifying open-source contributions, and when a new generation who perhaps grew-up in a more competitive academic setting took it as an opportunity to make their resume more impressive.
Sorry, but I consider this to be a similarly made up claim that's just sour grapes. I've been using GitHub since 2008, I'm one of the first users. GitHub didn't even have "gamified" social features until the past 3-4 years IME (what, stars are about it?), and before that its product was pretty poor in some key areas like code review and project management, org permissions, on-prem control, etc. I didn't even use GitHub commercially until like, 2018, because most orgs had setups that were better in some key areas. But it was pretty easy to use and get started with, which mattered, and still matters, and soon enough nobody could compete with the same ease of use for free projects.
If I had to "blame" "someone" in this vein, a better place to start: global monetary and fiscal policy for the past 20 years resulting in software development becoming one of the only places with rising wages to meet cost of living demands in places like the US -- resulting in an influx of new blood to increase their wages and quality of life, combined with political choices like low interest rates, and huge explosions in demand for software devs for things like VC adventures, etc. Subsequent developments like bootcamps designed to churn devs out to match rising demand, etc which solidified platforms like GitHub further as it was easier for them to paper over these fundamentals when there was a clear winner (git/github) to focus on and ignore everything else. Forest vs trees and all.
> That's completely made up on your end. I've been doing FOSS for 17 years and the vast majority of people were barely competent with any of them at all beyond whatever their chosen ponyshow was
I doubt either of us has actual figures, so it's "my experience" vs "yours" (and full of anecdotal bias which I'm willing to admit). At least what's factual is that a decade ago, the versoning and tooling ecosystem was much more diverse: where today github/-actions/-CI/-issues/… is a quasi-monopoly you would bump into a new hosting solution/bug-tracking, reviews and CA build systems every other week, and the major open-source projects were either using CVS, SVN, GIT or HG. So, in proportions, more people had to be able to switch from a system to another, just out of practical considerations.
And I'm not even looking back pretending that things were perfect back then, I'm only suggesting that the monoculture which ensued killed a lot of competition, innovation and convenience, from which we could benefit today, on top of increased standardization (did you know for instance that hg can pull and push to git?)
> For every 1 person doing this 50 just stuck with whatever they used and wouldn't bother. I've literally seen people refuse to contribute to a project over tabs vs spaces, and people still do this with git vs hg today, just not as much today.
There's some truth to that, but it's a pretty extreme view. All projects can benefit from a lower entry bar and should be as welcoming as feasible, but optimizing for drive-by contributions at the expense of more meaningful involvement is destructive on the long run: anyone having worked on larger projects knows that the bulk of the effort is carried by a smaller group of dedicated people over long periods of time, not hundreds of over-the-fence-throwing typo-fixes. And I'm totally siding with projects who don't want to adopt the github ways of organizing their work where it means making long-haul contributor's lives more difficult.
> More people use Git and contribute to FOSS in a single day in 2022 than every developer who knew all these tools combined back in 2008 or whatever.
[Ref. Needed]
> > Things really became one-sided after github started gamifying open-source contributions, and when a new generation who perhaps grew-up in a more competitive academic setting took it as an opportunity to make their resume more impressive.
> Sorry, but I consider this to be a similarly made up claim that's just sour grapes.
If "gamified" may not be the right term, you certainly can't be oblivious to how often people equate their github activity to their resume, and how it's used as a token of value by job hunters and recruiters alike. Putting aside the "merit" of most contributions, this really contributed to people "demanding" from projects to move to github, to bring visibility on themselves (before the interests of the project itself), hence the centralization around the single largest platform. I don't know about you, but I find this state of affairs quite discomforting.
Because motivated, high performing people need to have control over their own destiny. Because cookie-cutter solutions which work for 90% of use cases are often worse than something explicitly tuned for you.
People having specific needs, getting frustrated and then solving their problem is a feature of opensource code. Its not a bug. It is the engine of innovation and improvement. Forking means we can both get what we want, even if our needs are contradictory or we don't want to work together.
This happens with commercial offerings too - but its a mess. You can't just fork the code without paying (or sometimes at all). And every fork is private, so work is duplicated and collective learning doesn't happen. Expensive consulting-ware might be the best case outcome.
The ability of motivated people to fork projects and have their own spin on things is one of the biggest strengths of opensource. May the best forks win.
> Because motivated, high performing people need to have control over their own destiny.
This opinion seems to be peculiar to software and I think it has something to do with the fact that software is one of the few verticals where you can (attempt to) intellectualize everything.
Having control over your own destiny in any other skilled labor job seems to be 98% about finding a brand and model of tool that works the way you do, and 2% building your own tools or jigs for that specialty task nobody thought to build a tool for.
In software it's anywhere from 90:10 to 10:90 depending on how much emotional baggage your coworkers are carrying around.
Does it have to be so disrespectful, though? Demanding things, getting frustrated when denied and ghosting the original project to the point even they are surprised when a fork shows up?
There is always too sides to a story. We don't know what pressures the FB engineers were under and the problems they were trying to solve resulting in them "demanding" changes. In the end I think everyone wins with the forking.
> Because motivated, high performing people need to have control over their own destiny.
I don't quite buy this: Those same "motivated, high performing people" don't seem to have anywhere near the same "need to have control over their own destiny" when it comes to the commercial closed-source tools they use.
I’ve seen people reinvent the wheel all over the place because their tools weren’t quite working for them. This instinct is the reason most good software has APIs - so you don’t have to ditch the tool entirely to customise it to your workflow. And most medium to large companies have all sorts of wacky customisations on top of existing software.
Eg perforce at Google. Well, everything at Google. And a friend at a big broadcaster has a bunch of company specific plugins for Reaper for their audio editing pipeline. And everyone insists on customising Jira.
IME there's at most one of those wheel-reinventors for each zillion devs unthinkingly swallowing whatever (VSC) Microsoft is spurting down their throats this week.
Because that's how progress slowly happens? If I have an issue with something, and come up with a possible solution, and then ask why it's not actually done like that... Either other people tell me I'm wrong and explain why, or they agree and things get better.
> and there are still those of us who actually use Mercurial
I'm certainly in that camp; and it pains me every time I have to use the hggit extension to convert a mercurial repo to git in order to work with everyone else...
Yeah, we should give a shout-out to the octobus folks for making heptapod, a Gitlab fork bringing support for mercurial repos. That brings at least one mainstream hosting option for hg repos.
Absolutely. I guess most of us have a bunch of repos which are personal anyway which are just as easily "hosted" as Mercurial repos. I.e. if I have ssh access to a machine, "hosting" a remote Mercurial repo with
I have been waiting ten years (https://www.google.com/url?q=https://stevebennett.me/2012/02...) for someone to develop a better CLI for git, someone with the scale and clout to do it well and gain mindshare. It's not that useful to learn a new workflow if no one you ever work with will be familiar with it.
This looks incredible. A simple command to uncommit or unamend makes you further realise what a disaster the Git CLI is.
My brain immediately jumped to "but you can just git reflog and then copy the state you want to revert to and then git reset --hard <commit>", but not only is that not simple or obvious, it isn't even correct, since a commit or amend operation can be performed with only some of the changes staged, and a hard reset will wipe out anything unstaged. Ah sigh.
Well in that situation you can stash unstaged, reset, then pop. But that just reinforces the OP’s point. Not the most ergonomic or discoverable path for something that should be simple to do.
git reset HEAD@{1} should do the trick to "uncommit" and keep uncommitted changes in the absence of conflicts. (I may be missing some edge cases.) It does however unstage changes.
I don't blame git, I'm mostly surprised about the social inertia regarding improving it. But like programming languages, pressure accumulate until it's released, I'm sure sapling and similar will make people want to try new things and git will soon catch up.
The social inertia point is an interesting one. It seems like in some areas, like JavaScript frameworks, there is a ton of inertia for change; in other areas, like git, there is a ton of inertia for stasis. Why is that?
Writing the umpteenth Javascript framework might be good for your career. It isn't hard, in fact, it is so easy a multitude of javascript-only developers attempt it on a regular basis.
Occasionally one takes off. Really just a function of how many friends the author has, their stature in "the community" and/or their aptitude for creating cute marketable landing pages.
There is a multitude of people capable of jumping in with hot takes explaining why the new framework is superior/inferior, tweets, blog posts, courses, books and conferences abound..
In contrast re-inventing git is hard. Few people can wax poetic about the differences between alternatives. Even fewer can come up with a new one. The audience is far smaller not to mention skeptical. Less profit in it.
Rather the difference between a new age cult/mega-church and the Catholic church.
I honestly don't know. For some reason, with JS frameworks, everyone seems to agree that improving ergonomics is a good thing. New methods are added, old ones are tweaked, constantly improving the developer experience.
But with Git, every conversation about improving the CLI has 50% of the participants claiming that the other 50% are just too stupid to use it, and that the CLI is brilliant.
Forget the CLI. VSCode implements a ton of git features as commands, which can be bound to any keybindings. For me this is way faster than CLI, as I don’t even need to leave my text editor - I have it set up chorded, so AltG followed by P,U,C,Y, will push (actually sync), pull, commit, undo, respectively. Two keystrokes beats any CLI interaction I’ve seen.
Most people's objections to git ergonomics is not that the commands are slow to type. VS Code exposes the exact same operations on the underlying repository as the CLI, and it's those operations that critics say are hard to understand and cumbersome to use.
because doing a commit with one keystroke is a good thing? i think it isn't - i like to consider things, rather than program like a hyperactive cockroach
Taking time to consider things is good; but that doesn't mean you're better off with slow tools.
If developing has some loop like "consider -> implement -> evaluate", then the quicker you can implement it, the more budget you have for considering the problem.
I think the original poster was referring to staging the commit as their "thinking" step, not composing the message.
Personally, my workflow is to make many changes, then use `add -p` and `commit` to create a series of small commits. While staging, you might decide that you don't want to commit some bit of code and `restore -p` to toss it away.
I think your workflow would work well if you see commits as "development checkpoints" rather than semantic patches. It's not an invalid workflow, just a different one.
I'm sure you could configure VSC to be analogously ergonomic for any git workflow. But people who are comfortable with git and their shell of choice tend to develop comfortable workflows in the terminal as well.
For what it's worth, you can do similar to "add -p" in VSC by using "Next Change" to scroll through a file in the diff view and adding hunks with individually with "Stage Selected Ranges".
It's a little slower than "add -p" but serviceable. Having editable diff in the diff view is really nice though.
dude, vscode git interface is ... not really good.
I really continue to keep a copy of eclipse around just to use the egit client. That is a good UI ihmo
Admittedly this might not help since it is not CLI, but spend some time this weekend with Emacs and magit. You don't have to use emacs for anything else, just the magit client. It will transform your git experience.
I did something similar for years with Sublime and Gitsavvy. But I couldn't stop Sublime updating itself from time to time, and it kept breaking Gitsavvy, and it got really tedious to try to fix it.
I got 90% of the way there with a lot of git aliases. The other 10% is constraining your workflow and reasoning about what state each of your commands leaves the git tree in.
git revert doesn't undo a commit though— it creates a new commit that undoes it. That might be what you want under some circumstances, but most of the time that I want to revert it's a commit I just made and haven't pushed yet, so I just want to pretend it never existed.
I never want that though. If I wanted to do something like that, I'd do the git reset soft, followed by git stash, followed by git switch. I think this at least allows me to look back at it locally?
Git switch is also something I learned recently so sometimes I type checkout because of force of habit but I am trying to do better (even though I'm not sure what switch dies that checkout can't but don't want to get into arguments, just want to do things the prescribed way).
The thing about git is that it's basically plumbing all the way down. In this case: switch abstracts over checkout, which abstracts over read-tree & checkout-index. The nesting doll just keeps going.
Because switch is built on top of checkout, there's no functional difference between the output of the two subcommands when used for the same purpose (assuming the presence of a skilled operator). It's strictly a matter of ergonomics and abstraction.
At the end of the day, as long as you don't fuck up, just do whatever best keeps you in your flow state.
What's wrong with just working on and when the changes finally look like they should just do a `git commit --amend`?
Or if the commit should for some strange reasons really never exist just move HEAD one commit back. You could even get the changes back by merging the "bad" commit back without committing the merge (using the `--no-commit` switch).
Making committed history disappear isn't something I would call a "simple action". That's something that is almost never needed.
The common case it to amend mistakes. Git makes it very easy to accomplish that.
The other thing is: Git is conceptually very simple. There are almost no "esoteric" concepts. It's just a Merkle tree and some pointers to nodes on top of a very simple plain-text database.
My experience with people that have problems to understand Git is that most of the time those people never tried to understand how Git actually works. But everything (besides the concrete commands and switches, oc) becomes almost obvious when knowing the inner workings.
The main problem with Git is its UX.
I don't know anything of this stuff out of the top of my head! I have to look up the concrete commands or switches every time. But from the conceptual point of view Git is very easy to use. Because the underlying concepts are indeed so simple and straight forward.
I have a CS background and I don't even know what a Merkle tree is without looking it up, and I'm sure after looking it up I'd have to do more digging/research before it gave me a clear mental model of how git works. I'm pretty comfortable in git at this point - I know how to navigate the space of normal-ish states - but that came after years of exposure.
For a person who's learning to code, who's expected to jump straight into GitHub as a part of their very first real project, the situation is kafkaesque.
I would not mix up the UX and the underlying concepts.
The concepts are very simple. The UX on the other hand is at least "sub-optimal".
> I have a CS background and I don't even know what a Merkle tree is without looking it up, and I'm sure after looking it up I'd have to do more digging/research before it gave me a clear mental model of how git works. I'm pretty comfortable in git at this point - I know how to navigate the space of normal-ish states - but that came after years of exposure.
That's exactly the point that I've tried to explain: People make their life substantially more difficult because they never learn the basics. This way Git keeps to stay guess work till the end of days.
The theory behind Git may seem off-putting when words like "Merkle tree" come up. I understand that.
But actually it's something that I could explain to a 12 year old in 10 minutes…
Instead of looking things up people chose to struggle for years and years, without ever having any understanding about the "magic" that happens behind the scenes. But without the theoretical knowledge Git is not really intuitive, that's true. Coming up with a "plan" how to accomplish something becomes than a matter of black arts. But it really isn't! Git is very straight forward. Really. Just take the time to look up how it actually works. Everything (besides the weird APIs) will start making sense than.
I don't think this is true. GitHub (the website) became very successful with virtually no feature overlap with the git CLI. I suppose you could say that GitHub pull requests and forks are a replacement for git's built-in email-based workflow, but that's not really what people complain about when they say git is hard to use (and it's not like the other DVCSes had obviously better solutions).
I always chuckle when things like “sensible”, “user-friendly” or “sane” get thrown around as if they are anything more than that person’s opinion.
When are developers going to learn that they actually have to learn and familiarize themselves with preexisting systems, instead of endlessly reinventing them, and that there is no such thing as the perfect system?
Nothing is perfect, but some things are clearly better than others - hence why we’re nearly-all using git rather than "learning and familiarising ourselves" with CVS :)
Fair point, but it is not clear to me that Sapling is better than Git. I haven’t seen anything in its docs or in the comments here where I haven’t thought of a way to do it with Git.
I wonder what the cost benefit analysis would be: how many developers hours have been expended building Sapling so far, vs a list of things it can do that Git cannot.
And I guess it bears repeating that I’m interested in possibility and not aesthetics.
> where I haven’t thought of a way to do it with Git.
With git one can do "everything" true, doesn't mean that I want to teach everybody first what a DAG is and how one can manipulate that using git commands, but want to have something simpler. And yeah, git became a lot better over the years, but the plumbing still shines through in many places with inconsistencies.
Understanding a DAG becomes fundamental very quickly when trying to understand how distributed version control works. I would argue the opposite, that the internals are actually very reasonable and generic but it's the porcelain that has made some poor UX choices.
Given that they’re both open source and can be customised using turing-complete languages, the only limit to possibilities is how much effort you’re willing to put into customising them. Even CVS has the possibility of working as a distrubuted system if you wrap it up in enough layers of hacky shell scripts ;)
Having done quite a bit with both mercurial (which the sapling CLI is based on) and git, I find that the mercurial approach has a lot more sane defaults, is generally less effort for the same results, and has a lot of quality-of-life improvements — like git has the possibility of doing interactive commits with `git add -i`, but sapling's interactive-commit interface actually makes it usable.
The last thing I want to do is learn a new source control tool. For this to occur, it truly needs to be 10x better and not because someone isn't motivated enough to read the manual.
With how well git has worked for me, I suspect I'll use it for the rest of my career.
Not a big fan of FB as a company, but I think their open source work is pretty impressive. Various other large companies have the problem of giant monorepos that they constantly need to onboard new developers to, but I can't think of anyone other than FB who consistently released their solutions.
Sure, most people are probably fine with Git once they learned it and if they only work with small to mid sized code bases (like me). But I'm still happy Sapling is out there, I might use it or learn from it if I ever run into the problems it solves.
Facebook has a lot of interesting open source projects, but they tend to abandon them. As far as oss goes, I think Google is the best. As long as you don't mind dealing with 3 different custom build systems within the same codebase, their projects usually have dedicated teams maintaining them.
...and yes, I realize it's weird to say this considering Google is known for abandoning things. Maybe it's just coincidence that I've run into more abandonware from FB than Google?
Better than Microsoft? I don't think I've ever been able to talk to a human at Google, whereas with Microsoft, I get feedback very quickly on issues and pull requests. Does Google even interact with people with open source? For example, I am using Skia via SkiaSharp, and the only place I know of to go for Skia help and issues is their Google Groups page, a website out of the 2000s. And very few seem to actually monitor the group. I'm not even really sure what Google does in open source. Even the things that are released to the public, like Skia, are well known to come with a huge amount of internal baggage.
Whereas Microsoft has dozens of active projects on GitHub where you can talk directly with the people working on it at Microsoft.
> I'm not even really sure what Google does in open source.
Kubernetes (and a bunch of the offshoots), golang, a bunch of ML things, etc. It's just that many have independent foundations (CNCF) running them now to keep project management independent from a single company.
As far as I can tell, most of zstd's development is still by Facebook employees, though not all of it. I tend to think zstd has enough traction that development would continue even if FB were to abandon the project.
This is probably a naive take, but I think of compression software as something that can be “done”. Unlikely to be a lot of code churn required for such a project to be relevant for a very long time.
How to not abondon a project? I would love to know that.
This is my naive understanding. A for profit company open sources a project that they have been using and developing internally. The have built a philosophy and understanding of the project as they use and develop it. Most of their action regarding the project is that they must use it and they usually don't have any other options.
Because the foundation is already laid the solution for its shortcoming is having just an understanding of them. Then you open source the project knowing that you have developed the project to it's completion.
Now comes the OSS community. Either we request features that goes against the project philosophy or we don't want to get involved because we don't need to compromise and acknowledge the shortcomings because we have options.
A good solution can be open sourcing projects that the org thinks isn't complete and needs further development without compromising security, philosophy and usability. Because if you have a list of things you need, you can ask the OSS community to fix those things rather then be critical of the foundation and philosophy.
Idea for projects not owned by mega-corps (half real, half fantasy):
1. Get the project added as a package to one or more major commercial Linux distros, e.g., RedHat, etc.
2. Grant commit access to one or more devs at the same Linux vendor. Allow them to do whatever they want. You might not like their direction, but it should survive.
3. Retire from the project whenever you like.
4. Also, you could post a note in README about retiring. If people want to add features, ask them to fork, or just grant them commit access and let them go wild.
Good points. Personally, a lot of the appeal of open source is not so much about having free stuff, it's more about being able to learn from others. I learned a ton from the various engines Id Software released back in the day, but have no expectation that they maintain them.
If it comes to stuff I actually want to _use_, I avoid projects backed by a single or a few companies - like Sapling. So from that angle, I'm not particularly impressed either.
React Native, named after their open source project that is incredibly successful, widely used, completely reshaped the front end dev world, etc... Not sure if an offshoot of a world class project not gaining traction is a mark against them.
In a way it has. Just not directly. Everyone is abandoning (for example on iOS) imperative style (Cocoa and UIKit) for React style (RxSwift, ComponentKit, SwiftUI). It's the biggest shift in native UI programming in more than a decade.
There recently was a thread here on the impact of layoffs on React. And apparently even Facebook has abandoned react for anything but internal projects.
Tbf I think this was just the learning curve for big tech companies. Google and Facebook face scalability problems other companies 5 years before anyone else so if they don't open source the solutions they come up with, the industry-standard that establishes itself 5 years later will not be compatible with their solution and they will have a hard time recruiting and keeping personnel for their own system.
In the argument of monorepo vs not, the usual argument goes like this:
- It's too hard to scale for a large monorepo!
- Google does it just fine!
- But I don't have access to Google's tools!
So kudos to Meta for both solving the problem and making it available to others. It will be interesting to see how useable it is outside of Meta. I know for example that while Netflix open sourced a lot of tools, most of them weren't useable unless you ran all of them together. So far Meta has been good at avoiding that, so hopefully that remains the case.
Microsoft uses monorepos, and you have access to Microsoft's tools! They chose to work with Git and amend it when necessary, just like Meta did with Mercurial.
A few years ago they had a Virtual File System extension for Git. Now it's a public fork of Git that is intended for large repositories (several hundreds of GB). It adds a `git scalar` command, see https://github.com/microsoft/git/blob/HEAD/contrib/scalar/do...
Mostly people who have this argument don't have a code base large enough to run into actual limitations of git. They run CI with wonky java implementations of git and/or have giant amounts of binaries in their repos. Actually having Gigabytes of source code is pretty rare.
> and/or have giant amounts of binaries in their repos.
Firstly, it's not "giant amounts of binaries" it's "a very small amount of binaries". A few GB is enough to cause significant problems.
Secondly, This _is_ an issue with git. If my project requires binary files, git should handle it. How should we handle logos in a mobile app, branding images on a website, audio files for background? That's before you get to the question of "how does a video game store the source version of a 100GB worth of compressed assets?"
Git LFS is a reasonably good example of git being half baked. It is provider dependent, and requires configuring separately on both the server and client. It turns git into a centralised VCS, removing the option of working offline in the process.
A bit like submodules, LFS has its own warts that seem to multiply when you add more people to the mix. Working with git LFS has been the _only_ time the solution to my problem has been "nuke and clone again", in almost 15 years using source control.
Last time I used git LFS, it didn't support ssh cloning at all, and the issue had been open for years at that time.
It depends on the size of the company. The Linux kernel has about 1500 active developers. This is a lot, but many companies also exist that reach this size.
I think another thing that matters is how you store branches/code under review. In Linux, each team/person has their own repo. The main "Linus" repo has mostly the finished code. In a company it is much more common for everyone to store their unfinished code centrally. Perhaps this also accounts for some increase in size.
I use the staging area to allow me to more easily break larger changes into smaller commits. I am usually all over the place while writing/refactoring code and making commits as I go along doesn't work well.
How does sapling let me take a long list of commits and break them into larger but more manageable chunks?
git add -p allows me to add chunks easily and create commits, git commit --fixup allows me to mark a commit as fixing a previous commit, and with git rebase -i --autosquash I get to easily take those fixup commits and meld them into the previous commits.
Also reviewing a stack of patches is annoying in many cases as I care more about the end result vs each individual commit. But that may just be my experience talking in open source where I am working on smaller but better well defined projects vs a large mono-repo where there may be a lot of changes across many disparate parts of the code base that make it difficult to look at the "whole" vs a patch that is more localized.
Instead of `git add -p`, you would use `sl commit -i` (whose interface I much prefer). To amend into a previous commit, I prefer to switch to it and then just use `sl amend` (+ `sl restack` if necessary), but you can also use `sl fold` IIRC. Instead of `git rebase -i`, you can use `sl histedit` (not a direct replacement for autosquashing, but worth mentioning).
To split a single commit, you can use `sl split`, which is quite difficult in Git. (I miss that feature in Git quite a lot.) You can also use the `sl absorb` command to automagically merge local changes into the previous patches where they seem to belong (roughly speaking, commute changes backwards until they would cause a merge conflict, but it's a little smarter about avoiding certain merge conflicts).
If I switch to a previous commit to amend it, then I would temporarily lose all the other changes I made, and means I can't easily run tests on that particular commit to validate nothing else broke.
It sounds like I would need to:
- switch
- amend the commit
- restack?
- switch back to the HEAD?
`sl histedit` is very similar to `git rebase -i` if you're famiilar with that interface which works just fine.
Other commands useful for amending changes to previous commits:
* `sl goto --merge <hash>` - if theres no conflicts you can just switch commits with pending changes and those pending changes would be applied on top of other commit. If there are conflicts this command would fail https://sapling-scm.com/docs/commands/goto
`sl absorb` can turn that workflow into a single command in many cases: it automatically looks at what hunks you've changed and tries to propagate them back to earlier commits in the stack that touched those same hunks. It's not perfect, but in my experience using this at Meta, it does what you want 90% of the time.
https://sapling-scm.com/docs/introduction/differences-git#sa... says:
"If you want to commit/amend just part of your changes you can use commit/amend -i to interactively choose which changes to commit/amend. Alternatively, you can simulate a staging area by making a temporary commit and amending to it as if it was the staging area, then use fold to collapse it into the real commit."
To use a similar featureset but in the same Git repository you normally use, you can try my https://github.com/arxanas/git-branchless. Then, you can use your usual staging workflows if desired, or use regular Git commands directly.
Its design is inspired by Sapling, and, in fact, it uses some of the same code, such as the segmented changelog implementation. Possibly some of its ideas made their way back to Meta, such as interactive undo?
Jujutsu also supports colocated Git repositories: https://github.com/martinvonz/jj. It also has the working-copy-as-a-commit idea and conflicts are stored in commits (so rebases always succeed). I think it's a step forward compared to git/hg/sl.
> To use a similar featureset but in the same Git repository you normally use
What do you mean? Can't you use Sapling in the same Git repository you normally use? The first sentence is "Sapling is a new Git-compatible source control client". Is there something they're not telling us?
EDIT: Looks like it calls out to the git executable occasionally (https://news.ycombinator.com/item?id=33615576) and presumably works on the git object model under the hood, but you can't use `git` on a repo checked out using `sl` nor vice versa. It's a stretch to call it Git-compatible but I guess not completely wrong.
What I mean is that you can't co-locate your Sapling repository with your Git repository (at least, for now). You have to have them in separate directories and push/pull between them.
git-branchless is only an extension to Git, so it naturally operates in the Git repository. Jujutsu has a mode to create the `.jj` directory alongside the `.git` directory and co-locate them, which I find very convenient in practice. (Originally, Jujutsu only supported Git compatibility in the same way as Sapling, via pushes and pulls, but they added co-location later.)
> Looks like it calls out to the git executable occasionally
I believe Jujutsu never calls out to Git, and that all of its `jj git` interop commands are implemented via direct bindings to libgit2. This is less fragile in many ways, but it can also mean that `jj git` interop might be missing some new feature from Git. Fortunately, you can oftentimes just run the Git command directly in the repository when co-locating.
> presumably works on the git object model under the hood
There's no guarantee of this: the Mercurial (and therefore possibly Sapling?) revlog model is a little different from the Git object model, as I understand it. But it doesn't really matter, as long as it interoperates seamlessly. For now, I believe they do literally have a `.git` directory somewhere under the `.sl` directory, but they reserve the right to change that.
> > Looks like it calls out to the git executable occasionally
>
> I believe Jujutsu never calls out to Git
Oh, I was referring to Sapling. I know even less about Jujutsu than I do about Sapling!
> > presumably works on the git object model under the hood
>
> There's no guarantee of this ... but they reserve the right to change that
Interesting. So it would translate between them whenever you push to or pull from a Git repo?
I'm very keen to use Sapling if it's basically a polished interface to Git but less keen if it's an entirely different object model, because then I'm going to have to learn more about what's going on under the hood to understand it properly.
> Oh, I was referring to Sapling. I know even less about Jujutsu than I do about Sapling!
I was just remarking about Jujutsu, in the case that it was important to you for some reason whether or not your VCS called out to Git.
> Interesting. So it would translate between them whenever you push to or pull from a Git repo?
To be honest, I don't know. I suspect that, for now, they store real Git objects, rather than translating on the fly. You'd have to ask a Sapling maintainer.
> I'm very keen to use Sapling if it's basically a polished interface to Git but less keen if it's an entirely different object model, because then I'm going to have to learn more about what's going on under the hood to understand it properly.
I might have muddled some layers of abstraction and brought up something unhelpful. Git's object database and Mercurial's revlog are more comparable in terms of where they lie in the abstraction hierarchy, but these are just the storage layers. In practice, I find the Git and Mercurial object models, as exposed to the user, to be similar enough that I pretty much never have to worry about the differences. (Well, perhaps it's true that Mercurial file contents are not addressed by blob hashes, but do I ever really want to address by "blob hash", or just by "the contents of this file at this commit"?)
What I meant to emphasize is that you can't directly use Git to access Mercurial/Sapling's internal object store, if that's important to you (perhaps for scripting). In comparison, with Jujutsu, if you modify the Git object store on disk, it will try to "import" refs the next time you invoke it in order to update its own internal object store to match.
I think those two things that arxanas mentioned are the biggest differences (i.e. working-copy-as-commit and conflicts in commits). I haven't used Sapling, but I suspect Jujutsu has better support for moving commits (and parts of commits) around without touching the working copy.
Sapling, on the other hand, has much better support very large repositories, since they've spent a lot of time on that over the years. We're going to copy some of Sapling's solutions to Jujutsu soon, since we're working on integrating it with Google's monorepo (slides: https://docs.google.com/presentation/d/1F8j9_UOOSGUN9MvHxPZX..., recording: https://youtu.be/bx_LGilOuE4).
Yep — after trying out Sapling, it aborts many operations due to working copy changes, which I now find to be jarring interruptions to my workflow. Jujutsu (and even git-branchless) have much better support for juggling working copy changes, which is invaluable in a patch-stack workflow.
The Sapling support for remote repositories was a little rough in my opinion. Jujutsu and git-branchless can both co-locate with the Git repository, so you can always drop down to Git commands if there's something you're having trouble doing. (I find the `jj git` commands to also be better at interacting with remotes for now.)
Oh, in the context of Sapling, I should say that Jujutsu runs the equivalent of `sl restack` after every command. Thanks to first-class conflicts, that always works.
No, the CLI prevents that because the remote is unlikely to know how to interpret conflicts. In a future where the remote understands conflicts, then you'll be able to push conflicts to a remote and collaborate on the conflict resolution.
I'm sort of amazed that git and mercurial haven't built something like that yet. Makes me a little sad that Facebook created a new scm instead of expanding mercurial to include features like this.
I’ve been using FB’s mercurial fork for years, wishing for all that time that I could have the joy of the fb-hg CLI while remaining compatible with github because that’s where 99+% of the code lives - from my brief experimentation, sapling appears to be that. I look forward to never using the git CLI again :D
I actually just created my first PR from a sapling repo now - not sure why it’s not documented, but you can push your local development branch to a remote server, and in the case of github, you even get the “it looks like you’ve just pushed a local branch, would you like to turn this branch into a PR?” prompt, and it appears indistinguishable from a branch created with the Git CLI.
The message generated with the hyperlink for creating a PR is actually done server-side, you can customize any git server to do this, more or less. So yes, it should behave approximately the same assuming you can just do the basic push interactions.
Reading the documentation quickly, it looks like you can generate a standard pull request, it's just that the PR may not conform to expectations if there are multiple commits.
Of course, when it comes to github PRs, there are so many different "styles" of pull request, I'm not even sure which one should be considered "standard".
Its interesting how these threads about Git simultaneously have
(a) People arguing git is fine, and shouldn't be simplified
(b) People arguing about the right way to use git, and flame wars about best git workflows
I mean most people simply see (b) and conclude "this is a huge hassle, I don't want to annoy some git-workflow-purist, I'm just going to walk on eggshells on this tool and hope I don't break anything"
It's as much a social problem around conventions, and lack of opinions in the tool itself, then anything about the underlying technology (which is rock solid IMO)
I would argue that camp a) is more diehard than your description. Some people will argue to their death how perfect a tool git is and it would be impossible to operate without a tool that exposed so much low level power.
I use git begrudgingly because that’s where the world is, but I long for an improvement in this space.
I remember when git was new, and the subversion crowd didn't jump on. Same discussion, different subjects. Git is/was awesome... but there is certainly room for improvement.
As I recall, the Big Deal with CVS-to-SVN migration was that many teams learned to creatively exploit the fact that CVS does versioning on a file-by-file basis. As late as 2008, I had to work in an environment where checking out different versions for different parts of the company's CVS monorepo was required to get anything to build.
I dunno, there's a lot of value in having a tool that's flexible and lets you work the way you want—even if that fundamentally means there is no one blessed way to do things.
Git has a lot of incidental complexity and unforced design problems, but the fact that it's inherently flexible is not one.
(c) Git is a disaster that has destroyed source control for 15+ years and the industry is unable to recover from due to Stockholm syndrome
It’s impossible to discuss source control without people coming out of the woodwork to shit on git. It doesn’t work right. It’s too hard. SVN was better. Mercurial should have won. Blah blah blah.
Git isn’t perfect. And the command line has improved (I don’t care much, I use a GUI).
But the number of people who seem to insist that because it doesn’t work for them or they don’t personally like it it’s horrible and everyone should abandon it is crazy.
And it makes trying to read/participate in discussions like this painful.
I agree. I think git is fine, but also would like improvements. I started with CVS and VSS (well really started with copying files to dir_bak heh), then SVN and git. Yelling across the office to tell someone to checkin a file so I could make an edit seems so quaint. VSS had a fun bug that if you ran out of disk space, it would destroy the entire repository - yup.
Is git sometimes obtuse? Sure, but it's fast and incredibly powerful. My everyday commands are easy to use, and if I need something special I go to the documentation - just like any other SCM.
Looks like you are using clap v2? Feel free to ping me for help on moving to v3 then v4. I expect the rate of breaking changes to be slowing down and to be smaller in scope (from the users perspective) so now is a good time. Id love to hear how we can make clap better fit cases like this and how we can help in improving the UX of applications.
We're somewhat limited by what version of clap is available in our internal monorepo. We try to keep things reasonably up-to-date though, so I'm sure we'll upgrade at some point.
I believe we actually only use clap for some side binaries, not for the main sl executable. We have a custom parser for that (https://github.com/facebook/sapling/tree/main/eden/scm/lib/c...), to match the preexisting hg parse behavior. Unfortunately I'm not familiar enough with clap or why we didn't go with clap in the first place to say what we would need to use clap for the main binary.
Is it possible to use git commands on an sl checked out repository, or vice versa? Or at least get something close enough to a git repository that I could run git commands on it, so I can fake it for internal tooling?
Unfortunately you can't run git commands directly right now, since there is no .git directory at the root of the repo. Under the hood there is a .git directory hidden away somewhere under the .sl directory, but we consider that an implementation detail and are likely to change how we store the actual git data in the future. So we don't support people running git commands in there.
The GitHub repo says that Mononoke and EdenFS is "not yet supported publicly". The code seems to be all in the open source repository though, what does the "not supported" mean here?
EdenFS builds (and probably runs?) from GitHub, but we have done no work to make it usable and hook it up to an existing checkout. It may not be much effort, and we're hoping to demonstrate that workflow in the future.
The code is available to see, but they don't necessarily build in an external environment yet and even if they did we aren't ready to support them being used externally. Hopefully we can support them one day, but for now we're just starting with the client.
Sorry! 'one day' is the best I can do for now. We'd love to do it sooner, just gotta find the time.
We think, and many of our internal users agree, that the UX alone is a worth while upgrade. Since the majority of Git repos don't actually need the performance of a virtual filesystem, the UX is the main sell for them anyway. At the very least maybe it will inspire some UX improvements in Git.
I don't doubt the UX is better, but internal users are a captive audience. I imagine most developers will not think twice about what vcs they are using unless their organization makes the change.
Thank you for answering questions. I was poking around Sapling and got thrown off track pretty quickly. Just wanted to init an empty repo but on an Intel MBP I just get an error:
`abort: please use 'sl init --git .' for a better experience`
What's going on here? I couldn't find info in the `sl init --help --verbose` output or in the Sapling website.
Because we haven't released the server yet, the open source client currently only really supports git right now, so sl init --git is the only way to init a local repo for now. Perhaps we could make that message clearer.
I'll take a look at the help later to see what we're missing here.
Cool, thanks for the reply. It definitely threw me off and if it only supports git as a backend for the open source user it might make sense if it can default that way for now.
Hopefully a dumb question that I missed an answer for in the docs. If I have an existing git repo (not hosted on github), is there a way to try using sapling with it? Or do I need to clone it from scratch?
My impression from the blog post is that I can use sapling and have everything "look" git-like from the remote repo's point of view.
You should be able to `sl clone ...` your git repo into a Sapling clone and use it that way, even if the repository is not on Github. I haven't personally tried whether cloning like `sl clone /path/to/some/repo` works, but it should since we're actually using the actual git binary under the hood for clone, pull, and push.
You could personally clone a git repo with Sapling and no one would know the difference. When using Sapling with a git repo, clone, push, and pull all use the actual git binary under the hood, so the server just sees git speaking to it.
Just to clarify, I meant that I've already cloned it via git. Can I just start using sapling with it or do I need to delete the local repo and re-clone it with sapling?
You cannot just start using sl in a git repo. You need to make a Sapling clone of it.
But you can make a Sapling clone of your local git repository, so you don't have to clone from the server again and you would get all your local work from your git repo. That might be the easiest way to try Sapling, so you don't have to delete your git checkout at all.
> But you can make a Sapling clone of your local git repository
Whoa I didn't even know you could do that in git. I always considered clone to mean "download stuff from this location" but now it makes more sense. Thanks I'll give that a try.
Sapling originated from the Mercurial open source project, which was largely Python (at the time). To make things faster and more maintainable, we started rewriting portions of it in Rust, and going through a binding layer to interact with Python. Critical pieces like the storage layer, parts of the wire protocol, and various others are all in Rust at this point, while a lot of the high level business logic remains in Python. We'll continue to shift more to Rust over time though, especially since pure-Rust Sapling commands feel way more snappy and pleasant to use.
It's difficult to find out information about the status of the oxidation project with Mercurial. I just noticed the 6.3 release earlier today and looked more into the rust support. I didn't get the feeling that most of Mercurial has been rewritten in Rust yet. The `hg help rust` information only lists a handful of features as gaining improvements from Rust, albeit they are likely core/essential components, as well as mentioning some/all of the work is experimental. The documentation here seems very much directed towards developers still.
Good memory! Internally Eden eventually became synonymous with our virtual filesystem, so we decided it was better to choose a new name to avoid that confusion.
Since the sapling client is a fork from Mercurial does that mean it can also be used with Mercurial repositories in addition to Git repositories or is that not supported?
In order to work with Git repositories is this essentially the Mercurial client using hg-git on a converted repo under the hood?
Unfortunately you can't use Sapling with Mercurial repositories. There are too many differences at this point.
This does not use hg-git under the hood. Sapling's internal structure differs from Mercurial in substantial ways, and we've built some cleaner layering that allowed us to shim Git in under our storage layer. This also means that we read and write directly to the git repo, instead of duplicating and importing all the data like hg-git did. This has some nice benefits, like the hashes you see in the output are actually Git hashes.
https://sapling-scm.com/docs/internals/internal-difference-h... briefly explains the Git support. Currently we keep trees and blobs in a git bare repo unchanged, but convert the commit graph to our format so we can run our own graph algorithms. In the future we might store trees and blobs differently too.
Does sapling support hooks like pre and post commit? My workflow leans on pre-commit (the framework) heavily and it would be hard to give that up. I’d still be keen to take this for a drive though, nice work!
Technically the mercurial pre and post hooks are mostly still there, but I'm not certain we want to support them long term. The existing hook design has some problems.
I'd be curious about your use case, since we don't actually use hooks internally all that much.
usually to run linters and validators, speeding up the feedback loop (otherwise it's annoying to push changes to a PR and then get a CI failure minutes later for trivial linting issue)
Yea, that's on our list to fix. The homebrew packaging was the last package to be done, and we were busy tidying up other things in the lead up to launch.
can you speak to commit throughput of the sapling server? While there's tooling to make git scale better (like sparse checkouts) scaling commit throughput for automation is a pain.
Unfortunately we can't really talk too much about that at this point. I can say a lot of effort has gone in on the server side to optimize commit throughput though.
One example we mention in the blog post is that when you push, it doesn't actually need to be a fast-foward push (using Git terminology) to succeed. Our server can rebase the commit on top of the destination bookmark for you (with some limitations, like not merging file contents). This allows many people to push, and not have to race to rebase. Then we have substantial optimizations around the critical section of final-rebase-then-move-branch-forward, which yields pretty good throughput.
As a Meta employee for almost 4 years what I will say is I was skeptical at first coming from git, but the sapling system works very well in practice in my experience. I still use git for everything outside of work, but I may consider sapling now.
I'm ex-Meta and now at Google and while they have 'hg' as a wrapper around their fig system, it's lacking so many of the Sapling features I am sad and frustrated occasionally.
I agree it's not necessary, but i like having it because it lets me separate what's going to be added before i actually commit.
I still commit small, frequent. But i like `git add -p` to skip debug lines, hardcoded conditions, etc. I don't want to mistakenly auto commit a whole pile of lines and then have to remove debugs/hacks/etc from things i've committed.
Stage + Unstaged is my working area, and the two live together quite nicely to me personally. I could live without it, definitely.. but i'm not sure i'd want to.
> Stage + Unstaged is my working area, and the two live together quite nicely to me personally. I could live without it, definitely.. but i'm not sure i'd want to.
You can just use the tip as your staging. Use interactive amending to move changes from the working copy to the commit, and when you want to "commit", finish up the message.
hg actually has an "unamend" command (part of the "uncommit" standard extension) which... reverts the last amend. Rather than having to remember how to contort reset into the right shape to move changes back out of staging without destroying everything.
`git reset HEAD~` doesn't feel like that much of a contortion to me. It's the destructive change that requires more contortion (`--hard`) which feels fair. Maybe this is stockholm syndrome though.
The way I think of it, there's basically three copies of the file in play: in HEAD, in staging area, and on disk. I cannot trust my memory to remember which variant of "git reset" copies the file in HEAD to the staging area, which variant copies staging area to disk, and which variant copies HEAD to disk (in all cases, the third copy remains uninvolved). Getting it wrong potentially creates unrecoverable data loss. And, unfortunately, this is one of those cases where reading git's documentation is less than helpful.
Combine this with the case where "I want to break one commit into two commits," where now I have to worry about making sure I know if the command is going to change the revision HEAD points to. At least there, the old commit will still exist as backup in the invariable scenario I screw something up.
In those cases, I find it best to either 1) use the interactive commit tool to not commit debug junk, or 2) put the debug junk in its own commit, which I'll later discard (and, plus, that means you can't accidentally include it in a real commit).
I’ve never used one of the source control systems at these big companies but I use the staging area along with your git-branchless just fine for now. I’m not sure if it’s any less efficient this way.
It's not a big deal either way, but the staging area interacts worse with some operations. For example, if you have staged changes and then get a conflict with `git checkout --merge`, AFAIK there's no way to undo to before the conflict in Git. When using commits, all of the standard merge and undo tactics apply.
I guess each to their own. I want to stage my commit with regular commands, and then have the staging area work with (diff, add/remove etc).
I don't care for an interactive tool, IMHO I prefer using commands that are repeatable and learnable instead of stepping through some interactive workflow all the time.
You `sl commit` that half-done work anyway and then iterate by running `sl amend` many times until your commit is finished. In case you want to amend just part of changes us `sl amend -i`
It stems from the original Mercurial implementation. The goal here is that every operation leaves the repository in a good state that can be pushed/pulled. That's why Mercurial and Sapling rely on commit/amend/uncommit, etc and for example usually discourage the use of interactive rebasing in favor of restack and other operations that add another "state". It facilitates the mental model for developers without actually removing workflows (they are just different).
Stack of commits seems to be similar to what one would call a patch queue if one is using Git.
The fact that they have concepts like unamend suggests that they have thought about this in a way more turtles all the way down way than the Git designers. A versioning for your history changes—why, of course.
The fact that they have concepts like unamend suggests that they have thought about this in a way more turtles all the way down way than the Git designers.
You can thank the Mercurial developers for these concepts.
Originally started as an extension to Mercurial, but grew into its own SCM with a cross-platform virtual filesystem in C++ and a distributed server in Rust.
The interactive tool looks amazing. I do interactive rebases quite often and a drag-drop setup is wonderful
However, I don't understand why I would want 1 PR per commit. I feel like that's a non-starter for me.
Is the idea that no one should use branches - so there's only 3 points of interest: HEAD, main, and origin/main? And then is the idea that it's only 1 commit per feature to merge?
So I would work on something, make a PR, continue working on something else without making any git checkouts and then make a new PR?
Generally, you "stack" your commits/PRs and review them in small units (which I think is what you mean by working on something else without making any Git checkouts).
But you can certainly create new "branches" of development which aren't stacked on top of each other. They just don't have to have names. You can consider them to be "anonymous" branches.
The main advantage of 1 commit per PR is to review and commit smaller changes (a single commit at a time).
No, stacking is a (partial) workaround for Github "pull requests" being bad, by reimplementing the ordinary Git way to submit changes (one email per commit, reviewable inline. look at any patch thread at https://public-inbox.org/git/).
Doing this doesn't really make them good, but it makes them at least reviewable.
That's actually a deal breaker to me. Effectively using Git's staging area has become so integral to the way I work with repositories that I don't think I can ever go back to the old style.
It has an interactive commit staging command which accomplishes the same thing. In that case, it unifies the staging area with regular commits, which means you can also manipulate them the same way as regular commits.
AFAICT there are only two workflows involving the staging area: staging partial commits and resolving conflicts. The first case is taken care of by partial commit support, and the second case presumably has its own dedicated mechanism.
My favorite example of the staging area being super weird is `git diff` behavior. By default, git diff will show unstaged changes as well as committed changes, but _not_ staged changes; to see staged changes, you need to use `--cached`. This is especially weird when diffing between a fixed point (e.g. a commit hash) while going through the motions. If it's not clear why this would be weird, try out the following:
* get the hash of the HEAD commit
* run `git diff <hash>` and see that there are no changes
* make some change to a file
* run `git diff <hash>` again and see the change you made
* stage the change with `git add`
* run `git diff <hash>` again and no changes are show!
* commit the change
* run `git diff <hash>` again and the changes are back
It's super bizarre to me that there would be some sort of intermediate state where changes aren't visible. I feel like it would make more sense to have some sort of formatting difference indicating unstaged versus staged but not committed versus committed, but I imagine changing that now would break all sorts of scripts, so we're stuck with it.
Not to subtract from your point at all, but newer versions of Git allow you to use `--staged` as a synonym for `--cached`. That this is the default behaviour still makes no sense, but at least the name does.
That's good to know, but one has to wonder why that didn't change 15 years ago. It's representative of the general problem with Git: no particular problem with having bad UX for a long time without improving it. It seems like things started to get better in the last few years, but seriously it's not like the problems weren't obvious for a long time.
To be honest, I think this behavior is quite useful.
Staging just means "I'm happy with the changes so far but didn't finish everything I want in my commit; let's ignore this changes for now".
What `git diff` does by default becomes useful when you touch the staged changes again. Then you see only the new changes compared to the staged stuff. This helps building up a commit step by step, with some trail and error in between.
Think for example about something like: You use some tool to do some automatic changes. This creates hundreds of changed files. But the result isn't working. You could commit that, sure. But than you would need to rewrite history before pushing because creating not working commits is a terrible idea. Or you could just stage the changes for now. Than you can change / repair the still missing parts. Git diff will helpfully show you only the new changes but ignore the staged stuff as long as it's untouched. You would now for example easily see changes that you made to the automatic rewrites. Without the staging behavior you could only compare with a committed state, and drown in hundreds of changes that are unrelated.
The main problem with the staging area is that quite some GUI tools don't use it correctly. The tools try to "simplify" Git by ignoring how the stating area is supposed to work, or ignore it completely like the infamous JetBrains IDEs. (IDEA is the tool that needed almost 10 years to implement Git sub-modules…). I think the VCS handling in IDEA is on the surface very polished. But when it comes to something like the staging area the UI-wise very crappy VS Code Git support beats that by far. Sublime Merge does also the "right thing"™ and hides the staged changes at the bottom so you see only the the changes to the changes. Exactly as the staging area is meant to be used!
I can see the behavior being useful in some cases (and not a _super_ common cause of mistakes once you learn it), but the issue for me is mainly that it's surprising and easy to get bitten by before you learn it. This describes git entire UX as well in my opinion: the underlying model is very powerful and generally pretty sound, but the API for it does a poor job of making it clear, and generally you have to make a lot of mistakes on the way to learning how to use it properly.
Git is conceptually very sane and logical. But the UX is indeed (still) terrible.
But a lot of people seem to complain about the complexity of the underlying concepts. That makes no sense to me as the concepts are complex because the problem at hand is complex. Also it makes no sense to me that people are constantly crying for a new tool even they have obviously issues with understanding the problem space and all requirements. A new tool would not make all this complexity go away! It could at most try to "hide" some of the complexity by introducing magic. This wouldn't be helpful at all as magic is way worse than bad UX, imho. Bad UX is bad UX you can deal with it. But when magic goes wrong all bets are off and you're usually in deep trouble.
This looks like people would argue to rewrite the back-end (some propose form scratch!) even only the front-end needs some face lift…
I think the only potential improvement I can think of to the underlying model is that a patch-based system does seem like it would alleviate some of my pain points, but the UX for that hasn't exactly been fully figured out either; last time I tried pijul, it certainly seemed compelling feature-wise, but I wasn't able to figure out everything I wanted to do with it easily. It also would mostly just be a quality of life improvement for me at this point too; git is fully capable of doing everything I need it to, but sometimes I might have to google a bit to figure out how exactly to tell it what I want.
Meh. If it has mercurial's revsets instead of gitrevisions(7) I'm game, I'll happily give up the staging if I don't need to open that manpage ever again.
edit: yep, so long git
check if a given commit is included in a bookmarked release:
sl log -r "a21ccf and ancestor(release_1.9)"
Mercurial revsets and phases are two killer features of mercurial that blows any counterpart git has out of the water.
Phases are a property of revisions that essentially let you know their state. By default, there are three phases: public, draft, and secret. You can't rebase a public revision, nor can you have a public revision with a secret parent. So you get out of this concept things like safe rebasing, or barriers that let you keep internal and external repos separate.
But revsets really shine. This is basically a full-on query language for revisions. So you can define a query alias "wip" that specifies all of the, well, interesting revisions: every revision that is not in the public phase (i.e., not in the upstream repo), the tip of the trunk, the current revision, and sufficient ancestor information of these revisions that you can see where you based all of these WIP branches on. In a single query: "(parents(not public()) or not public() or . or head())".
Sure, composing revsets is definitely a somewhat painful process... but it's possible to describe more or less arbitrary sets with a Mercurial revset, and I've never been able to find a similar workable setup in git.
Why do you want to? Not trying to be snarky, but I've been using various source control tools for closer to 20 years than 10 and I can't remember when I've ever needed or would have benefited from revsets. I'm genuinely curious what problem this solves and whether I've just never experienced or have made my own hodge podge solution for it incidentally.
I am used to working in large repos (>100 commits/day), which generally means that something like 'git log --graph' contains a lot of extraneous information.
The most common workflow I have is that I've got a couple of old working branches (like featurea and featureb), and I want to see if I need to update featureb to a newer head or not, or if featureb was based on featurea or featurea-v2. A demonstration of this kind of thing is 'hg wip' here: http://jordi.inversethought.com/blog/customising-mercurial-l....
Another thing I would use revsets for is answering queries like "which of these changes that's on the public repository made its way into the internal repository (which periodically merges from the public repo)?"
You can also do `sl/hg log -Gr a21ccf+release_1.9`. The graph tells you the relation of selected commits. Last time I checked, git does not have the same --graph rendering yet - it only considers direct parents not ancestors.
The CLI and nomenclature for the staging area (what should be called "draft commit") is awful, but the actual concept is very easy to understand.
I seriously doubt anyone who uses a sane interface to Git (e.g. a GUI) has any trouble with clicking + to add changes to the draft commit before committing it.
Most GUI tools let you automatically add all changes before committing anyway so you don't have to know anything about it if you don't want to.
They just needed to name things better (what is a "soft reset" again?).
The main problem I have with the staging area is that it amounts to being something that's like a commit except for, you know, not actually being a commit, and therefore things that normally work on commits don't necessarily work on the staging area.
A better fix would be to make the staging area an actual commit, and then reframe everything as easy ways to edit the latest commit. (This meshes well with adding features like Mercurial's phases or changeset evolution that make commit editing somewhat safer).
A "draft commit" (I really like that name!) is not a commit, and should not be handled as such. This would make this feature more or less useless.
The whole point of the "draft commit" is that you can easily see changes against your (uncommited!) changes. That helps to build up a commit step by step.
Committing WIP stuff (and maybe even pushing that) makes the history useless. Branches don't help as you end up with millions of WIP branches that are all incompatible to each other (and the evolution that happened elsewhere). Only keeping WIP branches up to date is a full time job than.
Git has already a means to edit the latest commit easily: `git commit --amend`.
> The whole point of the "draft commit" is that you can easily see changes against your (uncommited!) changes. That helps to build up a commit step by step.
> Committing WIP stuff (and maybe even pushing that) makes the history useless.
Here's the thing. I'm a very big proponent of keeping history clean, and making sure that commits are atomic, and exorcising any "typo fixes" or the like commits from history. Not once have I found the concept of a staging area useful. Features like `git commit --amend` or `git add -i` are incredibly useful [1]. But not the staging area itself--it's only a thing that screws me up if I forget to add `-a` to `git commit`.
"Draft commit" also I think elucidates the other problem. You see, drafts of regular documents are frequently shared with other people, multiple versions of them created and shared, etc. Drafts don't become final until it's actually published--and there is utility in being able to track the differences in drafts as they are discussed. If you've got a "draft commit", then it should be able to go through this process--this is basically the process of code review.
Of course, we're already working with a VCS, which is designed to handle different versions of code, so... what if we made the "version history" of commits just... regular commits? Sure, shade them a different color, so you know that a commit is a draft, and you can tell which of the commit's parents [2] is the previous version. And knowing that a commit is a draft, when it actually gets pushed into the trunk, you can commit only that final commit and not include any of the previous history. Since the commits are using the same DAG logic under the head, questions like "what changed between version 2 and 5?" become just regular diff commands [3].
By the way, this system already exists. It's known as changeset evolution in Mercurial, and it appears that Sapling here has adopted it. My workflow in git tries to emulate this model to a degree, but the approach of having branches-based-on-branches doesn't mesh well with how git wants to do things.
[1] The number of times I have painstakingly sorted out which changes go into the commit with `git add -i` onto to immediately and accidentally undo them with a `git commit -a` is quite high. And because the staging area isn't an actual commit, it can't be recovered by digging into the reflog like actual commits can.
[2] If you amend or otherwise modify a commit, it has one parent, which is the previous version; if you rebase a commit, it has two parents, one of them the new commit it's based on and the other is the previous version.
[3] Worth noting that this question often turns out to be difficult to answer with most code review systems. Building a model of "commit history" into your VCS makes it come out for free!
It's really just a matter of habit and getting used to no staging area takes short and has huge benefits.
We develop HighFlux[1] which also gets rid of the staging area. It simplifies your mental model of what's going on a lot.
Because everything you save is automatically committed, switching to a different task/branch is also always instant without needing stash.
Because what you're testing locally is what you're committing, I also never have CI failures anymore (with the staging area I frequently had unexpected interactions with unstaged changes and sometimes even accidentally forgotten added files).
I lived through the alternative and the staging area is superior. If it wasn't, I might not be using Git, or at least begrudge when I have to; neither is the case.
But a branch is a commit but the staging area is an index, which is something different. Why two different concepts when one would do? That's my problem with the staging area.
I don't care how this is implemented. You may be right that the implementation could be more streamlined. But I care only about the functionality.
The whole point is to have some form of "draft commit".
The staging area lets me "stash" WIP changes in a transparent way.
Having a "draft commit" feature avoids the need to rewrite "bad commits" after the fact.
The staging area is really useful to build up commits gradually.
When thinking about it I've just realized that the staging area should not only be kept as a feature but could be even extended. You could add a "change-set management system"—which would be essentially multiple staging areas (maybe coupled to improved stash functionality to be able to quickly move / copy changes between change-sets).
Yes, such a thing would very likely need to be built on top of the mechanics behind branches / commits. But this should be transparent and not interfere with the said features, imho.
Sounds like saying bye-bye to any meaningful history.
Rebasing, cheery-picking, or reverting of commits becomes impossible when every save of a file is pushed.
You could just publish local IDE history… Would be equally "good" I think. (My IDE is saving files every few key strokes btw; the resulting history would be a bloody mess).
Why not go one step farther: Just make an automatic block image of the whole systems of every developer machine every few seconds. You could than just deliver the image. No docker setup needed any more. Just write code. And when the local version works, ship the whole local system just as it is. ;-)
> You could just publish local IDE history… Would be equally "good" I think. (My IDE is saving files every few key strokes btw; the resulting history would be a bloody mess).
Google does this, it has saved my butt on more than one occasion.
> Sounds like saying bye-bye to any meaningful history.
I'm so confused, you just amend the most recent commit, or work with changes unstaged and uncommited. Like my normal workflow is basically "change 2-3 files such that things are passing, hg commit, send for review", and then I continue working on the next thing, either back to HEAD if its unrelated, or on top of the just-pushed changes if it depends on them.
It's vastly simpler than having to git add at random times.
> > You could just publish local IDE history… Would be equally "good" I think. (My IDE is saving files every few key strokes btw; the resulting history would be a bloody mess).
> Google does this, it has saved my butt on more than one occasion.
You mean backups? Yes, backups are a very good idea.
But this has nothing to do with VCS. That are separate topics.
> Like my normal workflow is basically "change 2-3 files such that things are passing, hg commit, send for review", and then I continue working
As long as the requirements are so trivial even CVS would suffice.
But even considerably simple refactorings (in e.g. static languages) can be much more complex. It's easy to end up with hundreds of files changed. Than you need more powerful tools. Doing such things without the staging area is almost impossible to get right. (The only alternative would be quite some rebase sessions; and those are way more complicated than using the staging area upfront; also you would need some way to do diffs against "pined" changes—which is something you get for free with the staging area).
> You mean backups? Yes, backups are a very good idea.
No, I mean that the filesystem I edit code in has a full snapshotted history of every save and I can recover to a particular revision or particular point in time, even one's that weren't committed to vcs[0]. I guess you can call that a "backup", but like it's not what people usually mean.
I have used this to recover some things I was working on three months ago but ended up throwing away.
(citation:
> All writes to files are stored as snapshots in CitC, making it possible to recover previous stages of work as needed. Snapshots may be explicitly named, restored, or tagged for review.
> It's easy to end up with hundreds of files changed.
Sure, doing a rename can touch 100 files, but you should isolate that change to a commit and PR that don't do anything else. Running sed and then committing or using your IDE's refactor feature, and then committing and running tests doesn't require a staging area.
[0]: The truly wild thing about this is that the vcs state is also stored in the snapshotted, point in time recoverable fs, so if you do the equivalent of absolutely botching your git history, you can jump back in time a few minutes and start from a known good state.
One of the problems with `git`, and it seems that Sapling is no different, is that there is no one-to-one mapping between the user intent and the underlying SCM.
For a user intent of "implement feature X", there is no UI to "start a new feature". Instead, one has to translate their requirement to the SCM mental model and issue a series of commands to manipulate a DAG of blob hashes that live in 4 different places simultaneously. (work tree, index, local repository and the remote repository.
Highflux allows a simple user requirement to action mapping.
I use HEAD as my staging area, and do all the normal staging-area things, but without the weirdness that comes from the staging area being a different concept to a commit (ie, the diff command Just Works, no need for a separate `--staged` flag to enable special behaviour for that case)
the number of projects that require the scaling factor of something like this is very small. git with lfs scales very well for most repositories. That said the actual flow of git is pretty raw. There are other pretty solid projects out there that support undo commands into git.
Stacked PRs are a blessing and a curse - still not convinced they are the correct way to build software as a team.
Eh, it’s pretty hard to make a case that you don’t want stacked diff sometimes.
Obviously life is simpler if all your work is sufficiently non-intersecting that you can send separate diffs/PRs and e.g. rebase them separately, but if you have Big Feature X and you still want small, single thesis diffs, where else do you turn?
Public projects, sure. But plenty of companies have very large codebases that would benefit from this. Even if Git can handle it it can get very slow with largish repos.
I would be more concerned about the ease of mistyping `ls` as `sl`, especially if you tried to list a directory whose name is a destructive Sapling command.
Yes. Right now we are adding more documents about how internal components (like the commit graph) work. We hope others in the industry find them useful. Feel free to ask questions in GitHub too.
This serves a very weird niche. Reading through the docs, this seems just as complex to operate as git, but designed with less decentralized operations in mind. Why not just use mercurial if you want to use mercurial? Why invent this... monstrosity? Because GitHub pull requests are terrible?
None of this makes any sense to me.
> Local branch names are optional.
As are they in git, just hang out with a detached HEAD.
> There is no staging area.
Practically the entire world sadly invokes `git commit -a` anyways and you still have to add untracked files.
Neat project but I don't get what this is solving for.
> Why not just use mercurial if you want to use mercurial? Why invent this... monstrosity?
But this is mercurial. Or rather, it's mercurial rebased on top of the git data store, and it's a fork with breaking changes so it has a different name.
I do agree that the requirement to be online gives me pause. But I guess I don't know how much of a problem that would be in practice, since there's a mystery subset of functionality that works disconnected.
> Neat project but I don't get what this is solving for.
For us external people, it seems like it's mostly for using the hg interface with a github-hosted repo. The internal reason appears to be scalability to massive monorepos. Since I much prefer the hg interface to the git interface, I'm good with both of those motivations.
A lot of the important work is on back-end scaling for a centralized repository. Example: the segmented changelog allows answering merge-base queries in ~O(log n) time and very quickly in practice; these queries need to happen all the time as part of handling merges on the server-side. You get the front-end, a streamlined interface derived from Mercurial, for "free" as part of the open-sourcing.
But this centralized repository isn't available yet, so I cannot evaluate it.
And the problems you describe aren't really relevant outside a monorepo or low-volume repositories, of which the vast majority of open source code falls in. I much prefer the ability to clone an entire repository and be able to make changes in a distributed manner.
If this is for companies who aspire to have Google or Meta scale problems then this sure is a weird way to advertise it.
Not sure if anyone has tried this yet... but the example tutorial does not work.
$ sl clone https://github.com/facebook/sapling
$ cd sapling
$ sl
@ fafe18a24 23 minutes ago ricglz remote/main
│ migrate packer to new CLI framework
~
From [0] under "Cloning your first repo". I get the following:
~/sapling (main)> sl status
abort: '/full/path/sapling' is not inside a repository, but this command requires a repository!
(use 'cd' to go to a directory inside a repository and try again)
Hopefully this does not assume we are authenticating with GH just to clone and see sl operating?
Neat! I hope this is a step in the right direction towards and not-so-bespoke SCM.
I do wonder:
1) How it handles large (binary) files. This is a major pain point when using git and even the standard solution (git-lfs) leaves *a lot* to be desired.
2) How does server hosting currently work? I didn’t see any mention and am assuming it’s not an option currently? (two dependencies of Sapling are currently closed source)
Could this be a replacement for git-annex?
I'm using git-annex for a while, but its slow and has some quirks.
I would love to use version control on my whole home dir and sync different computers with it.
It's interesting to see more people embracing the patch-stack workflow! For those interested in using the patch-stack workflow, but not ready to change to sapling, git patch stack would be worth checking out.
Interesting execution. I'm not totally sold that Sapling is somehow forcing smaller/(more understandable) commits. Running Sapling restack with the manual step of an `amend` doesn't sound too different than running `git rebase -i` and moving the commits around. ReviewStack is interesting, but nothing new. It seems like it's removing the need to click through the commits page in GH by exposing it in a dropdown. IMO, the real improvement to our workflows will come from using better diff tools to make reviews more intuitive. I am biased of course :) (full disclosure: I work on DiffLens https://marketplace.visualstudio.com/items?itemName=DiffLens... )
I am a bit lost. Most of the discussions seem to focus on micromanagement of commits in your local HEAD branch.
I agree that the Git commands are pretty primitive (aka low-level). But eventually you can learn the good patterns and deal with that.
For me, the big question is how to manage a monorepo (with zillion of branches) so you can express which set of branches are relevant to your current concern at time T:
- focusing on the dev of a given set of features,
- frozing them into a delivery,
- pushing a stable monorepo+that frozen delivery to your validation platform,
- once validated, integrating that frozen delivery to one of the master branches of the monorepo,
- management of the many master branches corresponding to each subparts of the monorepo into a supermaster branch
Does this tool (or Mercurial in general) help with all that mono-repository branch management ?
I've worked with Mercurial repos for 10+ years and only recently we've been migrating some of the smaller repos to Git primarily due to better supported tooling. Mercurial has done a great job in specific workflows especially with phases (which not everyone adopts) -- by default they disallow modifying public commits but allow modifying draft commits. I don't believe Git really has a similar concept, or at least not integrated at the same level. I recently read about `git push --force-if-includes` which seems like it's trying to address similar situations but mostly guessing based on comparing changesets present between two repos.
To your question about what/how Mercurial helps I'm guessing it's related to this, or other workflows enabled by Mercurial's Evolve/Topic features. It sounds like Sapling has adopted Evolve/Topics and making them functional on Git repositories though I'm not sure what it's doing under the hood.
Does it support commit signing? I spent a while reading the website and couldn't find anything suggesting it does. Lack of that is a showstopper for me (and frankly, should be a showstopper for anyone).
There is a distinct lack of decent identity management/security in all of the version control systems I've used. It's a hard problem to solve, especially in a distributed/decentralized system (like git). Signing git-style commits is problematic in the face of merge conflicts or rebasing. A patch-style system (like Pijul) probably makes this easier: if everything is a patch, every patch can be signed atomically.
I'd really like to see a DCVS with better signing support and with some form of access control (on the remote), so every change can be traced back to the author, and so that some parts of a repo can only be modified by specific authors. Git hooks (on the remote) can sort of achieve the latter, but it's a bit of a pain.
> A patch-style system (like Pijul) probably makes this easier: if everything is a patch, every patch can be signed atomically.
In Pijul, patch authors are public keys and patches are signed by default. The link with an author's identity is done outside of the patches to allow for changes in name or email address.
I don't see why the git way is problematic. It means someone is verifiably taking responsibility for all changes. That applies to conflicts and reading as much as normal commits.
Edit: I'm not saying there's not a better way, just that I don't understand the problem with git.
1. I accept there's a requirement for a second level of verification on the signature, but I can't see how that's avoided in any scenario (that is, the signing is orthogonal to the verification).
2. That's the point though. The person doing the commit takes responsibility. The individual commits are still there before the merge (including signatures), so there's no loss of responsibility or credit before the merge.
If they deliver the virtual filesystem and server, this will be 10x better for large companies and large organized FOSS teams, who very often hit Git's pain points with relative ease, and the compatibility with Git from the client side means it has very easy onboarding for new users to get their feet wet. I've struggled with all these to varying degrees at almost every job and (non-solo) FOSS project I've been in; I've long wished for something like what's described here many, many times.
No joke, this might solve every major pain point I've had in mid-size-to-large teams in both the FOSS and proprietary world if it can deliver on what it says here, not to mention many issues with monorepo migration, work sharing, subproject management, etc. Many of these problems are very real but mostly ignored or we've decided to live with them.
Git is great, I'm one of the earliest GitHub users there is. I was also an early user of Darcs (which formed the theoretical basis for later competitors like Pijul -- so I'm not unfamiliar with radically different approaches), have a lot of experience with all kinds of administrative Git tasks for large repos; there's still plenty of room to fill gaps with new blood.
This won’t go anywhere even if it’s 20% better than git. To replace git’s network effects, you need to be 10x better.
Nobody is trying to replace git; that’s not a stated goal. Plus git is so entrenched, it’s not going anywhere anytime soon.
However, few companies have more gigantic code bases than Facebook, which not that long ago, had their entire monorepo [1] in Mercurial, which had certain advantages over Git at the time.
So if there’s an organization that knows the pinpoints of version control, I’d put Facebook on that list. They’ve been working on approving version control at scale for more than 8 years.
As long as it’s git-compatible, the git true believers will have nothing to worry about.
Because it is based on Git, jj is not a CRDT, it seems to be a better merge algorithm (even though the details on their algorithms are scarce).
When I say "not a CRDT" I'm obviously talking about HEAD not being a CRDT, a Git repo is append-only, so the history of a Git repo actually is a CRDT (but that's not what the comment above meant).
Looking back when the last change happened: there was a subversion integration with git, but people actually switched to git for proper decentralised version control rather than use it. Then switched to GitHub to recentralise but that’s off topic.
I remember git-svn being pretty commonly used back in the day (circa 2007). At that point a lot of open-source projects were still using svn and if you wanted to use git locally, git-svn or something similar was how you did it.
My first experience with git was using git-svn to work with my company's internal svn repository, which I did for a couple years before the company stopped using svn. There was no internal desire for decentralized version control (rather the opposite, in fact; they wanted centralized permission management and such).
Not everything that uses GitHub/BitBucket/GitLab etc. is (re)centralized. You can have an internal company hosting service and a public hosting solution, for example. I've also used the decentralized capabilities to synchronize between two computers.
There are many reasons why GitHub (or something like it) are popular, such as:
1) not having to host the infrastructure yourself (incl. hosting it on AWS/Azure/etc.)
2) discoverability -- being able to follow people/organizations creating projects you are interested in; being able to search for projects ~ having these on various websites makes it harder to discover them
3) additional functionality/capabilities like static web page hosting (great for things like personal projects), and CI/CD workflows
Somehow that doesn't matter for FB. Yarn is not 10x better than NPM and it took off. React is.... ugh... React and it took over the whole dang industry.
The UI sounds much better than git. (but what can be worse than git in the first place?!) It's solving (or trying to solve) many common griefs of git. I don't like Meta, but I find myself enjoying the design itself.
I especially like the idea of stack. It's not something git can't do, but someone should've spent quite some time on tons of trial-and-error to nail the workflow. It's certainly a well aged project - a decade old! Kudos to that.
I'm not interested in a simplification of git, sorry.
Git's major value proposition is that they added moving parts until the system worked great. If you don't want named branches, staging, or any other piece of the ideology, then subversion is a fine choice.
Despite the greytexting of this comment, I'm still really interested in the discussion around it -- and I'm really curious to hear from people that have the opposite experience.
Moving from SVN to git felt like liberation because there were suddenly idiomatic ways of expressing states that were sort of smushed together by SVN -- stuff like, "I have some changes in my branch I want to line up for commit" which, became the handy one-word concept, "staging".
The before-times were marked by a lack of these fine distinctions. While they existed in fact, they were obscured in-system.
Like a map, any tool should 'resemble' the sphere of human activity it potentiates, and git resembles our diverse workflows better because it has so many asinine distinctions.
This was always its strength, and indeed, likely the reason the platform is called 'git' in the first place, as 'smarmy git' (English idiom for 'smartass') implies an insufferable drawer-of-distinctions.
And like a smarmy git, it's easier to complain about git than it is to replace it.
In my opinion, Git added too many moving parts in a poorly-designed way. Commits, the staging area, and stashes all implement the same sort of idea, but interact poorly and considerably complicate workflows. Having them is better than not having them, but Git would have been better off if they consolidated into a single idea and concentrated their efforts on that. If you can remove a concept from Git and still support the same workflows equally well, then surely that concept was unnecessary. (Whether the staging area is actually better served by other concepts is a matter of judgment.)
Interesting -- I have the opposite impression, and we're both just simply staring at the same pile of distinctions and disagreeing on whether it 'resembles' the workload.
I imagine this will be settled by git steamrolling sapling in the market, but I wonder if there's a faster (and less network-effected) way to adjudicate? Both your position and mine seem lodged in a taste/touch/feel context, which seems like a data-poor place to make good decisions.
On the other hand, I'd say that absent sufficient data, one should pick the most flexible tool, which I'll bet in this context is the one with the most moving parts, i.e. git.
The findings validate the earlier conceptual design analysis in practically all aspects: https://gitless.com/#research
Git doesn't support certain workflows well. For example, how do you split the contents of the staging area into two separate groups? Sapling handles this the same way it handles splitting commits in general. Essentially, it has a greater set of verbs that act on a smaller set of nouns, where Git has a medium set of verbs that act on a medium set of nouns.
> how do you split the contents of the staging area into two separate groups
What is the workflow behind this ask? I don't understand what the goal is. The basic git workflow:
1. Edit and save a tracked file; the changes appear in the working tree.
2. Select some subset of the changes in the working tree to stage them in the index.
3. Form a commit with the changes in the index.
IIUC you want to add a step in between 2 and 3? But the way I see it, 2 is doing what you want. I can split the set of current changes by selectively adding them to the index in preparation for a commit. I can also selectively un-stage changes if I decide I don't want them to become part of the commit.
Between the changes to a file in whatever editor buffer I'm writing in, saving those to disk, moving changes from unstaged to staged, and forming a commit in any of a variety of ways (plain ol' commit, amending a commit, a fixup commit) I can't imagine what other way I need to slice and dice changes. Maybe it's just a failure of my imagination since I've been using Git for so long now and only more basic things like SVN/TFS/CVS before that.
To accomplish your workflow, there doesn't seem to be any need for a staging area at all. You can use an interactive commit selector to stage a subset of changes (or select a set of changes some other way), and then if you want to keep adding/"staging" changes, you can amend the commit you just made.
Occasionally, I do run into the situation where I've staged some changes and then realize that I want to start staging another commit first, but don't want to lose the changed I already staged. Unstaging my current changes means I have to remember and select them again later. I could also just commit what I have staged and start staging a new commit (and perhaps reorder the commits later), but that shows that the staging area was unnecessary in the first place, and I could have used commits to accomplish the same workflow without adding a new set of concepts to my VCS.
On the other hand, I'd say that absent sufficient data, one should pick the most flexible tool, which I'll bet in this context is the one with the most moving parts, i.e. git.
Outside of the Git fanboy bubble, developers don’t want to have to be version control experts, which once you get past a certain level of Git usage, you have to be, whether you wanted to or not.
Especially if your team doesn’t have that person who can get you out of any Git jam you may get yourself into.
It will also be much faster to get a new developer up to speed using Sapling than Git. And because it’s Git-compatible, if there’s something super advanced that can only be done using the Git command line, that’s still an option.
Telling the intern/junior developer to read the man page for git-log is a non-starter; it’s over 19,000 words!
The best thing for the greybeards is they can continue using Git while others use Sapling and commit to the same repo.
To be fair, if I had a junior dev who couldn't skim a man page of 19 000 words, I would have freestanding concerns about their ability to contribute -- being able to inhale knowledge is a core competency of any engineer.
That said, I hear you when you say that most folks (think) they have better uses of their attention.
I daresay they're wrong, but it's not my place to dictate terms to anyone's curiosity, my own included!
coming from SVN, the biggest thing about is it's distributed nature making everything local and fast. yeah there are a lot of other concepts along with that too, but just being able to work on my laptop,
commiting and making branches to my heart's content, without talking to a server until I'm ready for that step was the killer feature for me.
SVN branches were this doofy copy procedure that made sense once you drank the Kool aid, but they were so unwieldy compared to git
You can push and pull directly from Git repositories, without resorting to some kind of tedious intermediate conversion step. Note that they don't support co-locating in the same repository at present, i.e. running both `sl` and `git` commands in the same directory.
I’d like more detail on this too. I’m interpreting this to mean that anything you can do with git, you can do with sl, but I wonder what limitations exist for that. There have got to be some, somewhere along the way. Even just edge cases! Is it as simple as swapping ‘sl’ for ‘git’ in your terminal and workflow and nothing else changes? Surely not…
You `clone` your git repository into Sapling.
You can then only use the sapling tooling (CLI) until you `push` to GitHub. You can't use `sl` and `git` or git GUI/ IDE plugins on the `sl` clone.
I find it interesting that this is open sourced a few days after 11K were laid off.
Was bulk of the team behind this laid off wherein it made sense to open source it to involve the community to take it forward rather than paid resources?
To be clear: I'm NOT criticizing them open sourcing this.
I doubt that the FB people involved with the layoffs are the same FB people who decided to open source the Sapling project. Many FB engineering managers were not aware that the layoffs were being planned. Moreover, it takes up-front planning and design decisions (by more than just a few days) for a private company to open source their internal projects.
The open sourcing schedule was completely unrelated to the layoffs.
The team working on Sapling has been planning this release for a very long time and will continue working on Sapling to support our internal engineering efforts.
Trying to use it on existing git repo doesn't work... At least not out of the box, I wonder why is that? Makes it less fun to work with as you can't easily switch
It needs its own client-side data, but it will work with existing git servers - so switching is as hard as running “(git|sl) clone https://github.com/…"
Not really. I'm not sure I want it to behave the way GP suggests, but I definitely want to have staging, and `git commit -a` is basically an equivalent of having no staging.
The reasons for that are:
1. In the vast majority of cases there are multiple files I want to commit together. Usually I change them multiple times during the process.
2. It almost always starts with some debugging in a couple of other files, and I often want to keep that debugging for a couple of next commits, but `git checkout HEAD` these files in the end.
3. For me, the most popular way of using git rebase → edit (which I do reasonably often) is splitting a commit into 2 by separating files. This is easy enough by just changing a status of a file from "staged" to "modified" (I even have `git unstage` alias for that) and commiting.
So I kinda get why GP wants what he wants. This isn't crazy.
Now, that being said, I personally have absolutely no problems with how git does that now: I've figured out a workflow that solves these problems for me, and everything is ok now. This workflow is basically making many dozens of tiny commits to a branch without even bothering to name them properly, and then just doing `git rebase -i` many-many times while working on a single branch. So I just commit the code I don't intend to keep with a label "drop that", and drop these commits when I'm done. And other commits usually are heavily reordered and squashed into 3-5 larger commits that make some sense on a higher level (like 500 LOC of refactoring first, and then 1 LOC of an actual bug-fix, which usually makes much more sense than just 500 LOC of a bugfix, that solve the problem somehow, but it's absolutely not obvious how exactly). I rarely can figure out that separation before I'm done. In fact, I often fix the problem first, then refactor, then roll-back the fix just to add it again in a separate commit in the end (if the refactoring and the fix affect the same file, which also is often the case).
> Not really. I'm not sure I want it to behave the way GP suggests, but I definitely want to have staging
So I'm replying to OP's very specific usage pattern, and you object with a completely different usage pattern?
> `git commit -a` is basically an equivalent of having no staging.
GP doesn't use the staging as a staging, since they immediately stage all modified files. That means the staging is useless, they can just commit files straight from modified. Which is what `git commit -a` does.
> 1. In the vast majority of cases there are multiple files I want to commit together. Usually I change them multiple times during the process.
OK? `git commit -a` doesn't preclude that. You just use it instead of `git commit`.
> 2. It almost always starts with some debugging in a couple of other files, and I often want to keep that debugging for a couple of next commits, but `git checkout HEAD` these files in the end.
I'm really happy for you. It doesn't work when the files are already staged, which is the case of GP.
> 3. For me, the most popular way of using git rebase → edit (which I do reasonably often) is splitting a commit into 2 by separating files. This is easy enough by just changing a status of a file from "staged" to "modified" (I even have `git unstage` alias for that) and commiting.
You don't need a staging area to craft commits, you can manipulate the tip commit directly. With good enough support for that (which sapling seems to have inherited from mercurial), the staging is just an unnecessary pseudo-commit.
For GP's use case, its "added some stuff, want to stage the stuff I modified since then to those files"... which I feel is a perfectly normal workflow.
There's nothing saying that one can't add chunks to the staging area and then immediately commit it without invoking that alias afterwards (since it is a very deliberate "add these things" rather than "adding a bunch of things and keep adding."
Looks like the typical FB thing to do. Latch on to some popular tech like git and try to sell it as innovative or "better" gathering the fanboys. After a few years the marketing hype will blow over and we will either have lots of misinformed newbies, who got into dev work on FB products. Just like we have today lots of web devs able to throw together any react widget you want, but unable to gradp when simple server side templating in any web framework would have been sufficient. Hammer. Nail.
I for one will remain skeptical. If they release it as free software, we can talk. Probably real innovation will happen elsewhere though, without the FB flavour to chew on.
Phabricator[0]: code review/CI solution from Facebook. My company uses it, open development has since been halted by Facebook and we're effectively on abandonware.
Flow[1]: JavaScript typing system from Facebook. My company uses it, open development has since been halted by Facebook so we're effectively on abandonware.
EDIT: React: Javascript framework from Facebook, my company uses it, and while it has its warts it works pretty well all things considered and Facebook has continued to support and evolve it over time!
For all I know Sapling is fantastic and will be developed for years to come. But personally I can't help but feel "once burnt, twice shy" (or in this case, twice burnt once shy). I'd be happy to be wrong here because ergonomics of Git are really frustrating in many places.
Your thought process is completely fair, but just to clarify: Phabricator was never open-sourced by Facebook. The main engineer behind Phabricator (Evan Priestley) left Facebook to create Phacility and open-source Phabricator; that was never a Facebook product.
If memory serves, Evan open sourced Phabricator at Facebook back in 2010 or 2011, then quit to work on it full time.
Shortly after (months, years?) the internal version of Phabricator diverged from the now not FB managed or stewarded OSS one.
However I think it is fair to say, assuming my memory is correct, that Phabricator was open sourced by Facebook at a very different time, before the company really committed to supporting open source projects. At that time it was more ‘if an individual engineer wanted to then go for it’ rather than there being any formal process or consideration of longer term commitments.
That changed fairly shortly afterwards with the creation of the OSS team.
I remember someone transitioning to the newly formed team and moving from Dublin to London to do so in ~2012, as we became housemates :)
Although if Evan had access to the codebase after he was an employee and if it was the Facebook codebase that was open sourced then Facebook were involved. The original post sounded (to me) like the OSS code wasn’t the same as the FB code.
I think that just backs up my point that it was the Wild West back then in terms of individual decision making.
React[0]: JavaScript front-end web framework from Facebook. For good or ill, the most widely-used web framework in the world.
Not to say that Facebook will maintain Sapling, but React does stand as proof that they're not incapable of carrying an open source project to the finish line.
I have my doubts as to whether they can be a good citizen of the open-source community and respect the developers relying on their tools. I routinely see bugs marked “won’t fix” and nonsensical new features in the React Native ecosystem.
The nice thing is that this is a client that works with vanilla git servers - if you switch to using it now, best case, you get a lifetime of good ergonomics; worst case, you get a few months of good ergonomics before something breaks that upstream doesn’t want to fix, and you go back to using the vanilla git client.
HHVM is an interesting data point too - kept PHP compatibility for as long as there were significant open-source users (eg wikipedia), but after PHP7 caught up with a lot of the performance gains, meaning there was little reason to use HHVM in PHP-compatibility mode, they then went off in their own direction with Hacklang (which is still actively developed) to get all the benefits of being PHP-like without the drawbacks of being PHP-compatible.
PHP has some benefits to its design that the vast majority of other languages don’t — deployment is as simple as “stick a .php file on in your website folder”, hitting the “refresh” button gets you the latest code with no “build” or “restart server” step, it’s all stateless shared-nothing so you won’t have data from one request changing the behaviour of another request, etc.
But the implementation has a lot of drawbacks - the language is painful, typing is bolted-on and still incomplete after years of work (eg there are no typed arrays), the standard library is an inconsistent mess thanks to its origins of “take several other language’s standard libraries and duct-tape them together”, etc.
“PHP-like but incompatible” isn’t in itself a useful feature — it is the thing which unlocks a bunch of useful features (a sensible standard library, sensible list/dict datatypes, typed collections, XHP, async functions, generics)
I thought one of the main open-source's selling point is the fact you can fork it and maintain it even when the original author abandons it. Any alternative I can think of is a community-owned open-source, which probably wouldn't often work due to limited resources and no initial funding or a corporate-owned closed-source, which once abandoned is dead for good.
I'm kinda surprised by the excitement it gets. I'm still looking for a compelling explanation, why I (or anyone else) should even bother?
I am a git hater myself. I mean, git just sucks. It always did, and it always was much worse than Mercurial. When they could have be seen as competition, I was forcing Mercurial as much, as I could, but then GitHub became a thing, and after a very short struggle it became just hopeless. There still are folks who use fossil or something, but ultimately git became THE SCM. So, yeah, I hate you, GitHub, I hate you, Linus, but I fully admit that you've won. So… now I can actually admit it isn't such a big deal.
Sure, it would be somewhat better if git never existed at all and we'd all just use a better SCM from the very beginning. But given it's just not the case, what it the problem, really? It isn't hard to learn git. I do know some people who are struggling with anything outside of simple pull-branch-add-commit-push workflow (usually performed via buttons in their IDE), but, honestly, I think they will be struggling with any other SCM just as much — it's just the difference between caring to build a mental model of the tool you use, and simply memorizing a number of popular commands. The tool isn't at fault here. So, really, git is kinda bad, but not that bad.
Monorepos? I mean, there were tools to work with them before, but does anyone outside of Google/FB actually work with repos that git cannot handle? Is it really a good idea to have such repos? I mean, it's nice that some tool can work with them, but is it actually important?
I mean, there is some new "better" SCM (often somewhat git-compatible) almost every year. But I've never actually seen anything that would make me push for that "better" SCM anywhere. Even for my personal projects. Git isn't "just git" anymore, there are countless tools that integrate with it, we all know it by heart and have sets of "best practices", how-to's, personal workflows, helper-scripts, etc. There is a huge downside to start using anything besides git, so what is the upside that would compensate for it? I never see one.
Facebook was a large contributor to Mercurial for a number of years [0]. They wanted to contribute to making Mercurial scale but it seems like they wanted to move faster and/or in a different direction than the Mercurial team wanted. Funnily I think they originally chose Mercurial for essentially hitting those same roadblocks from the Git community. Instead of continuing to contribute to Mercurial they forked and continued working on it internally, now released as Sapling.
Many companies/organizations won't hit the size/scale where this matters but there are certainly plenty of companies who have large repositories and contributors that would benefit from something like Sapling over Git/Mercurial. These tools start to become slow. Many addons have been created to ease the problem (LFS or Microsoft's VFSforGit, narrow/shallow checkouts, etc.) but they also add complications (LFS especially in my experience). Monorepos have their advantages [1], and even with disadvantages they exist and won't go away. It's more appealing to migrate a monorepo to a new tool that adds more benefits specific to monorepos than to break apart a monorepo into separate repos.
The architecture that's being moved into seems to be less decentralized which is what Git/Mercurial were initially pioneering. I believe Google has essentially also built their own server-side SCM [2] and made Git/Mercurial clients (or wrappers for a client). I believe Microsoft did something similar forking Git and/or making their own server-side SCM but I don't recall where I came across that.
Bash has weird defaults so you end up googling for everything. In fish, it just works and you barely need to search for anything.
Sane defaults matter. With hg, I don't need to struggle to get it to do what I want, it just gets out of the way. With git, sure it works but like you said it has a bunch of ducktaped tools together that change the defaults or just generally make things easier.
Now hg is half the pattern here. The other half is stacked commits. Each commit should build and get reviewed separately. There isn't any waiting for reviews on each commit, they all get reviewed over time and you rebase any changes that are requested. With git this is amazingly painful and half my zshrc is about making this simple. With hg, it just works. Take a look at hg absorb or hg split, theyre features built on top that yeah can replicated in zsh scripts but its kind of nice when you can assume they just work. It means junior engineers don't spend hours trying to fight git with stacked diffs.
Sapling is trying to fight the network effect here by doing the classic built a compatible but legitly better front end. Compatible with github but sane defaults is a BIG thing.
> In fish, it just works and you barely need to search for anything.
I keep having to google the location of my configuration file. It's ~/.config/fish/config.fish. I think, if it's not in ~/.local.
The whole function thing is also not the easiest to understand, although I love that it hot reloads and is global across all instances and so on, along with all sorts of other things.
Overall fish is one of my favorite shells but it's not 100% intuitive at first.
Because people on the internal network are behind the same ip address(es)... This makes perfect sense to mention that while showcasing work. I'm not sure what are you insinuating.
I just cannot understand how a CTO approves this type of project. Imagine running a company and some engineer comes in and and says they want to develop a new source control system. I don't understand under what circumstances this is approved. Is it a pet project for a 10x engineer and it's allowed just to keep them on board?
There are projects, like Apache Hadoop, that are open-sourced because they're an open-source answer to an extremely powerful, successful, commercial product. Sapling is nothing like this. The reason it's being open-sourced is because Meta considers it tech debt and I'm not surprised.
The simple reason is that Mercurial and git were way too slow for large repos, and FB wanted a monorepo for productivity reasons. It's cheaper to fund this project than reduce engineering productivity by even 1% at the scale of FB.
Google did exactly the same thing, but hasn't open-sourced their tools.
Projects to drive incremental productivity make not a lot of sense for small companies, but become immensely valuable at very large companies. A 1% improvement if you have an engineering staff of even just 10k is worth 100 engineers, and FB is much larger than 10k.
Just as "inventing your own source control system" is a hobby, "being a very large engineering organization" is a hobby.
FB and Google started with infinite money spigots and used it to hire a lot of people who make a lot of commits; if you laid them off, or if you canceled half the remaining Google products they haven't already canceled, not all that much would change.
Google and FB each killed off earlier competitors who had similar spigots. The idea that you can simply discover an "infinite money spigot" once, sit back, and reap the rewards from it forever, is not borne out by anything in humanity's history.
> Imagine running a company and some engineer comes in and and says they want to develop a new source control system. I don't understand under what circumstances this is approved.
Under the circumstance that no existing source control system can handle a monorepo as large as Google's or FB's. The custom source control system is not a hobby. It is vital to both companies.
I think projects like this happen at two extremes of company size. Small companies, where the team is a few engineers empowered to do whatever without having to justify to "adult supervision", and megacorps where there is actually a team, with engineers assigned, who's job is something super specific like "manage our source control"
How I wish there was a company that did something similar to monstrosity Android development abd build tools are. Nonetheless it’s an excellent news. While I won’t call it a monstrosity Git was so untenable and obtuse that I only used it because everybody else used it.
Did anybody ask for a new git? And if they did, of all the entities in the world, why would Facebook be the guys we trust with that? It's already hard trusting Github.com.
Major pain point of monorepos: Merge-conflicts.
When a merge-conflict happens YOU need to be expert and resolve all conflicts the right way, across the many unrelated domains mixed together.
How does sapling solve that? It can't...
IMHO Sapling looks like "Git for dummies".
And Git teaches some pretty useful concepts, which are worth it.
I don't see why merge conflicts would be a pain specific to monorepos. The merge conflict arises when you and some other person modify the same parts of the same file. No matter the repo size the correct solution here is figuring out the intention of the other person by reading their change or talking to them and deciding how to combine it with your change.
The only reason which may make merge conflicts happen slightly more often in monorepo (vs constellation of small repos) is that not having the repo cloned locally is an obstacle to make the change. So some folks won't bother contributing repo that they'd have to clone first and instead they'd file a bug to the owners.
If you get a merge conflict in a piece of code, you must have touched that code. And if you were capable enough to modify the code to begin with, you should be able to solve a merge conflict around it.
Facebook used to be involved with the Mercurial community, but it was difficult to work with them. They always wanted to do things their way, had their own intentions, and started to demand that the Mercurial project work the way that Facebook wanted. For example, they demanded that we start using Phabricator and started slowly removing sequential revisions from Mercurial in favour of always using node hashes everywhere, arguing that for their gigantic repos, sequential revisions were so big as to be useless.
Eventually the disagreements were too great, and Facebook just stopped publicly talking about Mercurial.
I figured they would emerge a few years later with their fork of it. They love doing this. HipHop VM for PHP, Apache Hive, MyRock; these are examples of Facebook forking off their development in private and then later emerging with some thing they built on top of it.
The Mercurial project is surprisingly still chugging along, and there are still those of us who actually use Mercurial. I doubt I'll switch over to Sapling, because I disagreed with the things that made Facebook fork off in the first place. But if others like Sapling and this manages to put the slightest dent into the git monoculture, I'm happy for the change and innovation. I really hope that git is not the final word in version control. I want to see more ideas be spread and that people can see that there can be a world beyond git.