Hacker News new | past | comments | ask | show | jobs | submit login
Git Reflow (github.com/reenhanced)
109 points by jaybosamiya on May 19, 2016 | hide | past | favorite | 79 comments



The "squash and merge" trend with git bothers me, and perhaps I'm "doing it wrong" but it just doesn't capture what I need a commit history/git-blame for. Usually, I don't care what feature a line code was for. I want to know why a developer thought that was the right change. And to get that visibility I tend to make lots of commits, treating commits almost as out-of-band comments that don't clutter the file/repo.

When I think I'm done with a feature is exactly the time that metadata becomes relevant! Why would I want to lose it?

If anything, I wish ides would integrate git-blame more into my visualisation of a file

But there are so many people into it, that I feel I must be missing something obvious and it bothers me.


> But there are so many people into it, that I feel I must be missing something obvious and it bothers me.

As someone in favor of squashing, I can say that I don't want to see things like "oops, reverting last commit" popup in my git history, especially if I'm browsing history or bisecting a bug. That's noise - useless data. OTOH, commits should be the Minimum Necessary Change to accomplish a well-defined goal. The code itself should always be clear on what it is doing, otherwise it's badly written. If it's "deep magic", then comment it in the code as such.

That being said, I do think that reasoning for why a change was made, at every level, should be in the commit message. I'm also not a fan of merge, but prefer squash+rebase.

In all cases/workflows, it can be abused, and people writing bad code, bad comments or bad commit messages will do so until you can make them care to do it better. There's no silver bullet.


> As someone in favor of squashing, I can say that I don't want to see things like "oops, reverting last commit" popup in my git history

There's a middle ground between squashing and leaving a load of disorganised crap in the history. Rebase before merging. It gives you a chance to clean up the rubbish, but it doesn't force you to squash an entire feature's work into a single commit. You can preserve the logical changes without letting the crap into your history.


I agree here. Be a good steward of your commits, and let `rebase -i` be your tool for that. A bunch of "err, try this instead" should be washed away, but it doesn't help to commit "Add Huge Feature" +10,000/-2,000 because "I should squash my feature"


If you're putting up 10k lines of code in a single review, you're doing it wrong anyway. Why wouldn't you have multiple reviews that each add different parts of a feature and link to the same ticket?


If I wanted to review 200 lines of changes, then I would still prefer to see it as a series of patches, each maybe with 1-20 lines changed and each with a clear purpose.


That's what things like git rebase are for (nice workflow with git commit --fixup + git rebase -i --autosquash). Then you clean up your fixes, while still keeping valuable history. With squash merges you are throwing out the baby with the bathwater.


You should definitely squash away fixes to commits that are part of the same commit series you're submitting; you don't want to submit a broken patch followed by a second patch to fix that broken patch.

But as part of your point that commits should be the "minimum necessary change", it makes sense to send a logical series of patches that iteratively change a project to implement a new feature (e.g. adding new infrastructure), rather than one big patch to implement that feature.


> The code itself should always be clear on what it is doing

Yes, but WHY it does that may not be obvious. "Oh, this code checks for a funny condition, but why would we care about that?". That's what commit messages are for.


I agree, but here's their reasoning: https://github.com/reenhanced/gitreflow/issues/52

> When it really comes down to it, the only place we care about enforcing a particular style of commit is in the master branch. We don't care if you make a thousand commits to get there, the only thing we care about is the individual features that come in from each (small) pull request.

> And while the history is nice, the biggest advantage of using the squash merge is that over time, git blame becomes way more useful. You get to see for every line of code in your project, not only the person who changed it, but their commit in the full context of why that change was made, including an easy-to-reference link to the pull request and ideally (through the pull request description), a link to the ticket tracker. So we can tie any line of code all the way back to the ticket that caused it's creation.

> And over time, that's all we really care about in the history. Who made this change and why was it made. Squash merging allows us to do that while still giving all of our developers the individual freedom to develop in the way that suits them best. To try and enforce commit styles in branches owned by other devs is to me, micromanagement that will go against the best results.


> but their commit in the full context of why that change was made, including an easy-to-reference link to the pull request and ideally (through the pull request description)

This is somewhat fair, but I feel it needs to be noted that even if you don't squash commit, if you look up a commit in Github, at the top of the page it will link you to the PR even if that commit is not the merge commit. I use this all the time to go from a random commit to a PR in our code, even though we do not squash commit. (We encourage, but do not enforce, an autosquash rebase against master; that is, history is kept, but you're permitted fixup! commits for really silly things like typos that we don't care to remain in the history, and it's left to your judgement what should be kept. The rebase cuts down on the amount of criss-crossing branches.) That said, I also use the individual commit message to, as the grandparent noted, figure out what a dev was — or wasn't — thinking.


I don't think history should ever be changed but there should be a way to view history as if a "squash and merge" or whatever you like had happened.


This!

There's value in both use cases: 1. Scrolling through the log to see an overview of the direction of the project on the level of complete features. 2. Being able to see exactly when, who, and why any single specific line was changed.

Squashing gives you 1, but throws out 2 on the way.


I like this a lot! Maybe way to add some metadata for a "commit collection" in git that is collapsed into a pseudo-commit by default (for browsing, bisect, blame, etc) but with the option to drill down/expand into sub-commits.


git already does this. The message for the merge commit contains the overview of what's happening in the commits in the merge, and use `git log --first-parent` to only include the merge commit when one is encountered.

That said, I still prefer squash/merge myself.


This would give the best of both worlds. People preferring the squash style can view the log cleanly, and people preferring detailed history can get detailed history.


I think there's an inherent tension between your style, and the style that prefers pushing new branches immediately.

I personally like to create a new branch locally - sometimes I'm experimenting and get ahead of my commits, and then I'll make 3-4 commits in bite-sized concepts. Other times I'll commit something that seems like it will probably work, and then I realize it doesn't, so I'm able to revert/reset the commit (so the commit is deleted rather than seeing a commit and then a revert in the log). And once I'm close to done, I can even rebase so my clean commits are all in a row. Then I push my branch.

I lose all that flexibility as soon as a team's process demands I push my branch as soon as I create it. I can no longer rebase (I still don't understand the guides that explain how to use rebase after pushing), reverts add noises to the log, merges from master interrupt the flow, etc. So in that sense, I can see the allure of a squash-and-merge to master.

I just don't think that pushing an empty feature branch takes advantage of the benefits of using git.


Push after rebase is safe and easy if:

You have protection on your remote trunk branches against force push.

You have git configured to push only the branch you are on, to the same name on the remote.

git rebase; git push -f;

It is HARD, DANGEROUS and SHITTY under other configurations.

<3 Git.


As long as no one else is using your branch as well, then they'll hate you.


Yeah, basically as soon as you push, then there's no guarantee that someone else hasn't checked it out. That's how I see it anyway.

It'd be nice if git let you push (for backup/protection) without letting other people see it or check it out yet.


It's a fairly common practice to prefix branches only intended for your own work with your initials and a slash, e.g. jd/my-work-in-progress. You can't guarantee that nobody else will work off it, but it's their own fault if they do.


We treat branches for tickets (prefixed with the ticket ID) as owned by the owner of the ticket: feel free to check out, but if the owner changes it underneath you, tough.


There's no reason you couldn't rebase your feature branch work atop the commit it shares with the master branch, and push it. You would not be rewriting anything that the remote knows about.

Another option you might consider, if you need more flexibility, is doing your work in a branch off of your `feature-branch`, e.g., `feature-branch-wip`. When you're happy with your progress on `feature-branch-wip`, interactively rebase atop `feature-branch`, merge in, and push. Their process is satisfied, and they'll never be the wiser.


But leaving a day's work locally on your laptop is dangerous, no? You have to organise some other backup system, instead of pushing to the central, RAIDed, backed up repo.


I suppose you can always set up a private bitbucket as a second remote and push to there. It might be against your team's policy, though.


The way I see it, I don't want somebody else to do the squash for me. It's my responsibility to make sure my commits are logical. I do as many rebases, fixups, squashes as necessary and provide a logical set of commits. I want the merges to preserve that as it is. But as others have said, there are plenty of really valid reasons to alter history as one deems necessary.


I'm with you. History is mostly useful after the fact to understand the details, not the big picture.

I wish git had a first-class model of a milestone-ish block of commits, so that the detailed commits and the feature milestones are disambiguated. I try to do this with merge points, but it doesn't seem to always work out, and since it's not native, it depends entirely on convention.


I get close to the milestone-ish commit by opening feature branches and then merging with no fast forward. All commits in the main branch are merges, and all of those are features. The feature incremental commits go in the branch.

It's ok, I'd just like to be able to apply this structure to stuff like bisect or blame.


Doesn't `git blame --first-parent` work for that? From my quick tests, it seems like it shows the merge commit if you have such a structure (`--first-parent` also works for `git log` etc)


Totally does (though your maintainer has to care about first-parentage order, which they should, and which should be enforced by the software but isn't - see Junio's blogpost "fun with non-fast-forward", or "fun with --first-parent" for more basic info)


And this: http://bit-booster.blogspot.ca/2016/02/no-foxtrots-allowed.h...

(p.s. your old comment on "git branching models" that starts "Not another one. All good git workflows are different..." was hilarious/awesome - https://news.ycombinator.com/item?id=11193048.)


Thanks. Git superstitions bug me a lot because it shouldn't have to be this way.


Thanks! It does. Learning something everyday.


Yeah exactly -- if git knew about a milestone point, then "git bisect" could tie into it. That in and of itself would be fantastic.


Agreed, I'd much rather have the context of the commit message than the context of an entire PR, which could be a combination of 20 discrete changes, each with their own reason for existing which is explained by the commit message. If I need more context, which almost never happens, I'll just go check out the PR on GitHub.


It's a good idea when you're working with small changes. If I'm working on fixing some bug, it should probably be in one-commit and done. But sometimes you're working on a bunch of "testing123" "save where I'm at" commits - and you don't want those in the end.

And then if you're working on a larger feature, you have that feature branch and make these small one-commit merges into that branch. When you merge that larger feature - it'd be great not to squash those commits.


I agree, I wish some VCS would figure out how to do a "history of history". I hate every time in git I destroy history (delete a branch, do a force push / update), but it is almost impossible to use git without doing these things, particularly when committing to another project.


We did figure it out, that's Mercurial Evolve:

https://www.mercurial-scm.org/doc/evolution/sharing.html



That's not the same thing as a metahistory. It doesn't tell you which commit(s) replaces with commit(s). Mercurial Evolve, for example, will record if a commit was split into several commits or if a commit was folded (squashed) into a single commit. Thus you can trace the history of a commit as it, well, evolves. You can't easily do this with git reflog, because it doesn't store that information.


> because it doesn't store that information

Is there something about the Hg architecture that makes it easier to plug in something like Evolve than with Git? Evolve isn't enabled out of the box on Mercurial either.


All of the actual infrastructure for evolve is actually in hg core. This is the obsolescence markers and the logic for hiding obsolete commits. This infrastructure makes it so that all hg commands see a filtered set of commits if any are obsolete and "unreferenced" (hg doesn't really find commits by referencing, but the logic for when an obsolete commit is hidden is similar to git's referencing logic).

The "only" thing the Evolve extension does is expose a UI for creating and manipulating obsolete commits, but it's not the only extension that does it.

So, I guess you could build Evolve for git, if you absolutely cannot be persuaded to use anything but git. You would need to build obsolescence markers, and the proper logic for exchanging them between clones. Getting this right has taken a lot of work for hg, but maybe now that the ideas are mostly there, it would be easy to replicate them.


> Usually, I don't care what feature a line code was for. I want to know why a developer thought that was the right change.

I'm the complete opposite. When I do a blame I want to see a direct link to the feature. I couldn't care less about the specific commit that changed the line.


I've wanted both at different times, they answer different questions. The feature answers what it was added for, the commit answers the question of why.


If I want to cherry-pick a feature, I want less commits to search for, ideally one big coherent one with maybe a few smaller fixes later on.


With feature branches, I can merge the branch. That works just as well.


As one of the maintainers of git-reflow, I understand the controversy over squash-merges. Personally, on all the projects I've worked on with this workflow, I have yet to find any drawbacks when needing to use git-blame; although is not to say that there is no value in maintaining a full history nor that it is our way or the highway. We like to keep all changes in context of a feature.

We have worked in environments that promote rebasing of feature branches, and while that may work well, it can lead to holes in the history of the review process due to the need to force-push.

That said, we are nearing a stabilized core API and have plans to allow for more flexibility in the process. If you are interested in following our ideas behind this, feel free to follow the issue we have open: https://github.com/reenhanced/gitreflow/issues/53


Another "I like to deliberately lose information to no benefit because I'm bad at git" 'workflow' hits Hacker News.

Something is deeply wrong with the ecosystem when people want to do things like this!


"bisect". Bisect becomes useless if not every commit compiles (or local equivalent). Bisect is a critical feature, even if I don't necessarily use it often, because when I need it, I need it.

As long as I can bisect, I don't much care about the details. But this is definitely not compatible with three dozen commits mostly consisting of "oops didn't compile" and "forgot comma" and "fix syntax erorr". You've gotta do something about that.

The way some people talk, the only acceptable history is an asciinema [1] recording of the development process with a microphone recording the developer's mutterings as it goes. The question isn't about what information you "lose" but about what you keep, because you must discard the vast bulk of it. We're arguing here about whether we chuck 99.97% or 99.99% of it.

[1]: https://asciinema.org/


Bisect also becomes useless because of large commits. Even if you find the commit, the change it introduces can be so large you don't git much improvement.

Git allows you to clean up your feature branches to prevent these kind of fix commits.

Look at git.git. They don't require people to squash all their commits into a single patch, but still every patch should be compilable.


"As long as I can bisect, I don't much care about the details."

If I can't bisect, I don't approve. So, by simple logic based on the premise you supply, I also disapprove of large commits.

My point may not be what you expected prior to reading.


Then your comment makes no sense in context, because you replied to my complaint that people lose information by doing these giant squash merges and having no nontrivial graph structure in their main project history.


There are positions other than the extremes. My position makes sense, yes, in context too.


Not saying that you should keep everything, but your argument does not take into account "git bisect skip", which works both manually and automatically.

https://git-scm.com/docs/git-bisect.


I never liked git bisect skip, becuase the bug you're tracking down somehow always manages to be in a set of 3 or 4 commits, of which you can't build 2. Maybe I'm just lucky.


You can't blame git bisect skip for that.

Note also that many other commits that could not build did not contain the bugs ;-)

(Speaking of luck, I worked with a codebase where the bug was always inside a jumbo commit involving at least a hundred of files)


I know, but I prefer codebases where skipping a commit isn't necessary -- because it always fills me with dread. ;)


Some information is worth loosing. Personally, I get exactly zero value from "Fixed a typo", "Fixed that errant semicolon", "fixed tests broken 3 commits ago" commit messages that come hand in hand with a hard and fast "never rewrite history" policy.

If that information is valuable to you, great! This is why we have several ways to do things. The trend you're seeing simply seems to indicate (mildly indicate, at best) that a larger percentage of HN readers prefer to squash and merge.

Each to their own.


I think people are talking about different things. No one wants to see any "oops typo" coomits. Those are squashed/amended in the feature branch in all sane workflows. The question is only if you represent each feature with ten commits in the feature branch as one commit in master or as 2 or 3.

If a typical feature starts with some cleanup/refactoring, I don't mind seeing that work separated from the new implementation in master. Some would consider a pure refactoring noise that shouldn't be in master.

There seems to be an argument too that unless you have a policy of squashing completely then the outcome will be noise and not enough squashing. That depends on the people of course.


>No one wants to see any "oops typo" coomits.

That's what I thought but it's not quite "no one". There are some who'd love to have all git commit history, and if it was possible, the entire editor text buffer histories and keystroke logs of other programmers' work. This was my reply to it:

https://news.ycombinator.com/item?id=11408221

The post I replied to qualified it with "this may be a minority position" but his comment happens to be the topmost comment so lots of HN voters agreed with him.

Based on the repetition of this "git rebase" topic, I can only guess that roughly half of git users want to see all git commits with "oops, stupid typo", and the other half don't want to be inundated with meaningless noise. I really have no idea though.


> The post I replied to qualified it with "this may be a minority position"

Madness.

I did qualify my suggestion that no one wants "oops typo"-commits with "in any sane workflow" :)


Very much this. I keep saying: I. Don't. Care. About. Every. Little. Sneeze. A. Developer. Had. On. The. Way. To. Closing. A. Ticket.

See how annoying that is? That's what it feels like to me to read non-squashed commits.


Why does everyone seem to insist that there's only two options?

option A) SQUASH ALL THE THINGS option B) HISTORY IS SACRED AND HOLY

We just make sure that the developer rebases and squashes the meandering micro-commits into parent logical units before merging. This gives us both sensible logical commits, and avoids monster commits.

I spend enough time spelunking through history that I dread seeing something like this when I need to track down the context for a particular change

   client/something.js   | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 critical/something.rb            |  41 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------------
 gulpfile.js |   7 +++++++
 lib/stats.erl            |  24 ++++++++++++------------
 api/somethingelse.js   | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 critical_api/another_thing.rb            |  41 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------------
 webpack.js |   7 +++++--
 lib/mapreduce.exs            |  24 ++++++++++++------------
 client/user.js   | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 critical/security.rb            |  41 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------------


I know the question is rhetorical, but the answer is that many people don't understand how to use git, and the preferred way of coping with this is to dogmatically declare some git feature (either branching/merging, or the various operations that fall under "rebasing") to be "Beast. Defiler. The source of all my pain."

But only people who are bad at git ever post things on the internet about git! (with the exception of Junio's blog)


Where is information lost? Delivered branches aren't deleted, so full history information is available - it just doesn't clutter trunk by merging in every commit.


This workflow explicitly deletes delivered branches.

  If your workflow looks like this:
  - Create a feature branch
  - Write great code
  - Create a pull request against master
  - Get 'lgtm' through a code review
  - Squash merge to master
  - *Delete the feature branch*
As far as I'm concerned, commits should be rebased and squashed into logical units on the feature branch before merge, that's the responsibility of the dev. Squashing them all into one monster commit feels like a terrible idea.


Whoops, yeah, you're right. I was looking at the screencast and didn't see an explicit deletion, and mistook that for there not being one at all.


From the README:

    $ git reflow setup
    Please enter your GitHub username: nhance
    Please enter your GitHub password (we do NOT store this):

    Your GitHub account was successfully setup!
That implies that the username is actually stored somewhere. Is it stored locally or on some reenhanced.com server? The README should be very clear about what exactly gitreflow stores and where.


From the README: "On your first install, you'll need to setup your Github credentials. These are used only to get an oauth token that's stored in your global git config. We use the Github credentials so we can create pull requests from the command line."


It probably uses the password to create a token. The token is then stored on your computer. This allows the app to get access to github but also makes it easy for you to revoke the token. See https://github.com/settings/tokens


Perhaps they only use the password to retrieve an OAuth access token which they store? At least that's how the `hub` command line tool handles it afaik.


This seems super useful- it matches my team's workflow, I'll def suggest we try it. Squash merge ftw


I'm really curious about it. One of the purported reasons - "it makes git blame more useful", is pretty silly, unless you never have anyone fixing typos or reformatting code or whatever.

git blame is already an approximation (because git, and well, everything, does not record what you did for real, only the smallest set of binary delta instructions you must execute to produce file version 2 from file version 1).

So this is essentially is "this one git command sucks, so we are going to destroy all of history to make it's output slightly better in a few cases", instead of "hey, we are going to produce a version of blame that identifies the kind of info we care about"


>yfw you type `git blame --first-parent`


I'm unhappy with the approval process being a simple search for "LGTM". I wish GitHub pull requests had actual support for a review process, e.g.:

* open issues to address

* review state, such as "changes requested" or "approved" (along with users that are in each state).

We've been using Phabricator's[1] Differential tool for code reviews and it feels superior to this process, but it would certainly be nice to have an all-encompassing solution for this.

[1] http://phabricator.org


An actual flag wouldn't be bad, but I know that I've sometimes given out conditional approvals in a code review (i.e., "if you change this, then I approve; if you don't agree, then we should talk"). The idea being to remove a round-trip that would otherwise waste the reviewee's time. (This of course implies a certain degree of trust that the reviewee makes the change as you desire, but in practice I find this isn't a problem.)


Whenever I'm working on a big patchset, I always make sure I add a checklist. Unfortunately nothing forces me to do this, so some people make huge patchsets and there's no history of what was changed or fixed in that patchset. This makes review and maintainence quite frustrating.


Created something in-house very similar for a job a few years ago; We could accept a ticket, it would create a feature branch, update the ticket and comment on it that it was being worked on, then when we submit, it would update the ticket status - assign it to a reviewer, squash commits, create a pull request and comment with the link.

This project seems to be done much cleaner though and in a more abstract and reusable manner. Well done!


So...it's Gerrit's workflow for GitHub?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: