Git email flow vs. GitHub flow

rakoo · on March 30, 2021

I still don't understand why, for projects and similar topics-based discussion, we don't use NNTP instead of Email. Email is good for addressing specific people only, and you're supposed to have been part of the conversation since the beginning so it does apply to personal correspondence but it scales horribly bad for groups of people:

- there is no included history. You need to manually download hand-crafted archives from a bespoke endpoint but then you need to import it all. If you want to quickly read some old message there needs to be a relay to a public inbox type of thing which is a hack

- mailing lists software exist but they are another hack. There is no standard for subscribing or unsubscribing

- projects often have multiple lists anyway so they need to install them all

All of these issues are solved with a network of NNTP servers, and even more: automatic backup of posts because that's how NNTP works, messages are shared accross the network so every message has a specific URI (https://tools.ietf.org/html/rfc5538) and can be addressed directly and retrieved on-the-fly if needed

progval · on March 30, 2021

Not disagreeing with your overall point, but:

> - mailing lists software exist but they are another hack. There is no standard for subscribing or unsubscribing

There is a standard for unsubscribing: https://tools.ietf.org/html/rfc8058

And for subscribing, you can often open a mailto: link.

_pfxa · on March 30, 2021

No "good" NNTP clients, maybe besides Gnus and slrn, which are rather obscure.

Gmane was relatively popular, and it was great but went down for a long time, which probably further obscured NNTP. IDK if everyone who used to use it is aware that it's up again, now at http://gmane.io/

u801e · on March 30, 2021

> No "good" NNTP clients

Thunderbird works on most platforms.

> Gmane was relatively popular, and it was great but went down for a long time

Some mailing lists are now archived using public inbox[1]. There are several ways to access it including via NNTP.

[1] https://public-inbox.org/README.html

_pfxa · on March 30, 2021

Didn't know Thunderbird did NNTP, thanks!

Can you get your own mailing lists on public-inbox, or would you have to self host? I'd love to self host git and a mailing list but setting up mailman/sympa/etc. seems to be too big a task. I was thinking setting up a wiki as a replacement for my small personal projects...

arunc · on March 30, 2021

DFeed is the best NNTP clientI have come across https://github.com/CyberShadow/DFeed

leephillips · on March 30, 2021

Those are both definitely good clients. And the idea to use NNTP is good, too, but has almost no chance of taking off, for social reasons.

u801e · on March 30, 2021

I've brought up the same point in a number of threads that have brought up mailing lists over the years. In the specific case of mailing lists like the ones used for Linux kernel and git development, I believe many participants host their own SMTP instances and use them to send and receive messages and have had those set ups for a long time.

If they were to switch to NNTP, they would either have to:

1. Create a new group on usenet (and have to somehow filter out all the spam posted to it or everyone will have to keep their NNTP client kill file up to date)

or

2. They each would have to host their own NNTP server and each one would have to add enough peering servers (incoming and outgoing) to ensure that articles are propogated to them and articles they post are propagated to others

or

3. The maintainer would have to set up a authoritative NNTP server and allow read access, but then have to add accounts for those who need write access (in order to allow them to submit patches).

Continuing to use SMTP allows them to essentially choose option 2, but not have to set up peering servers since DNS MX records handle propagating the emails they send.

That said, having a NNTP mirror of a mailing list like gmane[1] and public-inbox[2] provides the best of both worlds. You get the benefits of NNTP for reading the mailing list, but get the benefits of propagation via SMTP.

[1] https://gmane.io/

[2] https://public-inbox.org/README.html

WorldMaker · on March 30, 2021

Given gmane's funding crises over the years, it feels hard to recommend relying on such a service long term. (Though I've certainly relied on it over the years for a number of different reasons.)

What we almost need is an NNTP 2.0 that makes it easier for smaller federated groups. (Maybe this is a good use for ActivityPub and/or Matrix?)

(Public-Inbox is new to me and at first skim seems to cover some interesting bases here. I'm not thrilled at the AGPL licensing though.)

rakoo · on March 30, 2021

> What we almost need is an NNTP 2.0 that makes it easier for smaller federated groups.

We've had that with XMPP and Publish-Subscribe for years: if the project has a server with this XEP enabled, any authorized user (registered anywhere) can post any type of content.

WorldMaker · on March 30, 2021

PubSub alone isn't a great replacement for NNTP. (It's maybe a replacement for SMTP.) The reasons for an NNTP 2.0 (as mentioned above) are more about storage/forwarding, if you want access to the back history as opposed to just live updates. You can build such things on top of a PubSub protocol, as ActivityPub is at heart still a PubSub protocol, but has a lot more storage/forwarding concerns added on top, as an example.

rakoo · on March 30, 2021

XMPP Core is a replacement for SMTP. XMPP pubsub is much more than just pubsub, take a look at https://xmpp.org/extensions/xep-0060.html. There is everything about querying and forwarding, in this XEP and in https://xmpp.org/extensions/xep-0313.html. There is precedent for posting content and replying inside XMPP's pubsub as documented in https://xmpp.org/extensions/xep-0277.html; the same can be used for general discussions

WorldMaker · on March 31, 2021

Fair points. I wasn't aware of any of these tools, but then like the majority of the world I haven't actively used XMPP in years (which is obviously among its largest cons today).

rakoo · on March 31, 2021

My intent wasn't to diminish your initial comment, just that a protocol is a technical answer to a problem that is more probably a societal one. It takes time and effort to commit to another communication protocol, especially when it's not the core of what you're doing as a project. Maybe what is needed is not just a protocol but proper tools to make it easy to use (and the bar with the simplicity of email is very high)

rakoo · on March 30, 2021

public-inbox seems to be the best of both worlds indeed. Interestingly it fits your option 1: posting is open and the admin is in charge of handling spam.

But public-inbox goes way further with other means of pulling, and the easy and trusted replication gives it the same status as the code: the repo is the source of truth, not the email endpoint. I could see projects using that and, say, gitolite as the two main bricks for running a project. Now all we need is a simple CI/CD system that is designed to run on its own rather than in a specific ecosystem

maeln · on March 30, 2021

> If your main complaint is that email is horrible to work with, the issue is most likely not email, it's probably your mail client.

This is definitely talking from someone who never had to write software that deal with emails. The reality with emails ? They are an horrible format. There is several RFC describing the "email" format, the usual reference is rfc2822, but technically, you can find other rfc that relate to it (ex, the MIME rfcs: RFC 20452, RFC 20463, RFC 20474, RFC 20485, RFC 2049, RFC 68388, RFC 4289, ...). There is a lot that left to be desired in those RFC in term of specifications. They tend to gloss over some part and also allow several format for retro-compatibility.

But then, even the part that are clear and well defined are not respected. I wrote software that dealt with emails for a company that receive several thousands emails on several mailbox everyday. The number of non-complying email that we received that broke our implementation was staggering. Nobody respect the date format, that is standard any more. Then we had some weird stuff, like fields that were mime-encoded even though the spec explicitly say it must not be, attachment metadata that were written completely wrong, headers that made no sense, ... And this was not from your small email provider, it was emails that were coming from a gmail mailbox, outlook, yahoo, ...

Oh and MIME is the best footgun in existence. It allows you to do anything and so of course, you receive anything, including mail-in-mail-in-attachement-in-mixed-content-in-multipart. And you better hope that the boundary are not bugged.

I wrote HTTP servers, IRC clients, BMP parser, and many more format. Nothing ever came close to the mess that is email. Anything is better than email, that is not the clients fault, it's the format fault. We just keep using it because of network effect. But you can bet that any other communication protocol / format sucks less than email.

toomanybeersies · on March 30, 2021

Email, in some iteration or another (starting with RFC524), not only predates the internet, but predates TCP. The only older standardised communications systems still in widespread use are the telephone and the postal systems.

It's an unfortunate quirk of how the internet has developed that email has stuck around, like postal mail, as long as you put the correct address on the outside of the envelope, you can put almost anything inside.

RFC 822 makes it clear this is a feature:

> In this context, messages are viewed as having an envelope and contents. The envelope contains whatever information is needed to accomplish transmission and delivery. The contents compose the object to be delivered to the recipient. This standard applies only to the format and some of the semantics of message contents. It contains no specification of the information in the envelope.

MartijnBraam · on March 30, 2021

I actually have written email software before (and dealt with the horrors of IMAP and MIME), but that's no tthe point of that sentence, it's about email management and not having your email client do stupid things. not about using the protocols

maeln · on March 30, 2021

> it's about email management and not having your email client do stupid things. not about using the protocols

The format / protocol do impact and limit what the client can do. In a funny way, thanks to MIME, you can technically do whatever you want with emails, but that will be a poorly organized and structured whatever. In many way, emails are like HTML : A way to present information, not to structure and/or organize it. And therefore, there is only so much the client can figure out about the email. Thanks to Thread-id and the recipient list, you can group emails more easily and more relevantly, but that is about it.

megous · on March 30, 2021

For a suggested development workflow, it doesn't really matter that much since everybody is supposed to be using simple plaintext messages. Even patch content is just part of the body.

Otherwise, yes, email message format is a mess. I also had a pleasure of writing a MIME mail message parser in C. But it was quite fun to try to do it in streaming fashion with no dynamic memory allocation allowed.

Marazan · on March 30, 2021

Once, many years ago, I started writing a library for dealing with IMAP mailboxes.

My take way is I have no idea how on earth people have actual working IMAP implementations.

tonyarkles · on March 30, 2021

My first attempt about a decade ago resulted in me sending a patch to Ruby mainline :D

leephillips · on March 30, 2021

Email sucks less than any other communications protocol. The quote from the author is exactly correct. Users hate email because they use broken clients, like Gmail. Using a real client like mutt (https://lwn.net/Articles/837960/) is a pleasure; for one thing, it threads conversations correctly, unlike Gmail and almost everything else.

maeln · on March 30, 2021

I used mutt, thunderbird, apple mail, outlook, gmail, sogo, roundcube and they pretty much all have the same issues. Yes, some make use of Thread-id a bit better than other (I like the way apple mail show thread and thunderbird is "okay", although sometimes it handle edge case in a weird way), but overall, they cannot fix fundamental flaws in the format in itself: Emails are hard to search, the threading model is simple but quickly turn into a graph with many leaf which become a mess to properly show and follow along, again, a lot of emails are outright broken and then it is up to the client to try to show what it can, the dates should not include timezone which is dumb so everybody do it anyway but follow their own standard, so you are not sure your client will correctly parse the date, files are encoded in base64 for hell-sake, ...

ori_b · on March 30, 2021

I'm currently touching the plan 9 email tools a lot, and recently rewrote the acme mail client. There are warts, of course, but the ones you've discussing are either related to the underlying communication and not the format, or are easy to deal with.

I'm not sure why you say emails are hard to search. Extract the text and put it into an engine like lucene. There are the usual small corpus problems, but these aren't dependent on format.

As threads are fundamentally a graph, so complaining that the structure matches the structure of the data is... Odd.. to me.

Including the time zone in the date is a small wart. Add an extra specifier to the format string and move on. Base64 is similar: a bit annoying, but decode and move on. And use RFC1652.

leephillips · on March 30, 2021

Indeed. Also, tools like grepmail make big mailboxes easy and fast to search through.

u801e · on March 31, 2021

> Yes, some make use of Thread-id a bit better than other

Traditional email threading was based on the contents of the In-Reply-To and References headers. The algorithm is described in RFC 5256 for the IMAP SORT and THREAD extensions[1] and used in several mail user agents[2].

[1] https://tools.ietf.org/html/rfc5256#ref-THREADING

[2] https://www.jwz.org/doc/threading.html

andrius4669 · on March 30, 2021

>RFC 20452, RFC 20463, RFC 20474, RFC 20485

>RFC 68388

what are these numbers, I don't recall RFCs going into 5 digits yet

maeln · on March 30, 2021

Yes my bad, it is from a bad copy-paste, you should ignore the last digit.

colejohnson66 · on March 30, 2021

How do you copy paste something so wrong that it adds a random digit to each number?

yebyen · on March 30, 2021

They aren't random at all, they are practically in order. It was probably a footnote reference to the bottom of the source article, with a link to each RFC at the footnote.

u801e · on March 30, 2021

The only way I think is that it was a numbered list and they pasted the RFC number before the list number so the list number was actually appended to the RFC number?

ryukafalz · on March 30, 2021

I like the decentralization aspect, but every time I've had to send patches by email... it definitely has not been as straightforward as the pull-request workflow. I suppose part of that has to do with the project I have experience with using Debbugs (which requires that subsequent patches in a patchset be sent to a newly created address rather than all of them being sent to the same email address). I might like it better if I were contributing to something using lists.sr.ht and could just set the default list address for the project and not worry about it.

I do also like being able to update an existing PR as I'm making rapid changes though, and just pushing up to whatever branch my PR references is also really nice. (It's, what, 5 keystrokes to push my latest changes to my remote for that branch? And that's assuming I don't already have the git status buffer open.)

Applying a PR is also straightforward, which matters especially for newer users. Uh, you press a button.

I'm all for decentralization, but this is the UX you're up against.

arghwhat · on March 30, 2021

> it definitely has not been as straightforward as the pull-request workflow.

I agree - there's a bit of a setup phase, stemming a from email having degraded into "that thing you see on gmail.com". https://git-send-email.io/ is nice for validating the setup when in doubt.

However, once the flow runs, I think it's more straight forward than a pull-request workflow.

> I do also like being able to update an existing PR as I'm making rapid changes though

Rapid changes to a PR is bad etiquette. Submit a PR for review, await comments, when received, prepare fixes and submit a new iteration.

Submitting changes after opening a PR that isn't to just to fix CI results is just noise, and suggests that you erroneously opened a PR for review before you were ready for review. Submitting changes little by little likewise suggests that you're still working on an open PR, making noise and and making it hard to see when I should resume my review.

Don't push anything to a PR you don't want me to review and merge. Hold off until then.

> Uh, you press a button.

Yeah, it's easy, but that button has lead to so many bad git habits. Including, say, not caring about their git history and expecting the reviewer to squash everything into ugly commits.

ryukafalz · on March 30, 2021

> Rapid changes to a PR is bad etiquette. Submit a PR for review, await comments, when received, prepare fixes and submit a new iteration.

I think this depends a lot on context. GitHub now lets you open "draft" PRs, which are explicitly unmergeable until you mark them as ready. If I'm unsure of the approach I'm taking, it's helpful to be able to show my team what I have and ask for feedback before I've done the work to polish it to a mergeable state.

Surely I'm not the only one who finds this useful, or the feature wouldn't exist.

Furthermore, it's also helpful when collaborating on something that's not code via a PR. For example, I've used git to collaborate on documents; as far as I'm concerned it's much better to have a tight feedback loop between the comments I receive and the edits I make to the document. There's not one "right" way to word things, and I'm not necessarily the only one working on that document, I just happened to be the one to start it; allowing other people to suggest small edits to the document piecemeal more closely matches the kind of collaboration I'm actually looking for.

WorldMaker · on March 30, 2021

Even before Draft PRs existed, I would often encourage especially junior devs to open PRs early and just include a note (in the description or the first commit) with a "Work in Progress" or "Not Ready to Merge" type comment and then either delete that (if in the description) or add a new comment with "Ready to Merge" or similar when ready.

It's great to see GitHub add this as a real tool now, but there's always been lots of good reasons to open PRs early, evolve them rapidly with discussion, and merge them only once discussion converges. It can be a useful workflow for some types of teams.

u801e · on March 31, 2021

> evolve them rapidly with discussion

That, in my experience makes it harder to review the change. To me, it's easier to do something like that while pair programming. Code review should be done once a final approach has been taken and only minor changes or factors that the implementation may not have taken into account should be addressed.

WorldMaker · on March 31, 2021

Most PR systems have multiple ways to "slice" the changes coming in to specific updates and giving you "bookmarks" for what you've reviewed prior and what is new in the latest update.

Just because it may not be your preferred workflow doesn't mean it isn't a useful workflow or that the tools don't already exist to make it a manageable workflow.

u801e · on March 31, 2021

My experience with Github Enterprise is that I'll make a comment and someone will commit a change, but the reference to the change is far removed from where my comment text is (especially if there are many comments made between my commit and the commit that addresses it).

And if there are many other such commits addressing different comments, it gets difficult to find the one that specifically addresses the code I commented on.

So, I essentially need to skim through the diff again to find the code that I commented on, open another tab to find my comment and uncollapse it if it's outdated according to Github, and switch back and forth between the tabs to see if my change has been addressed. If it hasn't been addressed, I need to start a new comment thread on the updated diff.

It's certainly not a very efficient workflow.

rectang · on March 30, 2021

Yeah, I've got strong git skillz and could handle the email workflow, but I prefer to work with branches rather than a bunch of loose patch files.

There are a couple things I strongly dislike about Github though. Number one: the default commit history display with commits force-linearized by date, which is just messed up and wrong when actual Git history can only be properly modeled with a topological view revealing branch lines of development. Give me `git log --graph`!

jlokier · on March 30, 2021

The GitHub linear view that shows no graph is a real problem on my team, because at least one team member has no idea when they are creating a complex, unbisectable graph with their merge commits and no-message submodule updates.

To them the timeline is just a linear sequence of what everyone has been working on recently, mixed together. They do understand feature branches, but then they randomly merge their branches in progress and the result is still a mess somehow.

To me, seeing the `git log --graph`, it's a difficult to understand mess with redundant micro-branches and micro-merges. Randomly unrelated file changes and even file deletions also occur - `git commit` might as well mean "copy my working directory into the repo". As a result, the main branch quality varies and cannot be trusted from one day to the next.

The difference between us is I actually look at the repo history, I look at my commits before I commit them. They appear to commit things without even knowing what's going in, and don't look afterwards, as it's a surprise to them when I complain about things like a file deletion.

GitHub's commit history view is part of the problem, because its presented as if it's the main, primary way to look at the history. It's not the problem, but it makes things worse. Better visualisation tools would help, but for now everyone has their own "pet way" of using Git, I lack the social capital to convince anyone to do better, and the gravity of GitHub as "the way" makes it difficult to get anyone to care about any alternative way of viewing the repo. "It's what everyone uses".

GitHub issues similarly suck as a project management tool; guess what we're using.

WorldMaker · on March 30, 2021

The GitHub "network graph" is most known for exploring fork activity, but even in a single repo scenario with no forks you can still use it as a cudgel to show some idea of the complex graph if you need to stick with GitHub provided visualization tools. (It doesn't help that several UX redesigns have done nothing but bury the network graph, but it's still around.)

You might think about investing in a copy of a client that puts the graph up front for that developer, such as GitKraken.

Though I've actually had good success with a tool that is free and already out of the box in most git installs: gitk. gitk is ugly, but it gets the idea across and is easier to teach most developers to use than trying to get them to remember to add the --graph flag to git log. (Though adding a good alias for `git log --graph` or some further relative like `git log --graph --oneline` and encouraging them to use the alias instead is also another option.)

ClumsyPilot · on March 30, 2021

This, a thousand times this. I fail to understand how such an established and well funded project can lack somethong so basic, so core to the entire system.

<rant> Every time I see people doing a dance and paising billion dollar companies like netflix, uber or deliveroo, I check them out, realise that they lack most essential functionality:

Deliveroo didn't have Search(!) For restaraunts for like a year

Netflix will sell you a 4K plan, but if your system has 4K screen but does not satisfy their '4K requirements' it won't tell you that you aren't gonna get 4K or what you are missing.

Uber outright used to lie about time to arrive, in London all taxis were always 4 minutes away.

So I I often wonder, we pulled resources and finance in billions across the world, and this is the best we can do?

rakoo · on March 30, 2021

> I prefer to work with branches rather than a bunch of loose patch files

The two are not mutually exclusive. You can have branches locally with the changes you're interested in and create patches from master to the tip of your branch. Same as a maintainer: you can apply patches in a different branch, check everything is alright and then merge it

Patches only live in the mailing list and are not supposed to be manually handled on your machine

0x0 · on March 30, 2021

Gitlab has a great graph view

pabs3 · on March 30, 2021

There are lots of graphical history viewers, I use gitk mostly and git-big-picture to get a graphical overview graph without individual commits. The gitg GUI from GNOME is fairly good too, and of course there are better tools for proprietary operating systems.

rectang · on March 30, 2021

For local development, I'm actually content with what I get from the Git command line interface: I use a slightly customized variant of `git log --oneline --graph`.

The problem is that I am often browsing the history of repositories hosted on Github via the Github web interface, along with other interactions such as searching through issues and PR history. Cloning a repo to my local machine and firing up gitk is a pretty inconvenient interruption of the web browsing experience.

megous · on March 30, 2021

I once wrote a hook script for a company I work for that would send patch series e-mail when someone pushed a new tag into a shared repository, so I could comment on the changes just by replying to emailed patches. Cover letter was created from a message attached to a tag.

So if someone wanted a review of some changes all that was necessary was to push their current status to a tag, which is a single command in git. No git-send-mail setup needed.

It was quite nice middle ground, and didn't require me to use github/gitlab GUI. Being able to stay in neomutt for most communications was nice.

eeZah7Ux · on March 30, 2021

The email workflow is objectively, measurably simpler and it works really well for kernel development.

Look at the examples on sourcehut.

Yes, it takes a moment to get used to something different, but that's a human problem.

jillesvangurp · on March 30, 2021

It works well for people used to command line interfaces to email, mailing lists, and using cli tools exclusively. Old people in other words. No shame in that; I am one. But I'm well aware that I'm a minority. I mostly work with twenty somethings these days. Keeps me mentally younger than I have any right to be. Also forces me to keep updating my skill sets.

In any case, I tick only one of those boxes as I use a lot of cli tools (including git of course), aggressively unsubscribe anything that looks like a mailing list (never had the patience for the low signal to noise ratio and horrible UX of having to mentally parse deeply nested threads). NNTP was always superior to that and the last time I worked with that is 15 years ago.

I have not used a cli email client since the mid nineties. I considered outlook express to be an upgrade back then. And that wasn't all that great if you think about it. I also used Thunderbird for a while until dropping the notion of having offline email archives on my laptop/computer entirely. These days I use gmail, slack, and a lot of other chat tools. All cloud based stuff. Mostly work related emails are very limited and restricted to interacting with non technical people and alerts/notifications. It is not part of any technical process I've used in the past decade.

I actually used git apply patch back in the day when I was transitioning from subversion to git and was stuck in a team insisting on subversion. Pretty cool way to move commits around and it allowed me to work around some limitations with git-svn (no rebases in svn); but not for everyone.

The nice innovation of the pull request by Github achieves a similar flow but with a pretty UI, easy to use tools for commenting, line by line review, search, issue tracker integration, and a bunch of other things that IMHO stopped being optional on most projects. I kind of like the traceability of PR #4 closes issue #13 and successfully cleared CI & received a positive review so it is safe to merge.

Kernel development has historically catered to the needs of the core committers, most of which have been working on this project for decades and one of which happens to be the inventor of Git. And of course anything kernel related tends to be very cli centric to begin with. And finally they do process a very large number of patches. So their use is very valid of course and I guess any tool changes are very disruptive there given the large amount of stakeholders. The statistics on that project are in any case mindblowing. You can get a good picture of that via Githubs insights on Linus Torvalds fork of linux. I don't think he uses pull requests though. Is there an issue tracker for linux even?

ddevault · on March 30, 2021

This is a really nasty argument. "Old people" are (1) not the only people who understand mailing lists, (2) not the only people who like mailing lists, and (3) not "set in their ways", so to speak, or failing to "updating their skillset". There are plenty of "twenty-somethings" which use the email workflow. What a gross, ageist comment.

>The nice innovation of the pull request by Github achieves a similar flow but with a pretty UI, easy to use tools for commenting, line by line review, search, issue tracker integration, and a bunch of other things that IMHO stopped being optional on most projects. I kind of like the traceability of PR #4 closes issue #13 and successfully cleared CI & received a positive review so it is safe to merge.

There is nothing preventing these features from being done with email, either.

>You can get a good picture of that via Githubs insights on Linus Torvalds fork of linux. I don't think he uses pull requests though. Is there an issue tracker for linux even?

I wrote an article which can offer some insight into the Linux development process:

https://drewdevault.com/2020/09/02/Linux-development-is-prof...

jillesvangurp · on March 31, 2021

If by plenty you mean a vanishingly small minority of twenty somethings, then yes you are completely right. They exist. Just not in very large numbers.

I'm merely making the observation that it's mostly older generations (like yourself?) that are 1) used to/familiar with this way of working 2) actually prefer doing so. There's nothing nasty intended here. I think you are taking this way too personal.

ddevault · on March 31, 2021

I'm 27.

u801e · on March 31, 2021

> It works well for people used to command line interfaces to email, mailing lists, and using cli tools exclusively.

But don't most people use the CLI to run or edit their source code in some fashion?

alberth · on March 30, 2021

Sourcehut.org is as close to using “stock” git as you can get. It’s also email based flow.

See the following write up this 2018 write up on email vs GitHub flow https://drewdevault.com/2018/07/02/Email-driven-git.html

mehdix · on March 30, 2021

It's a while I've switched to sourcehut and learned the email workflow for the first time. It was easier than I thought, and I am very pleased with the outcome. Perhaps leaving megacorp mail clients behind had helped me to embrace this workflow faster.

hashkb · on March 30, 2021

And it's free; and the web interface has no JS, if you're into that sort of thing.

mehdix · on March 30, 2021

I assume you mean free software? Since sourcehut is intentionally a paid service for hosting content, albeit free during beta and for contributors.

hashkb · on April 5, 2021

Sourcehut.org vs the product itself. Nobody is forcing you to use the cloud product. No features are paywalled. The community will support you on their own time on the mailing list. And if you can't afford to pay for the cloud version, you can email Drew and he will help you.

mugsie · on March 30, 2021

For me the happy medium is something like gerrit. It works on the one change, one patch, one commit strategy, which helps with the ability to rebase for fast moving code, makes it easier to view CI results for a patch, and is slightly more user friendly for comments / reviews.

It also allows for a group of people to be co-maintainers (e.g. 2 people need to approve a patch before it get merged), and when tied into a CI tool like zuul-ci can help projects be sure the CI tests are testing the actual merged state of the project.

The downsides are people need to learn how to rebase, use git commit --ammend, and for the easiest submission install something like git-review or repo to submit patches. (it is possible to use gerrit without them, but remembering the syntax of git push refs/for/<branch> can be difficult.

brandmeyer · on March 30, 2021

The other big downside is that enforcing the unit of review to be a single commit instead of a branch tends to encourage over-large commits. Rebasing a series of related commits for gerrit is a major pain.

mugsie · on March 30, 2021

yeah - that is a pain in the beginning.

Personally (this is completely anecdotal, and not backed by any serious data), I have found it drives people to do multiple small commits, as they rebase a lot easier.

using something like git review means that the entire chain is rebased by default when you summit a new patch on top, and I have the `git commit --am -a --no-edit && git fetch && git rebase` in muscle memory at this point.

(the lack of ability to have PRs based off other open PRs is one of the biggest issues I have with the current github / pull request model, so I probably over compensate for that type of situation)

u801e · on March 31, 2021

How does Gerrit handle a group of commits in a branch? That it, if a feature takes several commits to implement and some of them depend on changes made in earlier commits in the branch, how do you ensure that the earlier commit is approved and applied to the main branch before the later commit?

mugsie · on March 31, 2021

it groups the branch into a chain of commits, and has dependancies between them, so you can approve a commit on top of the chain, but it will not merge until the ones it is based off merge.

brandmeyer · on March 31, 2021

My recollection was that you had to give each commit its own local branch name and both rebase and push them one-at-a-time with Gerrit. Are you saying that an entire branch of commits can be pushed and updated at one time?

mugsie · on March 31, 2021

yeah - at least with something like git-review [1] when you do git review on a branch with multiple commits, it creates a chain of commits in gerrit with one command.

1 - https://docs.opendev.org/opendev/git-review/latest/

brandmeyer · on March 31, 2021

> the lack of ability to have PRs based off other open PRs is one of the biggest issues I have with the current ... pull request model

At least in bitbucket, you can just set a branch under review as the target of a pull request and the UI "just works" for that case.

mugsie · on March 31, 2021

yes - the same for github, but when you go to merge it, it messes up.

brandmeyer · on March 31, 2021

> it messes up

Can you be more specific about the failure mode here? If we have a series master -> A_review -> B_review, and B_review is targeted to A_review in the PR tool, then a fast-forwarding merge of A_review to master just advances the master branch name and automatically updates B_review to now target master. No merge commits take place.

twobitshifter · on March 30, 2021

This is the first I’ve heard of sourcehut. I agree that it’s a better method. Sticking closer to text and staying away from vendor lock-in is the way to go from my perspective. I’m not anal about my git history, so I can use the GitHub or Gitlqv workflow, without problems, but this would be an improvement.

alberth · on March 30, 2021

Drew, who created Sourcehut, is very active on HN

https://news.ycombinator.com/user?id=ddevault

javitury · on March 30, 2021

Reading this article has changed the impression I had about git-mail-flow. Naturally, more questions came to my mind.

Is a git-mail-flow compatible with continuous integration? Yes, it seems to be

https://sourcehut.org/blog/2020-07-14-setting-up-ci-for-mail...

Is it possible to construct ergonomic workflows around a git-mail-flow? I suspect you could do it with notmuch and alot, although I wonder which tools do sourcehut users prefer.

ddevault · on March 30, 2021

I use an email client I wrote for this purpose:

https://aerc-mail.org/

Greg-KH has also written about his mutt workflow:

http://www.kroah.com/log/blog/2019/08/14/patch-workflow-with...

Though his volume and needs are probably different from most users.

IgorPartola · on March 30, 2021

The Linux kernel uses email. I can’t imagine that crowd does things the inconvenient way.

sofixa · on March 30, 2021

I can't imagine that crowd doing it anything but their preferred way.

Remember when a Debian maintainer quit due to the obsoleteness of the toolchain that the other maintainers flat out refused improve because "it worked" ( my google-fu is failing me, if someone finds it please post it)? I imagine things to be the same with the kernel ( not saying the toolchain is obsolete, just that if it were, they won't just change and jump to the latest X).

nethead · on March 30, 2021

This one: https://michael.stapelberg.ch/posts/2019-03-10-debian-windin... ?

sofixa · on March 30, 2021

Yep, thanks!

ddevault · on March 30, 2021

This seems to have everything to do with Debian and nothing to do with git-via-email.

detaro · on March 30, 2021

Nobody said it did.

ClumsyPilot · on March 30, 2021

Despite technical brilliance, this is one of the most closed-minded and stubborn crowds arounds. Just look at the reaction to adding case-insensitive functionality in filesystem in comments:

https://www.collabora.com/news-and-blog/blog/2020/08/27/usin...

swiley · on March 30, 2021

Case insensitive filesystems are a really bad idea though.

colejohnson66 · on March 30, 2021

But why? Why are they a bad idea besides “Windows does it and ‘Windows bad’”? Why is DNS case insensitive (example.com and EXAMPLE.com are the same), but not files (.Xauthority and .xauthority are different)?

Case insensitivity lines up with the average user’s expectations. For example, if I’m searching for a file, I want a case insensitive match. Because if I named a file “Resume.txt” and searching “resume” didn’t bring it up, I’d be pretty confused. Now imagine the average Joe trying it. Explaining to them case sensitivity won’t convince them it’s a good idea because “Resume” and “resume” are the same thing.

GrumpySloth · on March 30, 2021

Should that work different for different languages? Imagine the mess. Maybe the file names aren't even written in any human language. Should we use English rules for all languages? Why? What makes English so special that English characters would be normalized, but characters from the native language of the user won't? Wouldn't that trip people up?

Case-insensitive file systems are mostly advocated for by those who only or mainly speak English and aren't aware of how much variety there is in the languages across the globe.

And even in English "us" is a pronoun, whereas "US" is a country.

EDIT: another question: should the normalization rules be changed when the language changes? Break backwards-compatibility?

ClumsyPilot · on March 30, 2021

Thank you gor being exhibit A

If you bothered reading the link, all of these questions have been addressed long ago by folks are are more knowledgeable than either one of us.

The rules are spesific to each language, and are especially neccesary for languages thay have several alphabets. Without this functionality, efficient search is impossible.

This silly, forcefull and uninformed critisism is exactly the kind of behaviour i was talking about.

GrumpySloth · on March 30, 2021

I have bothered to read the link.

The point of those questions wasn't to see if there are answers to them. The point of those questions was to show that all answers to them are flawed.

Yes, the link provides one set of answers. (Except for the backwards-compatibility and changing language question.) But it doesn't solve the problem.

When you want case-insensitive search, it's the search tool's job to provide it. E.g. whenever you want to do a case-insensitive search inside a file, you pass "-i" to grep or click a checkbox in a GUI. You don't change the file system to normalize characters inside files.

From the link, it also seems like the main motivation for the change is compatibility with Windows software. In particular, it mentions that it isn't something that should be enabled globally in the file system. It really isn't a convenience for the user.

The article makes a good case for providing compatibility with Windows software. But not for much else.

Thank you gor being exhibit A

ClumsyPilot · on March 30, 2021

"The point of those questions was to show that all answers to them are flawed."

Whats flawed? Where is it flawed? You provide no arguments at all!

"When you want case-insensitive search, it's the search tool's job to provide it."

I dont know if you didnt read it, or missed it, but the article clearlerly explains that the tool above the FS cannot do such a search performantly, it has to happen at the FS layer

"You don't change the file system to normalize characters inside files."

Where does this certainty come from? Why do you offer no technical argument to support your position? Is this by Pope's decree?

Can you demonstrate a tool that can search efficiently in different alphabets, or when the same character is presentes by different unicode codepoints?

Reading replies here convinces me even more that you just picked 'something something windows' and react like a bull does to a red rag.

Not a single good technical counterargument has beed presented

GrumpySloth · on March 30, 2021

Ok, maybe this you will judge as something constructive: when the underlying medium is case-insensitive, your application cannot behave in a case-sensitive way. But frequently I care about case-sensitivity in my searches. I gave an example in my top-most comment: "us" vs "US". On the other hand, when the underlying medium is case-sensitive, the application can implement case-insensitivity on its own. I do it all the time. Sometimes I want to run "find . -name", sometimes "find . -iname", and the first one not because I forgot about the second.

> Reading replies here convinces me even more that you just picked 'something something windows' and react like a bull does to a red rag.

Completely missed. I appreciate a lot of design choices behind Windows and use it with pleasure. However, I judge this one aspect of it negatively. It's also a source of recurring problems with Git on Windows.

Windows itself has case-insensitivity largely for backwards-compatibility reasons (from the times of MS-DOS). The underlying filesystem (NTFS) itself is case-sensitive, it is the OS API that normalizes filenames, and is itself case-preserving, rather than case-insensitive, when it comes to writing files.

UPDATE: another point: I may want to have a directory with an image for every character of my alphabet with files named accordingly. With a case-insensitive filesystem I can't have an "a.svg" and an "A.svg" in the same directory.

cweagans · on March 30, 2021

Isn’t that an implementation detail of whatever tool you’re using for searching for files? I seem to remember most search utilities doing that already?

IgorPartola · on March 30, 2021

Why would you ever want a case insensitive file system?

detaro · on March 30, 2021

Click the link.

mixedCase · on March 30, 2021

So, for Windows compatibility layers.

ClumsyPilot · on March 30, 2021

So you havent read 90% of the article about languages with mutiple alphabets or efficient search, but you are willing to argue anyway.

Thanks for proving my point.

mixedCase · on March 30, 2021

I read it, even though you could be succintly making your points here instead of linking generic articles, and that's why I surfaced the only thing I considered valid.

The usecase you bring up, including the multiple scripts issue, is search. And it's not and should not be a filesystem concern which optimizes for a different kind of access from which you can build search on top of.

ClumsyPilot · on March 30, 2021

The article is not generic, it is spesific to the issue being discussed, and it explains the issue better than I can.

"And it's not and should not be a filesystem concern which optimizes for a different kind of access"

Why not? Is that by Pope's decree, or is there an actual reason for that?

The filesystem already has like 3 different APIs, with and without caches, sync and async, so clearly they do optimise for different kinds of access.

Can you demonstrate a tool that can search efficiently in different alphabets, or when the same character is presentes by different unicode codepoints?

mixedCase · on March 30, 2021

> Why not? Is that by Pope's decree, or is there an actual reason for that?

By the decree of it vastly increasing complexity and drastically changing the very generic problem a filesystem tries to tackle, instead of just isolating it to its own solution for when it's actually needed and the trade-offs make sense.

> with and without caches, sync and async, so clearly they do optimise for different kinds of access.

That are compatible with the previously existing patterns. Case-insensitive/mapped codepoint full-text search is a very, very different problem and for which you should reach out to the right solution.

And if you want it in your file browser, well nobody's forcing your operating system's human-facing file browser to base its UX entirely on file system primitives.

As for existing tools, I don't know, I don't care for them since I can just organize my files and find them with fd or use a mapping tag-based file-system like tmsu. But I would assume this is the kind of usecase KDE's Baloo or GNOME Tracker intend to solve.

PurpleFoxy · on March 30, 2021

I imagine it’s convenient to them but being difficult for everyone else is a feature. They don’t want people sending in random garbage code so the barrier to entry is kept high.

vxNsr · on March 30, 2021

Inconvenient to who? There a many things in Linux that anyone who doesn’t use Linux full time finds majorly inconvenient, a lot of the UX around Linux is only intuitive to the 45 year old graybeard. To us 30 year old win DevOps guys Linux isn’t actually that “convenient” out of the box.

edgyquant · on March 30, 2021

They were talking about development of the Linux kernel, not using the OS. But I also take issue with your “graybeard” comment. Unix has won out for a reason, I and many other who aren’t old hats prefer the flow of development and deployment on Linux/Unix to that of Windows.

vxNsr · on March 30, 2021

I’m sorry, I use that as a term of endearment. The GP appeared to me to be implying that those who work on Linux prefer convenience I was just remarking that convenience doesn’t always equal ease of use for most ppl.

koonsolo · on March 30, 2021

Especially in the frontend world it all seems to be unix command line.

tasogare · on March 30, 2021

> Unix has won out

Nop, it doesn’t. Linux, which is a Unix clone, has won on some category of computers (notably servers) but most unices derivativing from the original one have a marginal market share.

> for a reason

Price mostly, and access to sources.

Shared404 · on March 30, 2021

Mac is still certified Unix iirc, and FreeBSD powers at least Netflix, so it seems Unix proper is still sticking around.

lelanthran · on March 30, 2021

> To us 30 year old win DevOps

There's your problem right there. If you expect unix models to follow the particular broken-by-design failures of Windows on the server then that's the problem.

Unix models have their own broken-by-design crap, and it is different to the one you know :-)

Shared404 · on March 30, 2021

19 year old DevOps/Sysadmin here, gonna have to disagree.

I can work with Windows, but I hate every second of it. Meanwhile, Linux/BSD is intuitive and easy to get going.

Almost like what tools you know impact what you find easy.

vxNsr · on March 30, 2021

Can I ask what’s intuitive about a UX that hides all functionality behind cryptic commands that require reading the mind of the person who made them to know which three letters correspond to the acronym of the command you’re trying to run?

Shared404 · on March 30, 2021

Perhaps intuitive is the wrong word.

Once you are familiar with it, it is easy to continue using, and much faster than fumbling around in a GUI trying to find the magic button.

For most commands it's also easy to find the necessary subcommand via man or -h or whatever. The other big thing is scriptability, there's a number of things I find myself doing a few times a day, I can throw that in a script (about a minute to do for most of them) and now they take .5 seconds to do, versus waiting for a GUI to load/run -- Plus, now it's my stupid 3 letter acronym I need to remember :P .

chii · on March 30, 2021

I suspect that people who are used to working in small, integrated teams are more used to using the github flow (aka, a web tool to do code reviews which also integrates git commands).

People who are more used to a hub and spoke model - aka, a maintainer receiving tonnes of patches from many different people - would prefer the git email flow (it requires less work from them - patches that don't merge is pushed back to the contributor).

gocartStatue · on March 30, 2021

For small, integrated teams, github flow may be suboptimal. Frequent merging to master, aka trunk-based development, aka Continuous Integration is the way to go for me.

chii · on March 30, 2021

> trunk-based development, aka Continuous Integration

continuous integration isn't trunk-based development (where everyone merges into the same master branch)!

gocartStatue · on March 30, 2021

I love pasting that quote :)

"The idea that developers should work in small batches off master or trunk rather than on long-lived feature branches *is still one of the most controversial ideas in the Agile canon*, despite the fact *it is the norm in high-performing organizations such as Google.* Indeed, many practitioners express surprise that this practice is in fact implied by continuous integration, but it is: The clue is in the word “integration.”

[State of DevOps report 2016]

divbzero · on March 30, 2021

Thank you!

I’ve heard that Linus Torvalds disliked GitHub pull requests in part because they tried to reinvent the wheel instead of using Git’s native system. I wasn’t clear on what that native system was until reading this article.

properdine · on March 30, 2021

Understand where author is coming from - but doesn't squash-n-merge (newish github feature) solve the issue of needing to rebase and the issue of having too many merge commits?

Squash-n-merge has nice property of removing unnecessary local information that probably doesn't matter at a meta level (commits are nice when reviewing PR, doesn't matter much later)

fishywang · on March 30, 2021

(squash-n-merge isn't new on github, unless you are not talking about the same thing I'm thinking about)

Yes squash-n-merge is often needed in github's PR workflow because no one need those un-bisect-able fixup commits in the final merged master/main branch, and also they make the diff between different states of the PR more readable, but it comes with its own problems.

Main problem is commit message. As the contributor (the one sending out the PR for maintainer to review), you have no control on what the final commit message in the merged single commit is. The maintainer doing the merge decides that for you, and by default github generates that message by combining all the commit message titles (the first line of the commit messages) of all the commits in that branch, and that's almost never the good choice for the final commit message.

Another problem with that is the email in the final commit. When the maintainer use squash-n-merge, github uses your default email on file on your github account, regardless whichever email(s) you configured your git to use and associated with those individual commits inside the PR.

As a result, squash-n-merge is more suitable for contributors less familiar with open source contribution, for example people not yet realized the value of a good, concise commit message, and people don't have different email addresses for different projects. For advanced contributors, there's no wonder they would prefer force-push with rebase-merge when they are making contributions on github, because rebase-merge makes sure the exact state of their final commit is preserved, including commit message, email address associated with it, and gpg signature if they use that. But github's rebase-merge strategy has its own issues, as described by the author and more.

Ar-Curunir · on March 30, 2021

You can work around all of this as a contributor by squashing on your own end before the final merge.

fishywang · on March 30, 2021

That comes with all the problems with force push and rebase, bar the history during code review one.

For example this still has a commit message issue, just on the maintainer's side: As the maintainer if you are going to use rebase to merge this PR, that means you need to accept whatever commit message the contributor wrote as-is. Are you happy with that? If not, you can't even leave inline comments on that, and it's usually pretty hard to communicate and give feedback on how you want the commit message to be.

WorldMaker · on March 30, 2021

> un-bisect-able fixup commits in the final merged master/main branch

If you require PRs to create merge commits you get the nice world where git bisect --first-parent bisects at the PR level, you don't have to worry about the individual commits inside the PR/below the PR level when bisecting, but you still have that commit history "as-is" for deep archeological dives when you need it.

(And you can use --first-parent to cleanup git log and git praise too.)

u801e · on March 31, 2021

And those commits rarely provide useful information because they're of the variety where people fix syntax errors, add missing files, remove changes they didn't mean to commit, etc.

WorldMaker · on March 31, 2021

There's plenty of information in all those types of commits even if you personally don't find that information "useful". I've had to do the sort of archeology digs to figure out "what syntax errors did a build tool miss", "why is this type of file often missed to be added, and how often do we miss it", "what was still TODO in this feature effort that got removed at the last minute", etc. All of which needs information from those sorts of "low level" commits.

u801e · on March 31, 2021

In the instance where a file was missed and added in a later commit, then running git blame would show the sha1 referencing a commit that has a title that says something like "Added missing file". That's not going to tell me anything about why that file was added.

Instead, if you had a commit that explained what the file was for or if some of the lines in that file were added by a commit that explained the change and why it was made, then that would be useful history.

Many times, investigations start with running git blame on a file you plan to make changes to. The usefulness of commit messages associated with each line in a change and whether the diff associated with the commit shows a logical change rather than a fix for a syntax error is the difference between an investigation that leads to results versus one that leads to a dead end.

WorldMaker · on March 31, 2021

I already mentioned `git blame --first-parent` just a few comments up! You get the sha1 referencing a commit that has a title like "Merged PR #327". You can dig down deeper than that --first-parent level if need be, but you have the power of the git graph to show/hide details if when you do/do not need them.

u801e · on March 31, 2021

Does the --first-parent flag handle the case where the the line was change as part of a conflict resolution in the merge commit itself?

cryptonector · on March 30, 2021

Sometimes (often!) you want clean history in the upstream but also patches separated by bugs they fix, features they add -- issue/ticket numbers, whatever. And you may want regression tests to come before bug fixes, that way you can see the regression test failing, then the test passing after applying the bug fix. Different upstreams are likely to have different rules.

So squash-and-merge is a bad one-size-fits-all. Rebase is a much much better approach: you keep the history as submitted and you lose the useless merge commit. There's no "unnecessary local information" if the submitter did the work of cleaning up their history before submitting. That means doing interactive rebases locally to squash/fixup/edit(and-possibly-split)/reword/drop/reorder their commits -- this is something every developer should know how to do.

pabs3 · on March 30, 2021

Squashing commits into one mega-commit isn't great for future investigations of the commit history (code review, bisects etc). It is much better to create separate logical commits, rebase them and pull in the result, either as a branch fast-forward merge, or with a merge commit where appropriate.

earthboundkid · on March 30, 2021

When someone invents the git killer, it will have a feature called “subcommits” that will be blindingly obvious in hindsight.

oftenwrong · on March 30, 2021

If I've correctly understood what you mean, I've wanted this for some time now. A way to preserve history while adding a single, linear integration of changes.

renewiltord · on March 30, 2021

You get this by forcing merge commits for every non-single-commit change.

earthboundkid · on March 30, 2021

Sure, you can use git to do this, but the git killer will have it as an expected capability.

I also think that octopus merges are basically always a disaster because they can't be meaningfully reviewed and put your repo into an unknown state. Maybe there's some way to get the advantages of merge commits (preserve all history!) without the disadvantages (jumble all history!).

urxvtcd · on March 30, 2021

Not sure why you are being downvoted. You actually can use merge commits this way, by viewing the diff produced by

    git diff $merge_commit^...$merge_commit^2

when interested in whole change-set introduced by a branch or looking at individual commmits when interested in well, individual commits.

WorldMaker · on March 30, 2021

Yup, and you can use --first-parent to git bisect, git log, git praise to interact at the "macro-level" of those merge commits by default, and dive in to the fuller graph only as necessary.

properdine · on March 30, 2021

a year from now, are you actually going to want to test each individual change in a pull request, or are you going to want to test it as an entire unit?

I agree that code review you want smaller units but my experience has been that 1-2 years later, you no longer care about the individual units and instead you want the entire patch/PR all together.

pgaddict · on March 30, 2021

I'm pretty sure you want reasonable meaningful commits. On tiny projects it may not matter, but on larger projects it's definitely a huge benefit, because chances are you'll have to investigate a bug in that code, re-learn why it was done this way, etc. And maybe bisect the git history to find which exact commit caused the issue.

Which is why larger changes are often split into smaller patches that may be applied and tested incrementally. If you just merge the whole pull request as one huge patch / in merge commit, you just lost most of that.

pabs3 · on March 30, 2021

I definitely will want to do that, especially when bisecting a random bug that was introduced with one of the changes in that PR. The smaller the unit of change the better, as long as they are logically separate changes.

dboreham · on March 30, 2021

I think it's more about: in a year from now will you understand the purpose of a change to some code you're debugging? If the commit says "merged PR 2234", answer is probably not.

rectang · on March 30, 2021

Nirvana is:

• Setting `merge.ff=no` in git config to force merge commits by default.

• Creating a series of logical commits on `my-feature-branch`.

• Merging `my-feature-branch` into `main` with a bona fide merge commit.

• Using `git branch -d my-feature-branch` (NOT capital `-D`) to delete the feature branch safely and without worry, since `-d` only deletes the branch if the commits are present on HEAD.

• Using `git log --oneline --graph` to see a clean representation of the actual history.

masklinn · on March 30, 2021

> • Setting `merge.ff=no` in git config to force merge commits by default.

I'd rather `merge.ff = only` so git never creates a merge commit from under me. It's a big issue because of `git pull`, that thing should not exist.

Most git tools are wholly unable to deal with really merge-heavy graphs, too.

rectang · on March 30, 2021

A pull is just a fetch followed by a merge. So to solve this problem, just fetch instead of pull!

Then do `git merge --ff-only` and if it doesn't work, do the rebase or whatever else to resolve the conflict.

I did this long before I set `merge.ff=no`. I hate it when pull creates crappy graphs — it's something I try to help all my colleagues to avoid. I often wish that `git pull` didn't exist.

masklinn · on March 30, 2021

> A pull is just a fetch followed by a merge. So to solve this problem, just fetch instead of pull!

Of course, that’s what I do. But “git pull” is still a danger, and configuring merge.ff=only protects against that danger.

rectang · on March 30, 2021

Why is `git pull` a "danger" if you always use `git fetch`? The configuration setting for merge.ff only affects the local machine. It doesn't generally impact other developers.

(Unless you're doing something like setting the system gitconfig on a shared dev box, and setting merge.ff to anything other than the default would be really heavy handed in such an envronment.)

pabs3 · on March 30, 2021

I would only use merge commits when it is appropriate, like a commit series porting usage of a dependency from an old version to a new one.

rectang · on March 30, 2021

Well, there are different views of "appropriate". When you do team development and everything is done via feature branches, it's nice to have merge commits so that the integrity of the each feature development effort is preserved via a merged branch in the history. If everything is flattened, it's harder to see where the branches (standing in for development initiatives) begin and end.

You can't always get fast-forward merges anyway. Long-lived branches with merge conflicts are undesirable but unavoidable in the long run. At least some of the time, you're going to have merge commits even when your "appropriateness" test says there shouldn't be one.

pabs3 · on March 30, 2021

Do you at least agree that merge commits for single-commit PRs aren't "appropriate"?

rectang · on March 30, 2021

I don't feel strongly about the issue.

A good clean fast-forward merge of a single-commit PR is fine. But I've also worked at multiple jobs where every merge to the production branch created a merge commit and that's also fine. It adds a bit of complexity to the history graph, but it's not meaningless complexity.

If your commit history is majority single-commit PRs then having additional merge-commits everywhere would be noisy, so in that case it would be too much. I don't tend to work on actively developed projects that match that pattern, though. Most feature development involves multiple-commit branches.

WorldMaker · on March 30, 2021

Merge commits for single-commit PRs helpfully record which PR # was merged if you need to review/audit the PR sometime later, if nothing else.

u801e · on March 31, 2021

The original commit could be amended to include that information.

WorldMaker · on March 31, 2021

The point of using a "proper" merge commit would be to avoid amending/rebasing the original commit and allow the original commit to live as-developed in the final branch.

u801e · on March 31, 2021

The only thing changing in the original commit is to include a reference to the PR number in the commit message. There would be no change to the tree referenced by the commit.

WorldMaker · on March 31, 2021

It's an entirely different commit at that point. If work has already started in another branch based on the original commit (for whatever reason), it can cause merge problems down the road. Again, you are likely going to suggest that you can just rebase this other branch on top of the modified commit, but that's still sweeping possible merge commits under the rug, and again just because that rebase is usually automatic including that the tree references should be the same doesn't mean it is always automatic or doesn't have dangerous repercussions (including training junior devs to rebase often and giving them plenty of ammo for avoidable footguns).

u801e · on March 31, 2021

I'm talking about a single commit PR in Github or Gitlab. If it's based on the latest version of the base branch, then amending it to include the PR number would allow Github to generate a link to the PR page associated with that commit. That would make the merge commit superfluous at that point.

So something like:

  git commit --amend

and editing the commit message. This doesn't introduce any further change to the tree associated with the commit.

rectang · on March 31, 2021

But because the commit has different metadata after amending, it now has a different SHA and is a different commit.

For illustration, a minor inconvenience of amending the commit is that `git branch -d my-feature-branch` no longer succeeds for the original branch, because it looks for the actual commit SHA, not the tree.

You may not care about the effects of changing the commit, but those effects are real and other people care.

u801e · on March 31, 2021

Assuming you were the one who amended the commit before pushing it up to the remote, there's no reason that you would not be able to delete the branch because your local working copy has already updated contents of .git/refs/heads/my-feature-branch.

For those who have cloned the repo for testing, they can simply run git checkout my-feature-branch; git fetch origin; git reset --hard @{u} to get their local repo in sync with the remote.

So there's no reason that amending the commit will affect anyone until they branch off of the repo to do their own work. But that's nothing that a rebase can't fix.

rectang · on April 1, 2021

Yes, of course there are workarounds; no matter what scenario you or I come up with, the other will be able to propose a different way of doing things. I chose a deliberately trivial example because I was illustrating a fundamental aspect of Git's design, not trying to stump you. But we're talking past each other.

u801e · on April 1, 2021

I don't think we're talking past each other because you weren't involved in this subthread until your comment about using git branch -d.

rectang · on April 1, 2021

But one of my comments (†) is the great-grandparent of your first comment on the subthread? (∆) And the concept of preserving commits precisely is fundamental to my comment two generations above that, the one about "nirvana" (‡) ?

    [article]
     properdine
      pabs3
       rectang (‡)
        pabs3
         rectang (†)
          pabs3
           WorldMaker
            u801e (∆)

Perhaps we would benefit from an `hn log` function which displays the linear parentage history for comments? (It would be easier to design that `git log` because every comment has exactly one parent, there are no `hn merge` comments.)

Or in your working copy has my authorship info been lost? That can happen if a committer uses plain old `patch -p1` to apply a diff from the mailing list rather than `hn am`. :D

u801e · on April 2, 2021

> Or in your working copy has my authorship info been lost?

Well, none of the text that you originally wrote in the comment you're referencing wasn't preserved on the working copy. And, unless it's quoted, and one could search for when it was introduced by running git log -S"a line from your comment", no one is going to search for it specifically. IOW, the thread moved on :).

slaymaker1907 · on March 30, 2021

I don't think it is a good idea to allow individuals the ability to force push changes to a branch where said force pushing allows them to impersonate others. If someone with said capabilities ever has their account compromised much less if said individual abuses their powers, you can end up with a scenario where all the provenance guarantees of your repo are gone.

For sensitive projects, ideally no code can be merged in unless it is reviewed by somebody else. Even if no malicious code is added, the extra layer of review adds at least some degree of accountability.

As for the issues the author mentions, Azure DevOps mostly solves these by having squash commits which can be set to autocomplete once fully approved via reviewers and CI. You can do merging manually in ADO, but most of the time I (the author of the PR) just rebase or merge manually before completing the PR or setting auto complete.

There is no VCS that I know of which handles many people modifying the same files simultaneously in a nice way. If you allow for more clever auto merging, you increase the likelihood your merge algorithm produces nonsense. The author should probably be doing this merging since they are the expert on their changes. I think the only real alternative would be if a VCS allowed for a custom merge algorithm (which could do different strategies for different files). A package.json file can't really be merged without some understanding of what a package.json file is or at the very least what a JSON file is.

ketozhang · on March 30, 2021

Article did not mean force push to main branch, they mean force push to working branch when your working branch is behind main and requires a rebase.

This is often avoided by doing a merge with main branch instead of a rebase.

WorldMaker · on March 30, 2021

Yes, most of the issues this article has with PR branches and needing to force push them stems from the self-imposed "requirement" that they don't like merge commits and don't allow them. Obviously that will make working with PRs much harder than the merge-based workflows that PRs were originally built for and still tend to be best optimized for.

u801e · on March 31, 2021

The merge button merges the feature branch into the base branch. The merge commit provides information regarding what commits are in the branch (commits from the first to second parent). Merging the base branch into the feature branch introduces a merge commit that just shows changes to the base branch and conflicts that were addressed.

That information could just as well not be there if you created the feature branch from the main branch after the latter was updated. The merge commit doesn't really provide any useful information, which is why many consider it noise that shouldn't be in the commit history.

WorldMaker · on March 31, 2021

Every merge is a risk of merge conflicts, of bad auto-resolutions (don't get me started on the circle of pain where you need to understand git rerere and its consequences). Just because there are auto-resolutions and they mostly work (most of the time) automatically doesn't mean that all merges are safe. (Fast-forward only merges being an exception, of course.) Rebasing hides the history of those merges, their conflicts, their resolutions. Merge commits do provide all sorts of useful information in the cases where merges go wrong. That information is entirely lost in rebases, depending on the user that did the rebase and how they do the rebase.

I understand many people consider merge commits noise. I think rebases are destructive and dangerous. I'd rather have an extremely "noisy" git graph that I can manage with UI tools and filters like --first-parent than a "clean" git graph with no way to research and/or fix a bad merge after the fact.

I realize those are often very opposed viewpoints, I'm just offering mine in a thread full of people who don't shudder every time they hear a junior developer attempted a rebase or amended a commit in a branch they shouldn't have.

u801e · on March 31, 2021

If you imagine a VCS where merges didn't exist, then what would you do when the code is updated after you started on a change you were making? You probably would get a copy of the most up to date version of the code and try to apply your changes to it and make sure it still works.

That's essentially what rebasing does.

What merges do is basically try to have the VCS apply the code changes you made based off an older version of the code and apply it to the newer version of the code. Conflicts are changes you made to your code or the base code in order to get your code to work.

So, it really comes down to the following:

1. Would a reviewer want to view the changes based on the most up to date version of the code as a set of one or more changes?

or

2. Do they want what's in item 1 and then one or more commits that are autogenerated with a mix of automatically applied changes as determined by the VCS and manual changes that are a mix of changes to the base code and code that's part of the new feature?

I assert that it's easier to deal with option 1 because we can clearly see, in a set of organized commits, what changes were made based on the latest version of the code. With option 2, you mix what's in option 1 with automatically generated commits mixed in with changes that update lines of code that may have been made by multiple commits previously applied to the branch.

> a junior developer attempted a rebase or amended a commit in a branch they shouldn't have.

Developers, in general, should look at the commits in their branch by running:

  git log -p origin/master..

and read through the commit messages and diffs and see if things make sense. They can also test each change by running:

  git rebase --exec "test_command" origin/master..

in order to verify that the changes introduced by each commit didn't break any of the tests.

WorldMaker · on March 31, 2021

> If you imagine a VCS where merges didn't exist

I've certainly used VCSes where merges didn't exist as a first class citizen. They are a huge pain to use, still have to deal with merges and merge conflicts somehow, even if they just push all that work onto the user.

Git presents a DAG as a data structure. Forcing it into a "straight line", in my opinion, is a silly amount of work when you can use the power of a DAG, including filtering/culling your "depth" views into the DAG to manage it.

I understand that a straight line is "cleaner" and often easier to read. I'd rather use the power of the DAG to query straight lines than do a lot of work up front to artificially make straight lines. I understand that this is a difference of opinion you are unlikely to share, that's fine, there's no "one workflow" for everyone.

u801e · on March 31, 2021

Even if you filter out the merge commits, then you're hiding some of the changes made when you look at the history of the branch. The other problem is that if you run git blame on a file and the line is referenced by one of those merge commits, then the commit message is next to useless. Those looking at the history want to know why that line is there (changed from a previous version or added). A commit message that explains the reason and allows you to see that line in a context of a logical change makes it much easier to determine whether the change you're planning to make will introduce a regression or lead to some other issue. A merge commit message with a mixture of changes that may pertain to any number of commits in the branch doesn't help and makes it harder to figure out what the change was.

halayli · on March 30, 2021

you are force pushing to your own branch. It's also frequently done when you rebase your branch.

baddate · on March 30, 2021

Has anyone ever used fossil? https://fossil-scm.org/

600frogs · on March 30, 2021

The author suggests that if I find email horrible to work with, it's my mail client that's at fault, but doesn't expand as to why this is or make any suggestions for what to use - can anyone help me out here? What am I missing from some amazing holy grail mail client that I don't get in gmail webapp or the default MacOS mail app?

progval · on March 30, 2021

https://useplaintext.email/ has some recommendations, and is written by the sourcehut devs

sreevisakh · on March 30, 2021

Conventional mail clients and web clients are a non-starter if you want to work with mailing lists or git email workflows. To begin with, both of them rely heavily on email threads. Patch sets and reviews are both threaded. Modern clients don't emphasize these threading well enough, and can leave you confused about the context of the messages when using these workflows.

Another problem is html mail. Lists and patches are easier to read as plain text. But again, it's hard in modern clients to figure out if you are composing a plain-text or html mail. As a result, you end up with mangled patches and badly formatted text in the mailing list. Another really annoying problem is top-posting when replying to a mail. Ideally, the quotes from original message and the reply should be interleaved, so that the context of the discussion is very clear. But web clients often just quote the entire original message and hide it away at the bottom of the reply. It isn't even easy to expand and split the quotes.

Finally, there is the problem of applying patches to repository from email. I don't think this was even a design consideration for modern clients.

> What am I missing from some amazing holy grail mail client

I assume you are unfamiliar with the workflow. The reality is that the UI is never ideal, even with traditional text mail clients. List support is better than with modern clients, but patches need hacks or a bunch of other tools. But the tool selection makes it way more tolerable than using web-clients or modern desktop clients. There is a lot of room for improvements and there are on-going efforts to improve the situation (like ddevault's aerc). However, you end up with the realization that mailing lists and git email workflow can be as easy and enjoyable as fork-pr workflow, once you overcome the difficulty of initial setup. Perhaps some day, we will have a text client that is trivial to setup and covers all the steps in using mailing lists and patches.

For now though, I believe that mutt is the most commonly used client for the workflow. For some other reason, I use mu4e as my client. Drew Devault's aerc looks promising in the long term. For more detailed explanation, have a look at this post by Greg Kroah-Hartman: http://www.kroah.com/log/blog/2019/08/14/patch-workflow-with... (I just discovered this today from ddevault's reply in this discussion).

yunohn · on March 30, 2021

+1 to this. I’ve been searching high and low for a good email client that supports Gmail and Outlook (via OAuth for security). Ideally, it should also work across MacOS and iOS too.

da39a3ee · on March 30, 2021

If a patch is in an email then how do you know which parent commit it should be applied to? A patch is not a thing in its own right in isolation; it should come with a specified parent thus identifying a unique code context in which it is correct. Just because it applies cleanly on a branch does not mean it is correct there.

sreevisakh · on March 30, 2021

I have wondered the same too for a long time. But I have never seen the branching commit mentioned. Instead, most of the contributor documentation just ask you to rebase your feature branch on the latest master (or whatever it is called now) and send in those patches.

I believe that patchset application is treated the same way as rebasing. It should be possible to apply patches to the master without conflicts as long as the two branches haven't diverged too much. In case the branches did diverge a lot, the contributor is expected to sort it out by the rebase mentioned above. Patch workflow forces you to think about diff introduced by each commit, instead of worrying only about conflicts between two branches when merging.

Ultimately, it requires care and discipline from contributor and maintainer that is not strictly necessary with fork-PR workflow. However, commit quality would improve a lot if the same discipline was followed everywhere.

da39a3ee · on April 2, 2021

Thanks, so yours and the sibling reply by @worldmaker seem more argument against the email workflow. I honestly think it's just smart people being stubborn; men over 50 finding it hard to relinquish what were core tenets of internet-based development in the 90s.

WorldMaker · on March 30, 2021

This is one reason I far more trusted email workflows around darcs than I do with git. In darcs patches were first class objects (including a ton of context information), but in git that sort of information has to travel in manually and in parallel in the email thread.

thcipriani · on March 30, 2021

I believe what this post is describing is two different integration patterns and not two different workflows: continuous integration[0] vs long-lived branches.

Martin Fowler has a good post on the subject: https://martinfowler.com/articles/branching-patterns.html#Co...

tl;dr: if you make big feature branches and take a lot of time to integrate them it's a painful process.

[0]: <https://www.thoughtworks.com/continuous-integration>

maliegrl · on March 30, 2021

Platforms like GitHub and GitLab should support a workflow consisting of series of patches instead of a specific commit on a particular branch. They could probably even show pull requests from email in their interface.

megous · on March 30, 2021

It could be as easy as adding some toggle to enable sending a patch series when you push to a specialy named branch, to relevant people (mentioned in the patch message for example, via Cc: tags).

Something like:

    git push master:email/v1

Email/v1 branch would not be created, it would just be a virtual target branch name for this functionality.

You can easily make a git hook script to do it with regular ssh/http based git hosting.