Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What made you finally grok Git?
53 points by Ldorigo on Nov 18, 2022 | hide | past | favorite | 92 comments
I've been using it for over 7 years, and while I've memorized a bunch of commands and have some idea of what the commit tree is, I'm perpetually bewildered by some of its messages or the behavior of some commands - and the only way I can fix major trainwrecks is by nuking a branch (when I'm lucky, the entire repository when not) and copy-pasting stuff until it's back to working state.



When I realized it's just a DAG and you're just manipulating a graph and pointers.

Then it just became a matter of mapping git CLI commands to how they manipulate the graph.


This: my introduction to git involved reading an architectural overview. From that it was obvious that it was a DAG. About an hour of playing with the operations (particularly rebase!) was enough.

Just goes to show the power of understanding the fundamentals of CS :-)


Maybe I am biased because I did a CS degree, but I don't think so. Using the word DAG sure requires having heard of it, which a CS degree would give you, fair enough. I don't think that word is needed, nor most (any I would argue) of the things you learn about DAGs while doing a CS degree. I definitely forgot most of that by now because I never needed it again.

If you ask me, there's very little actual git to grok. Version control existed before git and many of the same things applied for "grokking" it. It just so happened that the implementations of other systems were worse and made certain operations expensive and error or conflict prone, while in git they're easy, safe and fast. Lots of developers have always been "bad" at version control and in some places I've been there was an entire team of people that did nothing but resolve merge conflicts (stupid idea if you ask me but hey).

Version control 101: It's just a graph of commits.

If it helps, think of it like a tree in the park or your yard. Just that trees don't have branches growing back into the trunk but if you only ever rebase without merging that's actually what your commit graph looks like.

What personally made me realize how git is so awesome is that it's all just labels. This is where some of the implementation details do come in but you don't need to go deeper than realizing that a branch is nothing but a label, not special or different from the main branch at all. You can freely relabel the graph any way you want. Think of the tree in your yard and buy a label-maker. Attach a label to the top of the trunk with `master` on it. Every few inches on the trunk imagine a commit (with a hash identifying it, on an actual tree, use the number of inches it's away from the ground). Where a branch goes off, do the same thing, tip of the branch gets a label. All operations in git you can imagine as using a saw and wood glue and your label maker now.

More implementation detail: go to the `.git` folder and take a look around. You can see this in action right there. Ignore the binary blobs part. Look at the other files and you see that each of the labels in the graph is just a file w/ the exact name of the label and the contents is nothing but the commit hash that the label is stuck to. It doesn't get easier than that. Now you know how to move labels around in the graph without even using git commands! Any old editor works but I recommend `vi`. Try it!

Remotes are easy as well. They're just another source of labels. You have your own set of labels locally (and don't need a server) and remotes allow you to see other people's sets of labels. By default most people probably only have one other set of labels, which is where they cloned stuff from, but you can set up any number of remotes and see any other person's personal set of labels and have a copy of it on your local machine. Think of it as your neighbour coming over with his own label maker. He's also brought some branches from his tree and started gluing them to your trunk ;)

`git fetch` just retrieves the latest copy of labels from a remote including any commits that are not present on your end yet. `git push` pushes your opinion of what the labels should be to a remote, including any actual commits that are not present on the other side yet. The remote can reject that because it thinks it knows better (i.e. only allowing fast-forwards). This is where the analogy falls apart a bit because it's hard to make copies of commits (parts of your tree you sawed out) but imagine you could walk over to your neighbour and give them a copy of parts of your tree just like they gave you parts of theirs.

Fast-forwards are also relatively easy to explain I think. It's nothing other than only allowing labels to be pushed along a path on the graph instead of freely moving labels around. For that to work, both you and the remote have to agree on what the graph looks like for this to be possible. Otherwise what looks like a pushing along of the label for you is pulling a label off a branch "somewhere over there on the tree" and sticking it onto a part of the trunk.

`git pull` is just a combination of fetching and rebasing/merging to deal with the fact that the commits and labeling on the remote side may have incompatibly changed vs. what you have. It's a way to resolve those conflicts. Personally I like the rebasing version of resolving them, because it results in an easy to read and follow straight line commit history.

I have yet to loose anything with git and there's one (and a half) rule that has helped me with that: before you do anything, always commit. The half rule is to always use the terminal/command line to interact with git and keep it open at all time. If you create a commit before trying to manipulate the tree, it's very hard to loose anything. You can always go back to that commit. Heck, if you happen to relabel your graph incorrectly and "lost" your commit you very probably have the commit hash of the lost part of your graph still in your terminal output. Just find it and stick a label on it. There, "recovered". In live tree analogy, branches you took all labels off of are still there. As long as you can find it you can stick a label on it again. When you were sawing and gluing you just let the parts/copies you didn't need any longer fall to the ground. They're also still there in a messy pile until you clean up. `git gc` cleans up the ground and automatically saws off and burns any branch tips that don't have a label.

You made a big boo boo while rebasing something and having lots of conflicts? Just abort the rebase and try again w/ the knowledge you gained during the first conflict resolution round. W/ something like SVN you're SOL unless you made a copy the repo, which you probably didn't do because making that copy takes 15 minutes because of all the small files. Or you have 17 copies lying around.

Rebasing is where I think most people have difficulty but I think very little deeper understanding is needed, not to get into trouble. I.e. the implementation details of how git does it so well are not important to be able to work with it efficiently and without destroying things in most situations you will encounter at work. Rebasing is nothing but taking a branch off the graph and attaching it somewhere else. In the live tree analogy you saw off a branch somewhere on the bottom and then you re-attach it somewhere else on the trunk. That's basically it. If your branch was long lived, it's a very thick and heavy branch. If you try to attach it at the top of the trunk you'll likely have problems (conflicts). If you had a short lived branch (or it was only making changes to parts of the code that seldom change) you can easily attach it without conflicts. There are some special scenarios here depending on how you work and branch where you may be able to solve conflicts very easily by skipping commits, which feels like loosing something - and you can if you skip the wrong commits and don't know a commit hash to recover things like mentioned above. I admit, this can be a bit hard to understand and it helps to see it visually. It's also not needed if you don't get yourself into this situation in the first place. In the analogy of the live tree, while trying to attach your sawed off branch to the trunk you notice that the lower parts of your branch look exactly like parts of the trunk already. So you saw your branch in half and attach only the top part.


Nice analogy with the tree, branches, etc.

Reminds me a bit of that very popular / upvoted / quoted StackOverflow answer by Jim Dennis to a vim newbie question:

"Your problem with vim is that you do not grok vi.":

https://stackoverflow.com/questions/1218390/what-is-your-mos...


But note that git's data store (the bit where actual code/content is kept) is fundamentally append only. You can't manipulate anything there. You can only add different versions. The only pointers you can actually change are branches and HEAD. This is really, really important to understanding git and actually trusting it to keep your work safe.


Now I just have to learn what a DAG is


My bad, stands for directed (not acrylic) acyclic graph, which is just a fancy way saying a graph with no loops.

Edit: yeah, it's acyclic not acrylic :).


God damn it. I just amazon'd some fancy acrylic paper and wonder how that would help me learn git. You owe me $6.99 + tax. /s


acyclic, not acrylic ;)


Bob Ross enters the chat


"and over here, we'll add a few happy little change sets"


I still don't fully understand what the nodes and edges in this graph "is" though. It's not patches. And I can't believe it is "all of the code" in each commit because that sounds very expensive.


On a high level, it is "all of the code" in each commit. Node = commit (= all files + some metadata). Edge = pointer to a parent.

On a lower level, a commit is itself a bunch of pointers to files. Different commits can share files. Think about copy-on-write.


I think the confusion is caused by mixing two things - abstract model of git repository where edges are just deltas between images of file system and implementation where this is all optimized. But you don't have to think about implementation when working with git, just abstract model should be enough most of the times. Sure, you have to think about particular implementation of git when for example getting rid of large files in repository


Git essentially shares common files & folders. I.e if you change contents of one file in root folder, then a newly created commit would point to a tree that points to this new file and to all the sibling files and folders which git shares with all other commits.


I believe the latest commit in a branch is "all the code" and prior commits are reverse diffs internally. So more branches and commits doesn't make a repository grow exponentially or anything like that. I'm talking about code/text here, I have no idea if this works as efficient for binaries.


It's not 'all the code', it's the directory info, and links to every file, the vast majority of both of which will already be in the repo.


My mental model is each edge is a diff and each node is the aggregate of all edges + the starting state


Nodes are states of indeed "all of the code". Edges are commits. The "all of the code" thing does sound expensive but it actually is not because git does massive caching. That is what all of the directories in .git/objects are about.


I think this is a really compsci major way of saying every "branch" is just a stack of patches on top of whatever you started from. So "rebase" is just you reapplying your patches from a new start point. Perhaps that's what you mean by "DAG"? But for someone who used to work with patch sets against trunk in the old pre-git days, it's easier to think about it all as just a bunch of patches.

Personally I think the most difficult part of git to understand is merges. I get doing the final merge to master in the pull request model, but people merging branches willy nilly is something I find unnecessarily confusing and would prefer git didn't support at all.


This model doesn't cover the staging area though.


My basic understanding

- Untracked: Files that Git doesn't know about.

- Tracked: Files that Git knows about. Further divided into unstaged files (will not be part of the next commit) and staged files (will be part of the next commit).

What helped me here was to know that it's possible to track a (yet untracked) file without staging it:

  git add --intent-to-add <path>


Yeah, it also doesn't cover remotes and the decentralised model of git.

I do think understanding that it's a graph is the first step to grokking git though. The other stuff makes more sense after that (at least in my experience).


I think is healthy to think about staging area as something outside main git model. It's like helpful tool but separate from git repository.


I have not grokked git — it hasn't been necessary to. Understanding some of the basics, knowing enough of the commands to run, and figuring out which StackOverflow answer applies to my current situation is good enough. If I have to spend too much time knowing its internals and delving into its philosophies, then it is not a useful tool that lets me get on with my tasks.


Kinda.

I wouldn't be able to tell you too much about the internals of VSCode.

The 'trouble' with git is, though, that what happens when you run the wrong command can vary from "easy to fix" all the way to "you just lost your work, you shouldn't have run that command".

Which means you need to either be very careful anytime you run git, or you need to learn more about git to avoid the footgun aspects of it.


It's kinda sad though because off the top of my head i suspect it's really difficult to actually lose your work. Ie between the fact that the content store stores.. well, all content, until it's GC'd, and the obvious reflog, you can almost always get back your data.

The downside to this though is the type of novice most likely to make a mistake in "losing" their data will also not know how to recover their data.


> it's really difficult to actually lose your work

If it's been checked into a commit, sure.

But if you discard changes which haven't been checked in, then the work is lost.


Agree, sort of. I sort of get git but have never grokked it, for sure.

But I don't think I should need to fully understand its internals for it to be "grokkable", an elegant thing could still be grokkable regardless.

It's possible though that the tasks: diffing, branching, merging, rebasing, tagging, etc., are too elusive for "elegance", ha ha.

I might add though that simple diffing, branching, merging I do more or less grok. Perhaps if git had stopped there, like my understanding did, I would feel more comfortable with it ... imposed limitations be damned. I suppose in fact I only use git in this limited capacity anyway. It's only when someone else on the team does something clever (or I screw something up)....


What's important to understand about git is that it allows time travel, and time travel enables changing history, and changing history is a really bad idea. Do not ever change the history without knowing what you're doing. In git, this means mostly rebasing and other stuff that messes with existing commits.

Messing with existing commits can be fine as long as you're messing with the only existing version of that commit. Commits that exist in other branches, have been shared with other people, have been pushed to remote, etc, should be left alone. Rebasing local, unpushed single commits in your one working branch is fine. Rebasing anything else will cause problems. If rebase is causing you problems, do not rebase.

Also: small commits, short branches, merge often.


I just ran into this problem last night... if you shouldn't rebase after pushing to remote, how do you update a PR days after you originally created it to be on top of the latest origin/master and be able to resolve merge conflicts locally?


You can merge with master instead of rebasing, the downside is you'll end up with a lot of merge commits in a long living branch.

I just rebase and force push. I really don't get all the people that say avoid doing that. If you're the only one working on that branch, there's no risk.

If you're working with other people on the branch, and no one has unpushed changes, it's also fine. Everyone will just need to reset to the origin version of the branch to keep working.

If people do have unpushed changes and you force push, then they need to spend some time untangling it. Which is annoying, but it's not the end of the world. I don't get all the fear about it.

Don't rebase and force push master, that'll piss everyone off, sure. But doing it in branches is fine in most cases.


> If people do have unpushed changes and you force push, then they need to spend some time untangling it.

This is where `git pull --rebase` could save you all some time. You can even modify your git configs if this happens enough.


You don't rebase it ontop of main after you've published it because that will mean anyone else who has checked it out will have to like checkout -f to clean it up.

You just merge main in. Don't worry about extra commits that appear. PR tools like Github won't make people review them if you've done it all right, they can tell what a Merge Commit looks like (don't go deleting the auto-generated messages for merge commits unless you know what you're doing).

Once you merge main into your out-of-date branch, it'll be up to date. And when you merge your branch back into main after the PR, it will work out. This is the standard way you would work on a branch with a coworker, instead of working alone.

There's nothing wrong with merging main in.

If you absolutely MUST have a perfect history of your commits on top of THE CURRENT state of main, start a new branch, start a new PR, IMO. Just don't change public history.


No, you only shouldn't rebase if you pushed to a shared branch. If it's your PR that only you are working on and you need to update it to the latest in the master branch, rebasing is the correct way to do this. There isn't a magical rule about not pushing to a remote branch. The point is to recognize that rebasing rewrites the history which is a problem if and only if that history has already been relied on by someone else or otherwise integrated in such a way that rewriting that history is destructive. If it's your PR, then as far as anyone else is concerned, you DID just checkout the latest from master and branch off of it. No one needs to know or care that you actually rebased.


The qualifiers is 'unless you know what you are doing' .

I rebase PRs all the time, since they are my branches and my colleagues don't depend on them. If they do, I communicate with them so they know what I'm doing.

Things get very messy when you rebase a branch that multiple people branched off from, without them knowing you are going to rebase etc.


I do! As long as we understand it's "my branch" that's being reviewed. I can push/rebase/squash a bunch of times until the reviewer is happy. Once the reviewer's happy, my branch becomes one new commit on top of master.


You do that by merging.

For people who love a clean, linear history, this can be frustrating, because it creates a new commit that doesn't add anything new, it just merges two branches. At my current project, PRs get merged very slowly, and after every PR gets merged, we merge master back into all the branches, which has lead to a ridiculous number of merge commits, and the guy who loves clean history won't stop complaining about it.

The problem is: if you rebase instead, you don't actually change the old commit, you make a new commit that contains the same content change as the old one. But it's a new commit, different from the one you already pushed to remote. So now you want to push your new rebased commit (as well as everything from master) to remote, and git says that remote has a commit that you don't have locally. So it wants to rebase or merge that remote commit onto your local rebased commit, despite the fact that they contain the same change! Now if you choose to rebase your local changes onto the remote commit, it rebases all the commits that you just merged from master, creating new commits for them. Then pushes them to remote. Then you still can't merge your PR, because it now still doesn't have the original (unrebased) commits from master, because you just rebased them. But you've duplicated a ton of commits. This is a mess.

There are 3 ways around this:

* Replace the first rebase with a merge, merging master into your local branch.

* Replace the second rebase with a merge, from when you pulled remote when you wanted to push, duplicating the one commit you meant to rebase.

* throw away the old PR and create a new one.

The first and third are the good options. The second is bad because it has a duplicate commit, but it's not nearly as bad as when you tried to rebase twice in a row.

Seriously, if stuff gets complicated, throw away your PR. Or avoid it getting complicated by always merging. If you want clean history, your only option is to throw away your PR every time you merge master back into your branch. It's either that or accept extra merge commits.

A dirty shortcut to creating a new PR is to force push. This will throw away the original commit on remote and overwrite it with your new rebased commits. This will work fine as long as nobody has checked out that branch in the mean time. If someone else has checked it out and pushes to your remote branch again, they may still reintroduce the old unrebased commit, and you'll still have that duplicate commit.

I usually just accept extra merge commits. Sometimes I force push, but I really try not to. You have to be sure nobody else is using that branch. In a complicated setup with lots of developers, automated tests and build servers, I'm reluctant to count on that.


It seems like a lot of headaches could be avoided if you just don’t have people share branches. Then you just rebase master and force push away with reckless abandon,


That's certainly a possibility. But what if two people have to work on a big feature together?


Break the big feature up into smaller changes. Vertical slices help with that: https://news.ycombinator.com/item?id=33384275 Each change is merged to `master` after a relatively short amount of time. What some people call "trunk based development".

I've yet to build a big feature that wasn't worked on by 4 or 5 people at the same time. Since they're different vertical slices, they're different branches (and tickets). But even then you might have a frontend and backend person collaborating on the same branch for a small slice of functionality.

This is also not a problem as even with one branch for two people who work with it. In git there are constantly more actual branches around than you "realize". E.g. here there's 3 branches going around and you just merge/rebase them. One branch is the one in the shared repo. The other two branches are the local branches of the same name on each of the two developer's machines. They need merging or rebasing just like you merge/rebase with `master` itself.

Depending on what you changed, you'll have weird looking history and conflicts to solve (or commits to skip). Which needs some head wrapping around but git itself so far has never been confused by my "reckless force pushing" on a shared branch. It helps if you don't have two people actually changing the same parts of the code though. Then you're in a world of hurt for conflict resolution but it's not the force pushing part that creates the hurt. That's just always bad.

Before you bring this back into `master` you'll want to squash all this into one commit and thus all the weird history ceases to exist.

Also, communication is key. If someone force pushes, tell the other person and help them resolve any rebasing/merging conflicts w/ your knowledge of the changes you made. If two people sit in opposite corners, force pushing "their world view" without tell each other, you definitely will be loosing code.


Better yet, only rebase what you haven't pushed yet.


This youtube made it click for me what is actually going on:

Git for ages 4 and up - https://www.youtube.com/watch?v=1ffBJ4sVUb4


This is it. I was scanning the comments to see if anyone had already posted. This talk is so simple but so effective at explaining the foundations of Git that I always tell everyone to watch it.


Nothing, I keep using it as if it was Subversion.

With the help of IDE tooling and TortoiseGit, the command line basic stuff for workflows that might not be exposed on the graphic tools.

If the repo gets corrupted, I delete it and do a fresh clone.

Life is too short to bother going deeper than that.


get in the habit of commiting every few lines you write, and before creating a pull-request(tm) use 'git rebase -i' and cleaning up into meaningfull bit sized history. move related changes togheter, remove (squash) things that didn't make it to the end, etc.

oh and never join any project that uses github and force squash-commits-on-pullrequest-merge. that is a sign that nobody there knows git or they don't care about code history.

edit: and yes, some rare times, rebase -i and moving things around will cause local conflicts. do not fear them. resolve and continue. it's all code you just wrote, should be easy and is part of the understanding process.


> oh and never join any project that uses github and force squash-commits-on-pullrequest-merge. that is a sign that nobody there knows git or they don't care about code history.

First, color me offended! Second... you're not completely wrong. My git-fu is not that strong. Would you help me improve it by explaining why squashing via the GitHub UI is so bad? To me it feels like an easy way to condense 1 PR into 1 commit. For some workflows, that makes much more sense than trying to get everyone to play by the same commit semantics. But I can see how it's not as "pure" in capturing the meaning behind your commits. Maybe it would be helpful to provide an example of a situation where squashing with the GitHub UI could cause problems?


what if I give you a book with a nice chapter index, and you just rip the index and chapter titles out and add "a book" to your collection?

thats squash on merge.

it's a stop gap hack to fix shitty teams who write books with chapters like "chapter 1" , "chapter two I guess, still broken", chapter 3 final" , chapter 4 final final".


Git from the bottom up. A very clear and simple guide to the internals.

https://jwiegley.github.io/git-from-the-bottom-up/


This is a great recommendation.

Concise, grounded posts that focus on the "behind the scenes" of git. After reading this I felt a much more grounded understanding of git.

I don't recall if it's explicitly covered in there, but because I'd read this I understood things like what happens to my commits after I "delete" them (e.g. remove the last reference to them / hard reset to an old commit, etc).


Same for me.

I avoided getting into Git for years (long ago when it was new, before GitHub) and found the man pages unhelpful.

Git From The Bottom Up totally turned it around for me.

Ever since then I've been comfortable using Git in all sorts of advanced ways, such as constructing repos from old software backups, editing decades of history to separate out publishable parts from proprietary or inappropriate parts while keeping the history, fixing broken repo issues, as well as regular daily use the way most people use it.


I remember Git from the inside out and Learn Git Branching were very helpful when I started using git. The Git Command Explorer was good as well. Links from my notes below:

Git Command Explorer

- https://gitexplorer.com/ - https://news.ycombinator.com/item?id=28888763

Learn Git Branching

- https://learngitbranching.js.org - https://news.ycombinator.com/item?id=18504948

Git from the inside out

- https://codewords.recurse.com/issues/two/git-from-the-inside... - https://news.ycombinator.com/item?id=9272249


For me it was reading about git internals:

https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Po...

It made me understand how simple git really it under the hood, you just need to understand how the commands operate on this simple data structure.


I don't think I've fully grokked git, but

1. There is a similarity with Lisp. A commit is an object with a pointer to the next one (or multiple parents). These objects are garbage collected. Heads point to commits: they are like root pointers. These pointers are mutating. When a new commit is made, it's a lot like a new cons cell pushed onto a list in Lisp: (push new-commit HEAD). (Stupidly, git history isn't terminated by the equivalent of NIL; yet some git operations require a parent. There are ugly workarounds.)

2. Key insight: commits store snapshots, not deltas. Diffs are calculated.

I've slightly delved into the internals.

1. I developed a procedure for reordering commits on a branch with zero conflicts, using tree objects directly, without any rebase workflow. So that is to say, we peel back the brandch, and then create commits using the tree objects of the original commits, in whatever order we want. Because git is based on snapshots, there are no conflicts: it's not a rebase operation on cherry picks.

2. I went through an exercise once of manually creating a stash entry, by creating the right objects and editing some text files in .git


Being the go-to git guy for a whole project when everyone was new to it.

Are you using it from the command line? I feel that helps a lot. Get a good log alias[1] and get hacking. I almost exclusively use the official CLI, with the exception of git-gui for committing and rolling back patches (lines / hunks). (I find it's easier than `add -p` and that CTRL+J is a good time saver.)

1: As an example, https://github.com/aib/dotfiles/blob/master/.gitconfig#L29


>What made you finally grok Git?

When I realized what it was designed for. It was designed for one or a few people to shepherd software contributions from many people, pushing most of the work out to the masses and making the job of the shepherds easier. So it is designed kind of backwards of what you'd want for a personal project or a small team. Software integration is one of the most difficult tasks in software engineering and it requires someone who is aware of how everything operates and fits together and what the overall design is. If you use Git in a team where everyone does their own integration there is a strong likelyhood that no one actually does integration, which makes things go haywire later (when it is difficult to back out). If you don't have the resources to put someone in charge of integration (it can take a lot of time) do not use Git. Just use a good source control system and do design and integration in meetings up front (and periodically thereafter).

In some sense I think Git is inefficient since it doesn't so much guide development as reject bad code. All that rejected code is wasted time that could have been used more productively with better guidance. Yet the centralized nature of control in Git makes it look like development is being guided. If the center of control is not also doing design, integration, and guidance though it creates a lot of wasted effort and failed projects. To a large extent this is true of any method of software development, but I think using Git can make it worse because it can make it easier for sub-teams to ignore integration until it is too late. Git is not magic. Getting your code to build and run is not the same as integration.


Sit down with a coffee, read the book https://git-scm.com/book/en/v2 and just spend the time learning how to use it. You'll save yourself a bunch of time and headache and won't need to be nuking repos or branches any more. There's a lot it can do, so a big part is finding out what fits into you and your team's workflow.


It was all magic to me until one day I finally took the time to look at git internals.

You can build a valid git repo with simple unix shell commands, and that really helped me to understand the magic behind the git commands:

https://git-scm.com/book/en/v2/Git-Internals-Git-Objects


Reading this[1] and getting the data structures/mental model correctly internalised in my head.

Although I do use git from the cli (or magit) primarily, this knowledge has helped me pick-up front end tools intuitively and have helped others when I've never used their preferred tool. All of this doesn't seem possible unless you've put in the effort learning what's going on behind the scenes. That is to say, git itself it neither user friendly nor intuitive it must be learned.

Knowing mercurial (hg) beforehand did slow my learning progress a bit with all the false-friends. You may find it easier to learn if you don't have to re-learn some naming.

[1] https://git-scm.com/book/en/v2


This is my answer too. The main git book is very well written. I stopped reading just before it got into rebasing and had to pick that up from the internet due to work requirements, but the book was so solid in explaining the basics that I understood outside stuff almost immediately.

I think one thing that’ll help OP, is to learn everything on the command line. The fine grain control is worth the extra effort. I’ve hated almost all git GUIs in most IDEs because they won’t let me do weird things like grafting branches which is a very advanced concept for them but once you understand git, it is a very simple operation.


It's super easy to understand how git works — I'm doing it several times per month.

Seriously, low-level git, while not familiar to most devs, is logical and consistent. It's the UI level that is total disaster. Don't worry, you will forget everything in a week.


Using GitUp on the Mac as a client. It allows you to select a commit and press Options-S to reset your branch to this commit, it allows you to squash etc.

And best of all it keeps a backup of your entire repository so you can undo whatever git command you just did.


Maybe the one thing that made it click for me was realizing that a branch is merely a label to a git commit hash. There can be local and remote ones and the local is aware of the remote (e.g. git pull). Checking out a branch makes it current, with filesystem matching the branch head's git fileset state. New commits get added and the branch label moved to the new head.

That and a git tree view GUI or "git log --oneline --graph" and "git reflog" to undo stuff. Also, you're better off committing temp stuff on the current local branch with message "WIP" than using "git stash".


How I really learned git well though was looking in the .git/ directory. "ls .git/refs/heads" shows a bunch of files with names of local branches.

And "cat .git/refs/heads/my-local-branch" shows a git commit hash. You can learn a lot about git this way, and it's first-hand knowledge that's easier to recall than reading about it somewhere.



This took me from "add/commit/push, and pull" to being able to do rebases: https://learngitbranching.js.org/


Same for me. This has great visualisations, I’ve not found any better explanation.


Friends don't let friends do (non local) rebases.


You can take rebase's `--autosquash` and push's `--force-with-lease` from my cold dead hands. :o)

Well, I think it's untidy to leave commits like "fix typo" or "PR feedback" in a change request, even if the whole changeset gets squashed.


When I realised it was all about share-xor-mutate, just like all reasonable systems:

Rebase and force-push on your own branches only (they are mutations), pull-request when putting stuff onto other branches.

I also find the whole git tree WAY easier to read if it's fast-forward-only, and no merge commits.

Atlassian's not as popular as they used to be I guess, but I found Bitbucket presents the best view of the git tree ever. When I wanted to do anything advanced, I'd look at the Bitbucket view, do local git commands in my shell, push, and then watch Bitbucket again.


Using rebase and reflog really taught me a lot about git, because it "rewrites" history and exposes you to some nuance. Like: basically nothing in git is ever gone if it was at one point committed to the repository, that includes branches you delete and never pushed, it also includes commits that are removed with git reset --hard. Once I learned about how to find references, refer to them, and manipulate them, everything else made a lot more sense.


The trick(s) to git is not using `git add -A`, and being diligent (as in checking before you commit) about the changes that you are committing. Only add and commit what is relevant to the task that you are working on. But it's even simpler than that. Before you commit anything, check what you are about to commit. Is it what you expected? Is it relevant? Does it work? Has it been tested? Is it likely to pass a code review?


> Before you commit anything, check what you are about to commit.

Better still, check it as you commit, by enabling the config option commit.verbose (e.g. `git config --global commit.verbose 1`), so that the full patch that is to be committed is shown in COMMIT_EDITMSG, not just the file names that you touched. This removes the race condition between review and commit.

(While on this topic, I also recommend commit.cleanup = scissors, and reading through `git config --help`, or at least skimming it for basic familiarity with what options are available to you. A few more that I like: merge.conflictStyle = diff3, diff.algorithm = histogram, mergetool.keepBackup = false.)


Agree, I tend to go through my work with `git add --patch` and committing that way. I find it also works as a quick refresh on what you actually did and helps you decide how/if to split up your work into multiple commits.


Make a sandbox repository and just break it repeatedly. Within the realm of basic commands, I bet you already know how to do most things (including recovering from small mistakes), but it feels too scary so you revert to non-– GIT methods. In a sandbox, you can get the reps in without any fear.

You could then graduate to forking a real repo. Again experiment fearlessly, with slightly more realistic scenarios.


I started using it around 2008 on the recommendation of a co-worker. I already had a lot of experience with RCS, CVS, and SVN. I understood the problems and frustrations caused by those tools, because I suffered with them for years. After reading the Git documentation I immediately understood how Git solved those problems. Then I just started using it exclusively. The end.


This altassian article: https://www.atlassian.com/git/tutorials/merging-vs-rebasing

In particular the golden rule of never rebasing a public branch has served me very well and I never seem to run into the "merge hell" that others complain of.


As a beginner - each of stash, branch, staged and remote is just a swimlane, kinda like illustration here: https://nvie.com/posts/a-successful-git-branching-model/ but can't remember where did I read it initially


0. mental model: staging, local, remote. c'est tout

1. `git status` + gitk before and after git commands

2. guess upfront what the result of git commands will be (in terms of gitk / git status)

3. be fluent with commit, push, pull, checkout, reset, merge, and possibly rebase (I call it required but it's opinionated) - you can safely ignore the other git api in 96% of workflow



Introduction to Git with Scott Chacon of GitHub - This was really helpful for me.

https://www.youtube.com/watch?v=ZDR433b0HJY


this video where he explains Git's dag with tinkertoys https://www.youtube.com/watch?v=3m7BgIvC-uQ


TortoiseGit for Windows and Git Cola for Linux. I use the command line very seldom, only for very weird things (like moving commit from a repo to another repo, but in a different internal folder).


Teaching Git to others (e.g. company-internal workshops). You learn so much from teaching, and also from questions others ask, helping you discover your unknown knows and unknown unknowns.


Playing through this game https://learngitbranching.js.org


Having a window of "gitk --all" always open on my repo and refreshing the view makes it really obvious what my commands are doing.


Treat commits as immutable data. Don't try to amend, squash or rebase.

Having one commit merges is overrated


I just use an IDE (Jetbrains) with a git GUI built in and stop worrying about how it works


don't get it yet, but learning / realizing that it can remote into other machines and execute commands (via SSH/server, so no other server software - like an HTTP server - is necessary) was definitely eye-opening.


git is fine, one just needs to learn how to use it properly.

one has yet to witness the abomination that is svn, sourcesafe, or "software working on my machine" delivered by a zip file in order repent and mend ones ways


Implementing it.

Once I read somewhere it is just a key - object, append only, DB then it seemed easy to implement the core of it, so I did. It's simpler that it sounds.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: