These days it might be better to teach new users about ‘git switch’ and ‘git restore’ (added in Git 2.23, released 2019-08-16) rather than the two overloaded meanings of the confusing ‘git checkout’ command.
First of all, I'm long time going git n00b (still mainly p4, and g4 for a bit user), but welcome any hints to easy going with git (have to use it occassionally).
Saw this though for switch/restore:
"THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE."
Hmm, maybe I should leave as-is, for the time being, anyway! Thank you!
Update: based on other comments, I WILL seriously look into replacing the use of checkout. It sounds like these newer features are probably stable enough for the simple things that are being done in the blog post, and that updating the post will more it a bit more useful as a starting point for learning git.
I'm thinking of posting the modern syntax as an alternative rather than a replacement to checkout, but I have yet to investigate how stable things are now... will do when I have time.
But meanwhile, the checkout method is fine, it works, and it's well-established for many years.
Thank you, and that's a good advice from the comment:
"So if you're writing a script that needs to work with dozens of past and future Git versions, use git checkout. If you need to teach humans how to talk to Git, use git switch. Some details of some flags may change in the future, but I'd argue that that'll be a smaller mental challenge than trying to teach which parts of git checkout do what."
I'll try to use more switch/restore
I've actually used "git stash" more than it should be (I'm probably applying real bad "p4/g4/svn" like ideas in my head to the development. As soon as I go into project with few more people, and I'm lost, though I was able to make few PR's in github for things - but everytime had to le-learn the process).
I sometimes use git stash as a safer alternative to git reset --hard HEAD. git stash -> check that the state you've reverted to is actually what you want -> git stash drop. And since internally stash is just creating a commit, it's even possible to recover the changes if they haven't been garbage collected.
I thought so, like I'm not putting much thought behind it of sorts, also it puts into uncomfortable sitatutions sometimes should I merge, or forward something, or who knows what...
Internally Google started on Perforce and gradually replaced its backend. Their client (not available outside their org) is g4 rather than p4.
Some dedicated/stubborn devs also used to (maybe still do?) manage local history in a git-based tool with pushes on demand to a g4 changelist for review.
Git itself seems to recommend these commands e.g. if I do "git status", and there are changed files, git says:
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
...
Another example where it recommends "switch"
git checkout f9b45dd
Note: switching to 'f9b45dd'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
I, personally, learned to use git a couple of years ago, have never learned to use the `checkout` command, and had no need for it. The goal of `switch`/`restore` is to do everything that `checkout` does while being less of a confusing mess.
See also the reddit thread linked by a nephew comment.
One example, when reverting file changes to previous file state from last commit, the following are equivalent:
`git checkout — $FILE` and `git restore $FILE`.
Wow! That's a really hopeful thing to hear. I have been bemoaning git's bafflingly inconsistent interface for years, and checkout is one of the worst. Glad to hear we can finally move on.
I only found out about git switch recently - super handy when you want to checkout another dev's branch that isn't tracked locally yet. There's a case to be made for reserving 'checkout' for when you specify a filespec rather than a branch.
I'm going to think about this more. Ironically, I'm in a situation for a few days where I can't really focus on technology.
It sounds like switch and/or restore are probably stable enough that they could be used instead of checkout for the specific tasks I'm trying to do in the blog post, so yet another update will be warranted when I get a chance!
It's completely usable as-is, so I hope no one is scared off by thinking it should incorporate switch and/or restore. It's usable now, and if you do feel like saving a link to it for future reference, I expect that there will be another way in the future to do a couple of the tasks using the more modern syntax.
My feeling is that new users shouldn't be taught about branches in their local repo at all. Typical cloud-based git{hub,lab}/gerrit/etc.. workflows generally store all branches worth talking about on remote systems anyway.
I find I don't ever bother with managing local branches for anything in my own workflows. I just git-reset --hard to bounce between fetched copies of upstream branches as needed. Stuff I need to work on for more than a few commits in an afternoon gets pushed to a remote branch regularly anyway as part of general safe development hygiene.
Stop using any passwords that get checked in. Set expiry dates for access tokens. Set up git commit hooks that scan for password or API key prefixes and block the commit. Set up appropriate .gitignore files, you can place them in subfolders to keep them simple.
If possible, switch from using passwords or tokens without expiry to using ephemeral or time-limited tokens such as machine or pod identities, JWT tokens, IAM service accounts, public/private key pairs (if you can get by with only a public key in the repo) or two-factor authentication. Consider distributing time-limited passwords with Hashicorp Vault.
Some teams might use a cloud-hosted secrets manager or password manager like 1password to distribute passwords and then have code load the password as it runs on a developer machine. GitHub has a secrets scanner, if you pay them enough money for “Advanced Security” such that secrets can never be pushed upstream if recognized as such.
Also, converting from any repo format to another repo format requires care and multiple repeated attempts. A lot can go wrong. Reposurgeon (from a sibling comment) is highly recommended but also not easy to use. It takes a lot of attention to detail to really get the details right.
Upvoted. Why in the world was this flagged?
I don't think it's off-topic, since the topic is particularly relevant to git newbies, and the committing of stuff that shouldn't be in the repo is a huge problem I've observed in that group. (Not just credentials, but all sorts of junk like binaries and logs and tempfiles, and individual settings.)
It's such a common problem that there's been many tools and workflows that have been set up to mitigate it. This situation often results from not having those tools and workflows in place to give even the "idiots" a "pit of success" to fall into.
e.g., pre-commit hooks installed on dev machines by team policy, and pre-receieve hooks on central repos, both along with CI jobs running truffleHog etc. And even that is downstream of proper code design of minimizing credential use, and focusing/localizing it into files which are already .gitignore'd
I might be an idiot myself, but would `git rebase -i <commitBeforeSnafu>` do the trick?
edit: This is interactive rebase, where you can rewrite history. Instructions are in the file you are editing once the command is executed. As a rule of thumb, don't rebase on production, however with such a SNAFU you probably need to.
No. The old commits still exist, a rebase just makes it so that they are not part of the history of the current branch. If you know the commit hash of a commit containing the offending data, you can still check it out. You can also find such commits in the reflog, or if they are part of the history of any other branch or tag. The data is also in git's internal data structures if you know enough to interpret those directly.
Still, a rebase is a critical part of fixing this. If you rebase every branch that has the offending data in their history, and recreate any tags that have the data, then the offending data would be present only in unreachable objects (which can still be found through the reflog or knowing a commit hash). Eventually, git's garbage collection will clean up such unreachable data. You can force this behaviour earlier by using the `git gc` command. By default git will not remove recent objects (which is why you can reflog your way out of a botched rebase), but you can override this behaviour as well.
Of course, all of the above assumes you are working on your own repository. Given git's decentralized design, you need to clean up and garbage collect on every clone that did a fetch since the offending data was checked it. Worse, each clone also keeps track of remote branches, and considers those to be reachable as well, so a garbage collection will not work correctly until a given clone fetches from all of its remotes after those remotes removed the offending data [0].
Further, there is not good tooling to check that you actually did this correctly, so when you are done, you need to hope you fully purged the data.
The plus side of all of this is that once something is checked in, it is extremely difficult to accidently delete it. The downside is that it is also difficult to deliberately delete it. Also, it is fairly easy to make it difficult enough to find to negate the day to day benifets of it still existing, which does very little to protect against a motivated attacker.
[0] Fourtuantly, you do not need all of the remote to have garbage collected, so you could have every make the data unreachable, fetch on all remotes, then garbage collect.
Rebasing doesn't rewrite history, it creates an entirely new history, but all of the old commits still exist unless they are reaped by the GC at some point.
It's more useful to fix the problem before it happens. Try to set up systems or workflows that make it easy to do the right thing, rather than trying to fix it afterwards.
I really had to converse with the lead in this wise:
If I stapled your HR record, including your bank account and routing number, to our shared refrigerator so I could remember it, would you think that appropriate?
Really, this was the context of our meeting.
Now I am left with the aftermath of his departure, and knives at my sides.
About 8 years ago, I blogged a very brief guide to using git for a solo developer. A lot of people seemed to like it.
A couple times since then, I've noticed there was something I was using that wasn't in it, but that could be added without making it significantly longer or more complicated. So there were a couple of updates.
I did another one today, and had the thought that since this guide now includes the benefit of 8 years of practical experience in actually using it, while still essentially being "Git In Two Minutes", it might be worth posting to HN. So here it is.
It doesn't say anything about github. It really is for a solo developer who wants to start using git in a very painless way. With the additional goal that you can profitably use git for years without going beyond the described features.
Hey, just a heads up (won't be a problem for most people Goolging git) but on mobile (Firefox) the code blocks are almost unreadable because they're so small.
In both examples given in the "Undoing a bad commit" section (fixing a commit message and fixing an error in a file) it's easier to make the change and then run `git commit -a --amend` which takes the current changes you have and adds them to the last commit, allowing you to change the commit message as well.
There are other cases where git reset is useful, but generally not for the reasons given.
In solo development I still revert rather than reset and don't force push. I see no harm in having history. You may even be glad you have it later on. Revert on the command line is very easy to use, maybe easier than reset as you don't need to think about whether it is hard or soft you want.
Once vice I have for solo development that I won't do at work is really crappy commit messages. Like "ditto" for a commit that is similar to the one before or "typo" if I am just correcting a spelling mistake.
You want meaningful history, not unfiltered history.
The harm is that the history is no longer as useful if it's not curated -- whether that's for tracking down the origin of a bug, or just using it to remember what you did a few weeks ago.
It's easy to see the flaw in "keep everything" by taking the argument to its logical conclusion, which would be recording the history of every keystroke in your editor.
If you have false starts or alternate approaches that you gave up on but think you might still want, by all means save them, just not in "main."
You could use branches / tags for this purpose. Reverts are not too bad becuase they are pretty meaningful. If your commits are meaningful then the revert is "reverts: {meaniningful commit message}"
> Reverts are not too bad becuase they are pretty meaningful.
I was responding to: "I still revert rather than reset and don't force push".
Say you are working privately on a branch, saving your work, etc. You realize you made a silly error, and need to undo a commit. Creating a revert commit here is a pretty serious error of git organization imo. Typically you do not want to save this work... it is just your private mess as you were working through the problem, and when you are done and have solved it, you want to save the final result, as clean history, perhaps as multiple commits, perhaps as one.
After you have published your work publicly, then yes, you need to do a revert commit if you made a mistake.
You are right. There a times when, if I have a local-only branch and there is no utility in keeping the error I will reset. It is rare but it happens.
If I went down a direction then tried something else, I prefer revert so I don't lose that. That said, tagging then resetting is probably just as good and might have the best of both worlds. You get to keep what you did and you keep the history clean. I only just thought of this now!
Tagging and resetting is of course a bit like stash, but I see stashes as very temporary. It is easy to lost them due to stash pop.
I think that vice is honestly much better than another common vice: not committing because of commit message paralysis. If you don't commit, git is doing little to help you do anything. Commits, and their messages, can always be redone with an amend, or a rebase.
This makes me think that messages should be a separate entity and not part of what computes the hash. So you can retrospectively change a message later on if it needs some more detail.
> it it’s enough be useful for beginning solo developers, and provides a start from which you can grow.
I like this guide a lot as a cheatsheet. However, when it comes to beginners I fear it is one of those things that make sense only when you already know what the guide is talking about.
It would take me more than two minutes just to explain a completely new developer what a commit is and why they would want one. And God help us if I throw the output of "diff" at them without warning...
The sad truth is, you cannot explain git in two minutes. I nonetheless admire the author for giving the problem a fair fight.
as a noob trying to pick up coding just to experiment with some ideas, I always found Git to be surprisingly confusing. I feel that a lot of the command names aren't very intuitive.
I think what makes it more confusing is that the inner workings are almost too simple.. so one wonders why there are so many commands etc.
Git is basically a linked list (if you have one branch) and a graph (if you have more than one branch). Then all you are really working with are pointers to certain nodes in that graph.
In git terminology these nodes are called objects and there are different types of objects such as tree objects and commit objects etc. Objects are compressed with a library called zlib and are stored as files within the .git/objects directory, the file name is the hash of the decompressed object.
But, back to the graph and the pointers to certain nodes. A branch itself is nothing more than a pointer either. You can look at the files in .git/refs/heads to see where they point.
Now, what you do with git checkout is changing the node you currently view in the graph. If you make a commit you create a new node. Merging is nothing more than taking two nodes and creating a new node that is connected to both of them.
The main problem i think is that concepts like branch, tag, commit etc are actually just overcomplicated abstractions of the much simpler graph nodes and pointers and are thus confusing.
Git isn't designed for "noobs trying to pick up coding just to experiment". It also isn't really designed for solo developers as the OP is about.
Git is designed for collaboration. Git's UI is absolutely unintuitive, especially coming from other version control software. But that's irrelevant. Everyone wants to use it because of the network effect.
In terms of _why_ the UI is unintuitive, I believe it's because it intentionally exposes its internal representation. Unlike its predecessors like subversion, Git has infinite flexibility in terms of the workflow that teams of people can adopt. But the flexibility comes at the cost of learning how to use a distributed graph of commits.
Personally, I never learned git by reading recipes of commands, like the OP. I only really grokked it by understanding how it works internally. This article helped me:
> Git has infinite flexibility in terms of the workflow that teams of people can adopt.
IMO flexibility is just a meaningless buzzword used to describe overly complicated and poorly designed products. If a product has crappy UX, they just call it “developer oriented” and “flexible”.
When I read about “flexibility” of a software product, I think that this product is of low quality. And that I will have to spend some time to make it work for me (hours? days? weeks? years?). To me as a user this flexibility does not matter at all.
The product either works or it doesn’t. If it works, I don’t want to spend a career to learn its internals. I just want it to support my use case and give clear step by step instructions in its documentation.
I better spend my time on something more important than exploring how this or another crappy software product is flexible.
Maybe git wasn't designed for experimenting noobs or solo developers, but it can certainly work for them. I'm not a coder, I'm a writer. I taught myself git, for myself. I collaborate with no one, and use only a few more features than the OP talks about (I push to and pull from remotes and branch occasionally). I love git.
It's not just you. I have 30 years of experience coding and... hmm, probably 15 years of experience with git, at this point. I still think git is surprisingly confusing, and a lot of the command names aren't very intuitive. The engine is good, but the UI sucks.
I wish the world had standardized on mercurial instead, but here we are.
IMO the overwhelmingly biggest fuckup of the UI was / is the multiple overloaded meanings of "checkout", and the introduction a couple (three?) years back of "switch" and "restore" largely fixed that. Sure, there's still a lot of crap about it, but it's got a lot better.
I don't know whether you're an idiot or not, but thinking git is confusing is certainly not evidence to that effect.
Git _is_ confusing; it became the de-facto standard because it was the first free DVCS (and DVCSs solve lots of problems that were common to old non-distributed VCSs), not because its UI is particularly well-designed.
It might be presumptuous of me, but maybe you're encountering the same issue I did/do: git as a tool doesn't seem mentally compatible with github
/gitlab/etc.
Locally, git almost makes sense. I make a change, I commit it to the tree of changes. I can roll it back, or branch it off, or merge branches, or restore from a previous point, and so on.
It's when some kind of external repository gets involved where the model breaks down for me, especially if it's public/open source.
Someone wants to help me fix a bug? Cool! They got my code from GitHub and modified it. Now they want their changes to be merged into my version. They submit a pull request. A what? They want me to pull their code into mine? Why isn't it called a push request, where they request permission to push their changes into my maintained repository?
> Someone wants to help me fix a bug? Cool! They got my code from GitHub and modified it. Now they want their changes to be merged into my version. They submit a pull request. A what? They want me to pull their code into mine? Why isn't it called a push request, where they request permission to push their changes into my maintained repository?
If you haven't given them push privileges to your repository, they can't push anything to it. That's why they have to request you to do it... And then it isn't a push, now is it? They're asking you to get code from their repo; that's a pull.
The larger issue is of course that many people never use the git CLI with remote repositories: Far, far too many think that "GitHub is git." Dunno if that's what's happened in your case: You write, "They got my code from GitHub..." If they'd got it from a git repo on your server directly, their "pull request" would consist of an e-mail or something asking you to check out their code, and then you would do that in your repository by typing a command literally starting with "git pull ..."
Git's command names, and UX in generaly, is its worst area. It's definitely not just you.
Many professional software devs I work with still really have no idea how git works, they've just memorized the 3 or so commands they absolutely need and maybe how to recover when something out of the ordinary happens.
> Git is simple, but it takes a lot of experience to appreciate its simplicity. So don't beat yourself up, git is hard and it's okay to be lost.
Yet another example of how, as so often in programming, the "KISS principle" ("Keep It Simple, Stupid") is deceptive: Simple isn't (at least not always) equal with easy.
But yeah, with a more well-thought-out set of commands it would have been a lot easier for a lot of people. I think it's just simply (heh!) that it was released a tad too early, before anyone had thought more deeply about keeping the syntax orthogonal, and the world's been stuck with those choices (largely, choices not made in the first place) ever since, for backwards compatibility.
I find it odd to have `git status` shown only at the end and presented as a bonus. That would have been the first command I would speak of after `git init`.
I use it so often that I have an alias in my shell for it: `gis`.
It shows all the relevant information and generally suggests commands that would perform what you might want to do in the situation you are in (new untracked files, tracked files changes, staged changes that you may want to unstage, merging conflicts, rebase in progress, etc.). It really helps with Git discoverability, which is not negligible when you set yourself to present such a complexe tool in two minutes!
What do we call the "--" part in a command like this:
git checkout HEAD -- <filename>
I find it hard to remember things unless I understand their purpose. It looks like it's specifying an argument but without an argument name (e.g. --verbose), unless it's similar to a pipe | symbol and <filename> is being passed to the checkout command as some special kind of argument?
> I wish they made a breaking change in the next version and make the CLI actually usable.
As I understand it, they're attempting to make it "actually usable" by adding more consistent (and maybe even "intuitive") options to the syntax. In order to avoid a breaking change, though, they're also leaving the old confusing stuff in. Sure, it's regrettable that it's there in the first place, but understandable that they don't want to break people's scripts from, by now, decades back.
Sure, that makes it a little harder to learn, but it can be done, by just making a conscious decision to ignore those crufty old bits.
They shouldn't break the existing `git` command as it is widely used in scripts. But there are projects that try to build better UIs on top of git; the most serious such attempt that I know of is magit.
While acknowledging that git's CLI is often unintuitive: `git tag` lists tags, `git branch` lists branches, and `git remote` lists remotes, so I don't think I understand this particular objection.
Compare it with `git branch -a` and `git remote -v`. That's my point. Not only are they all different flags but you'll get half the data you could be getting and not know why. It's impossible to google
You point is what, that these are needlessly asymmetric? It's true. But they're in my head because I do them every day, and it's not like I'm suffering under the burden of remembering a handful of flags. That's a pretty far cry from "not actually usable", so maybe your hyperbole is a little misplaced?
Correct. That was a simple example that's easy to understand
Anyone with half a brain can understand that if they can't keep something as simple as printing a list consistent you better believe nothing else will be straightforward. Which is my point. NOTHING is straightforward and I haven't met a single person who likes the CLI if they do anything more complex than a commit, push and pull. I know people who still refuse to use rebase and don't understand bisect or blame. They use a GUI to restore files
It forces git to intereperet what comes after as paths. Normally this is not needed, but occasionally you get a filename that causes ambiguity.
For instance, "git checkout test.c" will check out the file test.c unless you happen to have a branch called "test.c". (See also people complaining that checkout is overloaded".
Simmilarly, "git add --verbose test.c" will add test.c, and log that to stdout. "git add -- --verbose test.c" will add test.c and --verbose, where "--verbose" is the name of a file.
This isn't git specific, most CLI tools use "--" to restrict the parsing on arguements that follow.
We need a copilot for the command line. For the bazillion times I have to look up stack-overflow for doing anything unusual with git, like undoing things etc. And not just git. Maybe some day there will be an oh-my-zsh plugin for copilot.
This reminds me — what was that human readable “man” tool again? I think I had come across a homebrew package back in the day, that let you read command docs in a sane language, but lost track of it.
git's the kind of tool where understanding how your commands operate on the underlying data structures will probably make it a lot easier to use efficiently. (And they're beautifully simple despite how powerful and flexible they are)
As a tool you might be using for hours per week for a few more decades, it's worth the investment going beyond the "in x minutes". I tend to provide both to juniors.
https://git-scm.com/docs/git-switch
https://git-scm.com/docs/git-restore