Today I Learned Git Part 3

This is part 3 to my series of posts about learning git. Most of the sections here are from various TILs I've found and hand picked ones I found useful.

Clean Out All Local Branches

Sometimes a project can get to a point where there are so many local branches that deleting them one by one is too tedious. This one-liner can help:

git branch --merged master | grep -v master | xargs git branch -d

This won't delete branches that are unmerged which saves you from doing something stupid, but can be annoying if you know what you are doing. If you are sure you want to wipe everything, just use -D like so:

git branch --merged master | grep -v master | xargs git branch -D

Clean Up Old Remote Tracking References

After working on a Git-versioned project for a while, you may find that there are a bunch of references to remote branches in your local repository. You know those branches definitely don't exist on the remote server and you've removed the local branches, but you still have references to them lying around. You can reconcile this discrepancy with one command:

git fetch origin --prune

This will prune all those non-existent remote tracking references which is sure to clean up your git log (git log --graph).

Delete All Untracked Files

Git provides a command explicitly intended for cleaning up (read: removing) untracked files from a local copy of a repository.

git-clean - Remove untracked files from the working tree

Git does want you to be explicit though and requires you to use the -f flag to force it (unless otherwise configured).

Git also gives you fine-grained control with this command by defaulting to only deleting untracked files in the current directory. If you want directories of untracked files to be removed as well, you'll need the -d flag.

So if you have a local repository full of untracked files you'd like to get rid of, just:

git clean -f -d

or just:

git clean -fd

Delete Remote Git Tags

Tagging releases with Git is a good idea. In case your tags get off track, here is how you delete a Git tag locally and on a remote:

git tag -d abc
git push origin :refs/tags/abc
To git@github.com:hashrocket/hr-til
 - [deleted]         abc

It gets trickier if you're using Semantic Versioning, which includes dots in the tag name. The above won't work for v16.0.0. This will:

git tag -d v16.0.0
git push origin :v16.0.0
To git@github.com:hashrocket/hr-til
 - [deleted]         v16.0.0

Determine the Hash Id for a Blob

Git's hash-object command can be used to determine what hash id will be used by git when creating a blob in its internal file system.

echo 'Hello, world!' > hola
git hash-object hola
af5626b4a114abcb82d63db7c8082c3c4756e51b

When a commit happens, git will generate this digest (hash id) based on the contents of the file. The name and location of the file don't matter, just the contents. This is the magic of git. Anytime git needs to store a blob, it can quickly match against the hash id in order to avoid storing duplicate blobs.

Try it on your own machine, you don't even need to initialize a git repository to use git hash-object.

Diffing with Patience

The default diff algorithm used by Git is pretty good, but it can get mislead by larger, complex changesets. The result is a noisier, misaligned diff output.

If you'd like a diff that is generally a bit cleaner and can afford a little slow down (you probably can), you can instead use the patience algorithm which is described as such:

Patience Diff, instead, focuses its energy on the low-frequency high-content lines which serve as markers or signatures of important content in the text. It is still an LCS-based diff at its core, but with an important difference, as it only considers the longest common subsequence of the signature lines:

Find all lines which occur exactly once on both sides, then do longest common subsequence on those lines, matching them up.

You can set this as the default algorithm by adding the following lines to your ~/.gitconfig file:

[diff]
    algorithm = patience

or it can be set from the command line with:

git config --global diff.algorithm patience

Dry Runs in Git

There are a few commands in git that allow you to do a dry run. That is, git will tell you the effects of executing a command without actually executing that command.

For instance, if you are clearing out untracked files, you can double check what files are going to be deleted with the dry run flag, like so:

git clean -fd --dry-run
Would remove tmp.txt
Would remove stuff/

Similarly, if you want to check in which files a commit is going to be incorporated, you can:

git commit --dry-run --short
M  README.md
A  new_file.rb

Try running git commit --dry-run (that is, without the --short flag). Look familiar? That is the same output you are getting from git status.