Modern Version Control With Git, Part 3

Advertisement

In this third and final look at the Git source control system, I will introduce some more advanced concepts and show you some tricks employed by experienced Git users.

As with most things, and as anyone who has worked with Git for a while knows, there’s more than one way to skin a cat. A lot of tasks can be performed with a couple of basic commands. However, a few advanced concepts and tricks will sometimes help you achieve your goals more elegantly.

Advanced Concepts

Stash: The Clipboard

The “stash” is one such feature. In some situations, a “clean” working directory is recommended (or even essential). That means there shouldn’t be any local changes (for example, when switching the current branch).

Imagine the following scenario. You’ve been working on a new feature for several hours, when suddenly a critical bug report comes in. Of course, you’ve already changed a couple of files. But now you must switch branches to be able to work on your current production code. You could simply commit your changes — but they’re only half done (and committing stuff that is only half done is bad karma!).

The stash helps you solve precisely this dilemma. All current changes are saved on this clipboard, and your working directory is left in a clean condition. As soon as you’re done fixing that bug, you can return to working on your feature — and simply restore all stashed changes.

Staging Parts Of Files

A large commit that mixes a lot of different topics is hard for other developers to understand, and rolling back problems will be hard if problems should occur. That’s why creating granular commits that contain only related changes is so important in version control.

Git helps you do this by enabling you to add parts of a changed file to the staging area. If you execute git add with the -p parameter, Git lets you choose for every part of the file whether you want to stage it or not. This way, you can control very precisely which changes should go into your next commit — and which should remain for a later commit.

Tracking Branches

If you’ve already glanced at the configuration file of one of your local Git repositories (.git/config), you might have spotted one of these sections:

screenshot

Git saves some meta data about the relationship between two branches; in this case, our local “master” branch tracks the same-named branch on the remote “origin.” This meta data is used by a couple of commands in Git, such as push, pull and status.

screenshot

In general, though, you don’t have to worry about managing all of this meta data. If you create a new local branch based on a remote branch, Git will set up the tracking relationship for you.

Undoing Things

Most mistakes that you make in Git can be corrected pretty easily.

Let’s take a simple case. You have mistyped your last commit message and now want to correct this typo. Git offers an --amend parameter for its commit command. This will overwrite the last commit and make it look as if your little mistake never happened. Amending also allows you to change the set of committed files by adding and removing items to and from the commit. But remember one golden rule: don’t amend commits that you’ve already pushed to a remote!

The revert command lets you “take back” a commit (and this time it doesn’t have to be your most recent commit). Reverting, however, will not delete any commits. Quite the opposite: a new commit will be created that reverses the effects of the corresponding commit.

The reset command is useful if you truly regret your most recent commit(s). It takes advantage of the fact that branches are really nothing more than pointers to a certain commit. This command rolls the pointer back to an older commit. In fact, reset will not even delete any commits; but your project’s history will look like it has done exactly that.

Integrating Selected Commits

Usually, you would integrate changes into a branch by merging with another one. In those rare cases where a merge is undesired, Git offers an alternative with the cherry-pick command. Instead of integrating a complete branch (when merging), cherry-pick allows you to integrate any desired commit. You can even integrate multiple selected commits in one go — but remember to start with the oldest one to avoid problems.

GUI applications like Tower51 make tasks like these a lot easier by allowing you to simply drag and drop the desired commits.

Rebase Instead Of Merge

The most common method to integrate one branch into another is to perform a “merge.” For an ordinary three-way merge, Git takes the endpoints of the branches to be merged and the common parent commit as the basis for the integration. This results in a so-called “merge commit” that connects both branches like a knot.

screenshot

A “rebase” is an alternative to an ordinary merge. A rebase does not result in a separate merge commit and therefore produces no “bumps” in your project’s history. It will look as if the history has run linearly and all commits have happened on the same branch.

Let’s look at a concrete scenario to better understand what a rebase does:

screenshot

Here, branch_B is our current HEAD branch. If we execute git rebase branch_A, the following things will happen. First, all new commits (C2 and C4) that originated after the last common commit (C1) will be temporarily removed. Now, branch_A’s new commits will be applied on branch_B. This means that both branches are now on the same position: on branch_A’s position.

screenshot

Right at the beginning, branch_B’s new commits were removed temporarily. Now is the time to reapply them: one after the other, and in their original order.

screenshot

In the end, by rebasing the branches, no merge commit was necessary, and the history has remained linear.

Rewriting History

A handful of commands in Git will change a project’s history. In addition to amending a commit, rebase also falls into this category.

Note that the reapplied commits in our sample scenario aren’t completely identical to the original commits (which is why they’re named C2′ and C4′). The commit SHA-1 has changed because we rewrote history with the rebase.

Always follow this golden rule when using commands that change your history: local commits that haven’t been published can safely be changed by using rebase or amend. However, if they’ve already been pushed to a remote repository, you should not use these tools anymore. Your teammates will thank you for it.

Housekeeping In Git

A Git repository can accumulate quite a number of objects over its lifetime, be they commits, files or file trees. Organizing these objects optimally is crucial to keeping Git fast. The git gc command (where gc stands for “garbage collect”) was made for just this purpose. Although it’s executed automatically in the background when running certain commands, running gc from time to time is still a good idea (preferably with the --aggressive parameter, to ensure the best result).

Desktop Tools And External Code Hosting

Tools

If you spend a lot of time in the command line, you can spice things up by using plug-ins. Nice little helpers include Tab auto-completion for branch names, and directly displaying the current branch in your shell’s prompt. Plug-ins are available for Bash2 and zsh3.

As alternatives to the command line, various desktop clients might be worth a look. A lot of tasks can be performed more easily and comfortably using a graphical user interface — if not only to not have to memorize all of the commands and parameters. Windows users might want to look at Tortoise Git4, while Mac OS users can try Tower51 (disclaimer: this is the author’s product).

Repository Hosting

More and more companies and individual developers are opting not to host their own code anymore. Providing and maintaining expensive server infrastructure is not everyone’s cup of tea: special know-how is necessary, ressources are tied up, and a high degree of security and availability must be guaranteed.

Meanwhile, some companies are already offering code hosting as “software as a service.” Some of the most popular ones currently are GitHub6, Beanstalk7 and Codebase8. If hosting your code externally is an option, then check out one of these services.

Conclusion

Git is an extremely versatile tool. In its early days, it consisted of more than 140 binaries that could be combined flexibly with one another. While using Git has become much easier with recent versions, it has managed to maintain this flexibility. As a result, Git can be viewed as a toolset for creating your very own version control workflow.

But the advanced tools are not all that make Git interesting. Its extraordinary speed and unique branching concept, for example, are great to have, too.

Previous Articles

(al) (kw)

↑ Back to topShare on Twitter

Tobias Günther is CEO and frontend developer at fournova. With his team he develops the Git desktop client Tower for Mac OS X.

  1. 1

    I never thought I would find such an everyday topic so etrnhalilng!

    0
  2. 2

    Good reading, thanks! Cleared some advanced stuff for me :)
    I would like to suggest one more repository hosting you didn’t mentioned – bitbucket.org. Mercurial originally, but now supports Git as well. What is the best about Bitbucket – you can have private repos even on a free account. That is why I use it instead of Github for personal projects.

    0
  3. 3

    By the way, I just verified that Git does not allow one to eledte all branches from a *local* repo. Once you’ve committed the first commit to the initial branch (default is master’), it seems that you cannot eledte this branch. I suppose this makes sense because it would be very dangerous to allow a user to eledte the entire main history of a repo. It follows from this reasoning that Git should also disallow the deletion of all branches in a remote repo, but I haven’t yet verified that this is true.

    0
  4. 4

    I appreciate your three part explanations to help demystify git. Tower looks sweet too!

    0

↑ Back to top