Using git

This section documents some best practice when using the git source code management tool.

Good sources of documentation for git include:

Concepts

This section very briefly lists the main concepts in git. There are gory details in the git book. There is also a terminology section at the end of this document if any of the terms used herein are unknown.

Pushing and pulling

Pushing a branch to a remote consists of two stages:

  1. The remote is sent the branch HEAD commit and any parent commits it doesn't already have.
  2. The remote updates the remote branch to point to the new HEAD.

This only happens if the remote branch's original HEAD is a descendent of the local branch's HEAD. Otherwise the push fails.

Pulling a branch is the opposite of a push. The steps are identical by the roles of the remote and local repository are reversed.

GitHub != git

It is tempting to identify GitHub and git. GitHub is a service provider which offers git hosting and a set of related "software forge" functionality such as issue tracking, project management, code review, etc. GitHub may be the 800lb gorilla of the git world but there are many other large apes out there.

Git was originally designed to be decentralised. It is quite possible to use git without GitHub and even without a working Internet connection.

Feature branches

Feature branches are branches being used to develop an individual story. If you are stuck for a name, "issue-{number}-{summary}" is a good choice where "{number}" is the issue number of the story you are implementing and "{summary}" is a brief summary of the story formatted-like-this.

Structuring the branch

Tell a story with your branch: each commit should implement one step towards implementing the story. It is kind to a reviewer to allow your pull requests to be reviewed commit-wise so try to keep commits small, on topic and self-contained.

Commits

Each commit message should start with a single line summarising the change. If the repository has logical "sections", start your commit message with the section and a colon. For example, a Django webapp is usually composed of several applications. Using the application name as a section is a good idea. The single-line summary of a commit is, by convention, rather short. Keeping it around 50 characters is a good rule of thumb given in the git documentation.

The commit message proper should explain how the commit implements what it implements. It may also explain how the commit fits into the overall progression of the story.

If the commit closes an issue, use a GitHub keyword in the commit. If only the pull request as a whole closes the issue, use the keyword in the pull-request message.

An example message:

mediaplatform: MediaItem: make channel field non-NULL

Make the "channel" field of mediaplatform.models.MediaItem non-NULL. This
enforces that all media items will have an associated channel in future which is
required by #1234.

Add a pre-migration hook which assigns all media items which currently have a
"NULL" channel to an "orphan" channel. If there are no items with a NULL channel
the orphan channel is not created.

The orphan channel has blank edit and view permissions and as such will only be
available to admins.

Closes #1234

Some more resources on git commit messages:

Rebasing

Git rebasing is an invaluable tool to help you structure your branch to be easy to review. When developing, the rule is "commit early, commit often". After you have finished implementing the feature, you may re-order, combine and re-word commits using the git rebase tool.

Danger

Once you have shared your branch with others via a pull request, do not rebase as it makes pulling your changes harder.

Terminology

This section briefly describes some of the terminology around git. It's not intended to be exhaustive.

A blob is a set of bytes. It has a name which is the SHA1 hash of its contents.

A tree is a set of blobs in a directory/filename hierarchy. It has a name which is the SHA1 hash of its contents.

A commit is a message describing a tree, the name of a tree and a list of names of "parent" commits. Its has a name which is the SHA1 hash of its contents. Recursively following parent links from a commit yields the set of "descendent" commits.

A branch is an alias for a commit. Its content is the name of the commit it references. Its name is human-readable. Unlike commits, blobs or trees a branch's name can stay the same even if its content changes. The commit pointed to by a branch is called the HEAD of the branch.

The master branch is the branch which we agree as a team reflects the current state of the product. Beyond this convention and the fact that it is the default checked out branch when a repository is cloned there is nothing special about the master branch*.

A remote is an alias for a remote git repository. It maps a human readable name to a location. For example, "origin"→"git@github.com:uisautomation/guidebook".

Common git actions

The following actions may frequently be required while working with git. Simple cases are provided here for convenience but the git documentation or knowledgeable co-workers should be consulted in complex situations.

If you're worried about losing changes while performing any of the following then you can make a backup of a branch beforehand.

If you are on the branch to be backed up then use, for example, to create my-backup branch:

$ git branch my-backup
Or, to copy another branch, for example, my-feature branch to my-backup:
$ git branch -c my-feature my-backup
In both cases, you'll remain on your current branch.

Forgot something for last commit?

You've discovered that you have a file or files that should have been added to the last commit you made. Simply stage the forgotten file(s) then recommit using the --amend argument.

$ git add path/forgotten_file.py
$ git commit --amend

You'll also have opportunity to change the commit message, if so desired.

Undo last commit

You didn't mean to make that last commit and want to undo the commit but keep the changes you made.

$ git reset --soft HEAD~1
Note: HEAD~1 (equivalent to HEAD~) references 1 commit before current HEAD. By specify another number you can undo more than one commit.

Alternatively, if you really want to undo the last commit forgetting any changes made, as if it never happened, then switch the --soft for --hard. Warning: files and/or changes to files will be lost.

$ git reset --hard HEAD~1

Edit previous commit message

You've a typo or another mistake in the message of a previous (not necessarily the latest) commit?

Firstly, you will need to work out using git log how many commits you need to go back and use that number as part a HEAD~n reference.

For example, three commits back:

$ git rebase -i HEAD~3
This will show a list of the previous commits, each tagged with pick.
pick f4b35d1 Initial Implementation
pick 130564e Created READEM (typo)
pick f4a5a0f Added RESTful API

# Rebase 64fa23f..f4a5a0f onto 64fa23f
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
...
For the commit(s) that you want to change the message for, change the pick to reword (or just r) and save. You will then be prompted to change the commit message(s).

Combining commits in to one

You've got some commits that you want to combine together, typically to fix up later changes into the relevant earlier commit.

As above, use git log to work out how many commits to reference with HEAD~n then issue the rebase command. For example:

$ git rebase -i HEAD~4
Firstly, if the fix up commit is not immediately following the commit it is to be combined with then reorder the commits.

The following step can be done at the same time as reordering but it is often easier to finish this rebase, check the ordering and then rebase again to do the actual combining. There is a potential for merge conflicts during the rebase, see "Merge conflicts during rebase" below.

Now change the pick before the fix up commit to either fixup (or f) to have it combine with the previous commit and discard the fix up commit's message, or squash (or s) to be prompted for a revised commit message for the resulting combined commit.

Use git log to check that all went well. If you previously pushed to a remote repository then you'll need to git push --force to upload these changes.

Merge conflicts during rebase

Sometimes during a rebase (especially if reordering commits) you may get merge conflicts. You will need to resolve these conflicts, stage the conflicting files and then continue the rebase.

$ edit path/conflicting_file.py
$ git add path/conflicting_file.py
$ git rebase --continue
But don't worry! If you get in a mess you can always reset to the state before the rebase by aborting it.
$ git rebase --abort

Merge Request has merge conflicts with master

You've finished your changes on your branch and created a merge request (MR) but the master branch has changed since you originally created your branch, and Gitlab is reporting that merge conflicts need resolving before your MR can be merged.

You could git merge origin/master in to your branch but this would lead to an extra merge commit being added to the history. An arguably cleaner way is to rebase your branch on to the current master.

For example, rebase feature branch my-new-feature on to master:

$ git fetch
$ git checkout my-new-feature
$ git rebase master
$ git push --force
There is a potential for merge conflicts during the rebase, see "Merge conflicts during rebase" above.