GitHub basics


Git is a system that allows you to keep track of different versions of your code (and data). You can also set up an internet address where others can get hold of the work. A popular implementation of this is the GitHub website (Wikipedia), which provides free space for Open Source projects, including website space. GitHub is popular to the point that Git and GitHub are almost synonymous. This tutorial will walk you through using both together, using GitHub's version of the Git software.


Git is a code repository and versioning/revision control system, that is, it sets up somewhere you can store code and which keeps track of different versions of your code so you can undo/"roll back" changes if you need to. However, as code is just text, it also allows for uploading and tracking of other stuff, such as data.

Git is also a distributed revision control system. Multiple people can work on the same code. There is a central storage place for any given project (often on GitHub), but each user also has their own copy of this repository (or "repo") which encapsulates their changes. A coder starts by pulling the code from the online version (if there is any). They can then make changes on their local machine. When they want to alter the online copies, they can push their copy of the repository back to the online store. If there are changes from multiple people since a coder last pulled the code down, these changes will be merged, and any conflicts manually managed by either the coder or an administrator.

The final thing to understand is that the local repository is only really a record of versions you have decided to tell it to note. You work on an independent copy of the code, and have to commit changes (update your copy of the repository) when you want to store those changes locally. Once changes are committed you can carry on working, knowing you have the option to roll back to that version if you want. Whenever you are happy the code is in a good enough state for everyone else, you commit it locally, and then push the local committed version to the online repo for everyone. The usual sequence is therefore:

pull online repo --> edit --> commit to record locally --> push to online repo.

So, now we've got the basic idea, let's set up a repository, and try some pushing and pulling.


The first thing you'll need to do is to set up a GitHub account. Go to the GitHub site and do that now. Obviously only do this if you are happy with their terms and conditions. Don't pick a name starting or ending with a dash, or containing consecutive dashes, as this causes issues with setting up websites through GitHub, but otherwise most things are ok.

GitHub is a commercial company, but started by open source enthusiasts. Its model is that people willing to open source their repositories should get free use, and you pay for private repositories (though students/academics can get some free private repos). It is probably the most popular, and certainly the fastest growing, repository system with open source developers

Notes:

Other popular sites for storing and managing code are Sourceforge, CodePlex, and BitBucket. If you don't like their terms, you can always utilise the same free software (Git) to set up your own repository system.

There is also other software you can use to set up your own system, including Mercurial (hg), CVS, and Subversion).

For a brief explanation of why you'd want to open your code and data to the public, see this PC World Article. For a fuller argument, and example case studies, see the classic text The Cathedral and the Bazaar.


Once you've set up an account, go to look at using Git to keep track of changes locally.