Git Basics

4887

20 min read

This post is more than 3 year(s) old.

About Git

Why use Git?

Many times we will need to control the versions of our files, especially in large projects where many members are cooperating. You can do a rudimentary version control by regular backup your files, but it is time-consuming, and often you cannot retrieve the exact version you want because of your backup period missed it! Git offers a systematic, efficient and fast solution for version control. In contrast to centralized version control tools like SVN, Git employs distributed version control.

”Distributed” Workflow

Distributed version control implies that all members of the team would have a copy of the project file on their computer (called local repository). There is no “central server” as in centralized version control, so to speak. Team members exchange changes among their own devices (For example, if member A and member B both are collaborating on the file apple.txt, they can use Git to push the changes to each other so both member would have the updated apple.txt with both changes in it) - and if one computer fails, just copy files from another.

In practice, however, the team would usually set up a central repository online (called remote repository), commit and pull changes from that central repository to and from their local repositories (instead of pull and push between their local computers). This central repository is set up only for the ease of exchanging files - if it fails, unlike in centralized version control, no data will be lost. An online central repository could allow members not in the same LAN to collaborate with each other; also, member A does not need to wait until member B to turn on her computer so as to push the changes to member B.

Git Setup

Install Git

Download and install Git from the official website. After installing, 3 executables will appear: Git Bash, Git CMD and Git GUI. Git Bash and Git CMD are both terminals-like command windows and Git GUI is a GUI wrapper around them (read about their difference here). For most of the time we only use Git Bash to do everything by typing out commands in it.

Set up Environment

Start Git Bash, and execute:

git config --global user.name "Your Name"
git config --global user.email "email@example.com"

The git config sets configuration and preferences when using Git (Think of the Settings in other GUI applications). Clearly, these two lines set an identity for you. In version control, it is important to identify each change-maker as the person who commit a change, so that the team can know whose bugs this is. Note that this name and email does not really “log in” to anything or any platform. It is just two lines of text field that will come along with you and your committed changes, like an ID card.

The git config actually writes data into a local file on your computer that stores all your settings. The --global parameter specifies the level of control. It instructs that the configuration should be written into a a config file under the current computer user’s personal folder, so that the settings here you made in this command would apply to every repository this current user use in the future. The other level of control are --local (writes data to the configuration file in a repository, so the configuration apply to only one repository) and --system(writes data to the configuration file in Git installation directory so the configuration apply to all users of this computers).

Configurations set at lower levels will override those in higher levels. So, configurations set at local repository level overrides those in user level, which in turn overrides those in system level.

Two important keys, other than user.name and user.email, that I think one should know, are core.editor and the alias section.

core.editor="D:\Portables\NPP_Portable_7.9.5\notepad++.exe" -multiInst -notabbar -nosession -noPlugin

You can also set editor by git config --global core.editor editor-path. If you are using NPP, use the 32-bit version; if not, you must type -multiInst -notabbar -nosession -noPlugin after the editor-path because the 64-bit plugins may cause problems.

Alias

This is really another type of configuration stored in the config file. Alias allows you to shorten commands.

git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.st status
git config --global alias.unstage 'reset HEAD --'
git config --global alias.last 'log -1 HEAD'

Alias is not just capable of setting git command. To set a non-Git command, use ! to lead; to set a multiple line command, use & to concatenate. For example, !cd ${GIT_PREFIX:-.} && start update.sh this will go into the project root directory and execute the update.sh there. !c:/windows/explorer will start a windows explorer.

Set up Local Directory

For Git to manage your directory you must let it know where it is. Direct (not sure how to? See my other {% post_link Common-CMD-Commands [blog] false %} for common CMD command) Git Bash to your project root directory (that is, the folder that contains all project files you want to do version control; if you do not have one, create one and put all your files into it), and type

git init

This initiates the current folder as an empty local Git repository - which means Git starts managing it. You will find that in your project root directory a .git folder is created. If you do not see it, it might be hidden. Check Google to see how to reveal hidden files. Folders that contains a .git subfolder is managed by Git. (Incidentally, the configuration file config for this local repository is also in the .git folder.)

Now you have set up a Git local repository at your project directory - that is, you have set up the three-tree Git structure that manages version control for your project folder. The three trees are: working directory(working tree), staging index, and commit history.

Use Git

Get help

git help <verb>
git <verb> --help
git <verb> -h
man git-<verb>

There are four ways to get help about a certain action. Use any when in doubt, for example git config -h will tell you how to use the git config command in a brief way. Of course, you could also check the official documentation.

Add, Commit Locally

Now, type

git add -A
git commit -m "Initialized a repo"

If you started from an empty folder, this two lines will do nothing; but if you started from a project folder with pre-existing files, these two lines add those files to the Git repository we just initialized (remember, we just initialized an empty repository). You can understand it as “backing-up” those files to Git.

If we then created another file, say test.txt, under our project folder, and re-run those two lines of commands, Git will again detect this change and do another commit to the local repository to update the version information. If we do not make any change and run those two lines, nothing will happen.

Let’s look closely. git add detects and “stages” changed files. When you do git commit, all staged changes are “committed”. You always need the stage-commit two steps to inform Git to update version record (or, “backup your files”) after making any change. Why two steps

Move, Delete, Rename

By right you can work without these - but they can speed up the process.

Check status

Believe me, you will need to know the status quo of your project now and then.

Regret

One of the most important reasons why version control exists is to allow you to regret.

Sometimes you will see Git commands with double dashes -- before their <file-name> parameter. For example, git checkout -- <file-name> instead of just git checkout <file-name>. The meaning of this please refer to here, here and here.

Remote Repository

Set up Remote Directory

As previously said, usually we set up a remote directory online for easy exchange of changes. The standard solution is GitHub, a free platform for hosting Git remote repository - this saves you the hassle of building a git server by your own. Register a Github account first, and then you can get remote repositories for free. (Free users can only create public repositories - that means your project is publicly visible; to create private repositories, you must pay for GitHub Pro; students may get GitHub Pro for free; another option is to build a Git server of your own - but for personal users that probably cost more than a GitHub Pro…) Register a GitHub account and create a repository (name it properly to avoid confusion). If you need guidance on these, check GitHub Documentation.

Now two things need to be done:

Once the two things are done you basically connected your local and remote repository.

Push

Now, every time you committed some change, you can do git push -u <remote-alias> <branch-name> to push this commit to the remote repository at GitHub, your teammates can then get changes from there.

Conflict, Pull, Fetch, Merge

Rule of thumb: Do not modify the same file at the same time from two branches.

Sometimes, when you and your teammates modified the same document and both want to push to the remote repository, there will be conflicts. When this happens, the push action on one side would be rejected and she must resolve the conflict before pushing.

Clone

Following the previous steps, we started from a local repository, created another remote repository and established a connection between the two. There is really a simpler way to achieve the same effect: to start from a remote repository. Create a repository in GitHub and get its SSH link, direct Git Bash to an empty folder in your file system locally (you probably want to create a dedicated folder for this), and then type in git clone <repo-link>. The files (if any) would be downloaded from GitHub remote repository into that folder, a local Git repository will be initialized automatically at that folder, a local branch with the same name as the remote branch will be created and it will be linked to the remote branch - all done.

You do not need to do git init before git clone. git clone does everything for you.

Branch Management

How to Use Branch

Why bother branching?

Branching is inevitable. When we set up a remote repository, we are already using branching - one local branch and one remote branch. When multiple team members cloned a remote repository to their respective computers, their local branches are created. They need to constantly track the remote branch, pull from it and merge their work into the remote branch.

Moreover, we will open a local branch every time a new feature request or bug pops out. We will maintain a stable branch master locally, and when bug or features are calling, the person in charge will branches out from the stable branches and do her own work on that separate branch dev, leaving the stable branch intact; after she finishes, she merge back her work into master and delete dev. This provides extra security to the work we have already done.

There is a good illustration (in Chinese) of this workflow here and here. Remember: HEAD points to the current branch, and the current branch points to a specific commit.

Make, Switch, Delete

Use the commands below to manage branches and working at different branches:

Other Useful Topics

You can stop reading now. You have all Git knowledge for daily use. The topics below are not used as frequently but might turn out useful under certain messy situations. Completing the topics below gives you an edge over other Git beginners.

Submodules

Often, scenarios arise that a project needs another project as its part - do not just copy-paste another projects over entirely! See the elegant way to manage that using submodules.

Rebasing

A merge from C1 to C2 will consider their common ancestor C0 and the two descendants C1 and C2 and combine all of them. A rebase will have the same end effect, but with a cleaner branch history. Check the Rebasing chapter for a detailed explanation.

Stashing and Cleaning

You must have experienced the situation where when you are working on a feature and your boss request you to fix an urgent bug. Now what? Commit the unfinished feature half-way? Make yet another branch? They will work, but Stashing provides an easier solution.

.gitkeep and .gitignore

Add a .gitkeep file to an empty directory so that Git will manage that (Git does not manage completely empty directory). The .gitignore file lists files that should be ignored by Git. Create a .gitignore file (usually it is created for you) at the project root directory to specify files that Git should ignore in that directory. For the syntax, see Ignoring Files.

Tagging

Sometimes we attach tags to our commits to the convenience of categorization, search and management. This Chapter has an excellent explanation on tagging.

Other Git servers

There are other hosting website other than GitHub such as Gitee. Get to know them and their specific features. Tired of using a third-party server? You can learn how to set up your own Git server here or here (in Chinese).

Using Git GUI

After you get familiar with Git command-line operations you can consider switching to a GUI wrapper for Git, like SourceTree.

Reference

-- Yu Long
Published on Aug 05, 2020, PDT
Updated on Aug 05, 2020, PDT