Table of Contents
Using git
You may need to use git as a version control system in some of your CSCI or DSCI classes. It is also heavily used in industry, in open-source software development, and in Data Science.
If you have a particular task you need to complete for a class, you might check out the section below titled "I need to..."
Overview: What is git?
Git is a version control system. It maintains copies of files during development and allows people to share various versions of files. Git does keeping “snapshots” of files in a repository, and allowing these snapshots to be shared, updated, and merged.
When working with git, our local files can be in three states:
- In the working directory, with changes not yet tracked by git.
- In the staging area. This is a temporary holding area where the current files are copied, in preparation for adding them to the repository.
- Recorded as a snapshot in the repository. Files in the staging area are moved into a commit in the repository, which is a permanent snapshot of the state of all files at that point in time. If we make further changes to any files, these will be untracked changes until will stage and commit them.
(Note that there may also be a remote repository, on a service like GitLab or GitHub, which keeps a copy of all of our commits.)
To use git to track changes to a project, we do the following:
- Initialize a git repository in a directory (or clone a repository from a server)
- Create files or make changes to our files in the directory. These files are now considered untracked changes by git.
- After we have made changes to a file, we copy the file to git's staging area. This is a temporary location before git permanently makes a copy of our changes.
- When we are ready, we commit the changes in our files. This moves all files in the staging area into the git archive. This creates a permanent snapshot of what the project looked like at that moment.
- (Optional) If we have a connection to an archive on a remote repository (on a service like GitLab or GitHub), we can push those changes to remote repository. This stage is needed to share the project with others, if we wish.
As a simple example, we will look at the process of setting up a git repository for a simple “Hello, World!” program. In the following, we will be using git on the command line. This is easy if you are logged in to cslab103, for example, where git is already installed. (There are also GUI interfaces to git; we won't cover those.)
- First, let's make a directory to store the program in. We'll create a directory
HelloTest
and change to that directory in the usual way. (In the following, we will work on cslab100, and we will use$
to indicate the Linux prompt.)
$ mkdir HelloTest $ cd HelloTest $ ls $
- As we can see, the directory is initially empty. We will set up this directory as a git repository, with the command
git init
: (Instead of '~', you will see the path to your directory, of course.)
$ git init Initialized empty Git repository in ~/HelloTest/.git/
- This creates an invisible subdirectory called
.git
. You can see it usingls -a
, but you don't need to do anything manually with this subdirectory. Ignore it.
$ ls -a . .. .git
- Now let's put a file into the directory. Using an editor, we create a file
hello.cpp
, with the standard “Hello, World!” source code in C++ below:
#include <iostream> int main() { std::cout << "Hello, World!" << std::endl; return 0; }
- We can compile this as usual to create an executable program we'll call
hello
. (See A Compiler for information on compiling.) There are now two visible files in the directory:
$ ls hello hello.cpp
- We can run the command
git status
to find out the status of our git repository. It will tell us that that there are two untracked files, but that we have not added any files to commit:
$ git status # On branch master # # Initial commit # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # hello # hello.cpp nothing added to commit but untracked files present (use "git add" to track)
- Let's add the source code file
hello.cpp
to the staging area, and rungit status
again:
$ git add hello.cpp [19:13] jhoggard@cslab100:~/cs130/HelloTest $ git status # On branch master # # Initial commit # # Changes to be committed: # (use "git rm --cached <file>..." to unstage) # # new file: hello.cpp # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # hello
- We see in the above that we have one file waiting to be committed (
hello.cpp
), and one untracked file (hello
, the executable program). Usually, we don't store executable files in the git repository, because they can be completely generated by compiling the source files. So we won't addhello
to the archive.
Instead, we will now commit the changes we have made. This will make a permanent snapshot of our project at this point.
We will include a commit message that describes the changes we have made so far, using the flag-m
after the command:
$ git commit -m "Initial commit: Hello world program." [master (root-commit) bbf762b] Initial commit: Hello world program. 1 files changed, 8 insertions(+), 0 deletions(-) create mode 100644 hello.cpp
- We used the commit message “Initial commit: Hello world program.” as our commit message. If we didn't include the
-m
and the string in the command, git would automatically open an editor to let you write a lengthy commit message. (It should use nano by default on cslab103, although it may default to vi on Windows.)
Server: GitLab, GitHub, etc
We can also work with a remote repository, which holds a copy of all our work on a server. Common servers include GitHub and GitLab.
The department maintains its own GitLab server at codestore.cs.edinboro.edu. See your professor if you need to set up an account.
Create and clone
If you want to set up a remote repository on codestore, you can click the “New” button in the upper-left after you log in:
You can select “Create a blank project”, then fill in the name of your project, a description, set the privacy level, and create the project.
Afterwards, you will see instructions for cloning your repository to your local machine, but it is not hard: On your project's main page, click the “Clone” button and select the https address to copy. (There is a button that will copy it.)
Then on your machine (possibly cslab103), type the following in the directory where you want to clone the repository:
git clone <address>
where <address>
is the https address you copied from codestore. (The repository will be copied into a directory named for the repo.)
Now you have a local copy of the repository, and by default, it is connected to the remote repository. (The remote repository gets the nickname origin
.) You can work on your local copy, committing files as usual.
However, origin
doesn't know about the commits on your local machine, and your local repository won't know about any updates to origin
until you push or pull them.
Push and Pull
To send your commits up to origin
(the remote repo) you need to push the files:
git push origin master
(The above sends your master branch commits to origin
. In our case, origin
is the remote repository at codestore.)
To update your local repository with any changes added to the remote repository, you need to pull the files. (Or fetch them.) The easiest way to update your local repository is to type
git pull
This will fetch any commits made on your current branch (probably master) and merge them into your current directory.
You can also separate this into two steps. The first retrieves any new snapshots on the remote repository, and the second merges the changes into your current files.
git fetch git merge
Fork Another Repository
You may sometimes want to copy someone else's repository on codestore into your very own version of the repository. This is called a fork. (For example, your professor might provide you a basic repository for an assignment, and ask you to fork it so you can work on your own copy.)
In GitLab (for example, at codestore), you can click on the “Fork” button at the top of a repository to make your own copy:
Then you can clone your fork of the project to your own computer, and proceed as usual.
Setting Options
- Name and email
- Editor for commits
Common Commands
The following are common tasks to complete using git. Where we refer to GitLab, the same task can be completed with other git servers, such as GitHub, usually with very little or no change.
fork
: Make your own copy of someone else's repository on GitLab.
This is done on a GitLab web page by clicking the “fork” button.clone
: Copy a GitLab archive to your personal computer.
Usage:git clone <address>
init
: Initialize a new git repository on your own computer. (Will not be connected to a server, but it is possible to add one later.)
Usage:git init
.status
: Show what files are untracked, changed, staged, etc., in the current local repository.
Usage:git status
add
: Add files to your local staging area.
Usage:git add <filename>
commit
: Move files from the staging area to your local repository.
Usage:git commit -m “Commit message”
.
If the-m “Commit message”
is left out, an editor will open for you write your commit message.pull
: Update your local archive with changes/additions made to the server. (Assumes that you have an “upstream” repository on a server to pull from.)
Usage:git pull
push
: Push changes in your local repository upstream to the server repository. (Assumes that you have an “upstream” repository on a server to pull from.
Usage:git push
. For more control, usegit push origin <branchname>
log
: See a list of commits made to your repository.
Usage:git log
checkout
: Switch to a new branch.
Usage:git checkout <branchname>
diff
: See differences between various branches.
Usage:git diff
for the difference between current files and last commit.
git diff <branchname>
compares to another branch.
The .gitignore File
Often there are files we do not wish to include in the git repository. Usually we don't include object files and executables, for example, since these are generated automatically by compiling your source files. (So we don't actually need copies of these.)
If you want to automatically exclude these files so that git will always ignore them, create a file called .gitignore
in the same directory as the project. In the file, enter a list of all the files you want git to ignore, one per line.
Once you have done this, you will see that (for example) a git status
command will no longer list those files as untracked, and git add -A
will not add those files, and so on.
"I need to..."
Some common tasks you might need to know how to complete for a class assignment:
- I need to make my own copy of a repository on GitLab: (In the following, you can basically replace GitLab with GitHub or most other git servers) This will involve a fork and a clone operation.
- First, fork the repository on GitLab:
- Clone the repository to your home machine:
- I need to submit a copy of my work to a GitLab repository. There are two possible cases:
- This project was not created from a GitLab repository. In this case, you need to create a GitLab repository and connect it with this project. After completing this step, you will move on to the case below, “I have already cloned this project from an existing Gitlab repository” to commit and push your work.
- Easiest way:
- Create a project with whatever name you wish on GitLab.
- Clone your project to a folder on your computer:
- On the GitLab page for your project, select either the address under “SSH” or “HTTPS” (drop down menu near the top of the page), depending on whether you are using HTTPS or SSH. (If in doubt, you can use HTTPS.) Copy the link.
- On your local computer, go to the directory in which you would like to make a folder for your new project. Type
git clone <address you copied>
. This will clone the (empty) GitLab repository into a folder on your computer. - Copy the files from your project into this empty directory. Now you are ready to add your files, commit, and push back to your existing GitLab directory. (See “I have already cloned this project…” below.)
- Slightly more complicated: I have already created a git repository on my local computer, and I want to make a GitLab repository connected to it.
- (cont)
- I have already cloned this project from an existing GitLab repository, and I want to return my changes to that repository.
- First, make sure your current project is up-to-date.
- Run
git status
in your project directory. This will tell you if there are any current changes which are not saved by git. - If anything is not up-to-date in your repository, add any files with changes that you want to make to the staging area with the command
git add <filename>
. If you want to add all changed files, you can usegit add -A
.
Note: It is common to leave some files, like executables and editor save files out of a git repository. If you are trying to leave these out, you may want to set up a git ignore file. - Once everything that you want to save is in the staging area, add it to your local repository with a commit: You can type
git commit
, and git will open an editor for you to write a commit message, which explains what changes you have made. Or you can typegit commit -m “commit message here”
if you want to create a short commit message instead of using an editor.
- Finally, you need to push your local repository to the remote repository. In most cases, you can do this just by typing
git push
in the repository. Now your changes should appear–together with your commit message–on GitLab (or whatever git server you are using).