git tutorial
http://vogella.com/articles/Git/
1. Git
A version control system allows you to track the history of a collection of files and includes the functionality to revert the collection of files to another version. Each version captures a snapshot of the files at a certain point in time. The collection of files is usually source code for a programming language but a typical version control system can put any type of file under version control.
The collection of files and their complete history are stored in a repository.
The process of creating different versions (snapshots) in the repository is depicted in the following graphic. Please note that this picture fits primarily to Git, another version control systems like CVS don't create snapshots but store file deltas.
These snapshots can be used to change your collection of files. You may, for example, revert the collection of files to a state from 2 days ago. Or you may switch between versions for experimental features.
A distributed version control system does not necessarily have a central server which stores the data.
The user can copy an existing repository. This copying process is typically called cloning in a distributed version control system and the resulting repository can be referred to as clone.
Typically there is a central server for keeping a repository but each cloned repository is a full copy of this repository. The decision which of the copies is considered to be the central server repository is a pure convention and not tied to the capabilities of the distributed version control system itself.
Every clone contains the full history of the collection of files and a cloned repository has the same functionality as the original repository.
Every repository can exchange versions of the files with other repositories by transporting these changes. This is typically done via a repository running on a server which is, other than the local machine of a developer, always online.
Git is a distributed version control system.
Git originates from the Linux kernel development and is used by many popular Open Source projects, e.g. the Android or the Eclipse Open Source projects, as well as by many commercial organizations.
The core of Git was originally written in the programming language C but Git has also been re-implemented in other languages, e.g. Java, Ruby and Python.
After cloning or creating a repository the user has a complete copy of the repository. The user performs version control operations against this local repository, e.g. create new versions, revert changes, etc.
You can configure your repository to be a bare or a non-bare repositories.
-
bare repositories are used on servers to share changes coming from different developers
-
non-bare repositories allow you to create new changes through modification of files and to create new versions in the repository
If you want to delete a Git repository, you can simply delete the folder which contains the repository.
Git allows the user to synchronize the local repository with other (remote) repositories.
Users with sufficient authorization can push changes from their local repository to remote repositories. They can also fetch or pull changes from other repositories to their local Git repository.
Git supports branching which means that you can work on different versions of your collection of files. A branch separate these different versions and allows the user to switch between these version to work on them.
For example if you want to develop a new feature, you can create a branch and make the changes in this branch without affecting the state of your files in another branch.
Branches in Git are local to the respository. A branch created in a local repository, which was cloned from another repository, does not need to have a counterpart in the remote repository. Local branches can be compared with other local branches and with remote tracking branches. A remote tracking branches proxy the state of branches in another remote repository.
Git supports that changes from different branches can be combined. This allows the developer for example to work independently on a branch called production for bugfixes and another branch calledfeature_123 for implementing a new feature. The developer can use Git commands to combine the changes at a later point in time.
For example the Linux kernel community used to share code corrections (patches) via mailing lists to combine changes coming from different developers. Git is a system which allows developers to automate such a process.
The user works on a collection of files which may originate from a certain point in time of the repository. The user may also create new files or change and delete existing ones. The current collection of files is called the working tree.
A standard Git repository contains the working tree (single checkout of one version of the project) and the full history of the repository. You can work in this working tree by modifying content and committing the changes to the Git repository.
If you modify your working tree, e.g. by creating a new file or by changing an existing file, you need to perform two steps in Git to persist the changes in the Git repository. You first add selected files to thestaging area and afterwards you commit the changes of the staging area to the Git repository.
Note
The staging area term is currently preferred by the Git community over the old index term. Both terms mean the same thing.
You need to mark changes in the working tree to be relevant for Git. This process is called staging or to add changes to the staging area.
You add changes in the working tree to the staging area with the git add
command. This command stores a snapshot of the specified files in the staging area.
The git add
command allows you to incrementally modify files, stage them, modify and stage them again until you are satisfied with your changes.
After adding the selected files to the staging area, you can commit these files to permanently add them to the Git repository. Committing creates a new persistent snapshot (called commit or commit object) of the complete working tree in the Git repository. A commit object, like all objects in Git, are immutable.
The staging area keeps track of the snapshots of the files until the staged changes are committed.
For committing the staged changes you use the git commit
command.
This process is depicted in the following graphic.
If you commit changes to your Git repository, you create a new commit object in the Git repository. This commit object is addressable via a SHA-1 checksum. This checksum is 40 bytes long and is a secure hash of the content of the files, the content of the directories, the complete history of up to the new commit, the committer and several other factors.
This means that Git is safe, you cannot manipulate a file in the Git repository without Git noticing thatSHA-1 checksum does not fit anymore to the content.
The commit object points via a tree object to the individual files in this commit. The files are stored in the Git repository as blob objects and might be packed by Git for better performance and more compact storage. Blobs are addressed via their SHA-1 hash.
Packing involves storing changes as deltas, compression and storage of many objects in a single pack file. Pack files are accompanied by one or multiple index files which speedup access to individual objects stored in these packs.
A commit object is depicted in the following picture.
The above picture is simplified. Tree objects point to other tree objects and file blobs. Objects which didn't change between commits are reused by multiple commits.
2. Tools
The original tooling for Git was based on the command line. These days there is a huge variety of available Git tools.
You can use graphical tools, for example EGit for the Eclipse IDE.
The following table provides a summary of important Git terminology.
Table 1. Git Terminology
Term | Definition |
---|---|
Branches |
A branch is a named pointer to a commit. Selecting a branch in Git terminology is called to checkout a branch. If you are working in a certain branch, the creation of a new commit advances this pointer to the newly created commit. Each commit knows their parents (predecessors). Successors are retrieved by traversing the commit graph starting from branches or other refs, symbolic reference (e.g. HEAD) or explicit commit objects. This way a branch defines its own line of descendants in the overall version graph formed by all commits in the repository. You can create a new branch from an existing one and change the code independently from other branches. One of the branches is the default (typically named master). The default branch is the one for which a local branch is automatically created when cloning the repository. |
Commit |
When you commit your changes into a repository this creates a new commit object in the Git repository. This commit object uniquely identifies a new revision of the content of the repository. This revision can be retrieved later, for example if you want to see the source code of an older version. Each commit object contains the author and the committer, thus making it possible to identify who did the change. The author and committer might be different people. The author did the change and the committer applied the change to the Git repository. |
HEAD |
HEAD is a symbolic reference most often pointing to the currently checked out branch. Sometimes the HEAD points directly to a commit object, this is called detached HEAD mode. In that state creation of a commit will not move any branch. The first predecessor of HEAD can be addressed via HEAD~1, HEAD~2 and so on. If you switch branches the HEAD pointer moves to the last commit in the branch. If you checkout a specific commit the HEAD points to this commit. |
Index | Index is an alternative term for the staging area. |
Repository |
A repository contains the history, the different versions over time and all different branches and tags. In Git each copy of the repository is a complete repository. If the repository is not a bare repository, it allows you to checkout revisions into your working tree and to capture changes by creating new commits. Bare repositories are only changed by transporting changes from other repositories. This tutorial uses the term repository to talk about a non bare repository. If it talks about a bare repository this is explicitly mentioned. |
Revision | Represents a version of the source code. Git implements revisions as commit objects (or shortcommits). These are identified by a SHA-1 secure hash. SHA-1 ids are 160 bits long and are represented in hexadecimal. |
Staging area | The staging area is the place to store changes in the working tree before the commit. The staging area contains the set of the snapshots of changes in the working tree (change or new files) relevant to create the next commit and stores their mode (file type, executable bit). |
Tags |
A tag points to a commit which uniquely identifies a version of the Git repository. With a tag, you can have a named point to which you can always revert to. You can revert to any point in a Git repository, but tags make it easier. The benefit of tags is to mark the repository for a specific reason e.g. with a release. Branches and tags are named pointers, the difference is that branches move when a new commit is created while tags always point to the same commit. Technically a tag reference can also point to an annotated tag object. |
URL | A URL in Git determines the location of the repository. Git distinguishes between fetchurl for getting new data from other repositories and pushurl for pushing data to another repository. |
Working tree | The working tree contains the set of working files for the repository. You can modify the content and commit the changes as new commits to the repository. |
A file in the working tree of a Git repository can have different states. These states are the following:
-
untracked: the file is not tracked by the Git repository, this means it was neither staged, i.e. added to the staging area nor committed
-
tracked: committed and not staged
-
staged: staged to be included in the next commit
-
dirty / modified: the file has changed but the change is not staged
You can use ^ (caret) and ~ (tilde) to reference predecessor commits objects from other references. Predecessor commits are sometimes also called parent commits. You can combine the ^ and ~ operators.
[reference]~1 describes the first predecessor of the commit object accessed via [reference]. [reference]~2 is the first predecessor of the first predecessor of the [reference] commit. [reference]~3 is the first predecessor of the first predecessor of the first predecessor of the [reference] commit, etc.
[reference]~ is an abbreviation for [reference]~1.
For example you can use the HEAD~1 or HEAD~ reference to access the first [reference] of the commit to which the HEAD pointer currently points.
[reference]^1 also describes the first predecessor of the commit object accessed via [reference].
The difference is that [reference]^2 describes the second predecessor of a commit. A merge commit has two predecessors.
[reference]^ is an abbreviation for [reference]^1.
You can also specify ranges of commits. This is useful for certain Git commands for example for seeing the changes between a series of commits.
The double dot operator allows you to select all commits which are reachable from a commit c2 but not from commit c1. The syntax for this is "c1..c2". A commit A is reachable from another commit B, if A is a direct or indirect predecessor of B.
Tip
Think of c1..c2 as all commits as of c1 (not including c1) until commit c2.
For example you can ask Git to show all commits which happened between HEAD and HEAD~4.
git log HEAD~4..HEAD
This also works for branches. To list all commits which are in the "master" branch but not in the "testing" branch use the following command.
git log testing..master
You can also list all commits which are in the "testing" but not in the "master" branch.
git log master..testing
The triple dot operator allows you to select all commits which are reachable either from commit c1 or commit c2 but not from both of them.
This is useful to show all commits in two branches which have not yet been combined.
# show all commits which
# can be reached by master or testing
# but not both
git log master...testing
On Ubuntu and similar systems you can install the Git command line tool via the following command:
sudo apt-get install git
On Fedora, Red Hat and similar systems you can install the Git command line tool via the following command:
yum install git
To install Git on other Linux distributions please check the documentation of your distribution. The following listing contains the commands for the most popular ones.
# Arch Linux pacman -S git # Gentoo emerge -av git # SUSE zypper install git
A windows version of Git can be found on the msysgit Project site. The URL to this webpage is listed below. This website also describes the installation process.
http://code.google.com/p/msysgit/
The easiest way to install Git on a Mac is via a graphical installer. This installer can be found under the following URL.
http://code.google.com/p/git-osx-installer
As this procedure it not an official Apple one, it may change from time to time. The easiest way to find the current procedure is to Google for the "How to install Git on a Mac" search term.
Git is also installed by default with the Apple Developer Tools on OSX.
Git allows you to store global settings in the .gitconfig
file located in the user home directory. Git stores the committer and author of a change in each commit. This and additional information can be stored in the global settings.
You setup these values with the git config
command.
In each Git repository you can also configure the settings for this repository. Global configuration is done if you include the --global
flag, otherwise your configuration is specific for the current Git repository.
You can also setup system wide configuration. Git stores theses values is in the /etc/gitconfig
file, which contains the configuration for every user and repository on the system. To set this up, ensure you have sufficient rights, i.e. root rights, in your OS and use the --system
option.
The following configures Git so that a certain user and email address is used, enable color coding and tell Git to ignore certain files.
Configure your user and email for Git via the following command.
# configure the user which will be used by git # Of course you should use your name git config --global user.name "Example Surname" # Same for the email address git config --global user.email "your.email@gmail.com"
The following command configure Git so that the git push
command pushes only the active branch (in case it is connected to a remote branch, i.e.configured as remote tracking branches) to your Git remote repository. As of Git version 2.0 this is the default and therefore it is good practice to configure this behavior.
# set default so that only the current branch is pushed
git config --global push.default simple
# alternatively configure Git to push all matching branches
# git config --global push.default matching
You learn about the push command in Section 13.2, “Push changes to another repository”.
If you pull in changes from a remote repository, Git by default creates merge commits if you pull in divergent changes. This may not be undesired and you can avoid this via the following setting.
# set default so that you avoid unnecessary commits
git config --global branch.autosetuprebase always
Note
This setting depends on the individual workflow. Some teams prefer to create merge commits, but the author of this tutorial likes to avoid them.
The following commands enables color highlighting for Git in the console.
git config --global color.ui true git config --global color.status auto git config --global color.branch auto
By default Git uses the system default editor which is taken from the VISUAL or EDITOR environment variables if set. You can configure a different one via the following setting.
# setup vim as default editor for Git (Linux)
git config --global core.editor vim
Git does not provide a default merge tool for integrating conflicting changes into your working tree. You have to use third party visual merge tools like tortoisemerge, p4merge, kdiff3 etc. A Google search for these tools help you to install them on your platform.
Once you have installed them you can set your selected tool as default merge tool with the following command.
# setup kdiff3 as default merge tool (Linux)
git config --global merge.tool kdiff3
# to install it under Ubuntu use
# sudo apt-get install kdiff3
All possible Git settings are described under the following link: git-config manual page
To query your Git settings of the local repository, execute the following command:
git config --list
If you want to query the global settings you can use the following command.
git config --global --list
7. Setup rules for ignoring files and directories
Git can be configured to ignore certain files and directories. This is configured in a .gitignore
file. This file can be in any directory and can contain patterns for files.
You can use certain wildcards in this file. *
matches several characters. The ?
parameter matches one character. More patterns are possible and described under the following URL: gitignore manpage
For example, the following .gitignore
file tells Git to ignore the bin
and target
directories and all files ending with a ~.
# ignore all bin directories # matches "bin" in any subfolder bin/ # ignore all target directories target/ # ignore all files ending with ~ *~
You can create the .gitignore
file in the root directory of the working tree to make it specific for the Git repository.
Note
Files that are committed to the Git repository are not automatically removed if you add them to a .gitignore
file. You can use the git rm -r --cached [filename]
command to remove existing files from a Git repository.
Tip
The .gitignore
file tells Git to ignore the specified files in Git commands. You can still add ignored files to the staging area of the Git repository by using the--force
parameter, i.g. with the git add --force [filename]
command.
This is useful if you want to add for example auto-generated binaries but you need to have a fine control about the version which is added and want to exclude them from the normal workflow.
You can also setup a global .gitignore
file valid for all Git repositories via the core.excludesfile
setting. The setup of this setting is demonstrated in the following code snippet.
# Create a ~/.gitignore in your user directory cd ~/ touch .gitignore # Exclude bin and .metadata directories echo "bin" >> .gitignore echo ".metadata" >> .gitignore echo "*~" >> .gitignore echo "target/" >> .gitignore # Configure Git to use this file # as global .gitignore git config --global core.excludesfile ~/.gitignore
The local .gitignore
file can be committed into the Git repository and therefore is visible to everyone who clones the repository. The global .gitignore
file is only locally visible.
Git ignores empty directories, i.e. it does not put them under version control.
If you want to track such a directory, it is a common practice to put a file called .gitkeep in the directory. The file could be called anything; Git assigns no special significance to this name. As the directory now contains a file, Git includes it into its version control mechanism.
Tip
One problem with this approach is that '.gitkeep' is unlikely to be ignored by version control systems or build agents, resulting in .gitkeep being copied to the output repository. One possible alternative is to create a .gitignore file in there, which has the same effect but will more likely be ignored by tools that do build processing and filtering of SCM specific resources.
In this chapter you create a few files, create a local Git repository and commit your files into this repository. The comments (marked with #) before the commands explain the specific actions.
Open a command shell for the operations.
The following commands create an empty directory which you will use as Git repository.
# switch to home cd ~/ # create a directory and switch into it mkdir ~/repo01 cd repo01 # create a new directory mkdir datafiles
The following explanation is based on a non-bare repository. See Section 3, “Terminology” for the difference between a bare repository and a non-bare repository with a working tree.
Every Git repository is stored in the .git
folder of the directory in which the Git repository has been created. This directory contains the complete history of the repository. The .git/config
file contains the configuration for the repository.
The following command creates a Git repository in the current directory.
# Initialize the Git repository
# for the current directory
git init
All files inside the repository folder excluding the .git
folder are the working tree for a Git repository.
The following commands create some files with some content that will be placed under version control.
# switch to your new repository cd ~/repo01 # create another directory touch datafiles/data.txt # create a few files with content ls > test01 echo "bar" > test02 echo "foo" > test03
The git status
command shows the working tree status, i.e. which files have changed, which are staged and which are not part of the staging area. It also shows which files have merge conflicts and gives an indication what the user can do with these changes, e.g. add them to the staging area or remove them, etc.
Run it via the following command.
git status
Before committing change to a Git repository you need to mark the changes that should be committed. This is done by adding the new and changed files to the staging area. This creates a snapshot of the affected files.
Note
In case you change one of the files again before committing, you need to add it again to the staging area to commit the new changes.
# add all files to the index of the
# Git repository
git add .
Afterwards run the git status
command again to see the current status.
After adding the files to the Git staging area, you can commit them to the Git repository. This creates a new commit object with the staged changes in the Git repository and the HEAD reference points to the new commit. The -m
parameter allows you to specify the commit message. If you leave this parameter out, your default editor is started and you can enter the message in the editor.
# commit your file to the local repository git commit -m "Initial commit"
The Git operations you performed have created a local Git repository in the .git
folder and added all files to this repository via one commit. Run the git log
command
# show the Git log for the change
git log
You see an output similar to the following.
commit e744d6b22afe12ce75cbd1b671b58d6703ab83f5 Author: Lars Vogel <Lars.Vogel@gmail.com> Date: Mon Feb 25 11:48:50 2013 +0100 Initial commit
Your directory contains the Git repository as well as the Git working tree for your files. This directory structure is depicted in the following screenshot.
11. Remove files and adjust the last commit
If you delete a file which is under version control git add .
does not record this file deletion.
You can use the git rm
command to delete the file from your working tree and record the deletion of the file in the staging area.
# Create a file and commit it touch nonsense2.txt git add . git commit -m "more nonsense" # remove the file and record the deletion in Git git rm nonsense2.txt # commit the removal git commit -m "Removes nonsense2.txt file"
Tip
Alternatively to the git rm
command you can use the git commit
command with the -a
flag or the -A
flag in the git add
command. This flag adds changes of files known by the Git repository to the commit in case of the git commit
command. In case of the git add
command it adds all file changes including deletions to the staging area.
For this test, commit a new file and remove it afterwards.
# create a file and put it under version control touch nonsense.txt git add . git commit -m "a new file has been created" # remove the file rm nonsense.txt # show status, output listed below the command git status # on branch master # Changes not staged for commit: # (use "git add/rm <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # deleted: nonsense.txt # # no changes added to commit (use "git add" and/or "git commit -a") # try standard way of committing -> will NOT work # output of the command listed below git add . git commit -m "file has NOT been removed" # On branch master # Changes not staged for commit: # (use "git add/rm <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # deleted: nonsense.txt # # no changes added to commit (use "git add" and/or "git commit -a")
After validating that this command does not remove the file from the Git repository you can use the -a
parameter. Be aware that the -a
adds also other changes.
# commit the remove with the -a flag git commit -a -m "File nonsense.txt is now removed" # alternatively you could add deleted files to the staging area via # git add -A . # git commit -m "File nonsense.txt is now removed"
You can use the git reset [filename]
command to remove a file from the staging area, which you added with git add [filename]
. Removing a file from the staging area, avoids that it included in the next commit.
# create a file and add to index touch unwantedstaged.txt git add unwantedstaged.txt # remove it from the index git reset unwantedstaged.txt # to cleanup, delete it rm unwantedstaged.txt
The git --amend
command makes it possible to replace the last commit. This allows you to change the last commit including the commit message.
Note
The old commit is still available until a clean-up job remove it. SeeSection 31.2, “git reflog” for details.
Assume the last commit message was incorrect as it contained a typo. The following command corrects this via the --amend
parameter.
# assume you have something to commit git commit -m "message with a tpyo here"
git commit --amend -m "More changes - now correct"
You should use the git --amend
command only for commits which have not been pushed to a public branch of another Git repository. The git --amend
command creates a new commit ID and people may have based their work already on the existing commit. In this case they would need to migrate their work based on the new commit.
Sometimes you change your .gitignore
file. Git will stop tracking the new entries from this moment. The last version is still in the Git repository.
If you want to remove the last version of the files from your Git repository you need to do this explicitly via the following command.
# Remove directory .metadata from git repo git rm -r --cached .metadata # Remove file test.txt from repo git rm --cached test.txt
Note
This does not remove the file from the repository history. If the file should also be removed from the history, have a look at git filter-branch
which allows you to rewrite the commit history. See Section 43.1, “Using git filter-branch” for details.
Remotes are URLs in a Git repository to other remote repositories that are hosted on the Internet, locally or in the network.
Such remotes can be used to synchronize the changes of several Git repositories. A local Git repository can be connected to multiple remote repositories and you can synchronize your local repository with them via Git operations.
Note
Think of remotes as shorter bookmarks for repositories. You can always connect to a remote repository if you know its URL and if you have access to it. Without remotes the user would have to type the URL for each and every command which communicates with another repository.
It is possible that users connect their individual repositories directly, but a typically Git workflow involves one or more remote repositories which are used to synchronize the individual repository. Typically the remote repository which is used for synchronization is located on a server which is always available.
Tip
A remote repository can also be hosted in the local file system.
A remote repository on a server typically does not require a working tree. A Git repository without aworking tree is called a bare repository. You can create such a repository with the --bare
option. The command to create a new empty bare remote repository is displayed below.
# create a bare repository
git init --bare
By convention the name of a bare repository should end with the .git
extension.
In this section you create a bare Git repository. In order to simplify the following examples, the Git repository is hosted locally in the filesystem and not on a server in the Internet.
Note
To create a bare Git repository in the Internet you would for example connect to your server via the ssh protocol or you would use some Git hosting platform, e.g. Github.com.
Execute the following commands to create a bare repository based on your existing Git repository.
# switch to the first repository cd ~/repo01 # create a new bare repository by cloning the first one git clone --bare . ../remote-repository.git # check the content of the git repo, it is similar # to the .git directory in repo01 # files might be packed in the bare repository ls ~/remote-repository.git
Tip
You can convert a normal Git repository into a bare repository by moving the content of the .git
folder into the root of the repository and removing all others files from the working tree. Afterwards you need to update the Git repository configuration with the git config core.bare true
command. The problem with this process is that it does not take into account potential future internal changes of Git, hence cloning a repository with the --bare option should be preferred.
If you clone a repository, Git implicitly creates a remote named origin by default. The origin remotelinks back to the cloned repository.
If you create a Git repository from scratch with the git init
command, the origin remote is not created automatically.
You add more remotes to your repository with the git remote add
command.
You created earlier a new Git repository from scratch. Use the following command to add a pointer to your new bare repository using the origin name.
# Add ../remote-repository.git with the name origin
git remote add origin ../remote-repository.git
You can synchronize your local Git repository with remote repositories. These commands are covered in detail in later sections but the following command demonstrates how you can send changes to your remote repository.
# do some changes echo "I added a remote repo" > test02 # commit git commit -a -m "This is a test for the new remote origin" # to push use the command: # git push [target] # default for [target] is origin git push origin
To see the existing definitions of the remote repositories, use the following command.
# show the details of the remote repo called origin
git remote show origin
To see the details of the remotes, e.g. the URL use the following command.
# show the existing defined remotes git remote # show details about the remotes git remote -v
13. Cloning remote repositories and push and pull
Clone a repository and checkout a working tree in a new directory via the following commands.
# Switch to home cd ~ # Make new directory mkdir repo02 # Switch to new directory cd ~/repo02 # Clone git clone ../remote-repository.git .
The git push
command allows you to send data to other repositories. By default it sends data from your current branch to the same branch of the remote repository. See Section 16.6, “Push changes of a branch to a remote repository” for details on pushing branches or Git push manpage for general information.
Make some changes in your local repository and push them from your first repository to the remote repository via the following commands.
# Make some changes in the first repository cd ~/repo01 # Make some changes in the file echo "Hello, hello. Turn your radio on" > test01 echo "Bye, bye. Turn your radio off" > test02 # Commit the changes, -a will commit changes for modified files # but will not add automatically new files git commit -a -m "Some changes" # Push the changes git push ../remote-repository.git
Note
By default you can only push to bare repositories (repositories without working tree). Also you can only push a change to a remote repository which results in a fast-forward merge. See Section 34, “Merging” to learn about fast-forward merges.
The git pull
command allows you to get the latest changes from another repository for the current branch.
To test this in your example Git repositories, switch to your second repository, pull in the recent changes in the remote repository, make some changes, push them to your remote repository via the following commands.
# switch to second directory cd ~/repo02 # pull in the latest changes of your remote repository git pull # make changes echo "A change" > test01 # commit the changes git commit -a -m "A change" # push changes to remote repository # origin is automatically created as we cloned original from this repository git push origin
You can pull in the changes in your first example repository with the following commands.
# switch to the first repository and pull in the changes cd ~/repo01 git pull ../remote-repository.git/ # check the changes git status
Tip
The git pull
command is actually a shortcut for git fetch
followed by thegit merge
or git rebase
command depending on your configuration. InSection 6.4, “Avoid merge commits for pulling” you configured your Git repository so that git pull
is a fetch followed by a rebase. See Section 33.1, “Fetch” for more information about the fetch command.
Git supports several transport protocols to connect to other Git repositories; the native protocol for Git is also called git
.
The following command clones an existing repository using the Git protocol. The Git protocol uses the port 9148 which might be blocked by firewalls.
# switch to a new directory mkdir ~/online cd ~/online # clone online repository git clone git://github.com/vogella/gitbook.git
If you have ssh access to a Git repository you can also use the ssh protocol. The name preceding @ is the user name used for the ssh connection.
# clone online repository git clone ssh://git@github.com/vogella/gitbook.git # older syntax git clone git@github.com:vogella/gitbook.git
Alternatively you could clone the same repository via the http
protocol.
# The following will clone via HTTP
git clone http://vogella@github.com/vogella/gitbook.git
As discussed earlier cloning repository creates a remote called origin
pointing to the remote repository which you cloned from.
You can push changes to this origin repository via git push
as Git uses origin
as default. Of course, pushing to a remote repository requires write access to this repository.
You can add more remotes via the git remote add [name] [URL_to_Git_repo]
command. For example if you cloned the repository from above via the Git protocol, you could add a new remote with the name github_http for the http protocol via the following command.
// Add the https protocol
git remote add github_http https://vogella@github.com/vogella/gitbook.git
It is possible to use the HTTP protocol to clone Git repositories. This is especially helpful, if your firewall blocks everything except http or https.
Git also provides support for http access via a proxy server. The following Git command could, for example, clone a repository via http and a proxy. You can either set the proxy variable in general for all applications or set it only for Git.
This example uses environment variables.
# Linux export http_proxy=http://proxy:8080 export https_proxy=https://proxy:8443 # Windows set http_proxy http://proxy:8080 set https_proxy http://proxy:8080 git clone http://dev.eclipse.org/git/org.eclipse.jface/org.eclipse.jface.snippets.git
Note
For secured SSL encrypted communication you should use the ssh or https protocol in order to guarantee security.
This example uses the following Git config settings.
// set proxy for git globally git config --global http.proxy http://proxy:8080 // to check the proxy settings git config --get http.proxy // just in case you need to you can also revoke the proxy settings git config --global --unset http.proxy
15. What are branches?
Git allows you to create branches, i.e. named pointers to commits. You can work on different branches independently from each other. The default branch is most often called master.
Git allows you to create branches very fast and cheap in terms of resource consumption. Git encourages the usage of branches on a regular basis.
If you decide to work on a branch, you checkout this branch. This means that Git populates the working tree with the content of the commit to which the branch points and moves the HEAD pointer to the new branch. As explained in Section 3, “Terminology” HEAD is a symbolic reference most often pointing to the currently checked out branch.
The git branch
command lists all local branches. The currently active branch is marked with *
.
# lists available branches
git branch
If you want to see all branches (including remote tracking branches), use the -a
for the git branch
command. See Section 32.1, “Remote tracking branches” for information about remote tracking branches.
# lists all branches including the remote branches
git branch -a
The -v
option lists more information about the branches.
in order to list branches or tags in a remote repository use the git ls-remote
command as demonstrated in the following example.
# lists branches and tags in the
# remote repository called origin
git ls-remote origin
You can create a new branch via the git branch [newname]
command. This command allows to specify the starting point (commit id, tag, remote or local branch). If not specified the commit to which the HEAD reference points is used to create the branch.
# Syntax: git branch <name> <hash>
# <hash> in the above is optional
git branch testing
To start working in a branch you have to checkout the branch. If you checkout a branch the HEAD pointer moves to the last commit in this branch and the files in the working tree are set to the state of this commit.
The following commands demonstrates how you switch to the branch called testing, perform some changes in this branch and switch back to the branch called master.
# switch to your new branch git checkout testing # do some changes echo "Cool new feature in this branch" > test01 git commit -a -m "new feature" # switch to the master branch git checkout master # check that the content of # the test01 file is the old one cat test01
To create a branch and to switch to it at the same time you can use the git checkout
command with the -b
parameter.
# Create branch and switch to it git checkout -b bugreport12 # Creates a new branch based on the master branch # without the last commit git checkout -b mybranch master~1
Renaming a branch can be done with the following command.
# rename branch
git branch -m [old_name] [new_name]
To delete a branch which is not needed anymore, you can use the following command.
# delete branch testing git branch -d testing # check if branch has been deleted git branch
You can push the changes in the current active branch to a remote repository by specifying the target branch. This creates the target branch in the remote repository if it does not yet exist.
# push current branch to a branch called "testing" to remote repository git push origin testing # switch to the testing branch git checkout testing # some changes echo "News for you" > test01 git commit -a -m "new feature in branch" # push all including branch git push
This way you can decide which branches you want to push to other repositories and which should be local branches. You learn more about branches and remote repositories in Section 32.1, “Remote tracking branches”.
To see the difference between two branches you can use the following command.
# shows the differences between
# current head of master and your_branch
git diff master your_branch
You can also use commmit ranges as described in Section 4.2, “Commit ranges with the double dot operator” and Section 4.3, “Commit ranges with the triple dot operator”. For example if you compare a branch called your_branch with the master branch the following command shows the changes in your_branch and master since these branches diverged.
# shows the differences in your
# branch based on the common
# ancestor for both branches
git diff master...your_branch
See Section 22, “Viewing changes with git diff and git show” for more examples of the git diff
command.
Git has the option to tag a commit in the repository history so that you find it easier at a later point in time. Most commonly, this is used to tag a certain version which has been released.
If you tag a commit you create an annotated or lightweight tag.
Git supports two different types of tags, lightweight and annotated tags.
A lightweight tag is a pointer to a commit, without any additional information about the tag. Anannotated tag contains additional information about the tag, e.g. the name and email of the person who created the tag, a tagging message and the date of the tagging. Annotated tags can also be signed and verified with GNU Privacy Guard (GPG).
To create a lightweight tag don't use the -m
, -a
or -s
option. Lightweight tags are often used for build tags which do not need additional information other than the build number or the timestamp.
# create lightweight tag git tag 1.7.1 # See the tag git show 1.7.1
You can create a new annotated tag via the git tag -a
command. An annotated tag can also be created using the -m
parameter, which is used to specify the description of the tag. The following command tags the current active HEAD.
# create tag git tag 1.6.1 -m 'Release 1.6.1' # show the tag git show 1.6.1
You can also create tags for a certain commit id.
git tag 1.5.1 -m 'version 1.5' [commit id]
You can use the option -s
to create a signed tag. These tags are signed with GNU Privacy Guard (GPG)and can also be verified with GPG. For details on this please see the following URL: Git tag manpage .
If you want to use the code associated with the tag, use:
git checkout <tag_name>
Warning
If you checkout a tag, you are in the detached head mode and commits created in this mode are harder to find after you checkout a branch again. See Section 31.1, “Detached HEAD” for details.
By default the git push
command does not transfer tags to remote repositories. You explicitly have to push the tag with the following command.
# push a tag or branch called tagname
git push origin [tagname]
# to explicitly push a tag and not a branch
git push origin tag <tagname>
You can delete tags with the -d
parameter. This deletes the tag from your local repository. By default Git does not push tag deletions to a remote repository, you have to trigger that explicitly.
The following commands demonstrate how to push a tag deletion.
# delete tag locally git tag -d 1.7.0 # delete tag in remote repository # called origin git push origin :refs/tags/1.7.0
Tags are frequently used to tag the state of a release of the Git repository. In this case they are typically called release tags.
Convention is that release tags are labeled based on the [major].[minor].[patch] naming scheme, for example "1.0.0". Several projects also use the "v" prefix.
The idea is that the patch version is incremented if (only) backwards compatible bug fixes are introduced, the minor version is incremented if new, backwards compatible functionality is introduced to the public API and the major version is incremented if any backwards incompatible changes are introduced to the public API.
For the detailed discussion on naming conventions please see the following URL: Semantic versioning.
Tip
Git is able to store different proxy configurations for different domains, see "core.gitProxy" in Git config manpage .
20. Viewing changes in the working tree with git status
The git status
command shows the status of the working tree, i.e. which files have changed, which are staged and which are not part of the index. It also shows which files have merge conflicts and gives an indication what the user can do with these changes, e.g. add them to the staging area or remove them, etc.
The following commands create some changes in your Git repository.
# make some changesm, assumes that the test01 # and test02 files exists # and have been committed in the past echo "This is a new change to the file" > test01 echo "and this is another new change" > test02 # create a new file ls > newfileanalyzis.txt
The git status
command show the current status of your repository and suggest possible actions.
# see the current status of your repository
# (which files are changed / new / deleted)
git status
The output of the command looks like the following.
# On branch master # Your branch is ahead of 'origin/master' by 1 commit. # (use "git push" to publish your local commits) # # Changes not staged for commit: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: test01 # modified: test02 # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # newfileanalyzis.txt no changes added to commit (use "git add" and/or "git commit -a")
The git log
commands shows the history of your repository in the current branch, i.e. the list of commits.
# show the history of commits in the current branch
git log
The oneline
parameter fits the output of the git log
command in one line.
If you use the abbrev-commit
parameter the git log
command uses shorter versions of the SHA-1 identifier for a commit object but keep the SHA-1 unique.
The graph
parameter draws a text-based graphical representation of the branches and the merge history of the Git repository.
# uses shortend but unique SHA-1 values # for the commit objects git log --abbrev-commit # show the history of commits in one line # with a shortened version of the commit id # --online is a shorthand for "--pretty=oneline --abbrev-commit" git log --oneline # show the history as graph including branches git log --graph --oneline
For more options on the git log
command see the Git log manpage .
To see changes in a file you can use the -p
option in the git log
command.
# git log filename shows the commits for this file git log [file path] # Use -p to see the diffs of each commit git log -p filename # --follow shows the entire history # including renames git log --follow -p file
To see which commit deleted a file you can use the following command.
# see the changes of a file, works even # if the file was deleted git log -- [file path] # limit the output of Git log to the # last commit, i.e. the commit which delete the file # -1 to see only the last commit # use 2 to see the last 2 commits etc git log -1 -- [file path] # include stat parameter to see # some statics, e.g. how many files were # deleted git log -1 --stat -- [file path]
Note
The double hyphens (--) in Git separate flags from non-flags (usually filenames).
The git diff
command allows the user to see the changes made. In order to test this, make some changes to a file and check what the git diff
command shows to you. Afterwards commit the changes to the repository.
# make some changes to the file echo "This is a change" > test01 echo "and this is another change" > test02 # check the changes via the diff command git diff # optional you can also specify a path to filter the displayed changes # path can be a file or directory # git diff [path]
To see which changes you have staged, i.e. you are going to commit with the next commit, use the following command.
# make some changes to the file
git diff --cached
The git blame
command allows you to see which commit and author modified a file on a per line base.
# git blame shows the author and commit per # line of a file git blame [filename] # the -L option allows to limit the selection # for example by line number # only show line 1 and 2 in git blame git blame -L 1,2 [filename]
The git shortlog
command summarizes the git log
output, it groups all commits by author and includes the first line of the commit message.
The -s
option suppresses the commit message and provides a commit count. The -n
option sorts the output based on the number of commits by author.
# gives a summary of the changes by author git shortlog # compressed summary git shortlog -sn
Git provides the git stash
command which allows you to record the current state of the working directory and the staging area and go back to the last committed revision.
This allows you to pull in the latest changes or to develop an urgent fix. Afterwards you can restore the stashed changes, which will reapply the changes to the current version of the source code.
In general using the stash command should be the exception in using Git. Typically you would create new branches for new features and switch between branches. You can also commit frequently in your local Git repository and use interactive rebase to combine these commits later before pushing them to another Git repository.
Tip
You can avoid using the git stash
command. In this case you commit the changes you want to put aside and use the git commit --amend
command to change the commit later. If you use the approach of creating a commit, you typically put a marker in the commit message to mark it as a draft, e.g. "[DRAFT] implement feature x".
The following commands will save a stash and reapply them after some changes.
# Create a stash with uncommitted changes git stash # TODO do changes to the source, e.g. by pulling # new changes from a remote repo # Afterwards reapply the stashed changes # and delete the stash from the list of stashes git stash pop
It is also possible to keep a list of stashes.
# create a stash with uncommitted changes git stash save # see the list of available stashes git stash list # Result might be something like: stash@{0}: WIP on master: 273e4a0 Resize issue in Dialog stash@{1}: WIP on master: 273e4b0 Silly typo in Classname stash@{2}: WIP on master: 273e4c0 Silly typo in Javadoc # you can use the ID to apply a stash git stash apply stash@{0} # Or apply the latest stash and delete it afterwards git stash pop # also you can remove a stashed change # without applying it git stash drop stash@{0} # or delete all stashes git stash clear