Pro Git - 笔记3
Git Branching
Branches in a Nutshell
Branches in a Nutshell
let’s assume that you have a directory containing three files, and you stage them all and commit. Staging the files computes a checksum for each one (the SHA-1 hash), stores that version of the file in the Git repository (Git refers to them as blobs), and adds that checksum to the staging area:
$ git add README test.rb LICENSE $ git commit -m 'The initial commit of my project'
When you create the commit by running git commit
, Git checksums each subdirectory (in this case, just the root project directory) and stores those tree objects in the Git repository. Git then creates a commit object that has the metadata and a pointer to the root project tree so it can re-create that snapshot when needed.
Your Git repository now contains five objects: three blobs (each representing the contents of one of the three files), one tree that lists the contents of the directory and specifies which file names are stored as which blobs, and one commit with the pointer to that root tree and all the commit metadata.
If you make some changes and commit again, the next commit stores a pointer to the commit that came immediately before it.
A branch in Git is simply a lightweight movable pointer to one of these commits. The default branch name in Git is master
. As you start making commits, you’re given a master
branch that points to the last commit you made. Every time you commit, the master
branch pointer moves forward automatically.
Creating a New Branch
What happens if you create a new branch? Well, doing so creates a new pointer for you to move around.
$ git branch testing
This creates a new pointer to the same commit you’re currently on.
How does Git know what branch you’re currently on? It keeps a special pointer called HEAD
.
In Git, this is a pointer to the local branch you’re currently on.
In this case, you’re still on master
. The git branch
command only created a new branch — it didn’t switch to that branch.
You can easily see this by running a simple git log
command that shows you where the branch pointers are pointing. This option is called --decorate
.
$ git log --oneline --decorate f30ab (HEAD -> master, testing) add feature #32 - ability to add new formats to the central interface 34ac2 Fixed bug #1328 - stack overflow under certain conditions 98ca9 The initial commit of my project
You can see the “master” and “testing” branches that are right there next to the f30ab
commit.
Switching Branches
To switch to an existing branch, you run the git checkout
command. Let’s switch to the new testing
branch:
$ git checkout testing
This moves HEAD
to point to the testing
branch.
What is the significance of that? Well, let’s do another commit:
$ vim test.rb $ git commit -a -m 'made a change'
This is interesting, because now your testing
branch has moved forward, but your master
branch still points to the commit you were on when you ran git checkout
to switch branches. Let’s switch back to the master
branch:
$ git checkout master
That command did two things. It moved the HEAD pointer back to point to the master
branch, and it reverted the files in your working directory back to the snapshot that master
points to. This also means the changes you make from this point forward will diverge from an older version of the project. It essentially rewinds the work you’ve done in your testing
branch so you can go in a different direction.
Let’s make a few changes and commit again:
$ vim test.rb $ git commit -a -m 'made other changes'
Now your project history has diverged (see Divergent history). You created and switched to a branch, did some work on it, and then switched back to your main branch and did other work. Both of those changes are isolated in separate branches: you can switch back and forth between the branches and merge them together when you’re ready. And you did all that with simple branch
, checkout
, and commit
commands.
You can also see this easily with the git log
command. If you run git log --oneline --decorate --graph --all
it will print out the history of your commits, showing where your branch pointers are and how your history has diverged.
$ git log --oneline --decorate --graph --all * c2b9e (HEAD, master) made other changes | * 87ab2 (testing) made a change |/ * f30ab add feature #32 - ability to add new formats to the * 34ac2 fixed bug #1328 - stack overflow under certain conditions * 98ca9 initial commit of my project
Because a branch in Git is actually a simple file that contains the 40 character SHA-1 checksum of the commit it points to, branches are cheap to create and destroy. Creating a new branch is as quick and simple as writing 41 bytes to a file (40 characters and a newline).
because we’re recording the parents when we commit, finding a proper merge base for merging is automatically done for us and is generally very easy to do. These features help encourage developers to create and use branches often.
Basic Branching and Merging
Basic Branching
First, let’s say you’re working on your project and have a couple of commits already on the master
branch.
You’ve decided that you’re going to work on issue #53 in whatever issue-tracking system your company uses. To create a new branch and switch to it at the same time, you can run the git checkout
command with the -b
switch:
$ git checkout -b iss53 Switched to a new branch "iss53"
This is shorthand for:
$ git branch iss53
$ git checkout iss53
You work on your website and do some commits. Doing so moves the iss53
branch forward, because you have it checked out (that is, your HEAD
is pointing to it):
$ vim index.html $ git commit -a -m 'added a new footer [issue 53]'
Now you get the call that there is an issue with the website, and you need to fix it immediately.
All you have to do is switch back to your master
branch.
However, before you do that, note that if your working directory or staging area has uncommitted changes that conflict with the branch you’re checking out, Git won’t let you switch branches. It’s best to have a clean working state when you switch branches.
For now, let’s assume you’ve committed all your changes, so you can switch back to your master
branch:
$ git checkout master Switched to branch 'master'
This is an important point to remember: when you switch branches, Git resets your working directory to look like it did the last time you committed on that branch. It adds, removes, and modifies files automatically to make sure your working copy is what the branch looked like on your last commit to it.
Next, you have a hotfix to make. Let’s create a hotfix
branch on which to work until it’s completed:
$ git checkout -b hotfix Switched to a new branch 'hotfix' $ vim index.html $ git commit -a -m 'fixed the broken email address' [hotfix 1fb7853] fixed the broken email address 1 file changed, 2 insertions(+)
You can run your tests, make sure the hotfix is what you want, and finally merge the hotfix
branch back into your master
branch to deploy to production.
You do this with the git merge
command:
$ git checkout master $ git merge hotfix Updating f42c576..3a0874c Fast-forward index.html | 2 ++ 1 file changed, 2 insertions(+)
You’ll notice the phrase “fast-forward” in that merge. Because the commit C4
pointed to by the branch hotfix
you merged in was directly ahead of the commit C2
you’re on, Git simply moves the pointer forward. To phrase that another way, when you try to merge one commit with a commit that can be reached by following the first commit’s history, Git simplifies things by moving the pointer forward because there is no divergent work to merge together — this is called a “fast-forward.”
Your change is now in the snapshot of the commit pointed to by the master
branch, and you can deploy the fix.
After your super-important fix is deployed, you’re ready to switch back to the work you were doing before you were interrupted. However, first you’ll delete the hotfix
branch, because you no longer need it — the master
branch points at the same place. You can delete it with the -d
option to git branch
:
$ git branch -d hotfix
Deleted branch hotfix (3a0874c).
Now you can switch back to your work-in-progress branch on issue #53 and continue working on it.
$ git checkout iss53 Switched to branch "iss53" $ vim index.html $ git commit -a -m 'finished the new footer [issue 53]' [iss53 ad82d7a] finished the new footer [issue 53] 1 file changed, 1 insertion(+)
It’s worth noting here that the work you did in your hotfix
branch is not contained in the files in your iss53
branch. If you need to pull it in, you can merge your master
branch into your iss53
branch by running git merge master
, or you can wait to integrate those changes until you decide to pull the iss53
branch back into master
later.
Basic Merging
Suppose you’ve decided that your issue #53 work is complete and ready to be merged into your master
branch. In order to do that, you’ll merge your iss53
branch into master
, much like you merged your hotfix
branch earlier. All you have to do is check out the branch you wish to merge into and then run the git merge
command:
$ git checkout master Switched to branch 'master' $ git merge iss53 Merge made by the 'recursive' strategy. index.html | 1 + 1 file changed, 1 insertion(+)
This looks a bit different than the hotfix
merge you did earlier. In this case, your development history has diverged from some older point. Because the commit on the branch you’re on isn’t a direct ancestor of the branch you’re merging in, Git has to do some work. In this case, Git does a simple three-way merge, using the two snapshots pointed to by the branch tips and the common ancestor of the two.
Instead of just moving the branch pointer forward, Git creates a new snapshot that results from this three-way merge and automatically creates a new commit that points to it. This is referred to as a merge commit, and is special in that it has more than one parent.
Now that your work is merged in, you have no further need for the iss53
branch. You can close the ticket in your ticket-tracking system, and delete the branch:
$ git branch -d iss53
Basic Merge Conflicts
Occasionally, this process doesn’t go smoothly. If you changed the same part of the same file differently in the two branches you’re merging together, Git won’t be able to merge them cleanly. If your fix for issue #53 modified the same part of a file as the hotfix
branch, you’ll get a merge conflict that looks something like this:
$ git merge iss53 Auto-merging index.html CONFLICT (content): Merge conflict in index.html Automatic merge failed; fix conflicts and then commit the result.
Git hasn’t automatically created a new merge commit. It has paused the process while you resolve the conflict. If you want to see which files are unmerged at any point after a merge conflict, you can run git status
:
$ git status On branch master You have unmerged paths. (fix conflicts and run "git commit") Unmerged paths: (use "git add <file>..." to mark resolution) both modified: index.html no changes added to commit (use "git add" and/or "git commit -a")
Anything that has merge conflicts and hasn’t been resolved is listed as unmerged.
Git adds standard conflict-resolution markers to the files that have conflicts, so you can open them manually and resolve those conflicts. Your file contains a section that looks something like this:
This means the version in HEAD
(your master
branch, because that was what you had checked out when you ran your merge command) is the top part of that block (everything above the =======
), while the version in your iss53
branch looks like everything in the bottom part. In order to resolve the conflict, you have to either choose one side or the other or merge the contents yourself. For instance, you might resolve this conflict by replacing the entire block with this:
<div id="footer"> please contact us at email.support@github.com </div>
This resolution has a little of each section, and the <<<<<<<
, =======
, and >>>>>>>
lines have been completely removed. After you’ve resolved each of these sections in each conflicted file, run git add
on each file to mark it as resolved. Staging the file marks it as resolved in Git.
If you want to use a graphical tool to resolve these issues, you can run git mergetool
, which fires up an appropriate visual merge tool and walks you through the conflicts:
$ git mergetool This message is displayed because 'merge.tool' is not configured. See 'git mergetool --tool-help' or 'git help config' for more details. 'git mergetool' will now attempt to use one of the following tools: opendiff kdiff3 tkdiff xxdiff meld tortoisemerge gvimdiff diffuse diffmerge ecmerge p4merge araxis bc3 codecompare vimdiff emerge Merging: index.html Normal merge conflict for 'index.html': {local}: modified file {remote}: modified file Hit return to start merge resolution tool (opendiff):
If you want to use a merge tool other than the default (Git chose opendiff
in this case because the command was run on a Mac), you can see all the supported tools listed at the top after “one of the following tools.” Just type the name of the tool you’d rather use.
After you exit the merge tool, Git asks you if the merge was successful. If you tell the script that it was, it stages the file to mark it as resolved for you. You can run git status
again to verify that all conflicts have been resolved:
$ git status On branch master All conflicts fixed but you are still merging. (use "git commit" to conclude merge) Changes to be committed: modified: index.html
If you’re happy with that, and you verify that everything that had conflicts has been staged, you can type git commit
to finalize the merge commit. The commit message by default looks something like this:
Merge branch 'iss53' Conflicts: index.html # # It looks like you may be committing a merge. # If this is not correct, please remove the file # .git/MERGE_HEAD # and try again. # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # On branch master # All conflicts fixed but you are still merging. # # Changes to be committed: # modified: index.html #
Branch Management
The git branch
command does more than just create and delete branches. If you run it with no arguments, you get a simple listing of your current branches:
$ git branch iss53 * master testing
Notice the *
character that prefixes the master
branch: it indicates the branch that you currently have checked out (i.e., the branch that HEAD
points to). This means that if you commit at this point, the master
branch will be moved forward with your new work. To see the last commit on each branch, you can run git branch -v
:
$ git branch -v iss53 93b412c fix javascript issue * master 7a98805 Merge branch 'iss53' testing 782fd34 add scott to the author list in the readmes
The useful --merged
and --no-merged
options can filter this list to branches that you have or have not yet merged into the branch you’re currently on. To see which branches are already merged into the branch you’re on, you can run git branch --merged
:
$ git branch --merged
iss53
* master
Because you already merged in iss53
earlier, you see it in your list. Branches on this list without the *
in front of them are generally fine to delete with git branch -d
; you’ve already incorporated their work into another branch, so you’re not going to lose anything.
To see all the branches that contain work you haven’t yet merged in, you can run git branch --no-merged
:
$ git branch --no-merged
testing
This shows your other branch. Because it contains work that isn’t merged in yet, trying to delete it with git branch -d
will fail:
$ git branch -d testing error: The branch 'testing' is not fully merged. If you are sure you want to delete it, run 'git branch -D testing'.
If you really do want to delete the branch and lose that work, you can force it with -D
, as the helpful message points out.
Branching Workflow
Long-Running Branches
In reality, we’re talking about pointers moving up the line of commits you’re making. The stable branches are farther down the line in your commit history, and the bleeding-edge branches are farther up the history.
It’s generally easier to think about them as work silos, where sets of commits graduate to a more stable silo when they’re fully tested.
Topic Branches
Consider an example of doing some work (on master
), branching off for an issue (iss91
), working on it for a bit, branching off the second branch to try another way of handling the same thing (iss91v2
), going back to your master
branch and working there for a while, and then branching off there to do some work that you’re not sure is a good idea (dumbidea
branch). Your commit history will look something like this:
Now, let’s say you decide you like the second solution to your issue best (iss91v2
); and you showed the dumbidea
branch to your coworkers, and it turns out to be genius. You can throw away the original iss91
branch (losing commits C5
and C6
) and merge in the other two. Your history then looks like this:
It’s important to remember when you’re doing all this that these branches are completely local. When you’re branching and merging, everything is being done only in your Git repository — no server communication is happening.
Remote Branches
You can get a full list of remote references explicitly with git ls-remote [remote]
, or git remote show [remote]
for remote branches as well as more information.
Remote-tracking branches take the form <remote>/<branch>
. For instance, if you wanted to see what the master
branch on your origin
remote looked like as of the last time you communicated with it, you would check the origin/master
branch. If you were working on an issue with a partner and they pushed up an iss53
branch, you might have your own local iss53
branch, but the branch on the server would be represented by the remote-tracking branch origin/iss53
.
This may be a bit confusing, so let’s look at an example. Let’s say you have a Git server on your network at git.ourcompany.com
. If you clone from this, Git’s clone
command automatically names it origin
for you, pulls down all its data, creates a pointer to where its master
branch is, and names it origin/master
locally. Git also gives you your own local master
branch starting at the same place as origin’s master
branch, so you have something to work from.
If you do some work on your local master
branch, and, in the meantime, someone else pushes to git.ourcompany.com
and updates its master
branch, then your histories move forward differently. Also, as long as you stay out of contact with your origin server, your origin/master
pointer doesn’t move.
To synchronize your work, you run a git fetch origin
command. This command looks up which server “origin” is (in this case, it’s git.ourcompany.com
), fetches any data from it that you don’t yet have, and updates your local database, moving your origin/master
pointer to its new, more up-to-date position.
To demonstrate having multiple remote servers and what remote branches for those remote projects look like, let’s assume you have another internal Git server that is used only for development by one of your sprint teams. This server is at git.team1.ourcompany.com
. You can add it as a new remote reference to the project you’re currently working on by running the git remote add
command. Name this remote teamone
, which will be your shortname for that whole URL.
Now, you can run git fetch teamone
to fetch everything the remote teamone
server has that you don’t have yet. Because that server has a subset of the data your origin
server has right now, Git fetches no data but sets a remote-tracking branch called teamone/master
to point to the commit that teamone
has as its master
branch.
Pushing
When you want to share a branch with the world, you need to push it up to a remote that you have write access to. Your local branches aren’t automatically synchronized to the remotes you write to — you have to explicitly push the branches you want to share. That way, you can use private branches for work you don’t want to share, and push up only the topic branches you want to collaborate on.
If you have a branch named serverfix
that you want to work on with others, you can push it up the same way you pushed your first branch.
Run git push <remote> <branch>
:
$ git push origin serverfix Counting objects: 24, done. Delta compression using up to 8 threads. Compressing objects: 100% (15/15), done. Writing objects: 100% (24/24), 1.91 KiB | 0 bytes/s, done. Total 24 (delta 2), reused 0 (delta 0) To https://github.com/schacon/simplegit * [new branch] serverfix -> serverfix
which means, “Take my serverfix local branch and push it to update the remote’s serverfix branch.”
You can also do git push origin serverfix:serverfix
, which does the same thing — it says, “Take my serverfix and make it the remote’s serverfix.”
You can use this format to push a local branch into a remote branch that is named differently. If you didn’t want it to be called serverfix
on the remote, you could instead run git push origin serverfix:awesomebranch
to push your local serverfix
branch to the awesomebranch
branch on the remote project.
The next time one of your collaborators fetches from the server, they will get a reference to where the server’s version of serverfix
is under the remote branch origin/serverfix
:
$ git fetch origin remote: Counting objects: 7, done. remote: Compressing objects: 100% (2/2), done. remote: Total 3 (delta 0), reused 3 (delta 0) Unpacking objects: 100% (3/3), done. From https://github.com/schacon/simplegit * [new branch] serverfix -> origin/serverfix
It’s important to note that when you do a fetch that brings down new remote-tracking branches, you don’t automatically have local, editable copies of them. In other words, in this case, you don’t have a new serverfix
branch — you only have an origin/serverfix
pointer that you can’t modify.
To merge this work into your current working branch, you can run git merge origin/serverfix
. If you want your own serverfix
branch that you can work on, you can base it off your remote-tracking branch:
$ git checkout -b serverfix origin/serverfix Branch serverfix set up to track remote branch serverfix from origin. Switched to a new branch 'serverfix'
This gives you a local branch that you can work on that starts where origin/serverfix
is.
Tracking Branches
Checking out a local branch from a remote-tracking branch automatically creates what is called a “tracking branch” (and the branch it tracks is called an “upstream branch”). Tracking branches are local branches that have a direct relationship to a remote branch. If you’re on a tracking branch and type git pull
, Git automatically knows which server to fetch from and which branch to merge in.
When you clone a repository, it generally automatically creates a master
branch that tracks origin/master
. However, you can set up other tracking branches if you wish — ones that track branches on other remotes, or don’t track the master
branch. The simple case is the example you just saw, running git checkout -b <branch> <remote>/<branch>
. This is a common enough operation that Git provides the --track
shorthand:
$ git checkout --track origin/serverfix Branch serverfix set up to track remote branch serverfix from origin. Switched to a new branch 'serverfix'
$ git checkout serverfix Branch serverfix set up to track remote branch serverfix from origin. Switched to a new branch 'serverfix'
To set up a local branch with a different name than the remote branch, you can easily use the first version with a different local branch name:
$ git checkout -b sf origin/serverfix Branch sf set up to track remote branch serverfix from origin. Switched to a new branch 'sf'
Now, your local branch sf
will automatically pull from origin/serverfix
.
If you already have a local branch and want to set it to a remote branch you just pulled down, or want to change the upstream branch you’re tracking, you can use the -u
or --set-upstream-to
option to git branch
to explicitly set it at any time.
$ git branch -u origin/serverfix
Branch serverfix set up to track remote branch serverfix from origin.
If you want to see what tracking branches you have set up, you can use the -vv
option to git branch
. This will list out your local branches with more information including what each branch is tracking and if your local branch is ahead, behind or both.
$ git branch -vv iss53 7e424c3 [origin/iss53: ahead 2] forgot the brackets master 1ae2a45 [origin/master] deploying index fix * serverfix f8674d9 [teamone/server-fix-good: ahead 3, behind 1] this should do it testing 5ea463a trying something new
So here we can see that our iss53
branch is tracking origin/iss53
and is “ahead” by two, meaning that we have two commits locally that are not pushed to the server. We can also see that our master
branch is tracking origin/master
and is up to date. Next we can see that our serverfix
branch is tracking the server-fix-good
branch on our teamone
server and is ahead by three and behind by one, meaning that there is one commit on the server we haven’t merged in yet and three commits locally that we haven’t pushed. Finally we can see that our testing
branch is not tracking any remote branch.
It’s important to note that these numbers are only since the last time you fetched from each server. This command does not reach out to the servers, it’s telling you about what it has cached from these servers locally. If you want totally up to date ahead and behind numbers, you’ll need to fetch from all your remotes right before running this. You could do that like this:
$ git fetch --all; git branch -vv
Pulling
While the git fetch
command will fetch down all the changes on the server that you don’t have yet, it will not modify your working directory at all. It will simply get the data for you and let you merge it yourself. However, there is a command called git pull
which is essentially a git fetch
immediately followed by a git merge
in most cases.
Generally it’s better to simply use the fetch
and merge
commands explicitly as the magic of git pull
can often be confusing.
Deleting Remote Branches
You can delete a remote branch using the --delete
option to git push
. If you want to delete your serverfix
branch from the server, you run the following:
$ git push origin --delete serverfix To https://github.com/schacon/simplegit - [deleted] serverfix
Basically all this does is remove the pointer from the server. The Git server will generally keep the data there for a while until a garbage collection runs, so if it was accidentally deleted, it’s often easy to recover.
Rebasing
In Git, there are two main ways to integrate changes from one branch into another: the merge
and the rebase
.
The Basic Rebase
The easiest way to integrate the branches, as we’ve already covered, is the merge
command. It performs a three-way merge between the two latest branch snapshots (C3
and C4
) and the most recent common ancestor of the two (C2
), creating a new snapshot (and commit).
However, there is another way: you can take the patch of the change that was introduced in C4
and reapply it on top of C3
. In Git, this is called rebasing. With the rebase
command, you can take all the changes that were committed on one branch and replay them on another one.
$ git checkout experiment $ git rebase master First, rewinding head to replay your work on top of it... Applying: added staged command
It works by going to the common ancestor of the two branches (the one you’re on and the one you’re rebasing onto), getting the diff introduced by each commit of the branch you’re on, saving those diffs to temporary files, resetting the current branch to the same commit as the branch you are rebasing onto, and finally applying each change in turn.
At this point, you can go back to the master
branch and do a fast-forward merge.
$ git checkout master
$ git merge experiment
Often, you’ll do this to make sure your commits apply cleanly on a remote branch — perhaps in a project to which you’re trying to contribute but that you don’t maintain. In this case, you’d do your work in a branch and then rebase your work onto origin/master
when you were ready to submit your patches to the main project. That way, the maintainer doesn’t have to do any integration work — just a fast-forward or a clean apply.
More Interesting Rebases
$ git rebase --onto master server client
This basically says, “Take the client
branch, figure out the patches since it diverged from the server
branch, and replay these patches in the client
branch as if it was based directly off the master
branch instead.” It’s a bit complex, but the result is pretty cool.
Now you can fast-forward your master
branch.
$ git checkout master
$ git merge client
Let’s say you decide to pull in your server branch as well. You can rebase the server branch onto the master
branch without having to check it out first by running git rebase <basebranch> <topicbranch>
— which checks out the topic branch (in this case, server
) for you and replays it onto the base branch (master
):
$ git rebase master server
$ git checkout master
$ git merge server
$ git branch -d client
$ git branch -d server
The Perils of Rebasing
Do not rebase commits that exists outside your repository.
Rebase When You Rebase
Rebase vs. Merge
In general the way to get the best of both worlds is to rebase local changes you’ve made but haven’t shared yet before you push them in order to clean up your story, but never rebase anything you’ve pushed somewhere.