Git
Git
Ben Lynn
Git Magic by Ben Lynn Revision History August 2007 Revised by: BL
Table of Contents
Preface .......................................................................................................................................................vi 1. Thanks!..........................................................................................................................................vi 2. License ........................................................................................................................................ vii 1. Introduction............................................................................................................................................1 1.1. Work is Play ................................................................................................................................1 1.2. Version Control ...........................................................................................................................1 1.3. Distributed Control......................................................................................................................1 1.4. A Silly Superstition .....................................................................................................................2 1.5. Merge Conicts...........................................................................................................................3 2. Basic Tricks ............................................................................................................................................4 2.1. Saving State.................................................................................................................................4 2.2. Add, Delete, Rename ..................................................................................................................4 2.3. Advanced Undo/Redo .................................................................................................................5 2.4. Reverting .....................................................................................................................................6 2.5. Changelog Generation.................................................................................................................6 2.6. Downloading Files ......................................................................................................................6 2.7. The Bleeding Edge......................................................................................................................7 2.8. Instant Publishing........................................................................................................................7 2.9. What Have I Done? .....................................................................................................................8 2.10. Exercise .....................................................................................................................................8 3. Cloning Around ...................................................................................................................................10 3.1. Sync Computers ........................................................................................................................10 3.2. Classic Source Control..............................................................................................................10 3.3. Secret Source.............................................................................................................................11 3.4. Bare repositories .......................................................................................................................12 3.5. Push versus pull.........................................................................................................................12 3.6. Forking a Project .......................................................................................................................12 3.7. Ultimate Backups ......................................................................................................................13 3.8. Light-Speed Multitask...............................................................................................................13 3.9. Guerilla Version Control ...........................................................................................................13 3.10. Mercurial .................................................................................................................................14 3.11. Bazaar......................................................................................................................................15 3.12. Why I use Git ..........................................................................................................................15 4. Branch Wizardry .................................................................................................................................17 4.1. The Boss Key ............................................................................................................................17 4.2. Dirty Work ................................................................................................................................18 4.3. Quick Fixes ...............................................................................................................................18 4.4. Merging .....................................................................................................................................19 4.5. Uninterrupted Workow ...........................................................................................................20 4.6. Reorganizing a Medley .............................................................................................................21 4.7. Managing Branches...................................................................................................................21 4.8. Temporary Branches .................................................................................................................22 4.9. Work How You Want ................................................................................................................22
iii
5. Lessons of History................................................................................................................................23 5.1. I Stand Corrected ......................................................................................................................23 5.2. . . . And Then Some...................................................................................................................23 5.3. Local Changes Last ...................................................................................................................24 5.4. Rewriting History......................................................................................................................24 5.5. Making History .........................................................................................................................25 5.6. Where Did It All Go Wrong?....................................................................................................26 5.7. Who Made It All Go Wrong? ...................................................................................................27 5.8. Personal Experience ..................................................................................................................27 6. Multiplayer Git ....................................................................................................................................29 6.1. Who Am I?................................................................................................................................29 6.2. Git Over SSH, HTTP ................................................................................................................29 6.3. Git Over Anything.....................................................................................................................30 6.4. Patches: The Global Currency ..................................................................................................30 6.5. Sorry, Weve Moved..................................................................................................................31 6.6. Remote Branches ......................................................................................................................32 6.7. Multiple Remotes ......................................................................................................................33 6.8. My Preferences .........................................................................................................................33 7. Git Grandmastery................................................................................................................................35 7.1. Source Releases.........................................................................................................................35 7.2. Commit What Changed.............................................................................................................35 7.3. My Commit Is Too Big! ............................................................................................................35 7.4. The Index: Gits Staging Area ..................................................................................................36 7.5. Dont Lose Your HEAD............................................................................................................36 7.6. HEAD-hunting ..........................................................................................................................37 7.7. Building On Git.........................................................................................................................38 7.8. Daring Stunts.............................................................................................................................39 7.9. Preventing Bad Commits ..........................................................................................................39 8. Secrets Revealed...................................................................................................................................41 8.1. Invisibility .................................................................................................................................41 8.2. Integrity .....................................................................................................................................41 8.3. Intelligence................................................................................................................................41 8.4. Indexing.....................................................................................................................................42 8.5. Gits Origins..............................................................................................................................42 8.6. The Object Database .................................................................................................................42 8.7. Blobs .........................................................................................................................................42 8.8. Trees ..........................................................................................................................................43 8.9. Commits ....................................................................................................................................44 8.10. Indistinguishable From Magic ................................................................................................45 A. Git Shortcomings ................................................................................................................................47 A.1. SHA1 Weaknesses....................................................................................................................47 A.2. Microsoft Windows ..................................................................................................................47 A.3. Unrelated Files .........................................................................................................................47 A.4. Whos Editing What? ...............................................................................................................47 A.5. File History...............................................................................................................................48 A.6. Initial Clone..............................................................................................................................48
iv
A.7. Volatile Projects........................................................................................................................48 A.8. Global Counter .........................................................................................................................49 A.9. Empty Subdirectories ...............................................................................................................49 A.10. Initial Commit ........................................................................................................................49 A.11. Interface Quirks......................................................................................................................50 B. Translating This Guide .......................................................................................................................51
Preface
Git (http://git.or.cz/) is a version control Swiss army knife. A reliable versatile multipurpose revision control tool whose extraordinary exibility makes it tricky to learn, let alone master. As Arthur C. Clarke observed, any sufciently advanced technology is indistinguishable from magic. This is a great way to approach Git: newbies can ignore its inner workings and view Git as a gizmo that can amaze friends and infuriate enemies with its wondrous abilities. Rather than go into details, we provide rough instructions for particular effects. After repeated use, gradually you will understand how each trick works, and how to tailor the recipes for your needs. Translations
Simplied Chinese (/~blynn/gitmagic/intl/zh_cn/): by JunJie, Meng and JiangWei. Converted to Traditional Chinese (/~blynn/gitmagic/intl/zh_tw/) via cconv -f UTF8-CN -t UTF8-TW. French (/~blynn/gitmagic/intl/fr/): by Alexandre Garel, Paul Gaborit, and Nicolas Deram. Also hosted at itaapy (http://tutoriels.itaapy.com/). German (/~blynn/gitmagic/intl/de/): by Benjamin Bellee and Armin Stebich; also hosted on Armins website (http://gitmagic.lordofbikes.de/). Portuguese (http://www.slideshare.net/slide_user/magia-git): by Leonardo Siqueira Rodrigues [ODT version (http://www.slideshare.net/slide_user/magia-git-verso-odt)]. Russian (/~blynn/gitmagic/intl/ru/): by Tikhon Tarnavsky, Mikhail Dymskov, and others. Spanish (/~blynn/gitmagic/intl/es/): by Rodrigo Toledo and Ariset Llerena Tapia. Vietnamese (/~blynn/gitmagic/intl/vi/): by Tr<7847>n Ng<7885>c Qun; also hosted on his website (http://vnwildman.users.sourceforge.net/gitmagic.html).
Other Editions
Single webpage (book.html): barebones HTML, with no CSS. PDF le (book.pdf): printer-friendly. Debian package (http://packages.debian.org/gitmagic), Ubuntu package (http:://packages.ubuntu.com/gitmagic): get a fast and local copy of this site. Handy when this server is ofine (http://csdcf.stanford.edu/status/). Physical book [Amazon.com (http://www.amazon.com/Git-Magic-Ben-Lynn/dp/1451523343/)]: 64 pages, 15.24cm x 22.86cm, black and white. Handy when there is no electricity.
1. Thanks!
Im humbled that so many people have worked on translations of these pages. I greatly appreciate having a wider audience because of the efforts of those named above.
vi
Preface Dustin Sallings, Alberto Bertogli, James Cameron, Douglas Livingstone, Michael Budde, Richard Albury, Tarmigan, Derek Mahar, Frode Aannevik, Keith Rarick, Andy Somerville, Ralf Recker, yvind A. Holm, Miklos Vajna, Sbastien Hinderer, Thomas Miedema, Joe Malin, and Tyler Breisacher contributed corrections and improvements. Franois Marier maintains the Debian package originally created by Daniel Baumann. My gratitude goes to many others for your support and praise. Im tempted to quote you here, but it might raise expectations to ridiculous heights. If Ive left you out by mistake, please tell me or just send me a patch! Free Git hosting
http://repo.or.cz/ hosts free projects. The rst Git hosting site. Founded and maintained by one of the earliest Git developers. http://gitorious.org/ is another Git hosting site aimed at open-source projects. http://github.com/ hosts open-source projects for free, and private projects for a fee.
2. License
This guide is released under the GNU General Public License version 3 (http://www.gnu.org/licenses/gpl-3.0.html). Naturally, the source is kept in a Git repository, and can be obtained by typing:
$ git clone git://repo.or.cz/gitmagic.git # Creates "gitmagic" directory.
vii
Chapter 1. Introduction
Ill use an analogy to introduce version control. See the Wikipedia entry on revision control (http://en.wikipedia.org/wiki/Revision_control) for a saner explanation.
Chapter 1. Introduction
Chapter 1. Introduction shall later see there are drawbacks to a distributed approach, one is less likely to make erroneous comparisons with this rule of thumb. A small project may only need a fraction of the features offered by such a system, but using systems that scale poorly for tiny projects is like using Roman numerals for calculations involving small numbers. Moreover, your project may grow beyond your original expectations. Using Git from the outset is like carrying a Swiss army knife even though you mostly use it to open bottles. On the day you desperately need a screwdriver youll be glad you have more than a plain bottle-opener.
Git deletes these les for you if you havent already. Renaming a le is the same as removing the old name and adding the new name. Theres also the shortcut git mv which has the same syntax as the mv command. For example:
$ git mv bug.c feature.c
The rst few characters of the hash are enough to specify the commit; alternatively, copy and paste the entire hash. Type:
$ git reset --hard 766f
to restore the state to a given commit and erase all newer commits from the record permanently. Other times you want to hop to an old state briey. In this case, type:
$ git checkout 82f5
This takes you back in time, while preserving newer commits. However, like time travel in a science-ction movie, if you now edit and commit, you will be in an alternate reality, because your actions are different to what they were the rst time around. This alternate reality is called a branch, and well have more to say about this later. For now, just remember that
$ git checkout master
will take you back to the present. Also, to stop Git complaining, always commit or reset your changes before running checkout. To take the computer game analogy again:
load an old save and delete all saved games newer than the one just loaded.
load an old game, but if you play on, the game state will deviate from the newer saves you made the rst time around. Any saved games you make now will end up in a separate branch representing the alternate reality you have entered. We deal with this later.
You can choose only to restore particular les and subdirectories by appending them after the command:
$ git checkout 82f5 some.file another.file
Take care, as this form of checkout can silently overwrite les. To prevent accidents, commit before running any checkout command, especially when rst learning Git. In general, whenever you feel unsure about any operation, Git command or not, rst run git commit -a. Dont like cutting and pasting hashes? Then use:
$ git checkout :/"My first b"
to jump to the commit that starts with a given message. You can also ask for the 5th-last saved state:
$ git checkout master~5
2.4. Reverting
In a court of law, events can be stricken from the record. Likewise, you can pick specic commits to undo.
$ git commit -a $ git revert 1b6d
will undo just the commit with the given hash. The revert is recorded as a new commit, which you can conrm by running git log.
For example, to get all the les I used to create this site:
$ git clone git://git.or.cz/gitmagic.git
to download your script. This assumes they have ssh access. If not, run git daemon and tell your users to instead run:
$ git clone git://your.computer/path/to/script
Chapter 2. Basic Tricks From now on, every time your script is ready for release, execute:
$ git commit -a -m "Next release"
and your users can upgrade their version by changing to the directory containing your script and typing:
$ git pull
Your users will never end up with a version of your script you dont want them to see.
Or since yesterday:
$ git diff "@{yesterday}"
In each case the output is a patch that can be applied with git apply. Try also:
$ git whatchanged --since="2 weeks ago"
Often Ill browse history with qgit (http://sourceforge.net/projects/qgit) instead, due to its slick photogenic interface, or tig (http://jonas.nitro.dk/tig/), a text-mode interface that works well over slow connections. Alternatively, install a web server, run git instaweb and re up any web browser.
2.10. Exercise
Let A, B, C, D be four successive commits where B is the same as A except some les have been removed. We want to add the les back at D. How can we do this? There are at least three solutions. Assuming we are at D: 1. The difference between A and B are the removed les. We can create a patch representing this difference and apply it:
$ git diff B A | git apply
Chapter 2. Basic Tricks 2. Since we saved the les back at A, we can retrieve them:
$ git checkout A foo.c bar.h
Which choice is best? Whichever you prefer most. It is easy to get what you want with Git, and often there are many ways to get it.
to create a second copy of the les and Git repository. From now on,
$ git commit -a $ git pull other.computer:/path/to/files HEAD
will pull in the state of the les on the other computer into the one youre working on. If youve recently made conicting edits in the same le, Git will let you know and you should commit again after resolving them.
10
For Git hosting services, follow the instructions to setup the initially empty Git repository. Typically one lls in a form on a webpage. Push your project to the central server with:
$ git push central.server/path/to/proj.git HEAD
If the main server has new changes due to activity by other developers, the push fails, and the developer should pull the latest version, resolve any merge conicts, then try again. Developers must have SSH access for the above pull and push commands. However, anyone can see the source by typing:
$ git clone git://central.server/path/to/proj.git
The native git protocol is like HTTP: there is no authentication, so anyone can retrieve the project. Accordingly, by default, pushing is forbidden via the git protocol.
11
12
Chapter 3. Cloning Around Next, tell everyone about your fork of the project at your server. At any later time, you can merge in the changes from the original project with:
$ git pull
Thanks to hardlinking (http://en.wikipedia.org/wiki/Hard_link), local clones require less time and space than a plain backup. You can now work on two independent features simultaneously. For example, you can edit one clone while the other is compiling. At any time, you can commit and pull changes from the other clone:
$ git pull /the/other/clone HEAD
13
Now go to the new directory and work here instead, using Git to your hearts content. Once in a while, youll want to sync with everyone else, in which case go to the original directory, sync using the other version control system, and type:
$ git add . $ git commit -m "Sync with everyone else"
The procedure for giving your changes to everyone else depends on the other version control system. The new directory contains the les with your changes. Run whatever commands of the other version control system are needed to upload them to the central repository. Subversion, perhaps the best centralized version control system, is used by countless projects. The git svn command automates the above for Subversion repositories, and can also be used to export a Git project to a Subversion repository (http://google-opensource.blogspot.com/2008/05/export-git-project-to-google-code.html).
3.10. Mercurial
Mercurial is a similar version control system that can almost seamlessly work in tandem with Git. With the hg-git plugin, a Mercurial user can losslessly push to and pull from a Git repository. Obtain the hg-git plugin with Git:
$ git clone git://github.com/schacon/hg-git.git
or Mercurial:
$ hg clone http://bitbucket.org/durin42/hg-git/
Sadly, I am unaware of an analogous plugin for Git. For this reason, I advocate Git over Mercurial for the main repository, even if you prefer Mercurial. With a Mercurial project, usually a volunteer maintains a
14
Chapter 3. Cloning Around parallel Git repository to accommodate Git users, whereas thanks to the hg-git plugin, a Git project automatically accommodates Mercurial users. Although the plugin can convert a Mercurial repository to a Git repository by pushing to an empty repository, this job is easier with the hg-fast-export.sh script, available from:
$ git clone git://repo.or.cz/fast-export.git
3.11. Bazaar
We briey mention Bazaar because it is the most popular free distributed version control system after Git and Mercurial. Bazaar has the advantage of hindsight, as it is relatively young; its designers could learn from mistakes of the past, and sidestep minor historical warts. Additionally, its developers are mindful of portability and interoperation with other version control systems. A bzr-git plugin lets Bazaar users work with Git repositories to some extent. The tailor program converts Bazaar repositories to Git repositories, and can do so incrementally, while bzr-fast-export is well-suited for one-shot conversions.
15
Chapter 3. Cloning Around Naturally, your needs and wants likely differ, and you may be better off with another system. Nonetheless, you cant go far wrong with Git.
16
We have created a Git repository that tracks one text le containing a certain message. Now type:
$ git checkout -b boss # nothing seems to change after this $ echo "My boss is smarter than me" > myfile.txt
17
It looks like weve just overwritten our le and committed it. But its an illusion. Type:
$ git checkout master # switch to original version of the file
and hey presto! The text le is restored. And if the boss decides to snoop around this directory, type:
$ git checkout boss # switch to version suitable for boss eyes
You can switch between the two versions of the le as much as you like, and commit to each independently.
Now you can add ugly temporary code all over the place. You can even commit these changes. When youre done,
$ git checkout master
to return to your original work. Observe that any uncommitted changes are carried over. What if you wanted to save the temporary changes after all? Easy:
$ git checkout -b dirty
and commit before switching back to the master branch. Whenever you want to return to the dirty changes, simply type:
$ git checkout dirty
We touched upon this command in an earlier chapter, when discussing loading old states. At last we can tell the whole story: the les change to the requested state, but we must leave the master branch. Any commits made from now on take your les down a different road, which can be named later. In other words, after checking out an old state, Git automatically puts you in a new, unnamed branch, which can be named and saved with git checkout -b.
18
and resume work on your original task. You can even merge in the freshly baked bugx:
$ git merge fixes
4.4. Merging
With some version control systems, creating branches is easy but merging them back together is tough. With Git, merging is so trivial that you might be unaware of it happening. We actually encountered merging long ago. The pull command in fact fetches commits and then merges them into your current branch. If you have no local changes, then the merge is a fast forward, a degenerate case akin to fetching the latest version in a centralized version control system. But if you do have local changes, Git will automatically merge, and report any conicts. Ordinarily, a commit has exactly one parent commit, namely, the previous commit. Merging branches together produces a commit with at least two parents. This begs the question: what commit does HEAD~10 really refer to? A commit could have multiple parents, so which one do we follow? It turns out this notation chooses the rst parent every time. This is desirable because the current branch becomes the rst parent during a merge; frequently youre only concerned with the changes you made in the current branch, as opposed to changes merged in from other branches. You can refer to a specic parent with a caret. For example, to show the logs from the second parent:
$ git log HEAD^2
You may omit the number for the rst parent. For example, to show the differences with the rst parent:
$ git diff HEAD^
19
Chapter 4. Branch Wizardry You can combine this notation with other types. For example:
$ git checkout 1b6d^^2~10 -b ancient
starts a new branch ancient representing the state 10 commits back from the second parent of the rst parent of the commit starting with 1b6d.
Next, work on Part II, committing your changes along the way. To err is human, and often youll want to go back and x something in Part I. If youre lucky, or very good, you can skip these lines.
$ $ $ $ $ git checkout master fix_problem git commit -a git checkout part2 git merge master # Go back to Part I. # Commit the fixes. # Go back to Part II. # Merge in those fixes.
Now youre in the master branch again, with Part II in the working directory. Its easy to extend this trick for any number of parts. Its also easy to branch off retroactively: suppose you belatedly realize you should have created a branch 7 commits ago. Then type:
20
The master branch now contains just Part I, and the part2 branch contains the rest. We are in the latter branch; we created master without switching to it, because we want to continue work on part2. This is unusual. Until now, weve been switching to branches immediately after creation, as in:
$ git checkout HEAD~7 -b master # Create a branch, and switch to it.
Next, work on anything: x bugs, add features, add temporary code, and so forth, committing often along the way. Then:
$ git checkout sanitized $ git cherry-pick medley^^
applies the grandparent of the head commit of the medley branch to the sanitized branch. With appropriate cherry-picks you can construct a branch that contains only permanent code, and has related commits grouped together.
By default, you start in a branch named master. Some advocate leaving the master branch untouched and creating new branches for your own edits. The -d and -m options allow you to delete and move (rename) branches. See git help branch. The master branch is a useful custom. Others may assume that your repository has a branch with this name, and that it contains the ofcial version of your project. Although you can rename or obliterate the
21
Chapter 4. Branch Wizardry master branch, you might as well respect this convention.
This saves the current state in a temporary location (a stash) and restores the previous state. Your working directory appears exactly as it was before you started editing, and you can x bugs, pull in upstream changes, and so on. When you want to go back to the stashed state, type:
$ git stash apply # You may need to resolve some conflicts.
You can have multiple stashes, and manipulate them in various ways. See git help stash. As you may have guessed, Git maintains branches behind the scenes to perform this magic trick.
22
to change the last message. Realized you forgot to add a le? Run git add to add it, and then run the above command. Want to include a few more edits in that last commit? Then make those edits and run:
$ git commit --amend -a
and the last 10 commits will appear in your favourite $EDITOR. A sample excerpt:
pick 5c6eb73 Added repo.or.cz link pick a311a64 Reordered analogies in "Work How You Want" pick 100834f Added push target to Makefile
Then:
23
Remove commits by deleting lines. Reorder commits by reordering lines. Replace pick with:
edit to mark a commit for amending. reword to change the log message. squash to merge a commit with the previous one. fixup to merge a commit with the previous one and discard the log message.
Save and quit. If you marked a commit for editing, then run:
$ git commit --amend
Otherwise, run:
$ git rebase --continue
So commit early and commit often: you can tidy up later with rebase.
24
Chapter 5. Lessons of History involves a le that should be kept private for some reason. Perhaps I left my credit card number in a text le and accidentally added it to the project. Deleting the le is insufcient, for the le can be accessed from older commits. We must remove the le from all commits:
$ git filter-branch --tree-filter rm top/secret/file HEAD
See git help lter-branch, which discusses this example and gives a faster method. In general, lter-branch lets you alter large sections of history with a single command. Afterwards, the .git/refs/original directory describes the state of affairs before the operation. Check the lter-branch command did what you wanted, then delete this directory if you wish to run more lter-branch commands. Lastly, replace clones of your project with your revised version if you want to interact with them later.
commit refs/heads/master committer Bob <[email protected]> Tue, 14 Mar 2000 01:59:26 -0800
25
The git fast-export command converts any repository to the git fast-import format, whose output you can study for writing exporters, and also to transport repositories in a human-readable format. Indeed, these commands can send repositories of text les over text-only channels.
Git checks out a state halfway in between. Test the feature, and if its still broken:
$ git bisect bad
26
Chapter 5. Lessons of History If not, replace "bad" with "good". Git again transports you to a state halfway between the known good and bad versions, narrowing down the possibilities. After a few iterations, this binary search will lead you to the commit that caused the trouble. Once youve nished your investigation, return to your original state by typing:
$ git bisect reset
Git uses the return value of the given command, typically a one-off script, to decide whether a change is good or bad: the command should exit with code 0 when good, 125 when the change should be skipped, and anything else between 1 and 127 if it is bad. A negative return value aborts the bisect. You can do much more: the help page explains how to visualize bisects, examine or replay the bisect log, and eliminate known innocent changes for a speedier search.
which annotates every line in the given le showing who last changed it, and when. Unlike many other version control systems, this operation works ofine, reading only from local disk.
27
Chapter 5. Lessons of History I experienced these phenomena rst-hand. Git was the rst version control system I used. I quickly grew accustomed to it, taking many features for granted. I simply assumed other systems were similar: choosing a version control system ought to be no different from choosing a text editor or web browser. I was shocked when later forced to use a centralized system. A aky internet connection matters little with Git, but makes development unbearable when it needs to be as reliable as local disk. Additionally, I found myself conditioned to avoid certain commands because of the latencies involved, which ultimately prevented me from following my desired work ow. When I had to run a slow command, the interruption to my train of thought dealt a disproportionate amount of damage. While waiting for server communication to complete, Id do something else to pass the time, such as check email or write documentation. By the time I returned to the original task, the command had nished long ago, and I would waste more time trying to remember what I was doing. Humans are bad at context switching. There was also an interesting tragedy-of-the-commons effect: anticipating network congestion, individuals would consume more bandwidth than necessary on various operations in an attempt to reduce future delays. The combined efforts intensied congestion, encouraging individuals to consume even more bandwidth next time to avoid even longer delays.
28
6.1. Who Am I?
Every commit has an author name and email, which is shown by git log. By default, Git uses system settings to populate these elds. To set them explicitly, type:
$ git config --global user.name "John Doe" $ git config --global user.email [email protected]
Omit the global ag to set these options only for the current repository.
For older versions of Git, the copy command fails and you should run:
$ chmod a+x hooks/post-update
Now you can publish your latest edits via SSH from any clone:
$ git push web.server:/path/to/proj.git master
29
then transports the bundle, somefile, to the other party somehow: email, thumb drive, an xxd printout and an OCR scanner, reading bits over the phone, smoke signals, etc. The receiver retrieves commits from the bundle by typing:
$ git pull somefile
The receiver can even do this from an empty repository. Despite its size, somefile contains the entire original git repository. In larger projects, eliminate waste by bundling only changes the other repository lacks. For example, suppose the commit 1b6d. . . is the most recent commit shared by both parties:
$ git bundle create somefile HEAD ^1b6d
If done frequently, one could easily forget which commit was last sent. The help page suggests using tags to solve this. Namely, after you send a bundle, type:
$ git tag -f lastbundle HEAD
30
Chapter 6. Multiplayer Git Similarly, on your side, all you require is an email account: theres no need to setup an online Git repository. Recall from the rst chapter:
$ git diff 1b6d > my.patch
outputs a patch which can be pasted into an email for discussion. In a Git repository, type:
$ git apply < my.patch
to apply the patch. In more formal settings, when author names and perhaps signatures should be recorded, generate the corresponding patches past a certain point by typing:
$ git format-patch 1b6d
The resulting les can be given to git-send-email, or sent by hand. You can also specify a range of commits:
$ git format-patch 1b6d..HEAD^^
This applies the incoming patch and also creates a commit, including information such as the author. With a browser email client, you may need to click a button to see the email in its raw original form before saving the patch to a le. There are slight differences for mbox-based email clients, but if you use one of these, youre probably the sort of person who can gure them out easily without reading tutorials!
31
Chapter 6. Multiplayer Git The remote.origin.url option controls the source URL; origin is a nickname given to the source repository. As with the master branch convention, we may change or delete this nickname but there is usually no reason for doing so. If the original repository moves, we can update the URL via:
$ git config remote.origin.url git://new.url/proj.git
The branch.master.merge option species the default remote branch in a git pull. During the initial clone, it is set to the current branch of the source repository, so even if the HEAD of the source repository subsequently moves to a different branch, a later pull will faithfully follow the original branch. This option only applies to the repository we rst cloned from, which is recorded in the option branch.master.remote. If we pull in from other repositories we must explicitly state which branch we want:
$ git pull git://example.com/other.git master
The above explains why some of our earlier push and pull examples had no arguments.
These represent branches and the HEAD of the remote repository, and can be used in regular Git commands. For example, suppose you have made many commits, and wish to compare against the last fetched version. You could search through the logs for the appropriate SHA1 hash, but its much easier to type:
$ git diff origin/HEAD
32
Chapter 6. Multiplayer Git Or you can see what the experimental branch has been up to:
$ git log origin/experimental
Now we have merged in a branch from the second repository, and we have easy access to all branches of all repositories:
$ git diff origin/experimental^ other/some_branch~5
But what if we just want to compare their changes without affecting our own work? In other words, we want to examine their branches without having their changes invade our working directory. Then rather than pull, run:
$ git fetch $ git fetch other # Fetch from origin, the default. # Fetch from the second programmer.
This just fetches histories. Although the working directory remains untouched, we can refer to any branch of any repository in a Git command because we now possess a local copy. Recall that behind the scenes, a pull is simply a fetch then merge. Usually we pull because we want to merge the latest commit after a fetch; this situation is a notable exception. See git help remote for how to remove remote repositories, ignore certain branches, and more.
6.8. My Preferences
For my projects, I like contributors to prepare repositories from which I can pull. Some Git hosting services let you host your own fork of a project with the click of a button. After I fetch a tree, I run Git commands to navigate and examine the changes, which ideally are well-organized and well-described. I merge my own changes, and perhaps make further edits. Once satised, I push to the main repository.
33
Chapter 6. Multiplayer Git Though I infrequently receive contributions, I believe this approach scales well. See this blog post by Linus Torvalds (http://torvalds-family.blogspot.com/2009/06/happiness-is-warm-scm.html). Staying in the Git world is slightly more convenient than patch les, as it saves me from converting them to Git commits. Furthermore, Git handles details such as recording the authors name and email address, as well as the time and date, and asks the author to describe their own change.
34
Git will look at the les in the current directory and work out the details by itself. Instead of the second add command, run git commit -a if you also intend to commit at this time. See git help ignore for how to specify les that should be ignored. You can perform the above in a single pass with:
$ git ls-files -d -m -o -z | xargs -0 git update-index --add --remove
The -z and -0 options prevent ill side-effects from lenames containing strange characters. As this command adds ignored les, you may want to use the -x or -X option.
35
For each edit you made, Git will show you the hunk of code that was changed, and ask if it should be part of the next commit. Answer with "y" or "n". You have other options, such as postponing the decision; type "?" to learn more. Once youre satised, type
$ git commit
to commit precisely the changes you selected (the staged changes). Make sure you omit the -a option, otherwise Git will commit all the edits. What if youve edited many les in many places? Reviewing each change one by one becomes frustratingly mind-numbing. In this case, use git add -i, whose interface is less straightforward, but more exible. With a few keystrokes, you can stage or unstage several les at a time, or review and select changes in particular les only. Alternatively, run git commit --interactive which automatically commits after youre done.
36
Chapter 7. Git Grandmastery will move the HEAD three commits back. Thus all Git commands now act as if you hadnt made those last three commits, while your les remain in the present. See the help page for some applications. But how can you go back to the future? The past commits know nothing of the future. If you have the SHA1 of the original HEAD then:
$ git reset 1b6d
But suppose you never took it down? Dont worry: for commands like these, Git saves the original HEAD as a tag called ORIG_HEAD, and you can return safe and sound with:
$ git reset ORIG_HEAD
7.6. HEAD-hunting
Perhaps ORIG_HEAD isnt enough. Perhaps youve just realized you made a monumental mistake and you need to go back to an ancient commit in a long-forgotten branch. By default, Git keeps a commit for at least two weeks, even if you ordered Git to destroy the branch containing it. The trouble is nding the appropriate hash. You could look at all the hash values in .git/objects and use trial and error to nd the one you want. But theres a much easier way. Git records every hash of a commit it computes in .git/logs. The subdirectory refs contains the history of all activity on all branches, while the le HEAD shows every hash value it has ever taken. The latter can be used to nd hashes of commits on branches that have been accidentally lopped off. The reog command provides a friendly interface to these log les. Try
$ git reflog
See the Specifying Revisions section of git help rev-parse for more.
37
Chapter 7. Git Grandmastery You may wish to congure a longer grace period for doomed commits. For example:
$ git config gc.pruneexpire "30 days"
means a deleted commit will only be permanently lost once 30 days have passed and git gc is run. You may also wish to disable automatic invocations of git gc:
$ git config gc.auto 0
in which case commits will only be deleted when you run git gc manually.
Another is to print the current branch in the prompt, or window title. Invoking
$ git symbolic-ref HEAD
shows the current branch name. In practice, you most likely want to remove the "refs/heads/" and ignore errors:
$ git symbolic-ref HEAD 2> /dev/null | cut -b 12-
The contrib subdirectory is a treasure trove of tools built on Git. In time, some of them may be promoted to ofcial commands. On Debian and Ubuntu, this directory lives at /usr/share/doc/git-core/contrib. One popular resident is workdir/git-new-workdir. Via clever symlinking, this script creates a new working directory whose history is shared with the original repository:
$ git-new-workdir an/existing/repo new/directory
38
Chapter 7. Git Grandmastery The new directory and the les within can be thought of as a clone, except since the history is shared, the two trees automatically stay in sync. Theres no need to merge, push, or pull.
On the other hand, if you specify particular paths for checkout, then there are no safety checks. The supplied paths are quietly overwritten. Take care if you use checkout in this manner. Reset: Reset also fails in the presence of uncommitted changes. To force it through, run:
$ git reset --hard 1b6d
Branch: Deleting branches fails if this causes changes to be lost. To force a deletion, type:
$ git branch -D dead_branch # instead of -d
Similarly, attempting to overwrite a branch via a move fails if data loss would ensue. To force a branch move, type:
$ git branch -M source target # instead of -m
Unlike checkout and reset, these two commands defer data destruction. The changes are still stored in the .git subdirectory, and can be retrieved by recovering the appropriate hash from .git/logs (see "HEAD-hunting" above). By default, they will be kept for at least two weeks. Clean: Some git commands refuse to proceed because theyre worried about clobbering untracked les. If youre certain that all untracked les and directories are expendable, then delete them mercilessly with:
$ git clean -f -d
39
Now Git aborts a commit if useless whitespace or unresolved merge conicts are detected. For this guide, I eventually added the following to the beginning of the pre-commit hook to guard against absent-mindedness:
if git ls-files -o | grep \.txt$; then echo FAIL! Untracked .txt files. exit 1 fi
Several git operations support hooks; see git help hooks. We activated the sample post-update hook earlier when discussing Git over HTTP. This runs whenever the head moves. The sample post-update script updates les Git needs for communication over Git-agnostic transports such as HTTP.
40
8.1. Invisibility
How can Git be so unobtrusive? Aside from occasional commits and merges, you can work as if you were unaware that version control exists. That is, until you need it, and thats when youre glad Git was watching over you the whole time. Other version control systems force you to constantly struggle with red tape and bureaucracy. Permissions of les may be read-only unless you explicitly tell a central server which les you intend to edit. The most basic commands may slow to a crawl as the number of users increases. Work grinds to a halt when the network or the central server goes down. In contrast, Git simply keeps the history of your project in the .git directory in your working directory. This is your own copy of the history, so you can stay ofine until you want to communicate with others. You have total control over the fate of your les because Git can easily recreate a saved state from .git at any time.
8.2. Integrity
Most people associate cryptography with keeping information secret, but another equally important goal is keeping information safe. Proper use of cryptographic hash functions can prevent accidental or malicious data corruption. A SHA1 hash can be thought of as a unique 160-bit ID number for every string of bytes youll encounter in your life. Actually more than that: every string of bytes that any human will ever use over many lifetimes. As a SHA1 hash is itself a string of bytes, we can hash strings of bytes containing other hashes. This simple observation is surprisingly useful: look up hash chains. Well later see how Git uses it to efciently guarantee data integrity. Briey, Git keeps your data in the .git/objects subdirectory, where instead of normal lenames, youll nd only IDs. By using IDs as lenames, as well as a few lockles and timestamping tricks, Git transforms any humble lesystem into an efcient and robust database.
41
8.3. Intelligence
How does Git know you renamed a le, even though you never mentioned the fact explicitly? Sure, you may have run git mv, but that is exactly the same as a git rm followed by a git add. Git heuristically ferrets out renames and copies between successive versions. In fact, it can detect chunks of code being moved or copied around between les! Though it cannot cover all cases, it does a decent job, and this feature is always improving. If it fails to work for you, try options enabling more expensive copy detection, and consider upgrading.
8.4. Indexing
For every tracked le, Git records information such as its size, creation time and last modication time in a le known as the index. To determine whether a le has changed, Git compares its current stats with those cached in the index. If they match, then Git can skip reading the le again. Since stat calls are considerably faster than le reads, if you only edit a few les, Git can update its state in almost no time. We stated earlier that the index is a staging area. Why is a bunch of le stats a staging area? Because the add command puts les into Gits database and updates these stats, while the commit command, without options, creates a commit based only on these stats and the les already in the database.
42
8.7. Blobs
First, a magic trick. Pick a lename, any lename. In an empty directory:
$ $ $ $ echo sweet > YOUR_FILENAME git init git add . find .git/objects -type f
Youll see .git/objects/aa/823728ea7d592acc69b36875a482cdf3fd5c8d. How do I know this without knowing the lename? Its because the SHA1 hash of:
"blob" SP "6" NUL "sweet" LF
is aa823728ea7d592acc69b36875a482cdf3fd5c8d, where SP is a space, NUL is a zero byte and LF is a linefeed. You can verify this by typing:
$ printf "blob 6\000sweet\n" | sha1sum
Git is content-addressable: les are not stored according to their lename, but rather by the hash of the data they contain, in a le we call a blob object. We can think of the hash as a unique ID for a les contents, so in a sense we are addressing les by their content. The initial blob 6 is merely a header consisting of the object type and its length in bytes; it simplies internal bookkeeping. Thus I could easily predict what you would see. The les name is irrelevant: only the data inside is used to construct the blob object. You may be wondering what happens to identical les. Try adding copies of your le, with any lenames whatsoever. The contents of .git/objects stay the same no matter how many you add. Git only stores the data once. By the way, the les within .git/objects are compressed with zlib so you should not stare at them directly. Filter them through zpipe -d (http://www.zlib.net/zpipe.c), or type:
$ git cat-file -p aa823728ea7d592acc69b36875a482cdf3fd5c8d
8.8. Trees
But where are the lenames? They must be stored somewhere at some stage. Git gets around to the
43
You should now see 3 objects. This time I cannot tell you what the 2 new les are, as it partly depends on the lename you picked. Well proceed assuming you chose rose. If you didnt, you can rewrite history to make it look like you did:
$ git filter-branch --tree-filter mv YOUR_FILENAME rose $ find .git/objects -type f
Now you should see the le .git/objects/05/b217bb859794d08bb9e4f7f04cbda4b207fbe9, because this is the SHA1 hash of its contents:
"tree" SP "32" NUL "100644 rose" NUL 0xaa823728ea7d592acc69b36875a482cdf3fd5c8d
Hash verication is trickier via cat-le because its output contains more than the raw uncompressed object le. This le is a tree object: a list of tuples consisting of a le type, a lename, and a hash. In our example, the le type is 100644, which means rose is a normal le, and the hash is the blob object that contains the contents of rose. Other possible le types are executables, symlinks or directories. In the last case, the hash points to a tree object. If you ran lter-branch, youll have old objects you no longer need. Although they will be jettisoned automatically once the grace period expires, well delete them now to make our toy example easier to follow:
$ rm -r .git/refs/original $ git reflog expire --expire=now --all $ git prune
For real projects you should typically avoid commands like this, as you are destroying backups. If you want a clean repository, it is usually best to make a fresh clone. Also, take care when directly manipulating .git: what if a Git command is running at the same time, or a sudden power outage occurs? In general, refs should be deleted with git update-ref -d, though usually its safe to remove refs/original by hand.
44
8.9. Commits
Weve explained 2 of the 3 objects. The third is a commit object. Its contents depend on the commit message as well as the date and time it was created. To match what we have here, well have to tweak it a little:
$ git commit --amend -m Shakespeare # Change the commit message. $ git filter-branch --env-filter export GIT_AUTHOR_DATE="Fri 13 Feb 2009 15:31:30 -0800" GIT_AUTHOR_NAME="Alice" GIT_AUTHOR_EMAIL="[email protected]" GIT_COMMITTER_DATE="Fri, 13 Feb 2009 15:31:30 -0800" GIT_COMMITTER_NAME="Bob" GIT_COMMITTER_EMAIL="[email protected]" # Rig timestamps and authors. $ find .git/objects -type f
You should now see .git/objects/49/993fe130c4b3bf24857a15d7969c396b7bc187 which is the SHA1 hash of its contents:
"commit 158" NUL "tree 05b217bb859794d08bb9e4f7f04cbda4b207fbe9" LF "author Alice <[email protected]> 1234567890 -0800" LF "committer Bob <[email protected]> 1234567890 -0800" LF LF "Shakespeare" LF
As before, you can run zpipe or cat-le to see for yourself. This is the rst commit, so there are no parent commits, but later commits will always contain at least one line identifying a parent commit.
45
Chapter 8. Secrets Revealed We defeat even the most devious adversaries. Suppose somebody attempts to stealthily modify the contents of a le in an ancient version of a project. To keep the object database looking healthy, they must also change the hash of the corresponding blob object since its now a different string of bytes. This means theyll have to change the hash of any tree object referencing the le, and in turn change the hash of all commit objects involving such a tree, in addition to the hashes of all the descendants of these commits. This implies the hash of the ofcial head differs to that of the bad repository. By following the trail of mismatching hashes we can pinpoint the mutilated le, as well as the commit where it was rst corrupted. In short, so long as the 20 bytes representing the last commit are safe, its impossible to tamper with a Git repository. What about Gits famous features? Branching? Merging? Tags? Mere details. The current head is kept in the le .git/HEAD, which contains a hash of a commit object. The hash gets updated during a commit as well as many other commands. Branches are almost the same: they are les in .git/refs/heads. Tags too: they live in .git/refs/tags but they are updated by a different set of commands.
46
Cygwin (http://cygwin.com/), a Linux-like environment for Windows, contains a Windows port of Git (http://cygwin.com/packages/git/). Git on MSys (http://code.google.com/p/msysgit/) is an alternative requiring minimal runtime support, though a few of the commands need some work.
47
Appendix A. Git Shortcomings 1. Diffs are quick because only the marked les need be examined. 2. One can discover who else is working on the le by asking the central server who has marked it for editing. With appropriate scripting, you can achieve the same with Git. This requires cooperation from the programmer, who should execute particular scripts when editing a le.
48
Appendix A. Git Shortcomings Or perhaps a database or backup/archival solution is what is actually being sought, not a version control system. For example, version control may be ill-suited for managing photos periodically taken from a webcam. If the les really must be constantly morphing and they really must be versioned, a possibility is to use Git in a centralized fashion. One can create shallow clones, which checks out little or no history of the project. Of course, many Git tools will be unavailable, and xes must be submitted as patches. This is probably ne as its unclear why anyone would want the history of wildly unstable les. Another example is a project depending on rmware, which takes the form of a huge binary le. The history of the rmware is uninteresting to users, and updates compress poorly, so rmware revisions would unnecessarily blow up the size of the repository. In this case, the source code should be stored in a Git repository, and the binary le should be kept separately. To make life easier, one could distribute a script that uses Git to clone the code, and rsync or a Git shallow clone for the rmware.
49
50
and so on for each text le. Edit the Makele and add the language code to the TRANSLATIONS variable. You can now review your work incrementally:
$ make tlh $ firefox book-tlh/index.html
Commit your changes often, then let me know when theyre ready. GitHub has an interface that facilitates this: fork the "gitmagic" project, push your changes, then ask me to merge.
51