Introduction to Distributed Version Control Systems
Learn about and compare how to use Bazaar, Mercurial, and Git
By Adam Shand & Noah Gift
Summary: Interested in distributed version control but intimidated by all the jargon? This article provides an introduction to the three main systems available (Git, Mercurial, and Bazaar), discusses some of the advantages gained from adopting a distributed workflow, and provides a quick start guide for each of the three systems.
Introduction
Over the last few years, there has been much discussion about the benefits distributed version control systems (DVCS) can offer your development process. Distributed tools have now matured to the point where there are few downsides remaining. While some of the advantages may not be initially compelling, the extra flexibility that distributed tools provide makes a lot of sense in the long run. By the end of this article, you should know enough to get started with a distributed version control system and have a basic understanding of the advantages a distributed model can provide.
Much of the discussion around distributed version control has focused on a central server no longer being required. While this is a distinguishing feature, and vital to some groups of developers, the real value is that it allows groups of developers to implement nearly any workflow that they choose. The possibilities range from two developers sharing changes with no infrastructure other than a shared wireless network to the traditional centralised model.
This enablement of different ways for people to work is what is truly exciting about distributed version control. The end of the article provides a how to guide for a simple ad hoc workflow which could be used by writers, school teachers or hard-core Linux kernel developers.
What is Distributed Version Control?
A distributed version control system is a way to manage many versions of files on many different computers without requiring a centralised server. Collections of changes can be exchanged with other users to allow easy merging of changes.
The main advantage is that it provides a lot of flexibility when it comes to designing your version control workflows. In addition, because it requires a full copy of the repository to exist on the client, it is very fast to perform most operations, and it can be effectively used without a network connection.
Distributed Version Control Workflows
Because distributed version control is so flexible, there are a vast number of potential workflows. One common workflow is an ad hoc workflow. Using this, a developer starts a project and makes a branch which they want to share. They then push changes back and forth between branches that another developer is working on, and merge those changes in each time.
Another common workflow is to use a centralised server with local commits. Using this, a developer works similarly to how they would with a Subversion repository, except they commit locally, and then push a final change to the centralised server. There are many variations to this, including intermixing the ad hoc workflow with. The main thing to take away is that there are many ways to work, and distributed version control provides the flexibility to choose which way works best for you.
Quick Start Guide
One of the best ways to actually understand a new technology is to work with it. In this next section, you can walk through the article while it runs through common operations in Mercurial, Bazaar, and Git:
Mercurial
- Install:
sudo easy_install mercurial
- Make project directory:
mkdir hgrepo; cd hgrepo
- Initialise project:
hg init
- Add a file:
touch foo.txt; hg add foo.txt
- Commit:
hg commit -m "added foo.txt"
- Grab shared repository:
hg clone ssh://example.com//projects/hgrepo
- Push changes to server:
hg push
- See pending updates as patches:
hg incoming -p
- Download updates from server:
hg pull
- Apply changes:
hg update
- Merge conflicts:
hg merge
- Merge two unrelated remote repositories:
hg pull -f ssh://example2.com//projects/hgrepo
Bazaar
- Install:
sudo easy_install bzr
- Make project directory:
mkdir bzrrepo; cd bzrrepo
- Initialise project:
bzr init
- Add a file:
touch foo.txt; bzr add foo.txt
- Commit:
bzr commit -m "added foo.txt"
- Grab shared repository:
bzr branch bzr+ssh://example.com/projects/bzrrepo
- Push changes to server:
bzr push
- Download updates from server:
bzr pull
- Apply changes:
bzr update
- Merge conflicts:
bzr merge
Git
- Install: Download the latest tar file
- Make project directory:
mkdir gitrepo; cd gitrepo
- Initialise project:
git init
- Add a file:
touch foo.txt; git add foo.txt
- Commit:
git commit -m "added foo.txt"
- Grab shared repository:
git clone ssh://example.com/projects/gitrepo
- Push changes to server:
git push
- Download updates from server:
git pull
- Merge conflicts:
git merge
Conversion Tools and Integration with Subversion
All three systems have the ability to integrate with existing Subversion repositories, they also provide varying degrees of integration with other distributed systems. There is a standalone tool called tailor which allows you to convert a repository to another format (eg. Git to Mercurial or Subversion to Bazaar).
These tools make experimenting with distributed systems a straight forward process and greatly simplify migrating between systems should the need arise.
Third Party Hosting Options
If you don't want to host your own server, there are some popular choices for hosting your project with Git, Bazaar, or Mercurial.
For Mercurial, a popular free and paid hosting site is Bitbucket. For Git, there is a similar service called Github, and for Bazaar there is Launchpad. In addition to these specific services, SourceForge provides support for all three (plus Subversion and CVS) and Google Code supports Mercurial in addition to Subversion.
Example Ad hoc Workflow Using Mercurial
An example of when a simple ad hoc workflow might be useful is if you are working with a friend in a coffee shop. In this instance, the only infrastructure you have available might be your two laptops plus a local wireless network.
Here's an example of how you and your friend can easily and securely exchange code using Mercurial and the above resources.
You create a repository using Mercurial:
mkdir /tmp/myhgrepo
cd /tmp/myhgrepo
hg init
When you are ready, you use Mercurial's built in web server to share a read only version of your new repository:
hg serve
Your friend can then make a copy of your repository by using the clone command:
hg clone http://192.168.1.2:8000
Your friend is now free to make any changes to their local copy of the repository. When they are ready they also share a read only version of their repository:
hg serve
You can then pull down changes that your friend has made whenever it suits:
hg pull http://192.168.1.3:8000
With this workflow yet each developer is secure in knowing that they are only changing their local copy of the files, and yet it still provides a simple way to send and receive changes between the developers.
Hint: If you are both using Mac's there's a convenient trick. Instead of using each other's IP addresses, you can use each other's Bonjour name. Every Mac computer is reachable on the local network by appending ".local" to their hostname. So if your partner's Mac is called maus
you could use the below command:
hg clone http://maus.local:8000
Conclusion
This article explains the benefits of distributed version control and some of the differences between the three main options. If you are new to version control, continue to experiment and learn about hooks and plugins and the power that they provide.
If you are an old hand, then you should be up to speed with each of the tools so you can use them to their best advantage. Review the links in the resources section, which contain more detailed information on the specifics of distributed version control and how some people are using it.
Resources
- BetterExplained's illustrated introduction to distributed version control.
- InfoQ's "Not-So-Quick Guide" to distributed version control.
- Wikipedia's overview on distributed version control.
- Google's analysis of why they chose Mercurial.
- Tailor is an "anything" to "anything" version control migration tool.
- "The Definitive Guide" to Mercurial.
- Mercurial provides hgsvn and Hg-Git for working with remote Subversion and Git repositories.
- Git provides git-svn for working with remote Subversion repositories.
- Bazaar provides a variety of options for working with remote Subversion or Git repositories.
- Joel Spolsky finally gets distributed version control and creates an excellent site to help people learn Mercurial.