Book Review: Bazaar Version Control

Packt recently published a book on Version Control using Bazaar written by Janos Gyerik. I was curious what the book was like, and they kindly provided me with a digital copy.

The book is split into roughly five sections: an introduction to version control using Bazaar's main commands, an overview of the available workflows, some chapters on the available extensions and integration, some more advanced topics and finally, a quick introduction to programming using bzrlib.

It is assumed the reader has no pre-existing knowledge about version control systems. The first chapters introduce the reader to the concept of revision history, branching and merging and finally collaboration. All concepts are first discussed in theory, and then demonstrated using the Bazaar command-line UI and the bzr-explorer tool. The book follows roughly the same track as the official documentation, but it is more extensive and has more fancy drawings of revision graphs.

The middle section of the book discusses the modes in which Bazaar can be used - centralized or decentralized - as well as the various ways in which code can be landed in the main branch ("workflows"). The selection of workflows in the book is roughly the same as those in the official Bazaar documentation. The author briefly touches on a number of other software engineering topics such as code reviews, code formatting and automated testing, though not sufficiently to make it useful for people who are unfamiliar with these techniques. Both the official documentation and the book complicate things unnecessarily by listing every possible option.

The next chapter is a basic howto on the use of Bazaar with various hosting solutions, such as Launchpad, Redmine and Trac.

The Advanced Features chapter covers a wide range of obscure and less obscure features in Bazaar: uncommit, shelves, re-using working trees, lightweight checkouts, stacked branches, signing revisions and using e-mail hooks.

The chapter on foreign version control system integration is a more extensive version of the public docs. It has some factual inaccuracies; in particular, it recommends the installation of a 2 year old buggy version of bzr-git.

The last chapter provides quite a good introduction to the Bazaar APIs and plugin writing. It is a fair bit better than what is available publically.

Overall, it's not a bad book but also not a huge step forward from the official documentation. I might recommend it to people who are interested in learning Bazaar and who do not have any experience with version control yet. Those who are already familiar with Bazaar or another version control system will not find much new.

The book misses an opportunity by following the official documentation so closely. It has the same omissions and the same overemphasis on describing every possible feature. I had hoped to read more about Bazaar's data model, its file format and some of the common problems, such as parallel imports, format hell and slowness.


Bazaar: A retrospective

For the last 7 years I've been involved in the Bazaar project. Since I am slowly stepping down, I recently wrote a retrospective on the project as I experienced it for the last 7 years.

Thanks to a few kind people for proofreading earlier drafts; if you spot any errors, please let me know in the comments.



During the last two days I hacked together a Bazaar module for Apache. This module makes it possible to easily enable the Bazaar smart server for Bazaar branches. It also can display a simple placeholder page for Bazaar branches without working tree. It's surprisingly easy to write Apache modules.

The main advantage this has over a mod_wsgi / mod_python / mod_fcgi setup is that it doesn't require any additional Python hacking on the users side or other configuration outside of Apache, and it doesn't require configuration for each single branch in the Apache configuration. In the future I'd also like to support the settings "BazaarFrontend Wikkid" and "BazaarFrontend Loggerhead".

The configuration is currently as simple as:

LoadModule bzr_module /usr/lib/apache2/modules/
BazaarSmart on
BazaarFrontend Basic

in your apache2.conf. The BazaarSmart and BazaarFrontend directives can appear in <Directory> or <Location> clauses as well, if you'd like to have different behaviour for different directories.

At the moment this project is a proof of concept, and probably not something you would want to run in production. For example, there is no way to limit the access to a branch to read only. I need to double-check there are no threading issues.

Testing and patches are welcome. The project is hosted here:

Currently Playing: Stream of Passion - Calliopeia


bzr-builddeb FTW

% bzr branch deb:line6-usb-source debian
Retrieving Vcs locating from line6-usb-source Debian version 0.7.4-1
Branched 354 revision(s).
% bzr merge-upstream
All changes applied successfully.
Using version string 0.7.4+svn511 for upstream branch.
The new upstream version has been imported. You should now update the changelog (try dch -v 0.7.4+svn511-1 "New upstream snapshot."), resolve any conflicts, and then commit.
% dch -v 0.7.4+svn511-1 "New upstream snapshot.
% bzr builddeb
Building using working tree
Preparing the build area: ../build-area
Purging the build dir: ../build-area/line6-usb-0.7.4+svn511
Placing result in /home/jelmer/bzr/line6-usb/result
% ls ../result
line6-usb_0.7.4+svn511-1_amd64.changes  line6-usb_0.7.4+svn511-1.dsc
line6-usb_0.7.4+svn511-1.diff.gz        line6-usb_0.7.4+svn511.orig.tar.gz

Currently Playing: Phideaux - Microdeath Softstar


bzr-svn push without file properties

Ever since bzr-svn started supporting "true push", people have been complaining about the extra file properties it sets.

The key thing about "true" push is that it preserves the exact revisions that were present in Subversion. This lets bzr behave on Subversion branches transparently using the same UI you also use for "native" Bazaar branches.

In other words, if I push to a Subversion branch from my machine, then that branch in Subversion contains enough information for somebody else to reconstruct the exact bzr branch I had.

Since some Bazaar metadata can not be represented in Subversion, it is stored in Bazaar-specific Subversion properties. Unfortunately, these file properties show up in email commit notifications and trac and so they tend to annoy people.

There are two ways around this:

Revision properties

Bazaar-specific metadata can be stored in in custom Subversion revision properties (these don't show up in commit notifications). Unfortunately, this requires Subversion 1.5 or newer to run on the server.

I hope to start setting revision properties instead of file properties when possible as of the next bzr-svn release.

less strict push

It's also possible to throw away any data that can not be represented in Subversion. Since this means that the remote branch won't end up an exact same copy of the local revisions, this isn't true push. The two branches will have diverged (no matter how slightly) after such a push so it is necessary to rebase on the remote branch after pushing.

This is similar to the way git-svn pushes data into Subversion - it calls it "dcommit".

Since this uses rebase it has the usual disadvantages of rebases, which I won't get into right now.

As of a couple of days ago, bzr-svn now also supports this mode of pushing using the "dpush" command, by popular demand.

Currently Playing: Brandi Carlile - The Story


bzr-svn: now with its own Subversion Python bindings

bzr-svn has always been using the standard Python bindings that were provided with Subversion itself. Unfortunately, I had to fix some issues in these bindings since they were incomplete or broken and thus bzr-svn has always depended on a development snapshot of Subversion.

As of today, bzr-svn is using its own Python bindings for Subversion.

There were several reasons for switching to our own bindings:

  • There are no requirements for backwards compatibility within bzr-svn. This means the API can be made sane without worrying about the mess it was in the past and users who still rely on that.
  • Deployment. It took 2 years for my fixes to the Subversion Python bindings to be part of a release. It'll be even longer before Subversion 1.5 makes it into most available distributions. That makes it very hard to just download and install bzr-svn.
  • They're in plain C, not SWIG. SWIG has a big advantage for the Subversion folks since it can generate python, ruby, java or tcl bindings all at once without a lot of overhead per language. However, it has issues as well that make it a bad choice for bzr-svn.
    • It generates inefficient code - it generates proxy classes that add more layers in the stack
    • Bindings tend to be very much like the C API rather than "Pythonic". To make them more Pythonic, you need tons of typemaps. For example, the Python bindings in bzr-svn provide an iterator when browsing the revision history rather than a callback as C and the SWIG bindings do.
    • Hard to write - personally at least, I write bindings in C faster than in SWIG
    • Adds an extra dependency to the build process. Several people had trouble building Subversion on their Mac machines because they didn't have the right version of SWIG available.

Since all of the patches that bzr-svn depended on previously were in the Python bindings for Subversion, it is now possible to use bzr-svn with any version of Subversion newer than 1.4.0. Of course, you do need to have the development headers installed as well.

Currently Playing: Kathleen Edwards - Independent Thief


Bazaar in the GNOME world

I was happy to see that John Carr has set up a Bazaar Mirror of all projects in GNOME Subversion, all created using bzr-svn. There's also a quick introduction to using Bazaar for GNOME developers on the GNOME Wiki.

Wouter, long time Bazaar user and GNOME dude, recently blogged about pushing Bazaar branches into GNOME Subversion, working around the restrictions imposed by the pre-commit hooks in GNOME Subversion.

The problems John ran into with memory usage in the Python Subversion bindings encouraged me to continue the work on bzr-svn's own Python bindings, thus avoiding any dependency on unreleased versions of Subversion and several other issues.


Git cutting corners

My relationship with git is still one of love and hate. It cuts corners to increase performance in a couple of places and that can be really bloody annoying.

For example, jerry renamed one of the top-level directories in Samba 3 (revision 9f672c26d63955f613088489c6efbdc08b5b2d14). Git will skip rename detection in this revision because of the number of files it affects, thus causing the output of "git log <path>" of this particular directory to be useless.

I'm the first to admit "bzr log" on directories and files in large history projects is painfully slow, but at least it gets the output right.

Currently Playing: Brandi Carlile - The Story


Using bzr-builddeb as a svn-buildpackage replacement

This slightly evil hack to bzr-svn allows using bzr-builddeb as a drop-in replacement for svn-buildpackage, making it recognize the "mergeWithUpstream" property svn-buildpackage uses.

Currently Playing: Jeff Healey - Mess O' Blues


Adaption blockers Bazaar sprint

The London Bazaar sprint is over again for this year. It was really good to meet everybody in person again and also to meet some of the folks who hadn't been to a sprint before.

Last years sprint was mainly about improving performance; this year, we discussed adoption blockers and how to remove them. A short summary of the brainstorming is on the wiki.

Martin's Blog has some pictures.


The Mars Volta concert we went to last night in Tilburg was absolutely brilliant. Very energetic and definitely one of the best acts I've ever seen live. We were standing in the back of a completely packed venue for 3 hours, but it was very much worth it.

Currently Playing: Soft Machine - Teeth


Bazaar: Need for a "Product" object?

This is something that's been lingering in the back of my head for the last year or so. I think I am missing something in the sequence of [Branch, Repository, WorkingTree]. Here are some of the reasons why I think this is the case:

  • Tags should ideally be shared amongst a set of related branches. This has come up often during discussions about where tags belong.
  • Management of sets of bzr branches is hard
    • It seems to make sense for the configuration of several plugins to be project-specific:
      • bzr-pqm-submit's pqm address
      • bzr-email target address
      • bzr-cia's project setting
      • ...
    • may be useful to override whoami
    • having a way to group branches allows mass-pushes/pulls
  • I often find that the public_location I set is almost the same for related branches, with only the last part of the url differing and containing the branch nick
  • It would be nice if "bzr register-branch" could automatically determine what product to register as

I'm not looking for repositories: - repositories can contain data from multiple totally unrelated branches. a tag "1.0" could conflict because there are multiple unrelated projects that have it. - repositories are a storage optimization and I like them that way

although other projects (mercurial, git) seem to be using repositories to allow talking about a group of related branches.

I'm not looking for "just" directories: - There's no place in a directory to store settings or tags - Having a long list of settings in ~/bazaar/locations.conf doesn't scale and the settings won't be able to propagate

Having another semantic object (''Product''?) on which options/tags can be set would help. Perhaps based on the root id (where available) ?

Currently Playing: Symphony X - The Odyssey



The next major release of bzr-svn, 0.4, has now been released. The main change in this release is that the behaviour of push is now intuitive. The big hack that allowed push to somewhat work in the previous release has been replaced by proper push which behaves the same way as it would against a Bazaar branch.

It's now also possible to branch from non-standard branch locations such as /foo in a repository and not necessarily standard locations like /trunk or /trunk/foo. See the release announcement for a list of other changes.

It's interesting to see what other people are saying about bzr-svn on their blogs:

Note that bzr-svn 0.4 has been tested on Windows and that branching schemes are now more flexible.


Ohloh - Statistics on Free Software projects

Ohloh is a nice web 2.0 site that contains stats on various Free Software projects. At the moment, they only support Subversion, CVS and Git. They're open to feature requests though. If enough people ask for it, hopefully they'll support Bazaar at some point.


Using a pqm with Subversion

One of the things that I've always missed in DVCS is the ability to refuse commits in a branch that's shared by multiple people based on a test suite run. Sure, it's possible to have a pre-commit hook - but that would mean that you'd have to wait for the full test suite to run until the commit finishes. With the time it takes to run the Samba testsuite, this is not really an option.

One of the things that would work is to have everybody work in a separate branch and then have some sort of tool that merges those revisions from everybody's personal branches that worked ok. However, to my knowledge, there is no such tool for Subversion.

Bazaar uses a tool called PQM (Patch Queue Manager). PQM usually controls the main branch (for example for Bazaar, it controls, and waits for GPG-signed requests to merge a specific revision into that main branch. Before accepting such a revision, it will try to run the testsuite to make sure it passes. This guarantees that the main branch never contains broken code (as far as can be indicated by the testsuite).

Now that bzr-svn supports true push, it is possible to actually use a PQM with a Subversion branch. I've tried it on a smaller branch last week, and am now looking into using this for my Samba work.



I'm currently doing a bit of sightseeing in London, after attending the Bazaar sprint at the Canonical office. It was a good sprint, and quite different from the previous ones - in that there was only a limited amount of actual coding involved. The view from the Canonical office is magnificent, so we were even able to do some sightseeing while working...

Bazaars' focus has previously mainly been on correctness and features. The first has always been one of our strengths, and we're in pretty good shape regarding the second. Performance has been one of the main complaints from users about Bazaar and so we have recently tried to improve in that area.

Since 0.12, we have already tried to optimise some of the common code paths and some people have been working on a high performance smart server (to speed up remote operations).

During the first two-and-a-half days of the sprint, we analysed 20 of the most common use cases with Bazaar and determined what complexity they should ideally require to be able to work. After this analysis, we looked at ways to change our data structures to reach these goals.

I have mainly been a spectator during the latter parts of these discussion, but they were interesting to follow.

One of the things I worked on was support for true push in bzr-svn. This was one of the bugs that has bitten a lot of users of bzr-svn. The upcoming bzr-svn 0.4 now supports true push as well as commits in heavyweight checkouts. I hope to release 0.4 after adding nested tree and ignores support so that I don't have to change the internal mapping mechanisms again.

And now, it is time for some more sightseeing. After that I hope to get back to the reason I'm doing all of this in the first place: Samba!