You Don’t Have to Ask Your Boss for a Fast Build (Lean Software Development Part 6)

A slow build costs money. I mean it costs a whole lot of money all the time !

Spending some time to speed up the build is like an investment, you’ll pay some money now, but then it’s only a matter of time until you get a return on investment. Here is the trick, if you manage to get it quickly, no one will even notice that you spent some time making the build faster !

With a bit of maths, you can even get what Reinertsen calls a “Decentralized Decision Rule”, making it possible for anyone in the organization to figure out if he should spend some time on the build or not; without the need to ask the permission to anyone.

Our example

Our team is constituted of 5 pairs, each running the build at least 10 times per day. Let’s figure out the value of 1 minute build time speed-up

  • The whole team would save : 1m x 5 pairs x 10 builds = 50 minutes per day
  • In a 2 weeks sprint, this would amount to around 1 day of work

This means that if a pair spends half a day to get a 1 minute build speed-up, it would not change the output of the sprint, and it would in fact increase the throughput of the team for further sprints.

Anyone in our team that spots a potential 1 minute build time speed-up that would take less that 1 to implement should do it right away, without asking the permission to anyone

Other Benefits

A direct benefit is that the issue will not have to be re-discussed every time someone spots a build time improvement. This will save some management time, and more build speed-up actions will eventually be undertaken.

The astute lean reader will have noticed that I completely ignored the second effect of fast feedback :

  • if the build is faster
  • we will run it more often
  • we’ll spot errors earlier
  • less errors will be submitted
  • the overall throughput will be increased even more

Another hidden benefit concerns the Cost of Delay (the cost of not selling the product NOW). As Cost of Delay typically trumps the work costs, this means that any improvement to the build time will bring even greater ROI in the long term.


If your sponsor agrees, you can negotiate a longer return on investment period for your decision rule. For example, if he agreed to increase the horizon to 2 sprints, we could undertake more build time speed-up tasks. You might also prefer only to discuss really long ROI investments with him.

While designing the 777 Boeing used a similar decision rule to meet the required weight of the plan : any engineer could increase the production cost of 300$ provided it saved a pound of weight on the plane. This fixed issues they previously had with department weight budgets and escalation.

Finally, it would be great if we had the same rule for technical debt ! Imagine that you knew both the costs of fixing and not fixing your technical debt, you could then decided whether it makes sense to work on the debt right now or not. But that’s for a later experiment.

This was part 6 of my Lean Software Development Series. Part 5 was What optimization should we work on ?, Part 6 will be “A Plan for Technical Debt”.

The Agile Change Management Viral Hack

We just discovered a hack to spread agile (BDD by the way) through an organization. It works provided :

  • There is a BDD testing expert in your team
  • Your team is using the work of another software team from your company

If this team does not use agile practices, they are likely to regularly create regressions or to be late at providing new versions.

Use your client role in the relation, and propose your help ! Spend some time with them to help them put automated customer tests in place. Be genuinely nice with them, show example, be available and, obviously, bring improvement. With some chance, they might soon be asking for more.

Given there are too many bugs
When you can help
Then DO IT !

Real Programmers Have Todo Lists

Productive programmers maintain a todo list. No Exception.

Why is it so important

As programmers, here is the typical internal discussion we have all day long :

- Why the hell am I doing this again ?
… hard thinking …
- Yes ! I remember now :
- Encapsulate this field
- In order to move it to another class
- In order to move this other function there too
- In order to be able to remove that other static variable
- In order to refactor the login module
- In order to remove the dependency between the custom search query generator and the login module
- In order to refactor the query generator
- In order to be able to optimize it
- In order to speed up the whole website !

Phew, now that’s a list ! A 9 frame stack, all in our heads, and that’s only a simple example. Knowing that us humans usually have around 7 ‘registers’ in our brains, this makes a lot of clutter to remember.

Maintaining all this in a todo list frees us some brainpower !

What happens when you use a todo list

Quite a lot in fact :

  • It’s satisfying to check something as done !
  • Our programming gets better, because we can fully concentrate on it
  • We have a clear idea about what’s done, what’s still to be done, and why we are doing it
  • We avoid getting lost in things that don’t really need to be done
  • We can make better choices about what to do, what to postpone, and what not to do
  • We can make more accurate estimates about the time it will take to finish the job

In the end, all this makes you feel less stressed and more productive !

How to do it

There are many ways to maintain a todo list. Which to choose is not as important as having one. Here are my 2 design principles for a todo list system :

  • It goes in finer details than a typical bug tracking software
  • It should help you to concentrate on the few items you can do in the coming hours

For example, I am now using a simple TODAY … TOMORROW … LATER … scheme. I tend to avoid deep hierarchies as it gets in the way of my second principle. I like to keep DONE items visible to keep track of what I did for up to 1 day.

Here is a list of tools you can use to set up a todo list :

  • Any text editor using a simple format convention will do
  • Dropbox or any other synchronization tool can be helpful to access it from different places
  • Org Mode of Emacs has built-in support for todo lists. It’s a simple text file, but with color highlighting and shortcuts
  • Google Keep might do just fine for you
  • Google Docs can also be useful, especially if you need to share your todo list with others (when pair programming for example)
  • Trello is also a good one, it can even be used as a personal kanban board
  • Any other todo list tool that suits you !

If you are not already using a todo list, start now and become more productive ! No excuse !

Trellospectives : Remote Retrospectives With Trello

As a distributed team working from Paris and Beirut, after pair programming, it was time for our retrospectives to get remote !

Initial setup

At first we were using the visio conference system. The retrospective facilitator would connect with the remote participants through instant chat and forward theirs post-its. We also used an extra webcam connected to the laptop in order to show the whiteboard in the other room.


  • Anyone can do it now
  • Kind of works


  • We often used to loose 5 minutes setting all the infrastructure up
  • The remote room cannot see the board clearly through the webcam
  • The animator has to spend his time forwarding the other room’s points
  • There is a ‘master’ and a ‘slave’ room

Sensei Tool

When Mohamad joined the team in Beirut, we thought that this was not going to scale … We decided to try something else. With the availability of the new conferencing system, we had the idea to use a web tool to run the retro. We found and tried After creating accounts for every member of the team and scheduling a retrospective through the tool, we could all equally participate using our smartphones. The retrospective follows a typical workflow that is fine for teams new to the practice.


  • Even easier to setup
  • Works fine


  • The website was a bit slow
  • The retrospective was too guided for an experienced team, we did not get as good outputs as we used to
  • Everyone could participate as easily


Asking Google, we discovered that some teams were having success using Trello for their remote retrospectives. We decided to give it a try. Ahmad from Beirut got to work on our first retrospective with it. He had to prepare it beforehand (as we always have been doing). In practice :

  • Ahmad created an organization for our team
  • We all registered to Trello and joined the organization (we like joining (smile))
  • Ahmad created a custom board for each activity
  • During the meeting, we used the video conferencing system and the instant chat to have both visio and screen sharing
  • The animator used a laptop to manage the Trello boards
  • Everyone of us could add post-its through his smartphone app


  • The setup is easy
  • The retrospective worked well and delivered interesting output
  • We actually all see the board
  • The smartphone app works well
  • It is possible to vote directly through Trello
  • Everyone could participate as easily
  • We can classify post-its with labels
  • We can insert pictures and photos
  • There are a lot of chrome extensions to Trello (Vertical Lists for Trello), Card Color Titles for Trello


  • There is nothing to ‘group’ post its together
  • We need to prepare custom boards for every activity
  • We would need to pay for the gold version with custom backgrounds and stickers


While missing a few features that would make it awesome, Trello is the best tool we found for remote retrospective, and is better than our initial physical setup. We’re continuing to use it, and we now have to figure out

  • If we could find a way to speed up the meeting preparation
  • How to handle ‘graph oriented’ activities such as the ‘5 whys’

What Optimization Should We Work on (Lean Software Development Part 5)

At work, we are building a risk aggregation system. As it’s dealing with a large bunch of numbers, it’s a huge heap of optimizations. Once that its most standard features set is supported, our job mostly consists of making it faster.

That’s were we are now doing.

How do we choose which optimization to work on ?

The system still being young, we have a wide range of options to optimize it. To name just a few : caches, better algorithms, better low level hardware usage …

It turns out that we can use the speedup factor as a substitute for business value and use known techniques to help us to make the best decisions.

Let’s walk through an example

I. List the optimizations you are thinking of

Let’s suppose we are thinking of the following 3 optimizations for our engine

  • Create better data structures to speed up the reconciliation algorithm
  • Optimize the reconciliation algorithm itself to reduce CPU cache misses
  • Minimize boxing and unboxing

II. Poker estimate the story points and speedup

Armed with these stories, we can poker estimate them, by story points and by expected speedup. As a substitute for WSJF, we will then be able to compute the speedup rate per story point. We will then just have to work on the stories with the highest speedup rate first.

Title Story Points /10 /2 -10% ~ +10% x2 x10 Expected Speedup ratio* Speedup rate / story point**
Data Structures 13 4 votes 5 votes x 1.533 x 1.033
Algorithm 13 1 vote 1 vote 2 votes 1 vote 2 votes 2 votes x 1.799 x 1.046
Boxing 8 9 votes x 1.1 x 1.012

* Expected speedup ratio is the logarithmic average of the voted speedups
** Speedup rate is “speedup(1/ story points)

So based on speedup rate, here is the order in which we should perform the stories :

  1. Algorithm
  2. Data Structures
  3. Boxing

III. And what about the risks ?

This poker estimation tells us something else …

We don’t have a clue about the speedup we will get by trying to optimize the algorithm !

The votes range from /2 to x10 ! This is the perfect situation for an XP spike.

Title Story points Expected Speedup rate
Algorithm spike : measure out of context CPU cache optimization speedup 2 ?

In order to compute the expected speedup rate, let’s suppose that they are 2 futures, one where we get a high speedup and another where we get a low one.

They are computed by splitting the votes in 2 :

  • low_speedup = 0.846
  • high_speedup = 3.827

If the spike succeeds

We’ll first work on the spike, and then on the algorithm story. In the end, we would get the speedup of the algorithm optimization.

  • spike_high_speedup = high_speedup = 3.827

If the spike fails

We’ll also start by working on the spike. Afterwards, instead of the algorithm story, we’ll tackle another optimization stories, yielding our average speedup rate for the duration of the algorithm story. The average speedup rate can be obtained from historical benchmark data, or by averaging the speedup rate of the other stories.

  • average_speedup_rate = (1.033 * 1.011)½ = 1.022
  • spike_low_speedup = average_speedup_ratestory_points = 1.02213 = 1.326

Spike speedup rate

We can now compute the average expected speedup rate for the full period ‘spike & algorithm’ stories. From this we will be able to get the speedup rate and finally, to prioritize this spike against the other stories in our backlog.

  • spike_speedup = (spike_low_speedup * spike_high_speedup)½ = 2.253
  • spike_speedup_rate = spike_speedup1/(spike_story_points + algorithm_story_points) = 2.2531/(2 + 13) = 1.056

IV. Putting it all together

Here are all the speedup rate for the different stories.

Title Speedup rate / story point
Data Structure x 1.033
Algorithm x 1.046
Boxing x 1.012
Algorithm spike x 1.056

Finally, here is the optimal order through which we should perform the stories :

  • Algorithm spike
  • Algorithm (only if the spike proved it would work)
  • Data Structures
  • Boxing


The math are not that complex, and a simple formula can be written to compute the spike speedup rate :

I think most experienced engineers would have come to the same conclusion by gut feeling …

Nevertheless I believe that systematically applying the such method when prioritizing optimizations can lead to a greater speedup rate than the competition in the long run. This is a perfect example where taking measured risks can payoff !

This was part 5 of my Lean Software Development Series. Part 4 was Measure the business value of your spikes and take high payoff risks, Part 5 will be You don’t have to ask your boss for a fast build.

Setting Up Octopress With Vagrant and Rbenv

I recently got hands on an abandonned laptop that was better than the one I was currently using for my personnal hackings, so I decided to switch to this one. I felt this was the time to learn Vagrant and save me some time later on. I settled on creating a Vagrant environment for this Octopress blogging. That proved a lot longer than I thought it would.

If you want to jump to the solution, just have a look at this git change. Here is the slightly longer version.

  • Add a Vagrantfile and setup a VM. There are explainations about how to do this all over the web, that was easy.

  • Provision your VM. That proved a lot more complex. There are a lot of examples using variants of Chef, but the steep learning curve for Chef seemed unneccessarily complex compared to what I wanted to do. Eventually, I figured it out using simple shell provisioning.

  config.vm.provision "shell", inline: <<-SHELL
    echo "Updating package definitions"
    sudo apt-get update

    echo "Installing git and build tools"
    sudo apt-get -y install git autoconf bison build-essential libssl-dev libyaml-dev libreadline6-dev zlib1g-dev libncurses5-dev libffi-dev libgdbm3 libgdbm-dev

  config.vm.provision "shell", privileged: false, inline: <<-SHELL
    git config --global "john.doe"
    git config --global ""

    if [ ! -d "$HOME/.rbenv" ]; then
      echo "Installing rbenv and ruby-build"

      git clone ~/.rbenv
      git clone ~/.rbenv/plugins/ruby-build

      echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
      echo 'eval "$(rbenv init -)"' >> ~/.bashrc

      echo "Updating rbenv and ruby-build"

      cd ~/.rbenv
      git pull

      cd ~/.rbenv/plugins/ruby-build
      git pull

    export PATH="$HOME/.rbenv/bin:$PATH"
    eval "$(rbenv init -)"

    if [ ! -d "$HOME/.rbenv/versions/2.2.0" ]; then
      echo "Installing ruby"

      rbenv install 2.2.0
      rbenv global 2.2.0

      gem update --system
      gem update

      gem install bundler
      bundle config path vendor/bundle

      rbenv rehash

    cd /vagrant
    bundle install

    if [ ! -d "/vagrant/_deploy" ]; then
      bundle exec rake setup_github_pages[""]
      git checkout . # Revert github deploy url to my domain
      cd _deploy
      git pull origin master # pull to avoid non fast forward push
      cd ..
  • Setup port forwarding. That should have been simple … after forwarding port 4000 to 4000, I could still not manage to access my blog preview from the host machine. After searching throughout the web for a long time, I eventually fixed it with by adding --host to the rackup command line in Octopress Rackfile

  • Setup ssh forwarding. In order to be able to deploy to github pages with my local ssh keys, I added the following to my Vagrantfile.

  # The path to the private key to use to SSH into the guest machine. By
  # default this is the insecure private key that ships with Vagrant, since
  # that is what public boxes use. If you make your own custom box with a
  # custom SSH key, this should point to that private key.
  # You can also specify multiple private keys by setting this to be an array.
  # This is useful, for example, if you use the default private key to
  # bootstrap the machine, but replace it with perhaps a more secure key later.
  config.ssh.private_key_path = "~/.ssh/id_rsa"

  #  If true, agent forwarding over SSH connections is enabled. Defaults to false.
  config.ssh.forward_agent = true
vagrant plugin install vagrant-vbguest
vagrant reload

I’ll tell you if this does not do the trick.

I admit it was a lot longer than I expected it to be, but at least now it’s repeatable !

Next steps will be to use Docker providers and Dockerfile to factorize provisioning and speedup up VM startup.

Measure the Business Value of Your Spikes and Take High Payoff Risks (Lean Software Development Part 4)

Lately at work, we’ve unexpectedly been asked by other teams if they could use our product for something that we had not forseen. As we are not sure whether we’ll be able to tune our product to their needs, we are thinking about doing a short study to know the answer. This looks like a great opportunity to try out Cost of Delay analysis about uncertain tasks.

Unfortunately, I cannot write the details of what we are creating at work in this blog, so let’s assume that we are building a Todo List Software.

We have been targeting the enterprise market. Lately, we’ve seen some interest from individuals planning to use our todo list system for themselves at home.

For individuals, the system would need to be highly available and live 24/7 over the internet, latency will also be critical to retain customers, but the product could get a market share with a basic feature set.

On the other side, enterprise customers need advanced features, absolute data safety, but they can cope with nightly restarts of the server.

In order to know if we can make our todo list system available and fast enough for the individuals market, we are planning to conduct a pre-study, so as not to waste time on an unreachable goal. In XP terms, this is a spike, and it’s a bunch of experiments rather than a theoretical study.

When should we prioritize this spike ?

If we are using the Weighted Shortest Job First metric to prioritize our work, we need to estimate the cost of delay of a task to determine its priority. Hereafter I will explain how we could determine the value of this spike.

Computing the cost of delay

The strategy to compute the cost of delay for such a risk mitigation task is to compute the difference in cost of delays with or without doing it.

1. The products, the features, the MVP and the estimates

As I explained in a previous post, for usual features, cost of delay is equivalent to it’s value. Along with our gross estimates, here are the relative values that our product owner gave us for the different products we are envisioning.

Feature $ Enterprise $ Individuals Estimated work
Robustness 20* 20* 2
Availability 0 40* 2
Latency 0 40* 1
Durability 40* 13 2
Multi user lists 20* 8 2
Labels 20 13 2
Custom report tool 13 0 3
TOTAL Cost Of Delay of v1 80 100

Stared (*) features are required for the first version of the product. Features with a value of 0 are not required for the product. Eventually, unstared features with a non null business value would be great for a second release.

It seems that the individuals market is a greater opportunity, so it’s worth thinking about it. Unfortunately for the moment, we really don’t know if we’ll manage to get the high availability that is required for such a product.

The availability spike we are envisioning would take 1 unit of time.

2. Computing the cost of delay of this spike

The cost of delay of a task involving some uncertainty is the probabilistic expected value of its cost of delay. We estimate that we have 50% of chances of matching the availability required by individuals. It means that CoD of the spike = 50% * CoD if we match the latency + 50% CoD if we don’t match the availability.

2.a. The Cost of Delay if we get the availability

Let’s consider the future in which we’ll manage to reduce the latency. The cost of delay of a spike task is the difference in Cost with and without doing the spike, per relevent months.

2.a.i. The cost if we don’t do the spike

Unfortunately, at this point in this future, we don’t yet know that we’ll manage to get to the availability.

Feature $ Enterprise $ Individuals $ Expected Estimated work WSJF
Latency 0 40* 20 1 20
Durability 40* 13 26 2 13
Robustness 20* 20* 20 2 10
Availability 0 40* 20 2 10
Labels 20 13 17 2 9
Multi user lists 20* 8 14 2 7
Custom report tool 13 0 8 3 3

We’ll resort to WSJF to prioritize our work. Here is what we’ll be able to ship :

Product Delay CoD Cost
Individuals 7 100 700
Individuals Durability 7 13 91
Individuals Labels 9 13 117
Enterprise 11 80 880
Enterprise labels 11 20 220
Individuals Multi user lists 13 8 104
Enterprise Custom reports 16 13 208

2.a.ii. The cost if we do the spike

In this case, we would start by the spike, and it would tell us that we can reach the individuals availability and so that we should go for this feature first. Here will be our planning

Feature $ Enterprise $ Individuals Estimated work Enterprise WSJF Individuals WSJF
Feasibility spike 1
Latency 0 40* 1 40
Availability 0 40* 2 20
Robustness 20* 20* 2 10 10
Durability 40* 13 2 20 7
Multi user lists 20* 8 2 10 4
Labels 20 13 2 10 7
Custom report tool 13 0 3 4

Here is how we will be able to ship :

Product Delay CoD Cost
Individuals 6 100 600
Individuals Durability 8 13 104
Individuals Multi user lists 10 8 80
Enterprise 10 80 800
Individuals Labels 12 13 156
Enterprise Labels 12 20 240
Enterprise Custom reports 15 13 195

2.a.iii. Cost of delay of the spike if we reach the availability

By making the spike, we would save 2320 – 2175 = 145$

Without doing the spike, we would discover whether we would reach the availability when we try it, around time 7 (see 2.a.i).

So the cost of delay for the spike would be around 145/7 = 21 $/m

2.b. The Cost of Delay if we don’t get the availability

Let’s now consider the future in which we don’t manage to increase the availability.

Using the same logic as before, let’s now see what happens

2.b.i. The cost if we don’t do the spike

Unfortunately, at this point in this future, we don’t yet know that we’ll not manage to get to the availability.

Feature $ Enterprise $ Individuals $ Expected Estimated work WSJF
Latency 0 40* 20 1 20
Durability 40* 13 26 2 13
Robustness 20* 20* 20 2 10
Availability 0 40* 20 2 10
Multi user lists 20* 8 14 2 7
Labels 20 13 17 2 9
Custom report tool 13 0 8 3 3

When we’ll fail at the availability, we’ll switch multi user lists and labels to be able to ship to enterprises as quickly as possible. Here is what we’ll ship.

Product Delay CoD Cost
Enterprise 9 80 720
Enterprise Labels 11 20 220
Enterprise Custom reports 14 13 182

2.b.ii. The cost if we do the spike

In this case, we would start by the spike, and it would tell us that we won’t match the availability required for individuals and so that that there’s no need to run after this now.

Feature $ Enterprise Estimated work WSJF
Feasibility spike 1
Durability 40* 2 13
Robustness 20* 2 10
Multi user lists 20* 2 7
Labels 20 2 9
Custom report tool 13 3 3

Here is how we will be able to ship :

Product Delay CoD Cost
Enterprise 7 80 560
Enterprise Labels 9 20 180
Enterprise Custom reports 12 13 156

2.b.iii. Cost of delay of the spike if we reach the availability

By making the spike, we would save 1122 – 896 = 226$

As before, without doing the spike, we would discover whether we would get the availability when we try it, around time 7.

So the cost of delay for the spike is around 226/7 = 32 $/m

2.c. Compute overall Cost of Delay of the Spike

Given that we estimate that there is a 50% chances of making the latency, the overall expected cost of delay is

50% * 21 + 50% * 32 = 26.5 $/m

Inject the spike in the backlog

With the Cost of Delay of the spike, we can compute it’s WSJF and prioritize it against other features.

Feature $ Enterprise $ Individuals Expected $ Estimated work WSJF
Feasibility Spike 26.5 1 26.5
Latency 0 40* 20 1 20
Durability 40* 13 26 2 13
Robustness 20* 20* 20 2 10
Availability 0 40* 20 2 10
Multi user lists 20* 8 14 2 7
Labels 20 13 17 2 9
Custom report tool 13 0 8 3 3

The spike comes at the top of our backlog. Which confirms our gut feeling.


Doing this long study confirmed classic rule of thumbs

  • Don’t develop many products at the same time
  • Do some Proof Of Concepts early before starting to work on uncertain features
  • Tackle the most risky features first

By improving the inputs, we could get more quality results :

  • If we had access to real sales or finance figures for the costs
  • If we did some sort of poker risk estimation instead of just guessing at 50% chances

Obviously, the analysis itself is not perfect, but it hints to the good choices. And as Don Reinertsen puts it, using an economical framework, the spread between people estimations goes down from 50:1 to 2:1 ! This seems a good alternative to the experience and gut feeling approach which :

  • can trigger heated unfounded discussions
  • often means high dependence on the intuition of a single individual

As everything is quantitative though, one could imagine that with other figures, we could have got to another conclusion, such as :

  • The spike is not worth doing (it costs more than it might save)
  • The spike can wait a bit

This was part 4 of my Lean Software Development Series. Part 3 was How to measure your speed with your business value, continue on Part 5 : What optimization should we work on ?.

From Zero to Pair Programming Hero

In my team at Murex, we’ve been doing Pair programming 75% of our time for the past 9 months now.

Before I explain how we got there, let’s summarize our observations :

  • No immediate productivity drop
  • Pair programming is really tiring
  • Quality expectations throughout the team soared up
  • As a result, the quality actually increased a lot
  • But existing technical debt suddenly became incompatible with the team’s own quality criterion. We went on to pay it back, which slowed us down for a while
  • Productivity is regularly going up as the technical debt is reduced
  • It helped us to define shared coding conventions
  • Pair programming is not for everyone. It has likely precipitated the departure of one team member
  • It certainly helped the team to jell
  • Newcomers can submit code on day 1
  • The skills of everyone increase a lot quicker than before
  • Bonus : it improves personal skills of all the team members

If you are interested in how we got there, read on, here is our story :

Best Effort Code Reviews

At the beginning, only experienced team members were reviewing the submitted code, and making remarks for improvement over our default review system : Code Collaborator.

This practice revealed tedious, especially with large change lists. As it was not systematic, reviewers constantly had to remind to the reviewees to act on the remarks, which hindered empowerment.

Systematic Code Reviews

Observing this during a retrospective, we decided to do add code review to our Definition Of Done. Inspired by best practices in the Open Source world, we created a ruby review script that automatically creates Code Collaborator reviews based on the Perforce submits. Everyone was made observer of any code change, and everyone was to participate in the reviews.

At first, to make this practice stick, a few benevolent review champions had to comment all the submitted code; once the habit was taken, everyone participated in the reviews.

Code Collaborator spamming was certainly an issue, but using Code Collaborator System Tray App helped each of us to keep up to date with the remaining reviews to do.

Bonus : As everyone was doing reviews, and that reviews of small changes are easier, submits became smaller.

This was certainly an improvement, but it remained time consuming. We believed we could do better.

Pair Programming

After 1 or 2 months of systematic code reviews, during a retrospective (again) nearly all the team members decided to give pair programming a try.

We felt the difference after the very first day : pair programming is intense, but the results are great. We never turned back.

With pair programming in place, we had to settle a pair switching frequency. We started with the full story, tried a one day rotation, and eventually settled on “MIN(1 week, the story)”.

This is not set in stone and is team dependent. It may vary depending on the learning curve required to work on a story. We might bring it down later maybe.

Remote Pair Programming

Ahmad, a Murex guy from Beirut joined the team a few months ago. We did not want to change our way of working, and decided to try remote pair programming.

Initial Setup

At the beginning, we were using Lync (Microsoft’s chat system) with webcams, headphones and screen sharing. It works, but Lync’s screen sharing does not allow seamless remote control between Paris and Beirut. Here is how we coped with this :

  • Only exceptionally use Lync’s “Give Control” feature : it lags too much
  • Do small submits, and switch control at submits
  • When you cannot submit soon, shelve the code on perforce (you would just pull your buddy’s repo with git), and switch control

As a result, Ahmad became productive a lot more quickly. We are not 2 sub teams focusing on their own area of expertise, but 1 single distributed team sharing everything.

Improved Setup

Remote pair programming as such is workable, but does not feel as easy as being collocated. Here are a few best practices we are now using to improve the experience :

  • Keep your pair’s video constantly visible : either on your laptop of in a corner of you main screen, but it’s important to see his facial expression all the time
  • In order to allow eye contact, place your cam next to the window containing the video of your pair.
  • Using 2 cameras, ManyCams and a small whiteboard allows to share drawings !

Future Setup

We are currently welcoming a new engineer in Beirut, and as we will be doing more remote pair programming, we’ll need to make this more seamless. Control sharing and lag through Lync remain the main issues. We don’t have a solution for that yet, but here are the fixes we are looking into

  • Saros is an Eclipse plugin for remote concurrent and real time editing of files. Many people can edit the files at the same time. We are waiting for the Intellij version that is still under development

  • Floobits is a commercial equivalent of saros. We tried it and it seems great. It’s not cheap though, especially with in-house servers.
  • Screenhero is a commercial low-lag, multi cursor screen sharing tool. Unfortunately, it currently does not work behind a proxy, and so we were not able to evaluate it yet.

Final thoughts

I believe that collocated, and remote, pair programming are becoming key skills for a modern software engineer.

I hope this will help teams envisioning pair programming. We’d love to read about your best practices as well !

Can Agile Teams Commit?

Making commitments to deliver software is always difficult. Whatever the margin you might take, it invariably seems wrong afterward …

Most Scrum, XP or Kanban litterature does not adress the issue, simply saying that commitment is not required, as long as you continuously deliver value (faster than your competition). That’s kind of true, but sometimes you need commitments, for example if your customer is not yet ready to deploy your new software every friday worldwide …

So how can you do it while remaining agile ?

Grossly speaking, you have 2 options :

Do it as usual

Discuss with your experts, take some margin, do whatever voodoo you are used to. This will not be worse than it used to be. It might turn out better, thanks to your agile process, you should be able to deploy with a reduced the scope if needed.

Use your agile process metrics

This technique is explained in The Art of Agile Development, in section “Risk Management”, page 227.

Let’s estimate the time you’ll need before the release

  • First, list all the stories you want it your release
  • Then estimate them with story points.
  • Now that you have the total number of story points you’ll have to deliver before the release, apply a generic risk multiplier :
Chances of making it Using XP practices Otherwise Description
10% x1 x1 Almost impossible (ignore)
50% x1.4 x2 50-50 chance (stretch goal)
90% x1.8 x4 Virtually certain (commit)

As explained in The Art of Agile Development page 227, these numbers com from DeMarco’s Riskology system. Using XP practices means fixing all bugs at all iteration, sticking rigorously to DONE-DONE, and having a stable velocity over iterations.

This factor will account for unavoidable scope creep and wrong estimations. * Use you iteration velocity to know how many sprints you’ll need to finish.

For example :

Suppose you have 45 stories that account for a total of 152 story points, and that your velocity is 23 story points per iteration. You need to do a commitment, but hopefully, you are applying all the XP practices. So you can compute :

Number of sprints = 152*1.8/23 = 12 sprints, (24 weeks, or about 5.5 months)

What about unknown risks ?

Unfortunately, using just the previous, you might miss some unavoidable important tasks you’ll need to do before you can release. Think about monitoring tools and stress testing, when did your product owner prioritize these ? These are risk management activities that need to be added to your backlog in the first place. Here is how to list most of them.

  • Do a full team brainstorming about anything that could possibly go bad for your project
  • For every item discovered, estimate
    • It’s probability of occurrence (low, medium, high)
    • It’s consequences (delay, cost, cancellation of the project)
  • For every item, decide whether to
    • avoid it : you have to find a way to make sure this will not happen
    • contain it : you’ll deal with the risk when it occurs
    • mitigate it : you have to find a way to reduce it’s impact
    • ignore it : don’t bother with unlikely risks of no importance
  • Finally, create stories to match your risk management decisions. These might be :
    • Monitoring systems helps to contain a risk
    • Logging helps to mitigate a risk
    • An automated scaling in script for situations of high demand helps both mitigate and contain the risk
  • Simply add these stories to your backlog, and repeat the previous section. You can now make your commitment


Contrary to the widespread belief, agile practices and metrics actually helps to make commitments.

It would be better if we had project specific statistics instead of generic risk multipliers. It’s a shame that task tracking tools (JIRA & friends) still don’t help us with this.

We should keep in mind though, that estimating all your backlog in advance takes some time and is actually some kind of waste. If possible, just sticking to build (and sell) the thing that is the most useful now is more simple (this guy calls it drunken stumble).

Tech Mesh 2012 – Building an Application Platform: Lessons from CloudBees – Garrett Smith from Erlang Solutions on Vimeo.

Performance Is a Feature

Now that is a widespread title for blog articles ! Just search Google, and you’ll find “Performance is a feature” in Coding Horror and others.

What’s in it for us ?

If performance is indead a feature, then it can be managed like any feature :

  • It should result from use cases

    During use case X, the user should not wait more than Y seconds for Z

  • It can be split into user stories

  • Story 1: During use case X, the user should not wait more than 2*Y seconds for Z
  • Story 2: During use case X, the user should not wait more than Y seconds for Z
  • They can be prioritized against other stories
  • Let’s forget about performance for now and deliver functionality A as soon as ready, we’ll speed things up later.
  • Let’s fix basic performance constraints for use case X for now, every story will have to comply with these constraints later.
  • The performance on these use cases should be automatically tested and non regressed
  • If we slow things too much and these tests breaks, we’ll have to optimize the code.
  • But as long as we don’t break the tests, it’s ok to unoptimize the code !

Maybe that’s a chance to stop performance related gut feeling quarrels !