How to Subscribe to an ActionCable Channel on a Specific Page With Custom Data ?

In my spare time, I’m writing a Planning Poker App. As a reminder, planning poker is a group estimation technique designed to eliminate influence bias. Participants keeps their estimates secret until everyone unveils them at the same time (See Wikipedia for more details).

The driving idea behind my app is for team members to connect together and share a view of the current vote happening in their team. Each team has an animator, who is responsible to start new votes. This is the aspect I’ve been working on during the last few days. I want all team members to be notified that a new vote started by displaying a countdown on their page.

I am building the app with Rails 5 but I did not have a clear idea of what technology to use to build this feature. After some googling, I found that ActionCable provides just the kind of broadcasting I am looking for (Have a look at the ActionCable Rails guide for more details).

A Specific Page

The Rails guide is pretty clear, as usual I would say, but all the examples show subscriptions at any page load. As explained above, I only want participants to subscribe to their own team’s votes : until they have joined a team, it is not possible to subscribe to a particular channel.

As my app is currently behaving, once identified, participants get to a specific team page. I wanted to use this page as the starting point to my subscription. After some more googling about page specific JavaScript in Rails, I found this page from Brandon Hilkert that explains how to do this cleanly. The idea is to add the controller and action names to the body tag, and to filter out js code at page load. This is what I ended up doing :

First, I adapted the app layout to keep track of the controller and action names in the HTML body :

1
2
3
4
5
6
7
8
<!-- app/layouts/application.html.erb -->
<!DOCTYPE html>
<html>
  ...
  <body class="<%= controller_name %> <%= action_name %>">
    ...
  </body>
</html>

Then I replaced the default channel subscription with a function :

1
2
3
4
5
6
7
8
# app/assets/javascripts/channels/team.coffee
window.App.Channels ||= {}
window.App.Channels.Team ||= {}

App.Channels.Team.subscribe = ->
  App.cable.subscriptions.create "TeamChannel",
    received: (data) ->
      # Do something with this data

As a reminder, here is what the server side channel would look like :

1
2
3
4
5
class TeamChannel < ApplicationCable::Channel
  def subscribed
    stream_from "team_channel"
  end
end

Finally, I called this subscribe function from some page specific Javascript :

1
2
3
4
5
# app/assets/team_members.coffee
$(document).on "turbolinks:load", ->
  return unless $(".team_members.show").length > 0

  App.Channels.Team.subscribe()

That’s it. By playing around in your browser’s js console, you should be able to test it.

Custom Data

That’s just half of the story. The code above subscribes on a specific page, but it does not specify any particular team channel to subscribe to. This means that all participants would receive notifications from all teams !

In his article about unobtrusive JavaScript in Rails, Brandon Hilkert also suggests using HTML data attributes to pass parameters to the a JavaScript button event handler. There’s no button in our case, but we can still use the same technique. Let’s add data specific attributes to the HTML body.

To subscribe to specific team channel, the plan is to add the team name to the HTML body tag through a data attribute, then to capture and use this team name when subscribing.

Again, let’s enhance the layout :

1
2
3
4
5
6
7
8
<!-- app/layouts/application.html.erb -->
<!DOCTYPE html>
<html>
  ...
  <body class="<%= controller_name %> <%= action_name %>" <%= yield :extra_body_attributes %> >
    ...
  </body>
</html>

I had to adapt my views. In the team members show view (the one doing the subscription), I added an extra data attribute for the team name :

1
2
3
4
<!-- app/views/team_members/show.html.erb -->
<% provide(:extra_body_attributes, raw("data-team-name=\"#{@team.name}\"")) %>

...

With this done, it is possible to capture the team name from the page load event and feed it to the subscribe method :

1
2
3
4
5
# app/assets/team_members.coffee
$(document).on "turbolinks:load", ->
  return unless $(".team_members.show").length > 0

  App.Channels.Team.subscribe($('body').data('team-name'))

I then used the team name to subscribe to a specific channel :

1
2
3
4
5
6
7
8
# app/assets/javascripts/channels/team.coffee
window.App.Channels ||= {}
window.App.Channels.Team ||= {}

App.Channels.Team.subscribe = (teamName) ->
  App.cable.subscriptions.create {channel: "TeamChannel", team_name: teamName},
    received: (data) ->
      # Do something with this data

The last piece is to actually start a specific channel :

1
2
3
4
5
class TeamChannel < ApplicationCable::Channel
  def subscribed
    stream_from "team_channel_#{params[:team_name]}"
  end
end

Same as before, hack a bit with your browser’s console, you should be able to check that it’s working.

Last thoughts

This is not exhaustive, depending on your situation, there might be other things you’ll need to do, like unsubscriptions for example.

I’d also like to give a word of feedback about ActionCable after this first look at it. Overall, it worked great both in development and production. Everything seemed to work almost out of the box … Except testing : I did not manage to write robust unit tests around it. There is pull request for that that should be merged in Rails 5.~ sometimes soon. For the moment, I’m sticking to large scale cucumber tests.

How I Finally Use Docker on Small Open Source Side Projects

A few months ago, I started Philou’s Planning Poker, an open source side project to run planning poker estimate sessions remotely. The main technology is Rails, and I’d been planning to use Docker as much as possible as a way to learn it. Indeed, I learned that Docker is no Silver Bullet !

The Docker love phase

At first everything seemed great about Docker. I’d used it on toy projects and it proved great to quickly setup cheap and fast virtual machines. I even created the Rubybox project on Github to clone new ruby VMs in a matter of seconds. I also used Docker to host my Octopress environment to write this blog. As a long time Linux user, my dev machines have repeatedly been suffering from pollution : after some time, they get plagued with all the stuff I installed to do my various dev experiments, and at some point, re-install seems easier than cleaning up all the mess. If I could use containers for all my projects, Docker would be a cure for this.

Going through all these successes, when I started my planning poker app, I decided to go all into Docker, development, CI and deployment. You can read the log of how I did that in these posts. Fast forward a bit of searching, experimenting and deploying, all was setup : my dev env was in containers, my CI was running in containers in CircleCI and the app was pushed to containers on DgitalOcean.

Reality strikes back

At first, everything seemed to be working fine. Even if there were a few glitches that I would have to fix down the road like :

  • Whenever I wanted to update my app’s dependencies, I had to run bundle update twice, and not incrementally. Surely, I would manage to fix that with a bit of time
  • Obviously, the CI was slower, because it had to build the containers before deploying them to Docker Hub, but that was the price to pay in order to know exactly what was running on the server … right ?
  • And … Guard notifications did not appear on my desktop. I was accessing my dev env through ssh, so I would have to fix that, just a few hours and it should be working

After a while, I got used to my work environment and became almost as productive as I used to be … but you know, shit happens !

  • I had to install PhantomJS on my CI, and if that comes out of the box on TravisCI, you’re all alone in your own containers. Installing this on the Debian container proved unnecessarily complex, but I figured it out
  • Then all of a sudden, my CI started to break … You can read a summary of what I did to fix it here. Long story short : I had forgotten to clean up old docker images, and after enough deployments, the server ran out of space, and that corrupted the docker cache somehow. I eventually re-installed and upgraded the deployment VM. That made me lose quite some time though.
  • Finally, as I started to play with ActionCable, I could not get the web-socket notifications through my dev host. There must be some settings and configuration to make this work, for sure, but it’s supposed to work out of the box.

Eventually, this last issue convinced me to change my setup. All these usages of Docker where definitely worth it from a learning point of view, but as my focus moved to actually building the app, it was time to take pragmatic decisions.

My use of Docker now

There were 2 main ideas driving my changes to my dev env for this open source side project :

  1. Use the thing most people do
  2. Use commercially supported services & tools

These should avoid losing my time instead of being productive. My setup is now almost boring ! To summarize I now use TravisCI, Heroku, and rbenv on my physical machine. I kept Docker where it really shines : all the local servers required for development are managed by Docker Compose. Here is my docker-compose.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
db:
  image: postgres:9.4.5
  volumes:
    - planning-poker-postgres:/var/lib/postgresql/data
  ports:
    - "5432:5432"

redis:
  image: redis:3.2-alpine
  volumes:
    - planning-poker-redis:/var/lib/redis/data
  ports:
    - "6379:6379"

This saves me from installing Postgresql or Redis on my dev machine, and I can start all the services required for app with a single docker-compose up command !

My future uses of Docker

More generally, in the near future, here is when I’ll use docker

  • As I just said, to manage local servers
  • To boot quick and cheap VMs (check rubybox)
  • To handle CI and deployment of large or non-standard systems, where Docker can provide a lot of benefits in terms of price, scaling or configurability

Docker came from the deployment world, and this is where it is so great. As of today though, even if it is usable as dev VM, it is still not up to a standard dev machine. Despite that, all the issues I ran into could be fixed, and I’m pretty sure they’ll be some day.

Developer ! Are You Losing Your Rat Race ?

A rat race is an endless, self-defeating, or pointless pursuit. It conjures up the image of the futile efforts of a lab rat trying to escape while running around a maze or in a wheel.

Are we building our own maze self defeating landscape by our exacerbated focus on technology ? Let me explain.

The context

As Marc Andreessen famously said “Software is eating the world”, which means that there is more and more demand for software. At the same time, giant countries like China, India, Russia or Brazil are producing more and more master’s degrees every year. This also means more and more software engineers. The consequence is that there has never been so many new technologies emerging than these days. The software landscape his huge, growing and complex.

That’s great for progress, but it’s a puzzle for hiring. In this chaotic environment, years of experience with a particular technology is something that remains easy to measure, that’s why employers (and developers) tend to use keywords to cast for a job.

The effects

As a result, developers tend to pick a few technologies to become masters at, to put them on their CV and get job offers. There’s a danger with specializing on a particular technology : eventually, it will become deprecated, in this keyword driven world, it’s almost like if you’ll have to start from zero again. Even if a specialization is wide enough now, as time goes on and more and more technologies are created, any area of expertise will become a tiny spot in all the landscape. One might think this is only an issue for old guys that did not stay up to date … I strongly believe this is wrong, it happened to all past technologies, I don’t see why today’s latest .js framework wouldn’t be legacy stuff one day.

One could think that sticking to a good employer is a good fix against that. It is … for some time ! Sticking to an company actually means betting on this company. What would happen if it went out of business, or through difficult times and you’re asked to leave ? When you reach the job market after so long with a single employer, you’ll be a de-facto specialist, on proprietary stuff that no one is interested about.

Finally, you might work hard not to specialize, but it’s going to be a lot more difficult to get a job as a generalist, only a few shops actually hire this way.

To summarize, we are forced into specialization, which is great in the short term, but risky in the long run.

1€ advice

So what can we do about this ? Obviously, we cannot change the world … The only ones we can act on are ourselves !

Learning

In our fast moving tech world, learning remains key ! But instead of trying to keep up with all the cool new techs that are invented every day, we should study fundamental skills, and only learn just enough specific skills to get the job done. To me fundamental skills are all the things you’ll apply whatever the language and technology you are using, for example :

  • design
  • architecture (whatever that is …)
  • clean code
  • refactoring
  • legacy code
  • testing
  • tooling
  • mentoring & coaching
  • programming paradigms (functional, dynamic, static, imperative, OO, concurrent …)
  • process flow
  • communication
  • product definition
  • concurrency
  • performance

I wrote this post that explains how I did learn some of these skills (by no mean would I say that this is the only way). Good mastery of these skills should be enough to quickly get up to speed in any project you are involved. This other article In 2017, learn every language, which I found through the excellent hackernewsletter, explains how this is possible.

Unfortunately, knowing is not enough …

Selling

How do you convince others that you are up to the job in a particular technology ? Unfortunately, I don’t have a definitive answer yet …

Regularly, people try to coin a word to describe the competent generalist developer : polyglot, full stack, craftsman … If it’s good enough, it usually gets taken over quite fast by the industry and just becomes yet another buzzword (the only exception being eXtreme Programming, but who would like to hire and eXtreme Programmer ?).

In Soft Skills, John Somnez says the trick is to explain to people that you might not have experience in a technology ‘yet’. This might work, if your resume gets through, which is not sure.

Here’s my try : the next time I’ll polish my resume, I’ll try to put forward my fundamental skills first, for example with 5 stars self-assessments. Only after will I add something like “By the way, I could work with tech X Y Z …”.

Independence

Being your own boss could be a solution in the long term. I recently listened to The End of Jobs in which the author explains that entrepreneurship is an accessible alternative these days, and that like any skill, it’s learnable. The catch is that there are no schools, no diplomas, and that it seems a lot riskier in the short run. Despite that, he makes the point that the skills you’ll learn makes it quite safe in the long run !

Questions

I feel like my post asks more questions than it provides answers :–). Honestly, I’d really love to read other people’s opinions and ideas. What are your tricks to market yourself on new technologies ? As a community, what could we do to fight our planned obsolescence ? Do you think I’m totally wrong and that the problem does not exist ? What do you think ?

How I Fixed ‘Devicemapper’ Error When Deploying My Docker App

A few months ago, I started continuously deploying my latest side project to a Digital Ocean box. If you are interested, here is the full story of how I did it. All was going pretty well until last week, when the builds unexpectedly started to fail. I wasn’t getting the same error at every build, but it was always the Docker deployment that failed. Here are the kind of errors I got :

1
2
3
4
5
6
7
8
# At first, it could not connect to the db container
PG::ConnectionBad: could not translate host name "db" to address: Name or service not known

# Then I started to have weird EOF errors
docker stderr: failed to register layer: ApplyLayer exit status 1 stdout:  stderr: unexpected EOF

# Eventually, I got some devicemapper errors
docker stderr: failed to register layer: devicemapper: Error running deviceCreate (createSnapDevice) dm_task_run failed

You can read the full error logs here.

That’s what happens when you go cheap !

After searching the internet a bit, I found this issue which made me understand that my server had ran out of disk space because of old versions of my docker images. I tried to remove them, but the commands were failing. After some more search, I found this other issue and came to the conclusion that there was no solution except resetting docker completely. Hopefully, Digital Ocean has a button for rebuilding the VM.

Once the VM was rebuilt, the first thing that I did was to try to connect from my shell on my local machine. I had to clean up my known host file, but that was simple enough.

1
nano ~/.ssh/known_hosts

Once this was done, I just followed the steps I had documented in my previous blog post

Was I all done ?

Almost … I ran into another kind of errors this time. Processes kept getting killed on my VM.

1
2
3
4
5
6
INFO [cc536697] Running /usr/bin/env docker-compose -f docker-compose.production.yml run app bundle exec rake db:migrate as root@104.131.47.10
rake aborted!
SSHKit::Runner::ExecuteError: Exception while executing as root@104.131.47.10: docker-compose exit status: 137
docker-compose stdout: Nothing written
docker-compose stderr: Starting root_db_1
bash: line 1: 18576 Killed

After some more Google searching, I discovered that this time, the VM was running out of memory ! The fast fix was to upgrade the VM (at the extra cost of 5$ / month).

After increasing the memory (and disk space) of the VM, deployment went like a charm. Others have fixed the same issue for free by adding a swap partition to the VM.

The end of the story

I wasted quite some time on this, but it taught me some lessons :

  1. I should have taken care of cleaning up the old images and containers, at least manually, at best automatically
  2. I should write a script to provision a new server
  3. The cheap options always come at a cost
  4. For an open source side project like this one, it might be a better strategy to only use Docker to setup my dev env, and use free services like Travis-ci and Heroku for production
  5. Doing everything myself is not a good recipe to getting things done … I well past time I leave my developer hat for an entrepreneur cap
  6. In order to keep learning and experimenting, focused 20h sessions of deliberate practice might be the most time effective solution

5 Minutes Hack to Speed Up RSpec in Rails 5 Using In-memory SQLite

Here is the story : you have a Rails 5 app that uses RSpec, but your RSpec suite is getting slower and slower to run. You’ve already considered some solutions :

  • Use SQLite in memory for your test env.
1
2
3
test:
  adapter: sqlite3
  database: ":memory:"

That’s the most straightforward thing to do, but unfortunately, if you are sharing your test env with Cucumber, you might want to use a production like DB with Cucumber (PostgreSQL or whatever). So unless you are ready to setup a new env for cucumber (which I tried and don’t recommend) you’re stuck.

  • Use mocks. That’s surely going to work, it’s going to make your test hell of a lot faster ! It will also make your tests a lot more fragile and more expensive to maintain … If you want to read more about why I think mocks are a bad idea, just have a look at these posts.

The hack

Here is a third alternative, I’ve already written about it, but here it comes updated and tested for Rails 5 :

  1. Don’t change anything to your config/database.yml
  2. Obviously, you’ll need to add sqlite3 to your Gemfile
  3. At the beginning of your spec/rails_helper.rb, replace
1
2
3
# Checks for pending migration and applies them before tests are run.
# If you are not using ActiveRecord, you can remove this line.
ActiveRecord::Migration.maintain_test_schema!

with

1
2
3
4
5
# In order to keep the same RAILS_ENV for rspec and cucumber,
# patch the connection to use sqlite in memory when running rspec
ActiveRecord::Base.establish_connection(adapter: 'sqlite3', database: ':memory:')
ActiveRecord::Schema.verbose = false
load "#{Rails.root.to_s}/db/schema.rb"

That’s it ! Run your specs … not bad for a 5 minutes investment !

One more thing …

If you need even more speed, you can now run your specs in parallel in different processes ! Each in-memory SQLite DB is bound to its process, so unlike a real PostgreSQL dev DB, you won’t get any conflicts between your tests ;–)

A Plain English Introduction to Paxos Protocol

A few weeks ago, I had to have a look at the distributed consensus protocol Paxos. Even though I know its purpose and I’ve built and used distributed systems and databases in the past, Paxos remains mind boggling at first !

The hard way

The best overall description I found is this answer by Vineet Gupta on Quora. After turning my head around it for a while, I finally gained the instinctive understanding which comes when you ‘get’ something.

As a way to both help others to understand Paxos faster and to burn all this in my own memory, I though it would he a good idea to illustrate it as a story (I was inspired by A plain English introduction to CAP Theorem which I found really instructive; I also later discovered that the original Paxos paper itself related the protocol using the metaphor of a parliament).

Once upon a time …

… there were 3 brothers and sisters, Kath, Joe & Tom, living happily. They lived far away, and it was not easy for them to meet and spend some time together. Neither did they have phone or internet, for this was a long time ago. All they had to discuss and share news was good old mail …

Unfortunately, one day, the worst happened : their parents die. All 3 are informed by a letter from the notary, telling them that they need to sell the family house in order to pay for their inherited debts. It also advises to use Paxos to agree on a price (Note : I never said the story was going to be chronologically sound !).

The happy end

As the oldest in the family, Kath decides to take things in hand, and starts the whole thing. She knows Paxos consists of 2 phases : ‘prepare’ and ‘accept’.

Prepare Phase

Kath sends a signed and dated price value proposal to her brothers, by mail.

Joe and Tom both receive the letter from Kath, they think the price is fair. In order to send their agreements back to Kath, they make a copy of the proposition, mark it as agreed, date it, sign it, and send it back.

Accept Phase

Joe lives a bit further away from Kath than Tom does, so correspondence between Kath and Tom is usually faster. Kath indeed receives the agreement from Tom first, she knows she can go on with the protocol straight away, because Paxos relies on majority, and not unanimity. In his letter, Tom agreed to the same price she proposed, so she just picks this one as the final price to agree on.

She sends new letters, called accept letters this time, to her brothers to finalize the agreement. In these letters, she specifies the price that they are agreeing on, plus the date at which it was first suggested (see Prepare Phase). When Tom and Joe receive the accept letter, they simply need to check that the time and price of the proposal to make sure it is what they agreed on, before sending back their final accept letter.

At the time when Kath receives the accept letters from her brothers, everyone knows that the price has been agreed.

Cover of the book "The Fairy Tales of the Grimm Brothers"

After

She then informs the notary on the agreed price. This one sends an information letter to the Kath, Tom & Joe. The house is sold pretty quickly, leaving the family out of financial problems for the rest of their lives …

Shit happens

That story went unexpectedly well ! Let’s see different variations about what would happen in real life.

Joe is particularly slow to answer

Joe has never been good at paperwork … he’s always out partying and having fun, and he does not want to bother answering letters. When Joe receives the prepare letter from Kath, he does not reply straightaway but leaves it on his desk to handle later. Meanwhile, Tom answers as soon as he got the letter. As mentioned before, Paxos relies on majority, as soon as Kath gets Tom’s answer, she can continue to the next phase. In fact, the accept phase also relies on majority, so she can continue to the end of the protocol if Tom continues to answer.

In this case, Joe would receive the accept letter before he sent his answer to the prepare letter, and would know that the consensus is moving on without him. He can try to catch up or not, but the consensus can be reach without him.

Tom wants to speed things up by becoming the master

Tom has always been the hurried brother. He does not like when things linger forever but prefers things to be done quickly. As soon as he receives the letter from the notary, he starts waiting impatiently for the prepare letter from his sister. Kath, on her part, takes a lot of time to settle on a price. Not knowing what is going on, Tom decides to take action, and to takes on the master role : he sends his own copies of the prepare letters. While these letters are in the mail, Kath finally settles on a price, and sends hers.

Joe gets Tom’s proposal first. Thinking that it’s a change in the plan, he responds straight away by signing the proposal and taking a copy for himself. The following day, he receives Kath’s proposal ! He’s a bit surprised, but hopefully, Paxos tells him exactly what to do in this situation. By agreeing to Tom’s proposal, he made a promise to stick to it whatever happens later. Here the date on Kath’s proposal is later than on Tom’s, so Joe is going to answer to Kath that he agrees but to to Tom’s proposal, of which he’ll join a copy.

After receiving the Joe’s agreement on his proposal, Tom has the majority, and should be able to end the protocol.

What about Kath ?

She should have received Tom’s proposal, and rejected it, because she had already proposed a later value. That will not prevent Tom to reach a consensus.

She should have received Joe’s agreement to Tom’s proposal. The same way, she might as well have received Tom’s agreement to his own proposal as an answer to hers. She’d get the majority of agreements, so she might then want to push on. For the accept letter, she must pick a value that has been accepted, in this case, it’s Tom’s proposed value ! Everything ends as expected as she’ll reach the same price as Tom.

Tom wants a higher price an becomes the master

Imagine Tom is obsessed about money ! When he receives Kath’s proposal, he’s outraged ! Believing the house has a lot more value than the proposed price, he sets on to act as a master in Paxos and sends his own proposal letters to his brother and sister.

Unfortunately, when they receive his proposal, they have already agreed to Kath’s older proposal, so they send him back a copy of it as an agreement. Having received agreements to Kath’s value only, he cannot push forward his value. Whether he continues his Paxos or not does not really matter, as he would reach the same value as Kath would.

River flood split between brothers and Kath

There’s a wide river that separates Kath from Joe and Tom. While they were trying to reach consensus, the river flood, cutting all communication between the brothers and their sister. Kath might abort the consensus as she won’t be able to get answers from the majority. On their side, Joe or Tom can takeover the consensus, take on the master role, and still reach a price, as they form a majority. As soon as the river would settle, the messages would arrive to both parties, eventually informing Kath that a price was accepted.

Lots of others

You can imagine zillions of ways in which the consensus between Kath, Joe and Tom could go wrong. For example :

  • Mail is so slow that Kath sends new proposals
  • One letter gets lost and arrives after Kath made a new proposal
  • Kath is struck by a lightning

Go ahead and execute Paxos step by step on all of them, you’ll see that whatever happens, Kath, Joe and Tom will reach a price.

More Formally

Now that you have an instinctive understanding of Paxos, I encourage you to read out the full explanation I found on Quora. Here is a extract with the protocol part :

Protocol Steps:

1) Prepare Phase:

  • A node chooses to become the Leader and selects a sequence number x and value v to create a proposal P1(x, v). It sends this proposal to the acceptors and waits till a majority responds.

  • An Acceptor on receiving the proposal P1(x, v1) does the following:

    • If this is the first proposal to which the Acceptor is going to agree, reply ‘agree’ – this is now a promise that the Acceptor would reject all future proposal requests < x
    • If there are already proposals to which the Acceptor has agreed: compare x to the highest seq number proposal it has already agreed to, say P2(y, v2)
      • If x < y, reply ‘reject’ along with y
      • If x > y, reply ‘agree’ along with P2(y, v2)

2) Accept Phase

  • If a majority of Acceptors fail to reply or reply ‘reject’, the Leader abandons the proposal and may start again.

  • If a majority of Acceptors reply ‘agree’, the Leader will also receive the values of proposals they have already accepted. The Leader picks any of these values (or if no values have been accepted yet, uses its own) and sends a ‘accept request’ message with the proposal number and value.

  • When an Acceptor receives a ‘accept request’ message, it sends an ‘accept’ only if the following two conditions are met, otherwise it sends a ‘reject’:

    • Value is same as any of the previously accepted proposals
    • Seq number is the highest proposal number the Acceptor has agreed to
  • If the Leader does not receive an ‘accept’ message from a majority, abandon the proposal and start again. However if the Leader does receive an ‘accept’ from a majority, the protocol can be considered terminated. As an optimization, the Leader may send ‘commit’ to the other nodes.

And here are the key concepts to map my story to this formal description of Paxos.

Story Paxos
proposal letter (and copy of) P(x,v)
Date (and time) sequence number

At the time of slow mail based communication, using the date and time down to the second is enough to build up unique sequence numbers. In our current time of digital messages, it’s another story, typical Paxos implementation assigns a different and disjoint infinite set of integers for every participant, it does not exactly follow ‘time’, but it’s enough for the algorithm to work.

What Happens to Non-Enthusiast Programmers in the Long Run ?

A few months ago, after receiving good feedback from my regular readers, I posted my latest article Is There Any Room for the Not-Passionate Developer ? on Hackernews and Reddit. I got a huge number of visits, a lot more than I typically get !

I also got a lot more comments, some nice, some tough, some agreeable and some challenging !

First, a summary

In this previous article, I wanted to contrast the different views about work/life balance in the software industry.

Some, like agile gurus and companies like Basecamp, and studies, strongly advocate for sane work hours. They explain that it results in greater productivity and healthy life.

On the other hand, the software field is always bubbling with novelty, and keeping up to date with technologies is by itself a challenge that takes time. For some companies, which might already be fighting for their survival against competition, it is almost impossible to grant some extra training time to their employees. The problem becomes particularly difficult when engineers get older, become parents and cannot afford to spend some extra time learning the latest JavaScript framework.

As a conclusion, I said that for most of us, it’s really difficult to remain a developer in the long run without the grit that only passion for programming brings. I encourage you to read it for more details.

What I learned from the comments

First of all, thanks a lot for all these, they were very valuable, they forced me to think even more about the issue.

People have been burnt !

The word ‘passion’ in particular, triggered engaged comments. As some pointed out, ‘enthusiast’ or ‘professional’ should be favored. It seems that some companies have asked their employees for unquestionable passion for their business (and not for engineering or programming) at the cost of the people’s own lives. As a commenter said, a lot of shops do not integrate the absolute necessity for their programmers to learn continuously in their business model. It made me kind of sad to feel once more this state of our industry.

As a result, people are weary of any statement of ‘passion’ in the workplace, and would prefer to be seen as very skilled professional, dedicated to keeping their skills up to date.

The particular question of France

I received some comments from all over the world, but my observations came from where I work : in France. Here, all in all, we have at least 7 weeks of paid leaves per year. It’s a lot more than in other parts of the world. I think it’s around 2 weeks in the US (other sources point the same fact). Imagine two companies, one from France, and one from the US. The one in the US can invest 5 weeks per year in exploratory learning (which can result in good things for both the business and the employee) while still producing as much as the french one.

Obviously, there are other parameters to take into account for overall productivity like hours per day, the effects of holidays or long hours on creativity, or funding … but here are some facts about software engineering in France :

  • 20% time policy, hackathons and other exploratory learning are extremely rare (I’ve seen it once in 15 years)
  • It’s slowly getting better, but if you remain a programmer in your thirties, you’re seen as a loser
  • France has no software powerhouse like Microsoft, Google, Apple …

This lead me to this open question : What’s the effect of the 7 weeks of paid leaves on the french software industry ?

By no means will I try to give an answer, I just don’t know. Plus, for those who might be wondering : I love my 7 weeks of holidays !

The conclusion I came to

Yet, I can try to draw a conclusion at the individual level. In France, if you’re not really enthusiastic about programming, you won’t put the extra effort off-the-job to learn the latest technologies. Within a few years, you’ll be ‘deprecated’, which will leave you with mainly 2 options :

  • become a manager
  • stick to your current codebase (and become completely dependent of your employer)

To me, the sad truth is that if you want to make a career as a professional developer in France, you’d better be ready to spend some of your free time practicing !

Verify the Big O Complexity of Ruby Code in RSpec

It might be possible to discover performance regressions before running your long and large scale benchmarks !

complexity_assert is an RSpec library that determines and checks the big O complexity of a piece of code. Once you’ve determined the performance critical sections of your system, you can use it to verify that they perform with the complexity you expect.

How does it work ?

The gem itself is the result of an experiment to learn machine learning in 20 hours (you can read more about that experiment in my previous post if you want).

Suppose you have some a method, let’s call it match_products_with_orders(products, orders) which is called in in one of your processes with very large arguments. Badly written, this method could be quadratic (O(n²)), which would lead to catastrophic performances in production. When coding it, you’ve taken particular care to make it perform in linear time. Unfortunately, it could easily slip back to a slower implementation with a bad refactoring … Using complexity_assert, you can make sure that this does not happen :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# An adapter class to fit the code to measure in complexity assert
class ProductsOrdersMatching

    # Generate some arguments of a particular size
    def generate_args(size)
        # Let's assume we have 10 times less products than orders
        [ Array.new(size / 10) { build_a_product() }, Array.new(size) { build_an_order() } ]
    end

    # Run the code on which we want to assert performance
    def run(products, orders)
        match_products_with_orders(products, orders)
    end
end

describe "Products and Orders Matching" do

    it "performs linearly" do
        # Verify that the code runs in time proportional to the size of its arguments
        expect(ProductOrdersMatching.new).to be_linear()
    end

end

That’s it ! If ever someone changes the code of match_products_with_orders and makes it perform worse than linearly, the assertion will fail ! There are similar assertions to check for constant and quadratic execution times.

Internally, the code will be called a number of times with different (smallish) sizes of arguments and the execution times will be logged. When this is over, by doing different flavors of linear regressions, it should determine whether the algorithm performs in O(1), O(n) or O(n²). Depending on your code, this can take time to run, but should still be faster than running large scale benchmarks.

Just check the README for more details.

Did you say experiment ?

It all started like an experiment. So the gem itself, is still experimental ! It’s all fresh, and it could receive a lot of enhancements like :

  • Allow the assertion to specify the sizes
  • Allow the assertion to specify the warm-up and run rounds
  • Robustness against garbage collection : use GC intensive ruby methods, and see how the regression behaves
  • Find ways to make the whole thing faster
  • O(lnx) : pre-treat with exp()
  • O(?lnx) : use exp, then a search for the coefficient (aka polynomial)
  • O(xlnx) : there is no well known inverse for that, we can compute it numerically though
  • Estimate how much the assert is deterministic

As you see, there’s a lot of room for ideas and improvements.

How I Got My Feet Wet With Machine Learning With ‘the First 20 Hours’

I’m currently wrapping up an alpha of a unit testing ruby gem that allows to assert the complexity of a piece of code. It’s the result of an experiment to learn some Machine Learning skills in 20 hours … not bad for a first a try at Data Science ! This is the story of this experiment.

How it all started ?

A few months ago, I read The First 20 Hours. The book describes a technique to get up to speed and learn some practical skills on any particular subject in only 20 hours. As examples, the author details how he managed to teach himself a pretty decent level of Yoga, Ukulele, Wind Surfing, Programming, Go and touch typing.

I decided to give it a try. In order to get a boost, I found a few motivated people at work to do it with me. I started by presenting them the technique described in the book, and asked everyone what they wanted to learn. After a quick vote, we set out to learn more about Machine Learning.

The technique

The method is meant to allow anyone to learn necessary skills to accomplish a specific task in about 20 hours. I my case, I could expect to get a basic understanding of the Machine Learning concepts, as well as some practical skills to do something involving Machine Learning. Here are the details of the technique :

  1. H0 : Deep dive in the main concepts and theory of machine learning
  2. H6 : Define an ambitious and practical goal or skill level to achieve by the end, and an outline of how to get there
  3. H6 to H20 : Learn by doing

As you see, the technique is pretty simple !

How did it work ?

For the group

The plan for the group was :

  • to meet weekly for 2 hours
  • to share what we learned at the end of every session
  • to bound by similar goals

At first, people were enthusiastic about learning something like machine learning. After a while, I started to get the following remarks :

  • “I don’t really see the point of doing this together rather than independently”
  • “I’m feeling a bit lost by not having a concrete goal and a plan from H0”
  • “I picked up a target that’s too large for me”

The learning curve must have proven too steep, because as time went by, a lot of people droped out, and we ended up being only 2 !

For me

The first phase was the toughest. As the author had warned in his book, “You’ll get deep above your head in theory and concepts you don’t know”, “You’ll feel lost”. He had some reassuring words though : “The steeper the learning curve, the more you’ll be learning !” I actually like this feeling of unknown things to learn, and that’s why I stuck to it.

I took me 8 hours, and not 6 to get a good overall grasp of Machine Learning techniques. The theory was just too wide and interesting and I could not cut the learning after just 6 hours :–). I studied Machine Learning for developers plus a few other pages for details on specific points. I took and kept notes about what I learned. I chose my subject “unit testing algorithm complexity” for the following reasons :

  • I could imagine some utility
  • I had been writing benchmarks at work for 3 years, and I knew the practice well enough
  • It’s pretty easy to generate data for this subject : just run your algorithm !
  • It seems a good first step, doable with basic Machine Learning techniques like linear regression
  • It seems small enough to get something working in 12 hours
  • I could use ruby, which I find both fast and pleasant to program

This is the plan I set out :

  1. Generate data with a linear algorithm (O(n))
  2. Run linear regression on the data
  3. Compute the the RMSE of the model
  4. Deal with Garbage Collection in order to make reduce its noise
  5. Deal with interpreter warm-up for the same reason
  6. Generate data for a constant (O(1)) algorithm and build a model for it
  7. Find a way to identify if an algorithm is constant or linear from it’s execution timings
  8. Generate data for a quadratic (O(2)) algorithm and build a model for it
  9. Identify if an algorithm is constant, linear or quadratic
  10. Package all this in an RSpec library

It started well, and I made good progress. Unfortunately, as people dropped out of the group and I got more urgent things to do at work, I had to pause my project for a while. It’s only been since last week that I got some time during my holidays to finish this off. I’m currently at H18, and I’ve completed all steps from 1 to 9.

As I said the project is still in early alpha. They is a lot of points in which it could be improved (more complexities, faster, more reliable …). Even though I did not tackle the more advanced machine learning techniques, I now understand the overall process of ML : explore to get an intuitive grasp of the data, try out a model, see what happens, and repeat … I feel that learning these more advanced techniques would be easier now.

My opinion on the method

Overall, I found the experiment really effective, it’s possible to learn quite a lot by focusing on it for 20 hours. A word of warning though : you need to be really motivated and ready to stick through difficulties.

It’s also been very pleasant. I’ve always loved to learn new things, so I might be a little biased on that aspect. I liked the first part when I felt that there was a lot to learn in a large subject I knew almost nothing about. I loved the second part too, although this might be more related to machine learning, because I felt like exploring an unknown (data set) and trying to understand it.

I’ve never been afraid to learn something, doing this experiment taught me I can learn anything fast ! I’ll definitely re-use it again.

One last word about doing this in group. My own experiment did not work very well. Most people were not comfortable with the first ‘explore’ phase. I guess one could make it work better by starting 6 or 8 hours before the rest of the group, enough to grasp the basic concepts and come up with a few end goals. Having concrete targets from day 1 should help people to stick through and to collaborate. The ‘guide’ could also help the others through the first phase.

Stay tuned, I’ll present my gem in a following post

Overclocking a Scrum Team to 12

From Wikipedia :

Overclocking is configuration of computer hardware components to operate faster than certified by the original manufacturer …

It is said that Scrum teams work best at 7 people, and that they break at about 10. The trouble is that sometimes there is just too much work for 7 people, but no enough for a full Scrum of Scrums. What if there was a simple way to hack this number up to 12 ?

An Idea

The Surgical Team

In his classic The Mythical Man Month Fred Brooks presents an idea to organize software development the way surgeons work. The master performs the surgery while the rest of his team (intern or junior surgeon and the nurses) are there to support him. Fred Brook imagined an organization where master developers could be the only ones with access to the production code, while other more junior developers would have the task to provide them with tools and technical libraries.

I admit that this idea sounds out-of-fashion in contrast with modern agile teams of generalists … Still …

Tools

At work, we are working on a pretty technical and complex product which requires some time getting into both the code and the domain. We took a few interns during the past years, and a bit like Fred Brooks, we came to the conclusion that internships yield more results when focused on building supporting tools rather than joining the team and working on production code.

We’ve also been doing retrospectives for 3 years now, we’ve stolen a lot of best practices from the industry and the team is working a lot better than it used to. The pending side of this is that nowadays, the opportunities for improvement that we discover are a lot more specific, and they often need us to take some time to build new tools to support our work.

The Agile Surgical Team

Agile method such as Scrum or XP are all about creating real teams instead of a collection of individual. That means that if we wanted to adopt the surgical team idea, we could use teams instead of individuals : a team of experts, and a tooling team of apprentice developers !

Why not, there’s not nothing really new here, but the challenge is to run such a tooling team efficiently !

  • 3 people or less : there’s evidence in the industry that micro teams can self organize in an ad-hoc manner
  • Mandate ScrumBan, Continuous Delivery and Devops : on site customer makes this possible, it should reduce project management overhead to almost nothing, and enforce quality
  • A sandbox for junior developers : there’s no risk of messing up production code here, the domain (tools for software developers) is straightforward and the fast feedback provides a great environment for learning

Obviously, for this to work, you’ll also need to have enough tooling work to do for a 3 people team. That’s usually the case, the CI alone can take quite some time (see Jez Humble’s talk Why Agile Doesn’t Work) and any team will have its own custom tools to do. For example, in our team, we built our own benchmark framework and we could benefit a lot from Saros on IntelliJ.

Not quite there yet

I promised to scale up to 12. Let’s do the maths :

  • 3 people in the tooling team
  • 8 people in the product team if we push Scrum a bit

That’s only 11, 1 is missing. This one is more specific to each team’s context.

As I said earlier, the product we are building is pretty technical and complex. Sometimes, we simply don’t know how we are going to do something. We need to try different ways before finding out the good one. The typical agile way of doing that is by conducting time-boxed spikes. Spikes are fine for code and design related issues but way too short to deal with hard R&D problems. These need a lot of uninterrupted time for research and experiments, so it’s not really possible to split them in backlog stories that any one can work on either …

The R&D Role

Here is what you want : some uninterrupted time to learn and experiment different ways to do something difficult.

Here is what you don’t want :

  • specialists in the team
  • people out of sync with the daily production constraints
  • a never ending ‘research’ topic

Here is a simple solution in the context I describe : add someone in the product team, and do some 2 month round robin on hard subjects. This should leave plenty of time to study something different, but not so much time that one looses connection with the rest of the team. Plus it brings a bit of diversity in every one’s daily work. One issue I can think of is that working on isolation might leave someone on a bad track, regularly presenting what was done to the rest of the team might alleviate this concern.

A final word

Obviously, this has a smell of specialization, we’re bending Scrum principles a bit. So take it for what it is : just like overclocking, it’s a hack to get a bit of extra juice before investing in something a lot more expensive (Scrum of Scrums, Less or whatever).