sql-migrate slides

I recently gave a small lightning talk about sql-migrate (a SQL Schema migration tool for Go), at the Go developer room at FOSDEM.

Annotated slides can be found here.



Posted: February 9, 2016 17:14 Tags: go fosdem

Show me the way

If you need further proof that OpenStreetMap is a great project, here’s a very nice near real-time animation of the most recent edits: https://osmlab.github.io/show-me-the-way/

Show me the way

Seen today at FOSDEM, at the stand of the Humanitarian OpenStreetMap team which also deserves attention: https://hotosm.org


Posted: January 31, 2016 20:28 Tags: openstreetmap fosdem

Kubernetes from the ground up

I really loved reading Git from the bottom up when I was learning Git, which starts by showing how all the pieces fit together. Starting with the basics and gradually working towards the big picture is a great way to understand any complex piece of technology.

Recently I’ve been working with Kubernetes, a fantastic cluster manager. Like Git it is tremendously powerful, but the learning curve can be quite steep.

But there is hope. Kamal Marhubi has written a great series of articles that take the same approach: start from the basic building blocks, build with those.

Currently available:

Highly recommended.



Posted: November 20, 2015 20:31 Tags: kubernetes

Custom attributes in angular-gettext

Kristiyan Kostadinov recently submitted a very neat new feature for angular-gettext, which was just merged: support for custom attributes.

This feature allows you to mark additional attributes for extraction. This is very handy if you’re always adding translations for the same attributes over and over again.

For example, if you’re always doing this:

<input placeholder="{{ 'Input something here' | translate }}">

You can now mark placeholder as a translatable attribute. You’ll need to define your own directive to do the actual translation (an example is given in the documentation), but it’s now a one-line change in the options to make sure that placeholder gets recognized and hooked into the whole translation string cycle.

Your markup will then become:

<input placeholder="Input something here">

And it’ll still internationalize nicely. Sweet!

You can get this feature by updating your grunt-angular-gettext dependency to at least 2.1.3.

Full usage instructions can be found in the developer guide.


Posted: August 14, 2015 08:15 Tags: angular

Google Photos - Can I get out?

Google Photos

Google Photos came out a couple of days ago and well, it looks great.

But it begs the question: what happens with my photos once I hand them over? Should I want to move elsewhere, what are my options?

Question 1: Does it take good care of my photos?

Good news: if you choose to backup originals (the non-free version), everything you put in will come back out unmodified. I tested this with a couple different file types: plain JPEGs, RAW files and movies.

Once uploaded, you can download each file one-by-one through the action buttons on the top-right of your screen:

Photo actions

Downloaded photos have matching checksums, so that’s positive. It does what it promises.

Update: not quite, see below

Question 2: Can I get my photos out?

As mentioned before there’s the download button. This gives you one photo at a time, which isn’t much of an option if you have a rather large library.

You can make a selection and download them as a zip file:

Bulk download

Only downside is that it doesn’t work. Once the selection is large enough, it silently fails.

There is another option, slightly more hidden:

Show in Google Drive

You can enable a magic “Google Photos” folder in the settings menu, which will then show up in Google Drive.

Combined with the desktop app, it allows you to sync back your collection to your machine.

I once again did my comparison test. See if you can spot the problem.

Original file:

$ ls -al _MG_1379.CR2 
-rwxr-xr-x@ 1 ruben  staff  16800206 Oct 10  2012 _MG_1379.CR2*
$ shasum -a 256 _MG_1379.CR2 
fbfb86dac6d24c6b25d931628d24b779f1bb95f9f93c99c5f8c95a8cd100e458  _MG_1379.CR2

File synced from Google Drive:

$ ls -al _MG_1379.CR2 
-rw-------  1 ruben  staff  1989894 May 30 18:38 _MG_1379.CR2
$ shasum -a 256 _MG_1379.CR2 
0769b7e68a092421c5b8176a9c098d4aa326dfae939518ad23d3d62d78d8979a  _MG_1379.CR2

My 16Mb RAW file has been compressed into something under 2Mb. That’s… bad.

Question 3: What about metadata?

Despite all the machine learning and computer vision technology, you’ll still want to label your events manually. There’s no way Google will know that “Trip to Thailand” should actually be labeled “Honeymoon”.

But once you do all that work, can you export the metadata?

As it stands, there doesn’t seem to be any way to do so. No API in sight (for now?).

Update: It’s supported in Google Takeout. But that’s still a manual (and painful) task. I’d love to be able to do continuous backups through an API.


The apps, the syncing, the sharing, it works really really well. But for now it seems to be a one-way story. If you use Google Photos, I highly recommend you keep a copy of your photos elsewhere. You might want them back one day.

What I’d really like to see:

  • A good API that allows access to all metadata. After all, it is my own data.
  • An explanation on why my RAW files were compressed. That’s exactly not what you want with RAW files.

Keeping an eye on it.


Posted: May 30, 2015 19:18

dupefinder - Removing duplicate files on different machines

Imagine you have an old and a new computer. You want to get rid of that old computer, but it still contains loads of files. Some of them are already on the new one, some aren’t. You want to get the ones that aren’t: those are the ones you want to copy before tossing the old machine out.

That was the problem I was faced with. Not willing to do this tedious task of comparing and merging files manually, I decided to wrote a small tool for it. Since it might be useful to others, I’ve made it open-source.

Introducing dupefinder

Here’s how it works:

  1. Use dupefinder to generate a catalog of all files on your new machine.
  2. Transfer this catalog to the old machine
  3. Use dupefinder to detect and delete any known duplicate
  4. Anything that remains on the old machine is unique and needs to be transfered to the new machine

You can get in two ways: there are pre-built binaries on Github or you may use go get:

go get github.com/rubenv/dupefinder/...

Usage should be pretty self-explanatory:

Usage: dupefinder -generate filename folder...
    Generates a catalog file at filename based on one or more folders

Usage: dupefinder -detect [-dryrun / -rm] filename folder...
    Detects duplicates using a catalog file in on one or more folders

  -detect=false: Detect duplicate files using a catalog
  -dryrun=false: Print what would be deleted
  -generate=false: Generate a catalog file
  -rm=false: Delete detected duplicates (at your own risk!)

Full source code on Github

Technical details

Dupefinder was written using Go, which is my default choice of language nowadays for these kind of tools.

There’s no doubt that you could use any language to solve this problem, but Go really shines here. The combination of lightweight-threads (goroutines) and message-passing (channels) make it possible to have clean and simple code that is extremely fast.

Internally, dupefinder looks like this:

Each of these boxes is a goroutine. There is one hashing routine per CPU core. The arrows indicate channels.

The beauty of this design is that it’s simple and efficient: the file crawler ensures that there is always work to do for the hashers, the hashers just do one small task (read a file and hash it) and there’s one small task that takes care of processing the results.

The end-result?

A multi-threaded design, with no locking misery (the channels take care of that), in what is basically one small source file.

Any language can be used to get this design, but Go makes it so simple to quickly write this in a correct and (dare I say it?) beautiful way.

And let’s not forget the simple fact that this trivially compiles to a native binary on pretty much any operationg system that exists. Highly performant cross-platform code with no headaches, in no time.

The distinct lack of bells and whistles makes Go a bit of an odd duck among modern programming languages. But that’s a good thing. It takes some time to wrap your head around the language, but it’s a truly refreshing experience once you do. If you haven’t done so, I highly recommend playing around with Go.

Random questions


Posted: May 23, 2015 11:44

An API is only as good as its documentation.

Your APIs are only as good as the documentation that comes with them. Invest time in getting docs right. — @rubenv on Twitter

If you are in the business of shipping software, chances are high that you’ll be offering an API to third-party developers. When you do, it’s important to realize that APIs are hard: they don’t have a visible user interface and you can’t know how to use an API just by looking at it.

For an API, it’s all about the documentation. If an API feature is missing from the documentation, it might as well not exist.

Sadly, very few developers enjoy the tedious work of writing documentation. We generally need a nudge to remind us about it.

At Ticketmatic, we promise that anything you can do through the user interface is also available via the API. Ticketing software rarely stands alone: it’s usually integrated with e.g. the website or some planning software. The API is as important as our user interface.

To make sure we consistently document our API properly, we’ve introduced tooling.

Similar to unit tests, you should measure the coverage of your documentation.

After every change, each bit of API endpoint (a method, a parameter, a result field, …) is checked and cross-referenced with the documentation, to make sure a proper description and instructions are present.

The end result is a big documentation coverage report which we consider as important as our unit test results.

Constantly measure and improve the documentation coverage metric.

More than just filling fields

A very important things was pointed out while circulating these thoughts on Twitter.

Shaun McCance (of GNOME documentation fame) correctly remarked:

@rubenv I’ve seen APIs that are 100% documented but still have terrible docs. Coverage is no good if it’s covered in crap. — @shaunm on Twitter

Which is 100% correct. No amount of metrics or tooling will guarantee the quality of the end-result. Keeping quality up is a moral obligation shared by anyone in the team and that can never be replaced with software.

Nevertheless, getting a slight nudge to remind you of your documentation duties never hurts.


Posted: March 29, 2015 09:39

Surviving winter as a motorsports fan.

Winter is that time of the year where nothing happens in the motorsport world (one exception: Dakar). Here are a few recommendations to help you through the agonizing wait:

Formula One

Start out with It Is What It Is, the autobiography of David Coulthard. It only goes until the end of 2007, but nevertheless it’s a fascinating read: rarely do you hear a sportsman speak with such openness. A good and honest insight into the mind of a sportsman and definitely not the politically correct version you’ll see on the BBC.

It Is What It Is

Next up: The Mechanic’s Tale: Life in the Pit-Lanes of Formula One by Steve Matchett, a former Benetton F1 mechanic. This covers the other side of the team: the mechanics and the engineers.

The Mechanic's Tale: Life in the Pit-Lanes of Formula One

Still feel like reading? Dive into the books of Sid Watkins, who deserves huge amounts of credit for transforming a very deadly sport into something surprisingly safe (or as he likes to point out: riding a horse is much more dangerous).

He wrote two books:

Both describe the efforts on improving safety and are filled with anecdotes.

And finally, if you prefer movies, two more recommendations. Rush, an epic story about the rivalry between Niki Lauda and James Hunt. Even my girlfriend enjoyed it and she has zero interest in motorsports.


And finally Senna, the documentary about Ayrton Senna, probably the most mythical Formula One driver of all time.


Le Mans

On to that other legend: The 24 hours of Le Mans.

I cannot recommend the book Le Mans by Koen Vergeer enough. It’s beautiful, it captures the atmosphere brilliantly and seamlessly mixes it with the history of this event.

But you’ll have to go the extra mile for it: it’s in Dutch, it’s out of print and it’s getting exceedingly rare to find.

Le Mans

Nothing is lost if you can’t get hold of it. There’s also the 1971 movie with Steve McQueen: Le Mans.

It’s everything that modern racing movies are not: there’s no CG here, barely any dialog and the story is agonizingly slow if you compare it to the average Hollywood blockbuster.

But that’s the beauty of it: in this movie the talking is done by the engines. Probably the last great racing movie that featured only real cars and real driving.

Le Mans


Motorcycles aren’t really my thing (not enough wheels), but I have always been in awe for the street racing that happens during the Isle of Man TT. Probably one of the most crazy races in the world.

Riding Man by Mark Gardiner documents the experiences of a reporter who decides to participate in the TT.

Riding Man

And to finish, the brilliant documentary TT3D: Closer to the Edge gives a good insight into the minds of these drivers.

It seems to be available online. If nothing else, I recommend you watch the first two minutes: the onboard shots of the bike accelerating on the first straight are downright terrifying.

TT3D: Closer to the Edge

Rounding up

By the time you’ve read/seen all of the above, it should finally be spring again. I hope you enjoyed this list. Any suggestions about things that would belong in this list are greatly appreciated, send them over!


Posted: January 16, 2015 18:07 Tags: formula1 motorcycles sports

Release notes: May 2014

What’s the point of releasing open-source code when nobody knows about it? In “Release Notes” I give a round-up of recent open-source activities.

angular-rt-popup (New, github)

A small popover library, similar to what you can find in Bootstrap (it uses the same markup and CSS). Does some things differently compared to angular-bootstrap:

  • Easier markup
  • Better positioning and overflows
  • Correctly positions the arrow next to anchor



grunt-git (Updated, github)

  • Support for –depth in clone.
  • Support for –force in push.
  • Multiple file support in archive.


angular-gettext (Updated, github, website)

Your favorite translation framework for Angular.JS gets some updates as well:

  • You can now use $count inside a plural string as the count variable. The older syntax still works though. Here’s an example:
    <div translate translate-n="boats.length" translate-plural="{{$count}} boats">One boat</div>
  • You can now use the translate filter in combination with other filters:
    {{someVar | translate | lowercase}}
  • The shared angular-gettext-tools module, which powers the grunt and gulp plugins is now considered stable.


Posted: June 1, 2014 18:12 Tags: javascript angular

Release Notes: Apr 2014

What’s the point of releasing open-source code when nobody knows about it? In “Release Notes” I give a round-up of recent open-source activities.

Lots of small bugfixes left and right this month, but just one big module that’s worth pointing out:

angular-optimistic-cache (New, github)

Usually you have something like this in your Angular.JS application:

angular.module('myApp').controller('PeopleCtrl', function ($scope, $http) {
    $http.get('/api/people').then(function (result) {
        $scope.people = result.data;


    <li ng-repeat="person in people">{{person.name}}</li>


This simple example is a page that will fetch a list of people from the backend and shows it on a page.

Unfortunately, it suffers from the “uncomfortable silence”. Here’s a diagram to explain:


When you arrive on the page, it’ll first show a blank page. After some time, this gets swapped with the data. Your app feels fast because navigation between screens is instant, but it feels jarring.

This is especially annoying when switching back-and-forth between pages, as it happens every time.

A similar thing happens when going from the list to a detail page:


Isn’t it a bit strange that you know the name of the person on which the user clicked, but upon navigation that suddenly gets lost, forcing us to wait until all the info is loaded? Why not start out with showing the name while the rest of the data loads?

The angular-optimistic-cache module is a very lightweight module to add some of that to your application. It’s probably the least intrustive way to avoid uncomfortable silences.

More on Github.


Posted: May 1, 2014 17:14 Tags: javascript angular

Benchmarking on OSX: HTTP timeouts!

I’ve been doing some HTTP benchmarking on OSX lately, using ab (ApacheBench). After a large volume of requests, I always ended up with connection timeouts. I used to blame my application and mentally filed it as “must investigate”.

I was wrong.

The problem here was OSX, which seems to only have roughly 16000 ports available for connections. A port that was used by a closed connection is only released after 15 seconds. Quick calculation shows that you can only do a sustained rate of 1000 connections per second. Try to do more and you’ll end up with timeouts.

That’s not acceptable for testing pretty much anything that scales.


Here’s the workaround: you can control the 15 seconds release delay with sysctl:

sudo sysctl -w net.inet.tcp.msl=100

There’s probably a good reason why it’s in there, so you might want to revert this value once you are done testing:

sudo sysctl -w net.inet.tcp.msl=15000


Alternatively, you could just use Linux if you want to get some real work done.


Posted: April 5, 2014 18:57 Tags: performance

Release Notes: Mar 2014

What’s the point of releasing open-source code when nobody knows about it? In “Release Notes” I give a round-up of recent open-source activities.

Slightly calmer month, nonetheless, here are some things you might enjoy:


angular-debounce (New, github)

Tiny debouncing function for Angular.JS. Debouncing is a form of rate-limiting: it prevents rapid-firing of events. You can use this to throttle calls to an autocomplete API: call a function multiple times and it won’t get called more than once during the time interval you specify.

One distinct little feature I added is the ability to flush the debounce. Suppose you are periodically sending the input that’s being entered by a user to the backend. You’d throttle that with debounce, but at the end of the process, you’ll want to immediately send it out, but only if it’s actually needed. The flush method does exactly that.

Second benefit of using an Angular.JS implementation of debounce: it integrates in the event loop. A consequence of that is that the testing framework for E2E tests (Protractor) is aware of the debouncing and it can take it into account.


angular-gettext (Updated, website, original announcement)

A couple of small feature additions to angular-gettext, but nothing shocking. I’m planning a bigger update to the documentation website, which will describe most of these.


ensure-schema (New, github)

Working with a NoSQL store (like MongoDB) is really refreshing in that it frees you from having to manage database schemas. You really feel this pain when you go back to something like PostgreSQL.

The ensure-schema module is a very early work-in-progress module to lessen some of that pain. You specify a schema in code and the module ensures that your database will be in that state (pretty much what it says on the box).

var schema = function () {
    this.table("values", function () {
        this.field('id', 'integer', { primary: true });
        this.field('value', 'integer', { default: 3 });

    this.table("people", function () {
        this.field('id', 'integer', { primary: true });
        this.field('first_name', 'text');
        this.field('last_name', 'text');

        this.index('uniquenameidx', ['first_name', 'last_name'], true);

ensureSchema('postgresql', db, schema, function (err) {
    // Do things

It supports PostgreSQL and SQLite (for now). One thing I specifically do not try to do is database abstractions: there are other tools for that. This means you’ll have to write specific schemas for each storage type.

There’s a good reason for that: you should pick your storage type based on its strenghts and weaknesses. Once you pick one, there’s no reason to fully use all of its capabilities.

This module is being worked out in the context of the project where I use it, so things could change.


Testing with Angular.JS (New, article)

Earlier last month I gave a presentation for the Belgian Angular.JS Meetup group:


The slides from this presentation are now available as an annotated article. You can read it over here.


Posted: April 1, 2014 08:06 Tags: angular javascript nodejs

Testing with Angular.JS

On Mar 5, 2014, I gave a presentation for the Belgian Angular.JS Meetup group:



The slides from this presentation are now available as an annotated article. You can read it over here. Now is a good time to start testing your code, if you aren’t already doing so.


Posted: March 10, 2014 21:12 Tags: javascript angular

Release Notes: Feb 2014

What’s the point of releasing open-source code when nobody knows about it? In “Release Notes” I give a  round-up of recent open-source activities.

Since this is the first instalment of what will hopefully be a regular thing, I’ll look back a couple months into the past. A long (and not even complete) list of things, mostly related to web technology. I hope you’ll find something useful among it.


angular-encode-uri (New, github)

A trivial plugin for doing URI encoding in Angular.JS view, something it oddly doesn’t do out of the box.


angular-gettext (Updated, website, original announcement)

The nicest way to do translations in Angular.JS is getting even nicer, with more improved coverage of strings and filetypes, built-in support for asynchronous loading and more flexibility.

But most of all: rock-solid stability. Angular-gettext is now in use for some nice production deployments and it just works.


  • The website is now open-source and on github.
  • There’s an ongoing effort to split the grunt plugins up into the actual grunt bits and a more generic library. There’s also a Gulp plugin coming, so you can use any tooling you want.
  • Functionality for loading translations asynchronously.
  • Now usable without jQuery loaded.
  • Better handling of translation strings in directives.


angular-import-scope (New, github)

Angular.JS structures your data in nested scopes. Which is great, except when page structure doesn’t work like that and you need the inner data on a much higher level (say in the navigation). With import-scope, you can import the scope of a lower-level ui-view somewhere higher up.



angular-select2 (New, github)

A select2 plugin for Angular.JS that actually works, with ng-options support.


connect-body-rewrite (New, github, DailyJS coverage)

A middleware plugin for Connect.JS that helps  you transform request bodies on the fly, typically on the result of a proxied call. Used in connect-livereload-safe and connect-strip-manifest (see below).


connect-livereload-safe (New, github)

A Connect.JS middleware plugin to inject livereload. What’s wrong with connect-livereload? Well, I ran into some freak issues where modifying the DOM during load breaks Angular.JS. This plugin avoids that.


connect-strip-manifest (New, github)

Connect.JS middleware to strip the HTML5 app cache manifest. Makes it possible to disable the caching in development, without having weird tricks in your HTML file.


grunt-git (Updated, github)

A pile of new git commands supported, with a much improved test suite.


grunt-unknown-css (New, github)

Lets you analyze HTML files to figure out which classes don’t exist anymore in the CSS. Good for hunting down obsolete style declarations.


grunt-usemin-uglifynew (New, github)

A plugin for grunt-usemin that reuses existing .min.js files. This speeds up compilation of web apps and lets you use the minified builds provided by library authors.


json-inspect (New, github)

Get JSON context information out of a string. Lets you build text editors that are aware of the structure of the JSON underneath it.

Suppose you have this:

  "string": "value",
  "number": 3,
  "object": {
    "key": "val"
  "array": [

With json-inspect you can figure out what it means if the cursor is at a given position:

var context = jsonInspect(myJson, 2, 6); 
// { key: 'string', start: 4, end: 21, value: 'value' }

var context = jsonInspect(myJson, 9, 5); 
// { key: 'array.1', start: 93, end: 102, value: 2 }


mapbox-dist (New, github)

A compiled version of Mapbox.JS, which you can use with Bower.


Nested Means (New, github)

A data quantization scale that handles non-uniform distributions gracefully.

Or in human language: a Javascript module that calculates how you can make a meaningful legend for colorizing a map based on long tail data. If you’d use a linear scale, you’d end up with two colors: maximum and minimum. Nested means tries to adjust the legend to show more meaningful data.

A linear scale would map everything to white and dark-green. Nested means calculates a scale that maps to many colors.


node-trackchange (New, github)

An experiment in using Harmony Proxies for tracking changes to objects. Here’s an example:

var orig = {
  test: 123

// Create a wrapper that tracks changes.
var obj = ChangeTracker.create(orig);

// No changes initially:
console.log(obj.__dirty); // -> []

// Do things
orig.test = 1;
orig.other = true;

// Magical change tracking!
console.log(obj.__dirty); // -> ['test', 'other']

You can even wrap constructors. This ensures that each created instance automatically has change tracking built-in:

TestType = ChangeTracker.createWrapper(OrigType);

var obj = new TestType();


pofile (New, github)

A gettext .po parser and serializer, usable in the browser and on the backend. The angular-gettext module is powered by this library.


Blog it or it didn’t happen.


Posted: March 1, 2014 10:05 Tags: angular nodejs javascript

Deploying Node.JS the modern way, everywhere

In a shocking move, Ubuntu decided to follow the long-strung-out decision of Debian to adopt systemd as their init system. This is a good thing: everyone can now get together and work on one great solution. I applaud them for making this move.

It’s also a good thing for those who depend on systemd for having a fantastic modern deployment environment: soon you’ll be able to depend on systemd everywhere, regardless of the distribution being used.

In this light it seems like a good idea to shamelessly mention the write-up I wrote a while back: Deploying Node.js with systemd. Everything in there is still highly relevant and relying on systemd for deploying Node.JS is still (in my humble opinion) one of the best possible setups.

Good times for Node.JS developers that also need to administer infrastructure.


Posted: February 15, 2014 12:44 Tags: nodejs ubuntu