Post-mortems gone meta

Randal Moore
Thoughts from TravelPerk
4 min readJul 23, 2020

--

TravelPerk has been very successful. Our business has been one of the fastest-growing in Europe. To further capitalize on our success, we have been scaling up our engineering team. As we ship additional features, inevitably customer-facing bugs surface. Like many other top-notch engineering teams, we hold a blameless post mortem in response to a critical bug. After we have the presentation, a document is made available for the team. We have amazing engineers, excelling even in humility, eager to present lessons learned to colleagues. Over the past 2 years, we’ve had 64 critical incidents requiring serious reflection. The post-mortem documents are a treasure trove of lessons learned but due to their number, they’re not easily digestible.

Too many post mortem documents to digest!
Source: Freepik.com

The goal

We needed to distill lessons learned from our valuable-but-intractable ocean of post-mortem documents. New team members could then easily consume these lessons learned, supercharging the expertise and scalability of our team.

The process

The magical transformation from a set of documents to distilled lessons learned is done through the process of coding. Codes are labels that are applied to specific parts of the qualitative input, in this case, our post-mortem documents. Codes are similar to tags but are applied to snippets within the document — instead of the entire document as tags would be. Qualitative Data Analysis (QDA) is a discipline used to turn subjective (qualitative) input into a quantitative output. Statistical analysis may then be done on the quantitative output to gain actionable insight.

Highlighting snippets of text within a document
Source: Freepik.com

We use an inductive coding process. Codes are “discovered” based on the content of the documents, not defined a priori (deductive coding). Inductive coding is messy and hard. The process is iterative, modifying existing codes to fit concepts as you come across them. The end goal is a small set of codes that clearly and accurately summarize the large number of concepts found in the documents. The evolution of your codes is best done with the help of a computer program unless you enjoy annihilating pencil erasers. For those who enjoy buzzwords: using a computer turbocharges QDA and turns it into CAQDA (Computer-Aided QDA).

Use a computer to perform Qualitative Data Analysis!
Source: Freepik.com

RQDA

RQDA is one of many options to facilitate the coding process. The benefit of using an application built on top of a data analysis environment (R) is that statistics based on your codes are immediately available.

Screenshot of using RQDA to perform Qualitative Data Analysis

Results

Using RQDA to do the coding results in an .rqda file, in our case “PostMortems.rqda”. Anyone can load this file and perform an analysis in R, limited only by their imagination. An example frequency analysis:

library(RQDA)
openProject("~/PostMortems.rqda", updateGUI = TRUE)
coding <- getCodingTable()
par(mar=c(17, 2, 2, 2)) # Set margins for barplot
barplot(sort(table(coding$codename), decreasing=T), las=2)
Root causes of critical issues ordered by frequency
A simple frequency analysis makes the worst offenders clearly visible. An example action item in response to this is to have your continuous integration pipeline reject changes that do not have enough automated test coverage.

Bonus insight

A surprise bonus of performing inductive analysis was the insight gained while trying to “fit” codes to the lessons learned in the post-mortem documents. The fitting process makes relationships between common causes of issues pop out. The people performing the coding process will gain wisdom, and be able to easily recognize conditions likely to result in a critical bug. For the benefit of everyone else, relationships may be noted in the code descriptions.

existing_code_misunderstood: The author was working with code already in place, often refactoring it, and misunderstood the intent or side effects of the code. Often seen with missing automated test coverage which would have informed the author of their misunderstanding.

Having people review the code descriptions along with the graphical analysis will quickly supercharge your team with the wisdom from hard-earned lessons.

Tips if you want to try QDA:

  • Be aware of bias: having a single person do the coding is faster, but they may choose codes based on pet peeves or to further their agenda.
  • Structure your post-mortem documents: in our case, we follow a template including sections for “5 Whys” and “Actions Taken”. This is where most of the codes are found, focusing on these 2 sections greatly sped up the coding process.

--

--