Online Experimentation: With Great Power Comes Great Responsibility

Recently we had the honor of giving a talk at this year’s Turing Festival. The festival, which is held in Edinburgh, is one of Europe’s biggest cross-functional tech conferences and is named after the late Alan Turing, who invented the Turing Machine. This invention cracked the Nazi Germany Enigma code, helped to end World War II and is the conceptual foundation of the modern-day computer.

So much of the technological advancements we benefit from today are a result of Alan Turing’s amazing achievement. But with the great power that Alan left us comes great responsibility. The responsibility we decided to speak on at the festival is ensuring you are not being misled by data.

Engineering teams are delivering faster than ever before. In the 1990s, a new feature would be released every few years, whereas today, apps such as Netflix see a new feature released hundreds of times a day. However, we need to make sure the value of the software we are delivering is not suffering at the cost of this increase in speed.

A great way to build this confidence and put out Turing-worthy features is by utilizing online controlled experimentation. Product managers can now utilize controlled experimentation across the full stack of their codebase. Experimentation can change the way in which we deliver new features. Instead of focusing so much on the speed at which new features are being released, it allows us to fine-tune the feature to meet the needs and demands of users.

All this speed of delivery doesn’t mean companies are getting it right 100% of the time–apps still crash and websites still fail. For example, we’ve seen companies such as Salesforce, US Bank and even Instagram suffer downtime following new feature releases. Online experimentation allows for DevOps teams to better understand the impact the upcoming feature release will have on their users.

Sponsorships Available

Online experimentation requires you to split your traffic and randomize this traffic in an unbiased way, serving a new feature to a proportion of this traffic while the rest of your traffic does not experience the new feature. You can then compare the metrics of both these user groups to make an informed decision on whether to kill the feature or ramp it up to 100% of your users. You can run an experiment to a specific user segment, such as a demographic or market by utilizing feature flags within your code.

Online controlled experimentation also allows for real signals to be differentiated from noise. By noise we mean the randomness that is always apparent in the data due to using a sample to represent the whole population. The statistics from online controlled experimentation allow us to determine if the differences in the metrics are unlikely to be due to noise alone, in which case we can be confident our feature really did have an impact. This is how using online controlled experimentation decreases the chance of mistaking noise for a real, meaningful signal. In other words, it allows us to control and limit the chances of a false positive–a result that falsely indicates the change you introduced impacted your metrics.

We could spend ages discussing the benefits of using online controlled experimentation for feature releases (and we did at Turing Fest). However, the bottom line is experimentation allows you to be an outcome organization rather than an output organization. Taking the time to conduct an online controlled experiment means you will truly understand a new feature’s impact on your users and therefore you can be more confident in making your product decisions.

This article was co-authored by Sophie Harpur, product manager at Split. Sophie is passionate about making the complex world of experimentation accessible to everyone with core engineering, growth, design and product principles in mind. After graduating from the University of Strathclyde, Sophie worked for Skyscanner before eventually joining Split.

— Lizzie Eardley