Startups

How to run data on Kubernetes: 6 starting principles

Comment

Containers in the cloud; kubernetes
Image Credits: SerrNovik (opens in a new window) / Getty Images

Sylvain Kalache

Contributor

Sylvain Kalache is the co-founder of Holberton, an edtech company training digital talent in more than 10 countries. An entrepreneur and software engineer, he has worked in the tech industry for more than a decade. Part of the team that led SlideShare to be acquired by LinkedIn, he has written for CIO and VentureBeat.

More posts from Sylvain Kalache

Kubernetes is fast becoming an industry standard, with up to 94% of organizations deploying their services and applications on the container orchestration platform, per a survey. One of the key reasons companies deploy on Kubernetes is standardization, which lets advanced users double productivity gains.

Standardizing on Kubernetes gives organizations the ability to deploy any workload, anywhere. But there was a missing piece: The technology assumed that workloads were ephemeral, meaning that only stateless workloads could be safely deployed on Kubernetes. However, the community recently changed the paradigm and brought features such as StatefulSets and Storage Classes, which make using data on Kubernetes possible.

While running stateful workloads on Kubernetes is possible, it is still challenging. In this article, I provide ways to make it happen and why it is worth it.

Do it progressively

Kubernetes is on its way to being as popular as Linux and the de facto way of running any application, anywhere, in a distributed fashion. Using Kubernetes involves learning a lot of technical concepts and vocabulary. For instance, newcomers might struggle with the many Kubernetes logical units such as containers, pods, nodes and clusters.

If you are not running Kubernetes in production yet, don’t jump directly into data workloads. Instead, start with moving stateless applications to avoid losing data when things go sideways.

Understand the limitations and specificities

Once you are familiar with general Kubernetes concepts, dive into the specifics for stateful concepts. For example, because applications may have different storage needs, such as performance or capacity requirements, you must provide the correct underlying storage system.

What the industry generally calls storage “profiles” is termed Storage Classes in Kubernetes. They provide a way to describe the different types of classes a Kubernetes cluster can access. Storage classes can have different quality-of-service levels, such as I/O operations per second per GiB, backup policies or arbitrary policies such as binding modes and allowed topologies.

Another critical component to understand is StatefulSet. It is the Kubernetes API object used to manage stateful applications and offers key features such as:

  • Stable, unique network identifiers that let you keep track of volume, and allows you to detach and reattach them as you please.
  • Stable, persistent storage so that your data is safe.
  • Ordered, graceful deployment and scaling, which is required for many Day 2 operations.

While StatefulSet has been a successful replacement for the infamous PetSet (now deprecated), it is still imperfect and has limitations. For example, the StatefulSet controller has no built-in support for volume (PVC) resizing — which is a major challenge if the size of your application dataset is about to grow above the current allocated storage capacity. There are workarounds, but such limitations must be understood well ahead of time so that the engineering team knows how to handle them.

Come up with a plan

Once you are comfortable with Kubernetes stateful concepts, you can progressively migrate your data workloads in a specific order. This allows you to learn from your mistakes and avoid being overwhelmed, because not all data technologies are equally easy to run on Kubernetes.

Established technologies, such as databases and storage, should be migrated first, and emerging tech, such as AI and ML, should be done last. This is reflected in a recent report, which found database and persistent storage are the two most-run data workloads on Kubernetes. The main reason is the lack of tooling for Day 2 operations. We will explore this in the next section.

Check for operator availability

Moving your stateful workloads to Kubernetes is only half the job — also known as Day 1. Now you need to handle Day 2 operations (one of the most discussed topics at the last KubeCon). This is where things get tricky. There are tons of Day 2 operations that Kubernetes cannot handle natively such as patching and upgrading, backup and recovery, log processing, monitoring, scaling and tuning.

All these operations are application specific. For example, a PostgreSQL and MySQL cluster will require two completely different approaches when picking a new primary server in an HA cluster configuration. Kubernetes cannot possibly know all the application’s specific Day 2 operations. This is where operators come in.

Operators are programmable extensions that perform operations that Kubernetes cannot handle natively. Operators provide intelligent, dynamic management capabilities by extending the functionality of the Kubernetes API. One of the most common uses is conducting these Day 2 operations. These operators aren’t developed by the Kubernetes maintainers but by third-party developers and organizations.

Before moving a data workload to Kubernetes, make sure there is an operator for it. OperatorHub does a great job of indexing them. With 282 operators available on the site, the distribution echoes what we discussed earlier: Some workloads have supporting tools, and some don’t. For example, the database category has 38 operators — there are eight for PostgreSQL alone — while the entire ML/AI category only has seven.

Pick the right level of operator capability

Having an operator for your technology isn’t enough, because they can have different capabilities and often exist at various levels of maturity. The OperatorFramework suggests a capability model that categorizes operators according to their features:

  • Level 1: Works for basic installation, such as automated application provisioning and configuration management.
  • Level 2: Supports seamless upgrades, patches and minor version upgrades.
  • Level 3: Handles the full app and storage lifecycle (backup, failure recovery, etc.).
  • Level 4: Provides deep insights, metrics, alerts, log processing and workload analysis.
  • Level 5: Offers automatic horizontal/vertical scaling, auto-config tuning, abnormality detection and scheduling tuning.

When choosing an operator, make sure its capabilities match your needs. If you are unsure which level is right for you, the Data on Kubernetes Report 2022 found that most organizations are looking for operators that are at least at Level 3. Having a backup for your stateful workloads sounds like a good idea.

If you can’t find an operator that matches your needs, don’t worry because most of them are open source. You can extend existing operators’ capabilities with internal development or, even better, contribute to the open source project.

Understand the operator

Operators’ extensibility is their strength, but it’s also their weakness. The lack of standards means they are programmed differently, so you must look at their config files to pick the format you like best.

What’s more, operators may use different technical routes to achieve the same goal. For example, one of the eight PostgreSQL operators, CloudNativePG, does not use StatefulSets, and instead uses its own custom controller. That’s quite unexpected considering that StatefulSets is the foundation for stateful workloads on Kubernetes.

Its developers decided to go with this design because of the inability of StatefulSet to resize PVCs (as we discussed earlier). As the operator documentation explains, picking “different [design directions] lead to other compromises.” So when picking an operator, be sure to understand its implementation and trade-offs, and go with the one you are the most comfortable with.

It’s worth the effort

As you can see, running data on Kubernetes isn’t always easy, but the good news is that it’s worth the hard work: 54% of surveyed organizations attributed more than 10% of their revenue to the fact that they run data on Kubernetes. What’s more, 33% said it has a transformative impact on productivity and another 51% saw a significant positive impact.

As organizations increasingly adopt multicloud infrastructure to optimize their cost and infrastructure performance, Kubernetes has become the tool of choice. With an estimated 66% of countries having some sort of data privacy and consumer rights legislation, which often requires enforcing data sovereignty, companies must increasingly host user data in the countries they operate in. Kubernetes is here to stay.

More TechCrunch

Google is preparing to launch a new system to help address the problem of malware on Android. Its new live threat detection service leverages Google Play Protect’s on-device AI to…

Google takes aim at Android malware with an AI-powered live threat detection service

Users will be able to access the AR content by first searching for a location in Google Maps.

Google Maps is getting geospatial AR content later this year

The space is available from the launcher and can be locked as a second layer of authentication.

Google’s new Private Space feature is like Incognito Mode for Android

Gemini, the company’s family of generative AI models, will enhance the smart TV operating system so it can generate descriptions for movies and TV shows.

Google TV to launch AI-generated movie descriptions

When triggered, the AI-powered feature will automatically lock the device down.

Android’s new Theft Detection Lock helps deter smartphone snatch and grabs

The company said it is increasing the on-device capability of its Google Play Protect system to detect fraudulent apps trying to breach sensitive permissions.

Google adds live threat detection and screen-sharing protection to Android

This latest release, one of many announcements from the Google I/O 2024 developer conference, focuses on improved battery life and other performance improvements, like more efficient workout tracking.

Wear OS 5 hits developer preview, offering better battery life

For years, Sammy Faycurry has been hearing from his dietician mom and sister about how poorly many Americans eat and their struggles with delivering nutritional counseling. Although nearly half of…

Dietitian startup Fay has been booming from Ozempic patients and emerges from stealth with $25M from General Catalyst, Forerunner

Apple is bringing new accessibility features to iPads and iPhones, designed to cater to a diverse range of user needs.

Apple announces new accessibility features for iPhone and iPad users

TechCrunch Disrupt, our flagship startup event held annually in San Francisco, is back on October 28-30 — and you can expect a bustling crowd of thousands of startup enthusiasts. Exciting…

Startup Blueprint: TC Disrupt 2024 Builders Stage agenda sneak peek!

Mike Krieger, one of the co-founders of Instagram and, more recently, the co-founder of personalized news app Artifact (which TechCrunch corporate parent Yahoo recently acquired), is joining Anthropic as the…

Anthropic hires Instagram co-founder as head of product

Seven orgs so far have signed on to standardize the way data is collected and shared.

Venture orgs form alliance to standardize data collection

As cloud adoption continues to surge toward the $1 trillion mark in annual spend, we’re seeing a wave of enterprise startups gaining traction with customers and investors for tools to…

Alkira connects with $100M for a solution that connects your clouds

Charging has long been the Achilles’ heel of electric vehicles. One startup thinks it has a better way for apartment dwelling EV drivers to charge overnight.

Orange Charger thinks a $750 outlet will solve EV charging for apartment dwellers

So did investors laugh them out of the room when they explained how they wanted to replace Quickbooks? Kind of.

Embedded accounting startup Layer secures $2.3M toward goal of replacing QuickBooks

While an increasing number of companies are investing in AI, many are struggling to get AI-powered projects into production — much less delivering meaningful ROI. The challenges are many. But…

Weka raises $140M as the AI boom bolsters data platforms

PayHOA, a previously bootstrapped Kentucky-based startup that offers software for self-managed homeowner associations (HOAs), is an example of how real-world problems can translate into opportunity. It just raised a $27.5…

Meet PayHOA, a profitable and once-bootstrapped SaaS startup that just landed a $27.5M Series A

Restaurant365, which offers a restaurant management suite, has raised a hot $175M from ICONIQ Growth, KKR and L Catterton.

Restaurant365 orders in $175M at $1B+ valuation to supersize its food service software stack 

Venture firm Shilling has launched a €50M fund to support growth-stage startups in its own portfolio and to invest in startups everywhere else. 

Portuguese VC firm Shilling launches €50M opportunity fund to back growth-stage startups

Chang She, previously the VP of engineering at Tubi and a Cloudera veteran, has years of experience building data tooling and infrastructure. But when She began working in the AI…

LanceDB, which counts Midjourney as a customer, is building databases for multimodal AI

Trawa simplifies energy purchasing and management for SMEs by leveraging an AI-powered platform and downstream data from customers. 

Berlin-based trawa raises €10M to use AI to make buying renewable energy easier for SMEs

Lydia is splitting itself into two apps — Lydia for P2P payments and Sumeria for those looking for a mobile-first bank account.

Lydia, the French payments app with 8 million users, launches mobile banking app Sumeria

Cargo ships docking at a commercial port incur costs called “disbursements” and “port call expenses.” This might be port dues, towage, and pilotage fees. It’s a complex patchwork and all…

Shipping logistics startup Harbor Lab raises $16M Series A led by Atomico

AWS has confirmed its European “sovereign cloud” will go live by the end of 2025, enabling greater data residency for the region.

AWS confirms will launch European ‘sovereign cloud’ in Germany by 2025, plans €7.8B investment over 15 years

Go Digit, an Indian insurance startup, has raised $141 million from investors including Goldman Sachs, ADIA, and Morgan Stanley as part of its IPO.

Indian insurance startup Go Digit raises $141M from anchor investors ahead of IPO

Peakbridge intends to invest in between 16 and 20 companies, investing around $10 million in each company. It has made eight investments so far.

Food VC Peakbridge has new $187M fund to transform future of food, like lab-made cocoa

For over six decades, the nonprofit has been active in the financial services sector.

Accion’s new $152.5M fund will back financial institutions serving small businesses globally

Meta’s newest social network, Threads, is starting its own fact-checking program after piggybacking on Instagram and Facebook’s network for a few months.

Threads finally starts its own fact-checking program

Looking Glass makes trippy-looking mixed-reality screens where things look 3D without the need of special glasses. Today it launches a pair of new displays, including a 16-inch mode that runs…

Looking Glass launches new 3D displays

OpenAI co-founder and chief scientist Ilya Sutskever has left the company. Replacing Sutskever is Jakub Pachocki, OpenAI’s director of research.

Ilya Sutskever, OpenAI co-founder and longtime chief scientist, departs