Enterprise

3 questions to ask before adopting microservice architecture

Comment

Digital generated image of data cloud server.
Image Credits: Andriy Onufriyenko (opens in a new window) / Getty Images

Madison Friedman

Contributor

Madison Friedman is an investor intern at Vertex Ventures US and an MBA candidate at the Wharton School of Business.

As a product manager, I’m a true believer that you can solve any problem with the right product and process, even one as gnarly as the multiheaded hydra that is microservice overhead.

Working for Vertex Ventures US this summer was my chance to put this to the test. After interviewing 30+ industry experts from a diverse set of companies — Facebook, Fannie Mae, Confluent, Salesforce and more — and hosting a webinar with the co-founders of PagerDuty, LaunchDarkly and OpsLevel, we were able to answer three main questions:

  1. How do teams adopt microservices?
  2. What are the main challenges organizations face?
  3. Which strategies, processes and tools do companies use to overcome these challenges?

How do teams adopt microservices?

Out of dozens of companies we spoke with, only two had not yet started their journey to microservices, but both were actively considering it. Industry trends mirror this as well. In an O’Reilly survey of 1500+ respondents, more than 75% had started to adopt microservices.

It’s rare for companies to start building with microservices from the ground up. Of the companies we spoke with, only one had done so. Some startups, such as LaunchDarkly, plan to build their infrastructure using microservices, but turned to a monolith once they realized the high cost of overhead.

“We were spending more time effectively building and operating a system for distributed systems versus actually building our own services so we pulled back hard,” said John Kodumal, CTO and co-founder of LaunchDarkly.

“As an example, the things we were trying to do in mesosphere, they were impossible,” he said. “We couldn’t do any logging. Zero downtime deploys were impossible. There were so many bugs in the infrastructure and we were spending so much time debugging the basic things that we weren’t building our own service.”

As a result, it’s more common for companies to start with a monolith and move to microservices to scale their infrastructure with their organization. Once a company reaches ~30 developers, most begin decentralizing control by moving to a microservice architecture.

Large companies with established monoliths are keen to move to microservices, but costs are high and the transition can take years. Atlassian’s platform infrastructure is in microservices, but legacy monoliths in Jira and Confluence persist despite ongoing decomposition efforts. Large companies often get stuck in this transition. However, a combination of strong, top-down strategy combined with bottoms-up dev team support can help companies, such as Freddie Mac, make substantial progress.

Some startups, like Instacart, first shifted to a modular monolith that allows the code to reside in a single repository while beginning the process of distributing ownership of discrete code functions to relevant teams. This enables them to mitigate the overhead associated with a microservice architecture by balancing the visibility of having a centralized repository and release pipeline with the flexibility of discrete ownership over portions of the codebase.

What challenges do teams face?

Teams may take different routes to arrive at a microservice architecture, but they tend to face a common set of challenges once they get there. John Laban, CEO and co-founder of OpsLevel, which helps teams build and manage microservices told us that “with a distributed or microservices based architecture your teams benefit from being able to move independently from each other, but there are some gotchas to look out for.”

Indeed, the linked O’Reilly chart shows how the top 10 challenges organizations face when adopting microservices are shared by 25%+ of respondents. While we discussed some of the adoption blockers above, feedback from our interviews highlighted issues around managing complexity.

The lack of a coherent definition for a service can cause teams to generate unnecessary overhead by creating too many similar services or spreading related services across different groups. One company we spoke with went down the path of decomposing their monolith and took it too far. Their service definitions were too narrow, and by the time decomposition was complete, they were left with 4,000+ microservices to manage. They then had to backtrack and consolidate down to a more manageable number.

Defining too many services creates unnecessary organizational and technical silos while increasing complexity and overhead. Logging and monitoring must be present on each service, but with ownership spread across different teams, a lack of standardized tooling can create observability headaches. It’s challenging for teams to get a single-pane-of-glass view with too many different interacting systems and services that span the entire architecture.

For example, years ago, one company had 10 different monitoring systems, three different logging systems and additional third-party vendors throwing their own data into the mix. The sprawl of observability tooling creates issues that flow downstream, making critical operations like incident response far more difficult.

“It’s important to get the number of services right,” said Andrew Miklas, co-founder of PagerDuty. “There’s a basic sanity check for the number of services in your organization. If each dev supports three services, you’re probably creating too many. On the other hand, if a full team of 10 devs supports one service, it may be time to break it apart.”

Managing a microservice architecture also implies managing multiple codebases, each with its own set of dependencies. Each codebase also needs to be hooked up to its own release pipeline. As the practice of CI/CD matures and companies can deploy multiple times per day, if something breaks, it becomes increasingly difficult to determine which change from which codebase caused the issue.

Problems are plentiful, but our interviewees also found creative solutions to overcome challenges with microservices.

Which strategies and tools help companies overcome these challenges?

Speaking with developers and engineering managers helped us identify essential strategies and tools to help companies manage their microservice pain points:

1. Embracing the silos that form in your organization. The key is to make it easy for teams to break out of their silo when needed. Establishing best practices and standardized formats for cross-team contracts helps break down silos when teams need to work together. For example, Laban said he has “seen a lot of companies lay out high level guidelines and say ‘if you follow these best practices then you get a license to operate autonomously.’”

On the tooling side, Kodumal says using new technologies like feature flagging can help teams embrace siloed teams and services, “once you can decouple deploying a change from exposing it to end users you have a greater flexibility in how you roll things out.”

2. Balancing your team with generalists and specialists can also help overcome the organizational drawbacks of distributed architecture. Every team has specialists that know their codebase inside and out. But adding in generalists can help facilitate connections between teams for features that span multiple codebases, share best practices across the company and help educate the team on how all the codebases work together. Sourcing internal generalist candidates from platform teams and setting up rotational programs between groups can set up teams for long-term success.

3. Standardization helps simplify microservices. Independent service ownership gives teams the flexibility to choose the best technologies that fit their requirements, but too much autonomy breeds too much complexity.

Laban recommended that “you should only introduce new technologies if they have a large impact. Don’t look for a 10% improvement, look for a 2x improvement.” Before switching to microservices, specify a preferred set of technologies that engineering teams can standardize on.

“When making technical decisions, try to keep the needs of the broader company in mind and don’t just focus on what works best for the piece of software you’re writing today,” said Miklas. This makes it easier to hire devs, share knowledge and focus development on the most useful tech stacks.

4. The most exciting solution I’ve seen that helps increase visibility across the microservice architecture is the service catalog. The service catalog’s goal is to reduce the pain of managing the associated overhead of microservices by having one place for developers to get all the information they require about their infrastructure. From costing, to observability, to team ownership, a successful service catalog helps developers understand how infrastructure maps to their company’s organizational structure.

Companies like OpsLevel, Cortex and Effx help dev teams build better software by painting a picture of how their services fit together and how they map to their organizational structure. John Laban laid out his vision for how teams should manage their microservice architectures and the challenges that follow: “The ideal end state is where the development teams have full ownership of their software end to end both in design and operation. This works well with a distributed architecture where people get autonomy and independence and they can move a lot faster. But this independence comes with responsibility in reliability, security and compliance, which takes a lot of added effort.”

The best products do far more than just track service SLOs; they help teams collaborate to build better services. Products like OpsLevel help teams understand what language a given service is written in, who to contact, how to contact them if something goes wrong and what changes are coming down the pipeline.

These tools can also solve practical headaches such as finding and eliminating orphaned services, cataloging services for comprehensive legal and security audits, resolving incidents and driving standardization of technologies across infrastructure.

For example, let’s say your organization needs to adopt a new technology like Kubernetes or move to a newer version of a language like Java or Python. This process might sound simple when you’ve only got a few services to keep track of. But once your organization scales beyond ~30 services, it’s nearly impossible to ensure standardization across the entire stack.

With organizational silos keeping teams focused on only their specific slice of infrastructure, cross-team engineering efforts require a microservice catalog to ensure all teams are on the same page when it comes to understanding who has yet to adopt. Companies like OpsLevel keep an up-to-date list of all your services and their owners with metadata on languages and frameworks to help you reach 100% adoption.

Another instance where microservice catalogs are a must is in incident response. Our interviews highlighted the magnitude of pain teams face when trying to resolve incidents in a microservice architecture. With services spread across so many teams and monitoring data spread across so many siloed products, it’s difficult to get context on the timeline of events leading up to an incident.

Tools like Effx embrace the service catalog to aggregate data across services and data sources to give the complete picture of an incident. For example, let’s say I’m a developer on support rotation. I can use Effx to pull in monitoring data from tools like Datadog to scan my services and identify any incidents quickly. If one pops up, I can build a timeline of events using monitoring data, past deployments, active feature flags and provisioning changes to diagnose the problem.

Tying everything together

Microservices are now widely accepted as a way to help companies scale their infrastructure with their org structure.

Most teams start with a monolith to lower overhead. As the company scales to naturally develop distinct areas of focus and ownership, microservices help link the right services to the right teams.

Increasing complexity creates challenges in managing this infrastructure. Service sprawl, technical and organizational silos, and dependency management slow development and take the fun out of building software.

But development teams are problem solvers by nature, and they’ve devised strategies to overcome these issues. Standardized cross-team contracts, balancing dev teams, standardization of practices/tooling and service cataloging help teams manage the complexity.

These strategies have helped our interviewees scale their architectures to support some of today’s most popular products. We’re excited to see what new products startups come up with to help teams fully realize the power of microservices. If your company is looking for feedback on how to uplevel microservice infrastructure or if you’re working on a new product to tame microservices, leave a comment below or get in touch.

Vertex Ventures US has a financial interest in LaunchDarkly and OpsLevel.

How Roblox completely transformed its tech stack

More TechCrunch

Jasper Health, a cancer care platform startup, laid off a substantial part of its workforce, TechCrunch has learned.

General Catalyst-backed Jasper Health lays off staff

Live Nation says its Ticketmaster subsidiary was hacked. A hacker claims to be selling 560 million customer records.

Live Nation confirms Ticketmaster was hacked, says personal information stolen in data breach

Featured Article

Inside EV startup Fisker’s collapse: how the company crumbled under its founders’ whims

An autonomous pod. A solid-state battery-powered sports car. An electric pickup truck. A convertible grand tourer EV with up to 600 miles of range. A “fully connected mobility device” for young urban innovators to be built by Foxconn and priced under $30,000. The next Popemobile. Over the past eight years, famed vehicle designer Henrik Fisker…

4 hours ago
Inside EV startup Fisker’s collapse: how the company crumbled under its founders’ whims

Late Friday afternoon, a time window companies usually reserve for unflattering disclosures, AI startup Hugging Face said that its security team earlier this week detected “unauthorized access” to Spaces, Hugging…

Hugging Face says it detected ‘unauthorized access’ to its AI model hosting platform

Featured Article

Hacked, leaked, exposed: Why you should never use stalkerware apps

Using stalkerware is creepy, unethical, potentially illegal, and puts your data and that of your loved ones in danger.

5 hours ago
Hacked, leaked, exposed: Why you should never use stalkerware apps

The design brief was simple: each grind and dry cycle had to be completed before breakfast. Here’s how Mill made it happen.

Mill’s redesigned food waste bin really is faster and quieter than before

Google is embarrassed about its AI Overviews, too. After a deluge of dunks and memes over the past week, which cracked on the poor quality and outright misinformation that arose…

Google admits its AI Overviews need work, but we’re all helping it beta test

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. In…

Startups Weekly: Musk raises $6B for AI and the fintech dominoes are falling

The product, which ZeroMark calls a “fire control system,” has two components: a small computer that has sensors, like lidar and electro-optical, and a motorized buttstock.

a16z-backed ZeroMark wants to give soldiers guns that don’t miss against drones

The RAW Dating App aims to shake up the dating scheme by shedding the fake, TikTok-ified, heavily filtered photos and replacing them with a more genuine, unvarnished experience. The app…

Pitch Deck Teardown: RAW Dating App’s $3M angel deck

Yes, we’re calling it “ThreadsDeck” now. At least that’s the tag many are using to describe the new user interface for Instagram’s X competitor, Threads, which resembles the column-based format…

‘ThreadsDeck’ arrived just in time for the Trump verdict

Japanese crypto exchange DMM Bitcoin confirmed on Friday that it had been the victim of a hack resulting in the theft of 4,502.9 bitcoin, or about $305 million.  According to…

Hackers steal $305M from DMM Bitcoin crypto exchange

This is not a drill! Today marks the final day to secure your early-bird tickets for TechCrunch Disrupt 2024 at a significantly reduced rate. At midnight tonight, May 31, ticket…

Disrupt 2024 early-bird prices end at midnight

Instagram is testing a way for creators to experiment with reels without committing to having them displayed on their profiles, giving the social network a possible edge over TikTok and…

Instagram tests ‘trial reels’ that don’t display to a creator’s followers

U.S. federal regulators have requested more information from Zoox, Amazon’s self-driving unit, as part of an investigation into rear-end crash risks posed by unexpected braking. The National Highway Traffic Safety…

Feds tell Zoox to send more info about autonomous vehicles suddenly braking

You thought the hottest rap battle of the summer was between Kendrick Lamar and Drake. You were wrong. It’s between Canva and an enterprise CIO. At its Canva Create event…

Canva’s rap battle is part of a long legacy of Silicon Valley cringe

Voice cloning startup ElevenLabs introduced a new tool for users to generate sound effects through prompts today after announcing the project back in February.

ElevenLabs debuts AI-powered tool to generate sound effects

We caught up with Antler founder and CEO Magnus Grimeland about the startup scene in Asia, the current tech startup trends in the region and investment approaches during the rise…

VC firm Antler’s CEO says Asia presents ‘biggest opportunity’ in the world for growth

Temu is to face Europe’s strictest rules after being designated as a “very large online platform” under the Digital Services Act (DSA).

Chinese e-commerce marketplace Temu faces stricter EU rules as a ‘very large online platform’

Meta has been banned from launching features on Facebook and Instagram that would have collected data on voters in Spain using the social networks ahead of next month’s European Elections.…

Spain bans Meta from launching election features on Facebook, Instagram over privacy fears

Stripe, the world’s most valuable fintech startup, said on Friday that it will temporarily move to an invite-only model for new account sign-ups in India, calling the move “a tough…

Stripe curbs its India ambitions over regulatory situation

The 2024 election is likely to be the first in which faked audio and video of candidates is a serious factor. As campaigns warm up, voters should be aware: voice…

Voice cloning of political figures is still easy as pie

When Alex Ewing was a kid growing up in Purcell, Oklahoma, he knew how close he was to home based on which billboards he could see out the car window.…

OneScreen.ai brings startup ads to billboards and NYC’s subway

SpaceX’s massive Starship rocket could take to the skies for the fourth time on June 5, with the primary objective of evaluating the second stage’s reusable heat shield as the…

SpaceX sent Starship to orbit — the next launch will try to bring it back

Eric Lefkofsky knows the public listing rodeo well and is about to enter it for a fourth time. The serial entrepreneur, whose net worth is estimated at nearly $4 billion,…

Billionaire Groupon founder Eric Lefkofsky is back with another IPO: AI health tech Tempus

TechCrunch Disrupt showcases cutting-edge technology and innovation, and this year’s edition will not disappoint. Among thousands of insightful breakout session submissions for this year’s Audience Choice program, five breakout sessions…

You’ve spoken! Meet the Disrupt 2024 breakout session audience choice winners

Check Point is the latest security vendor to fix a vulnerability in its technology, which it sells to companies to protect their networks.

Zero-day flaw in Check Point VPNs is ‘extremely easy’ to exploit

Though Spotify never shared official numbers, it’s likely that Car Thing underperformed or was just not worth continued investment in today’s tighter economic market.

Spotify offers Car Thing refunds as it faces lawsuit over bricking the streaming device

The studies, by researchers at MIT, Ben-Gurion University, Cambridge and Northeastern, were independently conducted but complement each other well.

Misinformation works, and a handful of social ‘supersharers’ sent 80% of it in 2020

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Okay, okay…

Tesla shareholder sweepstakes and EV layoffs hit Lucid and Fisker