Accelerating AppDirect Developer Workflow with Ambassador

Daniel Bryant
Ambassador Labs
Published in
5 min readSep 11, 2018

--

The AppDirect engineering team has embraced Kubernetes since the project emerged and started serving production traffic on this platform from the release of Kubernetes 1.1. In a post on the AppDirect blog, “Evolution of the AppDirect Kubernetes Network Infrastructure”, Alex Gervais, discussed their journey and approach to choosing how to expose the internal services via an external gateway. Using the Kubernetes-native Ambassador Edge Stack API gateway.

I sat down with Alex and discussed the challenges and benefits of Kubernetes, how their ingress solution matured as they embraced the microservice architectural style, and how they are working to improve the developer experience and associated CI/CD pipeline.

Ambassador: Could you describe your developer workflow for registering and managing services and routes, before moving to Kubernetes please? Did the use of Terraform and Chef provide challenges for developers (having to learn new things)?

Alex Gervais: Before Kubernetes, our architecture was simpler: it was a single monolithic Java application. Since the infrastructure was very static and evolving slowly, developers had a firm grasp on the services, and their published API. Every entry point was coded and checked at compile time by the automated CI process.

I wouldn’t say it was easy for developers, all working on the same code base and facing scale and domain design challenges, but extending our platform capabilities was a straightforward task. Then, when developers were ready to ship their unit of code, they would turn to our Operations team to manage the runtime configurations, expose the application on a single port, manage the load balancers, SSL termination and make sure the DNS records were pointing to the right location.

Teams worked in silos: developers were shipping services and routes configured by code without any knowledge of the underlying infrastructure, and sys-admins were running a magic process for which all they cared about was the uptime. We ran on a weekly release schedule, with no apparent continuous delivery process — other than a few bash scripts.

Ambassador: How did Kubernetes impact the ability to provision new services, and what were the corresponding challenges around the routing of traffic?

AG: With bare minimal infrastructure automation in place (a mix of Terraform and Chef), Kubernetes enables our architecture to move to microservices — at least as the runtime platform.

Challenges remained around traffic routing since Kubernetes does not offer one-size-fits-all solutions to manage operational networking components: edge load balancers, SSL termination, public DNS registration, etc. With developers still not having the keys to the infrastructure, onboarding a new microservice in all environments meant they had a huge dependency on existing Terraform recipes and the Ops team to provision the ingress controller components.

At this point, it also meant that every service running outside the monolith required its own public domain name! This introduced so much complexity and duplication in our cloud infrastructures and domain modeling. Eventually, we reduced part of this new complexity by moving every service behind the same edge load balancer with SSL termination and relying on HAProxy, running inside Kubernetes, to route traffic based on the request’s hostname.

We were moving forward, yet some new challenges emerged: unified authentication, multi-tenancy, and rate limiting, to name a few. So we had all this new complexity and challenges, and yet we were not shifting any traffic away from the monolith.

Ambassador: How has using Ambassador changed your developer workflow? Have you integrated Ambassador into your CI/CD process?

AG: Our choice to implement an API gateway, Ambassador more specifically, as a first-class component of our software architecture, was mostly driven by our need to let developers reroute traffic away from the monolith to a new microservice implementation without impacting the published public API.

Ambassador Edge Stack API Gateway also unlocked some key features by unifying utility services such as auth, multi-tenancy, and rate-limiting — removing these duplicate responsibilities from each service implementation. In a parallel effort to deploy an API gateway in our infrastructure, we completely automated our Kubernetes workflow through git.

Kubernetes YAML manifests are stored in git and follow the same review/approval process as any other code unit. The CD pipeline listens to changes to the git repo and applies the diff to Kubernetes, ensuring we always have an exact snapshot of production manifests stored in source control. Since Ambassador’s state relies 100% on configurations from the Kubernetes API through service annotations, this fits our workflow perfectly.

Compared to other API gateway solutions, Ambassador allowed us to shift the ownership of the router configuration from the infrastructure and Operations team back to the developers with git access. We even have built-in CI checks that validate the Ambassador annotations and report errors or misconfigurations to developers before they ship.

Ambassador: How does the local development loop work? Do you find yourselves spinning up multiple services locally, or do you work with dependencies in a remote cluster? How can you ensure fast feedback here?

AG: As a side effect of running such a huge monolith application, developers were already offloading part of their workload on remote clusters, especially for running long regression test suites and demoing their features.

Building on our knowledge of Terraform and Kubernetes API, we built an internal tool, in a way similar to Datawire’s Kubernaut, that allows Developers and QAs to run short-lived customized test environments remotely. This local development flow works with our existing CI process that performs automated checks on every code change.

With more and more inter-service dependencies, we are actually pushing towards quicker automated Pact contract testing than actual end-to-end service integration tests — leaving the later to a shared test environment.

Ambassador: Do you have an idea of your ultimate developer workflow? What steps are you taking to get there?

AG: Our target is to ship code to production on every change to the master branch and eliminate the duplicate local, dev, and test environments. This goal impacts our ability to ship fast in a reliable, confident manner. Therefore, we are not only investing our time and energy in our CI/CD pipelines but also focusing on observability through tracing, metrics and monitoring, canary releases, and smoke testing on live production systems.

Ambassador: How will you implement canary releases? Would you canary only on deployments (checking for operational issues), and release new functionality, for example, using feature flags? And who would be responsible for the canarying: the dev or the platform team?

AG: Greatly inspired by Cindy Sridharan blog post series on testing in production, we now distinguish between “deployment” and “release” phases. The platform team will provide the automation tooling and dashboards to enable dev and QA teams to run smoke tests on the canary deployment, observe, then release their changes independently from other teams. Product dev teams will be encouraged to use feature flags in order to test upcoming features, quickly disable unstable functionalities and preserve backward-compatibility.

The Ambassador Labs team would like to express our sincere thanks to Alex and the AppDirect team for sharing their knowledge here. You can read more from them on the AppDirect blog, and follow Alex Gervais on Twitter.

Learn more about Ambassador Labs. Or join our Slack Community, drop us a line in the comments below, or @ambassadorlabs on Twitter.

--

--

DevRel and Technical GTM Leader | News/Podcasts @InfoQ | Web 1.0/2.0 coder, platform engineer, Java Champion, CS PhD | cloud, K8s, APIs, IPAs | learner/teacher