KubeCon EU 2019: Top 10 Takeaways

Published in

Ambassador Labs

11 min readMay 30, 2019

The Datawire team and I have returned home from an awesome time last week where we attended KubeCon and CloudNativeCon in Barcelona. Together, we were part of six talks at KubeCon, staffed a packed booth with amazing T-shirts (if I do say so myself!), spoke to dozens of community members, and attended some fantastic talks. As there was so much goodness on offer at KubeCon EU, I’ve tried to summarise some of my key observations in this blog post.

In no particular order, here are my top ten takeaways:

Multi-Platform and Hybrid-Cloud is (Still) a Thing
Notable Increase in Technology Bundling
Service Mesh Interface (SMI) Announcement: Stay Tuned
The (Uncertain?) Future of Istio
Policy as Code is Moving Up the Stack
Cloud Native DevEx is Still Challenging
Enterprises Are (Still) Early in the Technology Adoption Lifecycle
On-Premises Kubernetes is Real (But Challenging)
Treat Clusters Like Cattle
Community is Still Core to the Success of Kubernetes

Multi-Platform and Hybrid-Cloud is (Still) a Thing

There were several talks that specifically covered the topic of multi-cloud (and the related sub-topics of networking and security), but I also observed that many of the introductory slides within end-user talks showed that their infrastructure / architecture included at least two cloud vendors. At the Datawire booth we also had a lot more conversations (in comparison with previous KubeCons) that anecdotally backed up this serious shift to embracing multi-cloud.

The success of Kubernetes has undoubtedly made creating a multi-cloud strategy much easier by providing a solid abstraction for deployment / orchestration. The functionality and APIs within Kubernetes have become more stable over the past two years, and the platform is widely adopted across vendors. In addition, functionality relating to storage management and networking have become more mature, and there are now viable open source and commercial products in these spaces. The “Debunking the Myth: Kubernetes Storage is Hard” keynote by Saad Ali, Senior Software Engineer at Google, was of interest in regard to storage, and “Kubernetes Networking: How to Write a CNI Plugin From Scratch”, by Eran Yanay at Twistlock, was a good overview of networking.

Of particular interest to me was the amount of chats I had that focused on the combination of Azure with existing on-prem deployments. I recently wrote a piece for InfoQ on how multi-platform deployments relate to application modernisation efforts, and broadly speaking there are three approaches: extend the cloud to the datacenter, as seen with Azure Stack, AWS Outposts and GCP Anthos; homogenise the deployment (orchestration) fabric across multiple vendors/clouds using a platform like Kubernetes; and homogenise the service (network) fabric using the combination of an API gateway and service mesh like Ambassador and Consul.

As the Datawire team work extensively within the API gateway space, we are obviously leaning towards the flexibility of the third approach. This provides the ability to incrementally and securely migrate from a traditional stack to more cloud native way of operating. Nic Jackson from HashiCorp and I presented a related session at KubeCon “Securing Cloud Native Communication, From End User to Service”.

Notable Increase in Technology Bundling

Many vendors are now offering bundles of Kubernetes tooling and complementary technologies. The announcement of the Rio “MicroPaaS” from the Rancher Labs team caught my eye, and Rancher have been releasing a bunch of interesting things recently. I wrote a summary of the Submariner multi-cluster bridge and k3s lightweight Kubernetes distro on InfoQ. I’m also keen to explore Supergiant’s Kubernetes Toolkit in more detail, which is a “a collection of utilities for automating the deployment and management of Kubernetes clusters in the cloud.”

In the enterprise space the bundling was focused on storage, and a good example of this is VMware’s Velero 1.0 (building on the initial “Ark” work acquired from Heptio), which allows engineers to backup and migrate Kubernetes resources and persistent volume.

On a related topic, many more storage and data management Kubernetes Operators were on display at KubeCon, for example, CockroachDB, ElasticCloud and StorageOS. Rob Szumski from Red Hat talked about the evolution of the Operator SDK and associated community in his keynote, in which he also announced the Operator Hub. The operator support appears to be a key part of Red Hat’s OpenShift enterprise bundle.

The Service Mesh Interface (SMI) Announcement: Stay Tuned

The announcement of the Service Mesh Interface (SMI) within Gabe Monroy’s Microsoft keynote certainly created quite a bit of buzz. There’s no denying that the service mesh space has been super hot of late, and the SMI aims to consolidate core features into a standard interface, and provide “a set of common, portable APIs that provide developers with interoperability across different service mesh technologies including Istio, Linkerd, and Consul Connect”.

Gabe’s on-stage recorded demonstration highlighted the key area that the specification will focus on: traffic policy — for applying policies such as identity and transport encryption across services (demonstrated via Consul and Intentions); traffic telemetry — capturing top line metrics like error rate and latency between services (illustrated via Linkerd and the SMI metrics server); and traffic management — shifting and weighting traffic between different services (demonstrated via Istio, with Weaveworks’ Flagger).

The concept of defining an interface within this highly competitive space is very interesting, but after having a look at the SMI website, and the corresponding specifications, I’m wondering if this abstraction may lean towards the lowest common denominator of functionality (which is always tricky to avoid with these kind of specifications, as I know from my Java Community Process days). The potential danger with this is that although everyone will implement the spec, vendors will provide the most interesting value-adds via custom extensions.

After a few chats with the rest of the Datawire team, I’m wondering if the service mesh space is a “winner take most” market, and therefore SMI could end up being a distraction for some, while another technology simply plows ahead and captures all of the value (arguably like Kubernetes did in relation to Mesos, Docker Swarm etc). Stay tuned to see what happens!

The (Uncertain?) Future of Istio

Although there was a lot of buzz around service meshes, the topic of Istio — arguably the most well known of the current service meshes — was discussed with mixed sentiment. Some folks were conflating Istio with service meshes in general (think Docker and containers) and focusing on the solution rather than defining the problems, some were enjoying the functionality that Istio provided, and some people were not particularly positive about the technology.

There has been quite a bit of discussion recently around benchmarking Istio, and the 1.1 release of Istio deliberately addressed some of the issues with the Mixer component. Anecdotally I had a few chats with folks that had each been evaluating Istio for several months (and one team for nearly a year), and they reported that running Istio is still operationally complex and resource intensive. Several teams mentioned that the release of hosted Istio via GKE addressed a lot of these issues, but not everyone could utilise GCP.

At a media lunch one of the attendees asked a panel member from Google what the future of Istio was, and inquired whether it would eventually be hosted as a CNCF project. The response given was polished but slightly vague and non-committal: basically, Istio is currently open source, which means people can get involved, but you’ll have to wait and see in regard to the CNCF. Currently only Linkerd is an official CNCF service mesh project, although Envoy Proxy is also a CNCF project (and is a foundational building block for Istio, Ambassador API gateway, and quite a bit of other tech). We did have a lot more people asking about the integration of Linkerd 2 and Ambassador at our booth (and thanks to Oliver Gould, CTO at Buoyant, for his mention of Ambassador in his Linkerd Deep Dive talk), and attendees appeared to like the operational simplicity on Linkerd.

On a related note, I did have to smile when several members of the Knative team made a big reveal at the end of their “Extending Knative for Fun and Profit” talk, where they showed they had replaced Istio with Ambassador within Knative, due to Ambassador being simpler to operate (check out the t-shirt reveal in the tweet picture below, and the related GitHub issue has more detail “Remove Istio as a dependency”):

Policy as Code is Moving Up the Stack

As an industry we’ve become used to thinking about policy as code in regard to identity access management (IAM), iptable config, network ACLs, and security groups, but this is obviously quite low-level and close to the infrastructure. Ever since I saw the Netflix folks talking about their use of Open Policy Agent (OPA) in KubeCon Austin, 2017, I’ve been intrigued by the use of this project for defining “policy as code”.

At this KubeCon I witnessed policy as code moving up the stack, with a lot more discussion about the use of OPA, including a good intro session from Rita Zhang and Max Smythe, and a more advanced “Unit Testing Your Kubernetes Configurations Using Open Policy Agent” talk by Gareth Rushgrove (who is typically quite good at focusing his attention on projects with a lot of potential).

I’ve also been tracking HashiCorp’s Sentinel for defining policy at the infrastructure level for quite some time, and now the use of Intentions within the Consul service mesh is targeting technology further up the stack. The use of Intentions enables policy to be defined at the service level e.g. service A can communicate with service B, but not service C. When the Datawire team and I started working with HashiCorp as part of our Ambassador and Consul integration, we all soon realised the power of combining Intentions with mTLS (for service identity) and ACLs (to prevent identity spoofing, and for defence in depth) offers a lot of potential.

Cloud Native DevEx is Still Challenging

The “Don’t Stop Believin’” conference closing remarks by Bryan Liles, Senior Staff Engineer at VMware, explored the importance of developer experience (DevEx), and this topic was mentioned in several talks throughout the event. Kubernetes and the surrounding platform ecosystem has matured nicely, but the inner development loop and integration of delivery pipelines with Kubernetes is still relatively immature.

In relation to this topic, Christian Roggia presented “Reproducible Development and Deployment with Bazel and Telepresence” and discussed how the team at Engel and Volkers use the CNCF-hosted Telepresence tool within their inner development loop, effectively removing the need to build and push a container after every change.

There was also an interesting panel with the #usualsuspects from Weaveworks and Cloudbees, “GitOps & Best Practices for Cloud Native CI/CD”, that explored the continuous delivery space nicely. GitOps has been bounding along quite successfully, and I frequently bump into teams using this approach to configuration and deployment during my work at Datawire and through conversations at conferences. For example, Jonathan and Rodrigo mentioned this in their great talk “Scaling Edge Operations at Onefootball with Ambassador: From 0 to 6000 rps”:

Enterprises Are (Still) Early in the Technology Adoption Lifecycle

This was the first KubeCon where I had a noticeable amount of discussion at the Datawire booth with engineers from large enterprise organisations that were investigating cloud native technology for the first time. The majority had heard about or experimented with Kubernetes, but many were exploring how to map their old technology landscape onto this new world.

Cheryl Hung, Director of Ecosystem at the CNCF, moderated several panels in this space, including “Leveraging Cloud Native Technology to Transform Your Enterprise”, and it was interesting to hear from the trail-blazers such as Intuit. Laura Rehorst also presented an excellent keynote “From COBOL to Kubernetes: A 250 Year Old Bank’s Cloud-Native Journey” that demonstrated the planning and strategic resources applied by the ABN AMRO bank.

As the Ambassador API gateway branding was front and center on our booth, most of the questions from enterprise engineers were in relation to how a modern Kubernetes-native gateway differs from the existing full lifecycle API management solutions. We are currently working on our next commercial offering in this space, “Ambassador Code”, and so it was fun to explore the requirements and expectations with attendees in relation to the new cloud paradigms.

On-Premises Kubernetes is Real (But Challenging)

There were several announcements in relation to installing Kubernetes on-prem, particularly within an enterprise context, such as Kublr’s VMware integration and VMware talking about kubeadm. Red Hat were ever-present in discussing OpenShift, and several people I chatted with were more than happy to buy into the OpenShift abstractions, saying that any potential lock-in is mitigated by the overall developer experience and SLAs they receive as part of the package.

However, one resounding message I received when chatting to attendees was to not install or operate Kubernetes yourself unless you really — really, really — have to. And even if you think your organization is special, then think again: the vast majority of companies that can leverage public cloud can also leverage Kubernetes-as-a-Service offerings.

Treat Clusters Like Cattle

In a great keynote “How Spotify Accidentally Deleted All its Kube Clusters with No User Impact”, Spotify Engineer David Xia discussed the lessons learned from deleting several(!) production clusters. I won’t do the talk justice by trying to describe it here (just go watch it), but one of the core messages I heard from David was to “treat Kubernetes clusters like cattle”. I’m sure many of us have heard the phrase “treat servers like cattle, rather than pets”, but David argued that as we’re moving up through the levels of compute abstraction (and treating the “Datacenter as a Computer”), we should still apply the principle of not getting too attached to our compute infrastructure.

I also overheard Purvi Desai and Tim Hockin suggest that organisations should continually destroy, recreate, and migrate their Kubernetes clusters to prevent them becoming pets, in their talk “Co-Evolution of Kubernetes and GCP Networking”. The core argument from all of these speakers was that unless you are regularly verifying your ability to rebuild clusters and migrate data, then you probably won’t be able to do this successfully when a problem occurs. Think of this as chaos engineering for your clusters.

Community is Still Core to the Success of Kubernetes

From the keynotes to the lunchtime discussions, the importance of community and diversity was front and center everywhere I looked. The Thursday morning keynote from Lucas Käldström & Nikhita Raghunath, “Getting Started in the Kubernetes Community” not only detailed two amazing journeys within the community, but also removed any excuses for not getting involved within open source and CNCF projects.

Cheryl Hung’s keynote “2.66 Million” also acknowledged the amount of contributions the Kubernetes project had amassed, and she made a compelling argument for the benefits of diversity and strong leadership. It was impressive to hear that so far over 300 diversity scholarships have been funded by donations from CNCF member organizations.

Wrapping Up KubeCon EU

Thanks again to everyone we connected with in Barcelona. If you didn’t get a chance to attend all of our presentations, here’s the full list:

Building an Edge Control Plane with Kubernetes and Envoy: Flynn from Ambassador Labs talks about the evolution of the Ambassador API gateway and its architecture.
Securing Cloud Native Communication, From End User to Service: Nic Jackson from HashiCorp and Daniel Bryant from Datawire discuss the integration of the Ambassador API gateway with the Consul service mesh.
Scaling Edge Operations at Onefootball with Ambassador: From 0 to 6000 rps: The Onefootball SRE team discuss how they’re using Ambassador in production.
Extending Knative for Fun and Profit: The Google Knative team talk about the Knative framework, and demonstrate how they use Ambassador as part of Knative.
Telepresence: Fast Development Workflows for Kubernetes: A talk about the past, present, and future of Telepresence.
Reproducible Development and Deployment with Bazel and Telepresence: Christian Roggia talks about the approach Engel & Volkers has taken to developing microservices with Telepresence.

Here at Datawire, we’re keenly watching these trends to ensure that Ambassador continues to evolve to meet the ever changing needs of the cloud-native community. If you haven’t used Ambassador recently, check out our latest release, Ambassador 0.70, which features integrated Consul service mesh support, Custom Resource Definition support, and more.

If you run into any problems with the update, please open an issue or join our Slack for some help. If you need an out-of-the-box Ambassador setup that includes integrated authentication, rate limiting, and support, check out Ambassador Pro, our commercial product.

And, if Ambassador is working well for you, we’d love to hear about it. Drop us a line in the comments below, or @ambassadorlabs on Twitter.