Extending GitOps to the Enterprise

GitOps is a relatively new term, but it continues to gain momentum as more organizations embrace the paradigm. In DevOps, GitOps lands on the software engineering side of the development and operations continuum. Even though GitOps is more likely to be adopted by smaller organizations, larger enterprises can successfully implement GitOps with just a few tweaks to the overarching framework.

Let’s start at the beginning. Wouldn’t it be great if we could combine DevOps and GitOps approaches and use the Git distributed version control system as the ultimate source of truth? Then, when there’s a dispute over the correct state, people know where to go for the correct version.

In GitOps, the system’s desired configuration is stored in a revision control system such as Git. Any difference between the desired state stored in Git and the system’s actual state indicates to DevOps teams that not all changes have been deployed. These changes can be reviewed and approved through standard revision control processes such as pull requests (PRs). When a PR is approved and merged to the main branch, an operator software process is responsible for changing the system’s current state to the desired state based on the configuration in Git.

GitOps doesn’t require a particular set of tools, but the tools must:

Operate according to the desired system state stored in Git.
Detect differences between the desired and actual states.
Perform required operations on the infrastructure to synchronize the actual and desired states.

In an ideal implementation of GitOps, manual changes to the system are not permitted and all changes to configuration must be made to files stored in Git. The infrastructure and operations engineers’ roles in a GitOps model shift from performing infrastructure changes and application deployments to developing and maintaining GitOps automation and helping teams review and approve changes through Git.

GitOps is great for non-critical environments, but enterprises are less likely to adopt GitOps due to a number of challenges facing GitOps users.

Challenges with GitOps

GitOps is applicable only to a subset of the software development life cycle (SDLC). This is important because GitOps tools are sometimes marketed as a one-size-fits-all solution that will solve all release problems—this is simply not true. First, GitOps requires that your deployment artifacts are already there. This means that tasks such as

Compiling code
Running unit/integration tests
Security scanning
Static analysis

are not a concern of GitOps tools and are assumed to already be in place. GitOps also doesn’t address promotion of releases between environments.

Auditability

A default GitOps process is not fully auditable because of force push. Force push essentially allows users to remove any unwanted blocks of commits from Git history on the central repository.

Business Approvals

PRs are a typical approval model used in GitOps. Developers make changes, create PRs and then an approver may accept that PR. This would be recorded and a change would be deployed. A PR-based approval model assumes peers are already using Git and are familiar with PRs.

Generally this works great until we need to include business approvals. Business personnel may not be familiar with Git or PRs. If a team needs business approvals, they have to build a process on top of GitOps. This drastically increases lead time.

Governance

GitOps is dependent on a pure PR-approval based model, one in which the PR approver is responsible for reviews. After a PR is approved, there is no way to enforce enterprise policy rules. Enterprises need to control what end users can do on specific clusters to ensure those clusters are in compliance with policies to meet governance, legal requirements or enforce best practices.

An Enterprise Approach to GitOps

However, there are ways to solve this problem at the enterprise level.

Alerting on Drift Detection

Enterprise customers get alerted whenever there is a drift in their repository and, based on that, customers can choose to approve or reject the drift. If rejected, the merged PR should be reverted to the previous state to ensure that Git is in sync with the desired state of the cluster.

Use a Pipeline for Promotion of Releases Between Environments

To promote an artifact between environments, users would have the ability to use a pipeline with deployment stages, performance monitoring and log analysis/approvals.

Let’s say the developer commits the source code and the GitOps deployment happens in the QA environment. The PR can be automatically approved upon meeting the policy standards set by the company. Then it can be pushed to a production environment.

Governance

There are many reasons why developers and platform engineers integrate policy-as-code into their current GitOps processes. For one, this strategy helps to accelerate application development and deployment because it helps solve many of the change management hurdles that slow development pipelines.

On the development side, policy-as-code helps app developers understand and abide by the company’s configuration, security and compliance policies. For example, a developer may not remember—or may have no reason to know—when deploying a load balancer onto Kubernetes in AWS is or is not sanctioned. Policy-as-code solves this problem automatically.

The ability to run policies during GitOps deployments and adhere to those policies as a part of the deployment and upon any violations—alert, audit, notify and automatically roll back the changes from Git to the previous successful compliant state so that Git remains in sync with the cluster.

GitOps + AIOps

There is a notion that once GitOps is in place, everything will work seamlessly with no problems. Not so much; it is important to continually monitor deployment state to ensure everything stays as it should. That is where AIOps comes into picture.

GitOps enables DevOps and development teams to see any manifest change, infrastructure change or artifact change. Post-deployment, the GitOps operator can instantly understand the changes in the service health, see what caused those changes and automate the process of investigating them using AIOps. If any severe anomalies are found, the operator can determine the current state of the system and, should anything differ, revert to the previous compliant Git commit.