Choosing your CI/CD tools wisely

“Too much freedom undercuts freedom” — William Raspberry

Eliran Barnoy
ProdOpsIO

--

It is no secret that Continuous Integration, and in most cases,
Continuous Delivery (CI/CD in short)too are one of the most sought after development practices among technology-related companies nowadays.

These practices, provided they are applied correctly, are said to improve software development efficiency.
This, in turn, leads to faster development time, fewer integration issues compared to integrating once, and assisting in the prevention of a well-known phenomenon called “Integration hell”.

One of the many challenges in implementing this practice, which we will discuss in great detail in this post is its starting point.

There is a variety of products to choose from, and the task of selecting the most suitable solution might not be simple without acquiring a deep understanding of the tools and their specific designation.
Often organizations are picking solutions without that knowledge and are facing great methodological incompatibility issues that lead to a painful migration, undesirable patchwork they are required to maintain or might even lead to organizational catastrophes.

In the rest of this article, I would like to provide a fresh view to better understand the different approaches to some of the industry-leading tools for implementing CI/CD and which one is more suitable for you and how to deal with what you already have.

As presented in the graph below, there are 3 contestants, But not much of a competition when it comes to the market share of one of them.

Source: https://www.datanyze.com/market-share/ci

Based on the graph provided above, Jenkins is dominating the scene and claiming most of that pie for itself. I will address this oddity and provide my opinionated explanation for it shortly. I will also be focusing on another solution besides it as these two solutions are encouraging different approaches that both achieve the same practice differently.
These tools are Jenkins and Drone.io.
As a side note, many tools are very similar to these two, but I have chosen them based on my experience with them and their popularity and these things will most likely apply to most if not all of the currently available tools for the job.

To give a bit background on both of them,
Jenkins is an open-source automation server written in Java. It was released back in 2005 under the name “Hudson”, and up until today retains a very high market share compared to other competitors.
Drone.io is an open-source software written in Go. It is a Continuous Delivery system built on container technology and can either be hosted for free or used as a SaaS. It has been created a single person and maintained mostly by him and over 200 more maintainers. Its community is constantly growing and it is becoming a fairly known tool with every revision.

Normally, when such a tool dominates the market, It possesses something that other tools lack, does things significantly better or happens to be there at the right time with the right solution.
There are many explanations of this phenomenon, but I would like to focus on two major aspects that define Jenkins the way I see it which are its expandability and flexibility.
These two aspects have led to almost unlimited functionality and customizability using these plugins that are serving as one of Jenkins’s main strengths nowadays.
Of course, other tools have their solutions, but they fall short of what’s currently available and can be done with the variety available with Jenkins combined with its natural abilities and its strong community.

With all these incredible abilities, plugins and integrations, few downsides are emerging when there are so many potential integrations and features. One major downside I would like to address is complexity.

With a tool like Jenkins, we commonly get overwhelmed by the number of integrations we could implement and start building more and more on top of it. While doing so in a managed environment with clear boundaries could end up positively, It creates a mass of power and responsibility on a single link in the chain which raises a potential major and dreadful issue.

“With great power comes great responsibility.” -Uncle Ben (Spiderman)

As comics have taught many of us, and experience has proved it right, When a certain entity gains power, It gains responsibility relatively.
In the same sense, when a Jenkins server is packed with integrations and has an increasingly important role within our organization, the inevitable failure of it (As all software does at times) packs an increasingly painful “punch”.

To give a few examples my colleagues and I have experienced — A simple task of cleaning up old workspaces on the server to release some space turned out as a manual Jenkins job recovery mission after some very experienced workers have mistakenly deleted them and had to build everything from the ground up since the backups have failed to recover the configuration.
This event is not as uncommon as you might think, and has happened a few times in other variations, each of them rendering companies unable to deploy anything for an extended period of time, and having to recreate the deployment sequence again for every service and remember to implement every fine-tune they have had to execute for whatever reason in the old configuration.

Another example is that Jenkins was left with deploying CloudFormation stacks on AWS and a simple bug within the deployment job written in Jenkins that involved an empty parameter in a parameterized job execution nearly deleted the AWS resources of the whole organization and was luckily stopped 60% into the deletion. This mistake had been executed by another very experienced worker who had been careless for these couple of seconds that cost the organization more than he thought they would.

More examples for this misuse are that interacting with the UI and writing detailed groovy code in a job without it being backed up as a file somewhere had led to the job being removed without considering the code in it would be deleted as well. In another company, The Jenkins server had stopped working due to lack of storage mid-way in a production deployment that cost the company 2 hours of stressful downtime.

Since there are many different use-cases, Incidents of this nature are very uncommon, and there have been many more of them.

To sum it up a bit, What Jenkins essentially gives you is the ability to do everything, but the ‘catch’ for this ability is that it is not necessarily fast, efficient or logically correct, And as long as we continue abusing Jenkins and using it for purposes for which he was not originally meant, We are effectively increasing its responsibility and having it execute far more complex scenarios than he was planned for, which is increasingly dangerous, based on the responsibility it has accumulated, as expressed and demonstrated in the examples I’ve shared above.

Another topic that normally arises, in that sense, is that tools like Jenkins also tend to take on roles of services that are completely unrelated based on the fact it is seemingly easier and/or faster to implement for the short run, yet later on, these seemingly simple implementations tend to evolve into issues that are a lot harder to mitigate.

A different approach that is used in many places in our lives and is the solution the community has knowingly or unknowingly come up with are directed tools.
These are tools that limit your actions to some extent in some directions to improve your experience, speed, consistency or anything else the creator found as justified.
For a better understanding of the concept, A good example of an everyday device that does it very well would be an iPhone and most of Apple’s devices.

Now, to give an example of how the directed approach works with CI Tools, I’ll be taking Drone.io for example — It can be used as a SaaS or deployed on-premise just like Jenkins, but when it comes to the way it behaves, you could spot some major differences.
One difference would be its inability to initiate a build or deploy process in the click of a button, but only through a commit that triggered a webhook, But the integration itself is activated and deactivated in a click of a button.

The lack of this feature, in turn, forces us to work in a certain approach.
The approach we are being directed towards is essentially what the writers have had in mind when they initially wrote the tool, which means that software updates and features will be more aligned with the way we work and will be more directed towards us and our needs.

Another important point is that other creators are more likely to release complementary solutions for our directed solutions that can almost be tailor-made since the writers then know for a fact if you are using Drone.io, for example, you are surely working in a specific manner.

Drone.io and directed approach tools are essentially paving the right way for you by lacking on functionality on the departments that would run outside of its scope and add needless responsibility and power to them and that in turn prevents you from going into that direction in the first place and looking for suitable solutions that will work better and not place Drone.io in such an important position.
This does not mean that Drone.io does not offer some sort of freedom, As a matter of fact, It does offer quite a lot of it, and even that much freedom is enough for people to turn it into something the writer had not intended for it to become, but the chances decrease dramatically when the product itself directs us towards the intended usage.

The effects of the above are clear, you might not get the freedom you are with Jenkins, but you are getting inline with a tool that is constantly updated with the release of new features and is responding to new events and trends in the field that will most likely be relevant to you and their implementation is normally a lot less complex.

I would like to summarize a bit and focus on the points I think are important to take from this post — Generally, every tool has his use-case and his downsides as well and you must take them into account when you choose any tool for your organizations, if you think there’s “one use-case to rule them all”, You are probably missing something or not seeing the picture in its entirety.
When you consider Jenkins, as well as other tools, consider the time it would take for you to create a stable solution of that size and importance, and consider using a more simple tool with a more ‘direct’ approach to avoid needlessly wild and complex solutions based on them when possible.

Lastly, If you were to have a strong and well-planned foundation where its center of mass is properly thought out, Even if you were to find out the ground you’ve built on is soft, you might still be able to make it through and even become arguably the world’s most famous tower millions of people know by name.

--

--