Learn how a novel attack vector in GitHub Actions allows attackers to distribute malware across repositories using a technique that exploits the actions dependency tree and puts countless open-source projects and internal repositories at risk. Get an in-depth look at the attack vectors, technical details and a real-world demo in this blog post highlighting our latest research.
As the premier platform for hosting open-source projects, GitHub’s popularity has boosted the popularity of its CI/CD platform — GitHub Actions. This popularity, however, extends beyond the DevOps community to attract hackers eager to exploit the platform’s expanding attack surface.
In recent supply chain attacks, we see attackers repeatedly executing the same scheme, ultimately compromising a repository or a software library and infecting it with a malicious payload targeting its direct dependents. The payload tries to steal secrets or create a reverse shell, whether running in pipelines or production environments.
To achieve control or write permissions on a repository, attackers draw from a range of established techniques:
When a repository moves to a new owner (organization or user), or the owner changes its name, GitHub automatically redirects requests sent to the old repository name to the new owner and repository. But if the owner is deleted, an attacker can register the previous name on GitHub and create a repository containing malicious code. Use of the previous repository name disables GitHub’s automatic redirection, and consumers consuming this project unknowingly consume the malicious repository.
To protect against repojacking, GitHub employs a security mechanism that disallows the registration of previous repository names with 100 clones in the week before renaming or deleting the owner's account. The 100-clone security measure, though, often proves inadequate for repositories hosting actions. When a GitHub Actions workflow uses an action, it downloads a zip of the repository via the GitHub API, bypassing the clone count. In other words, the downloaded zip file doesn’t contribute to the repository's tally of clones.
Actions written in JavaScript usually involve dependencies maintained by developers who typically use email addresses to sign into NPM. Should attackers acquire a maintainer's email domain, and the maintainer's NPM account lack two-factor authentication (2FA), the attackers could reset the password. This would allow the attackers to create a malicious package version that, when used by the action, will enable the attackers to execute malicious code within GitHub workflows as the action runs.
Workflows receive context about the triggering event, including information like issue title, pull request title, and commit message. Since attackers can control some of these fields, developers should treat them as untrusted input. Workflows that use the untrusted input in bash commands — when using bash’s command substitution, for example — can be vulnerable to command injection.
Other attack methods used to compromise repositories include:
Regardless of the initial attack vector, once attackers gain a foothold in an action's repository, they set out to create a worm. The worm will allow the attackers to directly infect with malware any GitHub Actions workflow consuming this action. But how can the attackers extend their reach and infect more repositories?
After creating the worm, the next step involves finding a path for it to spread. For a GitHub Actions worm, the path travels from action to action, which could involve any of the three types of actions — JavaScript, Docker, or Composite. Additionally, actions can depend on actions in one of two ways. The first way uses composite actions, which combine multiple workflow steps within one action. All action types, though, use an action.yml file to define the action's inputs, outputs and main entrypoint.
In figure 1, a sample action.yml file instructs a composite action to run actions/checkout, a dependency of the composite action.
The second way actions can depend on other actions is through the action’s CI/CD pipeline. It’s common to see actions that use GitHub Actions workflows to build and test their code.
We can see that the workflow file uses the trilom/file-changes-action action during its run, which makes it an implicit dependency of the action.
Note that this dependency action isn’t used as part of the action but only in the action’s workflow, a component process in the proper flow of the CI process.
These two ways for actions to depend on other actions form a tree of dependencies that interconnect the actions in the GitHub Marketplace.
We can now use this knowledge to parse action.yml files that define actions and CI workflows of actions. In this, we can identify actions dependant on other actions and create the GitHub Actions dependency tree over a Neo4j graph:
A purple node represents a repository. An orange node represents an instance of an action used in another action’s workflow or an action directly referenced in an action configuration file (action.yml).
In figure 3, we see several repos (purple) with a workflow. Each workflow uses an action (orange), and the action is hosted in another repo (purple). The action stored in this repository might have its own workflow, which uses another action (orange), and so on.
We demonstrated how actions can be dependent on other actions. Let’s now explore how attackers can abuse these dependencies to spread their worm.
To secure secrets in GitHub Actions, you can use its encrypted secrets feature, which allows you to define secrets in the organization, repository or environment settings.
When a job starts, the GitHub Actions runner receives all the secrets used in the job. Because the runner receives the secrets when the job starts, we can dump the runner’s memory to reveal all secrets defined in the job even before they’re used. This means that no matter when we achieve code execution by compromising an action used in the job, we can read from memory all secrets referenced in the job.
Also noteworthy, jobs can use a secret called GITHUB_TOKEN — uniquely generated for each workflow run — to allow jobs to authenticate and use GitHub’s API against the repository. Is the GITHUB_TOKEN as accessible as other secrets? We’ll soon find out.
The workflow seen in figure 4 contains two jobs. Each job runs on a different runner and uses a different secret. In the second job, we see that the second step dumps the runner’s memory to retrieve its secrets.
In the decoded base64 that the second job prints, you’ll notice two interesting details:
Incidentally, we often see a GitHub personal access token (PAT) stored as a secret and used by steps in the workflow to perform tasks against the repository.
To understand how we can infect an action’s repository, we need to understand how actions are used.
The common format for calling an action follows {owner}/{repo}@{ref}. The “ref” key has three forms:
1. Reference a commit hash.
2. Reference a branch.
3. Reference a tag.
We can use the secrets exfiltrated in the flow to infect the repository with malicious code. Overwriting a commit while keeping its hash the same isn’t possible, so we can’t abuse a commit hash reference. We still have two options:
Successfully pushing code also depends on the GITHUB_TOKEN permissions, branch protection rules and protected tags configured in the repository.
You can calculate the GITHUB_TOKEN permissions of a workflow, even before running it. Until recently, all GitHub Actions workflows had default read and write permissions against their repositories. GitHub changed the default configuration in February 2023 to read permissions on the contents, packages and metadata scopes. This means that most repositories now have default write permissions.
Besides permissions granted in the repository setting of the GITHUB_TOKEN, permissions can be overwritten by configuring them inside the workflow file.
While a real worm doesn’t need to know the permissions — it simply tries to infect any repository it encounters — we only created a static analysis as part of our research and wanted to discover the permissions granted to the GITHUB_TOKEN in each workflow. To do that, we examined the workflow run log.
We can see the write permission on the contents scope — and many other granted permissions — and use them to push code to the repository.
Let’s pause and look at what we have so far.
Imagine what would happen if we did this recursively. We have the potential to create a GitHub Actions worm across the Actions dependency tree.
Now that we have all the pieces of the puzzle, we can start scanning targets for dependencies vulnerable to exploitation.
First, we gathered two sets of targets:
To create a Neo4j graph, we then automated a process that accomplished the following for each target:
Figure 11 depicts an attack graph of repositories we can’t disclose. In this graph, you can see two viable initial attack vectors to execute on two repositories. These two repositories are actually the same repository but moved from one org to another. By attacking the initial repository, we can directly infect 18 actions that depend on it. From infecting these 18 actions, we can infect 72 of the target repositories.
The scale of this infection chain is larger than presented in the graph. We’ll explain why through a practical, real-world example.
We reported the issues we found to all vulnerable projects we could contact. Hangfire and Veracode allowed us to publicly disclose their cases.
In the figure 12 attack graph, you can see the HangfireIO/Hangfire (8.4k ⭐) public GitHub repository at the bottom-left side. This repository used two actions in one of its workflow files: veracode/veracode-pipeline-scan-results-to-sarif and papeloto/action-zip. The veracode action also used papeloto/action-zip in its workflow file.
The papeloto/action-zip action was moved from its original repository to the vimtor organization, and the papeloto organization was available for registration, making the action-zip repository vulnerable to repojacking. Our team registered the organization to prevent malicious actors from exploiting this vulnerability.
By repojacking this repository, we successfully attacked the HangfireIO/Hangfire repository directly. Infecting the veracode/veracode-pipeline-scan-results-to-sarif repository achieved the same ends.
The potential scale of this vulnerability is massive. We can attack the veracode repository’s 1,600 dependents, represented in figure 14 by orange circles. The action-zip action has about 600 dependents that will execute our malicious code. But that’s not all.
The Hangfire repository deploys a NuGet package that has 9,400 daily downloads we can attack. Additionally, the dependents have dependents, and the NuGet package consumers likely have their dependents, so the infection chain continues ad infinitum. That’s still not all, though.
We’re talking only about the public repositories we analyzed in the GitHub public ecosystem. A real worm would run on a vast number of private repositories — and impose an immediate impact. If the worm encounters a private repository granting minimal read-only permissions to the GITHUB_TOKEN, it could steal source code. If the repository’s code can be modified, the attack escalates. Disastrous best describes the scope of this attack scenario.
Now take all of that and imagine multiple possible attack graphs like the one in figure 14.
Veracode and Hangfire acted on the findings we reported:
We created a closed demo environment to show how a GitHub Actions worm takes advantage of the methods used to spread malware to any infectable repository across the GitHub Actions dependency tree.
Our demo includes four repositories:
rev-action is the weakest link of the chain. The attacker had previously compromised the repository to gain write access using an initial attack vector.
Video: The infection action chain demonstrated
As seen in the video showing the infection chain, the attacker infects the rev-action action by committing code to its action.yml file, which downloads and executes the worm.
In software development, the concept of dependencies is well-understood. When you build a software application, you rely on a variety of software components — libraries, frameworks, tools, etc. Development teams can track and manage these dependencies using a software bill of materials (SBOM).
The same concept applies to pipelines. A pipeline is a set of steps used to automate a task, such as building, testing and deploying software. Pipelines can also have dependencies in the form of other pipelines, tools and services.
As in software development, the dependencies of pipelines can pose security risks. If one of the dependencies up the dependency chain is compromised, it could affect the entire pipeline. This is why teams need to track and manage the dependencies of pipelines as carefully as they would track and manage the dependencies of software applications.
Multiple security controls can prevent or raise the difficulty of successfully attacking repositories using the worm. In order of effectiveness, controls include:
The CI/CD attack surface has changed considerably in recent years, making it challenging to know where to get started with CI/CD security. If you’re looking for practical help, check out the Top 10 CI/CD Security Risks: The Technical Guide.
And if you’re now thinking about Prisma Cloud, take it for a free 30-day test drive and discover the advantage.
By submitting this form, you agree to our Terms of Use and acknowledge our Privacy Statement. Please look for a confirmation email from us. If you don't receive it in the next 10 minutes, please check your spam folder.