For DevSecOps leaders, 2021 will be the year of the open source supply chain attack. It’s already starting, in fact. On January 7, security researchers at Sonatype identified three malicious Java components in the Maven Central repository. The components had identical names to reputable components. Then on January 20, the same research team found three more compromised packages in the node package manager (NPM). In all cases, these malicious components had nearly identical names to common Java and NPM packages.
The bad packages were downloaded thousands of times before they were removed, compromising the open source software supply chain. Such incidents are increasingly common. The practice of intentionally or unintentionally deploying poorly vetted code — we call it shadow code — is inevitable. Shadow code happens when developers include third-party software code in applications without approval or sufficient safety validations. Developers do this for the same reason that they use libraries and other components – it is quicker than writing the functionality from scratch. Often, these components offload compute-intensive or specialized features. Some examples are shopping cart software, payment systems or responsive design requirements. Sometimes, too, developers mistakenly include shadow code that is fake, but has a similar name (this is referred to as typosquatting), like the examples we described above.
Shadow code, plus a growing reliance on open source, magnifies the security risk and attracts more bad actors. Modern applications are composed of anywhere from 50% to 90% open source code, depending on the code audit. Rapid code iterations make it impossible for security reviews to police all code changes and dependencies. With many open source libraries containing hundreds of internal dependencies, even the best-automated code checking tools cannot keep up.
How Shadow Code and Supply Chain Attacks Play Off Each Other
The compounding risks of shadow code and supply chain attacks are one of the most pressing issues that will face DevOps and DevSecOps teams in the coming year. Security researchers at Sonatype tracked a 430% increase in supply chain attacks against 24,000 open source software components in 2020. The report blamed the growth of these types of attacks on two factors; first, DevOps teams are increasing code velocity to accelerate time to market. Second, security teams are responding to traditional zero-day vulnerabilities more quickly, shortening the window for exploits. Combined, the two factors encourage more supply chain attacks.
Attacking the Modern Extended Software Supply Chain
In modern web applications, most front-end and middleware application code comes from third-parties. Some is proprietary, but most code is open source, due to cost considerations. Open source component source code is a massive attack surface that is often poorly monitored, measured and inventoried. DevSecOps teams should create plans to reduce the impact of these attacks in 2021 and beyond.
Because supply chain attacks are often used either as launch pads for other attacks or to steal sensitive information, the business and financial risk from these compromises is considerable. In its annual “Cost of Breach” survey, IBM found that the average data breach cost $3.8 million and required 280 days to fully contain.
Supply chain attacks are likely an even more expensive subset of attack, due to several factors. Because they take advantage of trusted relationships, these attacks may go undetected for months or even years. Supply chain attacks also tend to be sophisticated and target an organization’s most valuable assets.
Evidence is emerging of coordinated attacks attempting to take advantage of supply chain weaknesses at the same time. Multiple attackers have targeted known users of vulnerable and end-of-life (EOL’d) versions of the online shopping platform Magento (aka Magecart attacks). We suspect that advanced hacking groups are either running the same scanning software packages to identify vulnerable targets, or are buying lists of viable targets via the dark web.
The Open Source Supply Chain Risk
The benefits of using open source software in building web and mobile applications is clear. Development teams, DevOps and DevSecOps teams can reuse code or libraries maintained by someone else to ship code faster and reduce the amount of code they must maintain.
Nowhere is this more of a risk than JavaScript and Node.js. The meteoric rise of Node.js as middleware has dramatically increased the number of packages and libraries that can be used by developers. NPM, the package manager for Node.js, is the most frequently downloaded piece of software in use today.
The rapid usage growth of Node.js and JavaScript has created a chaotic environment conducive to supply chain attacks. Attackers can inject code into trusted open source packages in a variety of ways. (To be clear, Java, Python, PHP, Golang and Ruby have the same potential supply chain vulnerabilities. But JavaScript is, by far, the largest target and it dominates the front-end and near-front-end, where user data is easiest to skim).
Some big, corporation-backed libraries, such as React (backed by Facebook), are well maintained. But many other commonly used libraries are maintained by under-resourced groups of developer volunteers. These unpaid maintainers have little incentive to do the “grunt work” of following rigorous security best practices, such as setting up automated code analysis or responding quickly to reports of security problems.
The cost of running all widely used open source JavaScript libraries through a proper code review would likely be in the tens or hundreds of billions of dollars. Because the maintainers of this code rarely profit from their work, they are not inclined to pay for and set up automated review systems. This is true even when maintainers work at large companies; the sheer amount of work behind maintaining a popular open source project can be backbreaking without major organizational support. Even code reviews and running code analysis tools is no panacea. Many popular open source projects change rapidly due to numerous and frequent code commits. So, a clean code review is no guarantee that the component remains safe a week or a month later.
For projects run by unpaid developers, commit access to core repositories is often poorly regulated, because maintainers are happy to get any help at all on their project. Reviews of pull requests may be superficial and miss well-obfuscated attacks. Worse, sometimes developers just get tired and stop maintaining their library. They may or may not notify users. Their ghost library may sit in GitHub or NPM, still downloading properly, but with no one paying any attention to bug reports or other suspicious activity.
The problem of abandoned or outdated code is often underestimated. According to a May 2020 study by Synopsys, 91% of commercial applications contain abandoned or outdated open source components. For these reasons, compromising open source libraries that have either been abandoned, are outdated or are lightly policed is a favorite tactic of malicious hackers looking to mount supply chain attacks – and that threat will only grow in 2021.
Learning to Live with Shadow Code and Supply Chain Risk
Getting rid of all shadow code is not an option. There’s just too much of it in use; halting the process entirely would negatively impact development cycles. Restricting developer behaviors in a heavy-handed manner usually ends badly (with more shadow code). Likewise, getting rid of the supply chain risk is impossible. Modern applications are too complex, with dependencies on open source components that themselves contain numerous third-party dependencies.
That doesn’t mean DevSecOps should do nothing. Developers can and should be educated about the risks of shadow code and supply chain attacks. Taking some simple steps to reduce the risk can go a long way. For example, ask developers to create simple checklists for every piece of third-party code they use to ensure the component meets the most basic risk standards. A checklist might include, “recently updated, responds to emails, has more than X number of downloads.” For open source code, use modern software composition analysis tools like Snyk and Sonatype that check open source libraries and can be dropped into the DevOps pipeline as part of the continuous integration process.
From the other direction, DevSecOps teams should attack the shadow code/supply chain problem by thinking about what application behaviors might indicate a compromise. To this end, teams should create a list of behaviors that are anomalous and create filters or notifications for those. For example, if a sign-up page starts taking consistently longer for users to complete, that might be a red flag for form jacking. As a final backstop, look at using machine learning systems to spot anomalous application behaviors. These systems can monitor all application behaviors, at scale, in a way that humans never can.
Combined, these defensive practices should reduce shadow code and supply chain risks for DevOps and DevSecOps teams. While thinking about these problems is not fun, a proactive approach to tackling this malicious duo will keep your development process and applications far more secure.