I've always been a little uncomfortable about build systems and languages
that start the build by going out to a package repository and pulling down
the most recent (minor or patch) version of every one of the package's
dependencies. Followed by all of their dependencies. The
best-known of these are probably Python's pip
package
manager, Javascript's npm
(node package manager), and Ruby's
gems
. They're quite impressive to watch, as they fetch
package after package from their repository and include it in the program
or web page being built. What could possibly go wrong?
Plenty, as it turns out.
The best-known technique for taking advantage of a package manager is typosquatting -- picking a name for a malware package that's a
plausible misspelling of a real one, and waiting for someone to make a
typo. (It's an adaptation of the same technique from DNS - picking a
domain name close to that of some popular site in hopes of siphoning off
some of the legitimate site's traffic. These days it's common for
companies to typosquat their own domains before somebody else does --
facbook.com
redirects to FB, for example.)
A few days ago, Alex Birsan published "Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of
Other Companies", describing a new attack that relies on the way
package managers like npm
resolve dependencies, by looking
for and fetching the most recent compatible version (i.e. with the same
major version) of every package, and the fact that they can be made to
look in more than one repository.
Fetching the most recent minor version of a package is usually perfectly
safe; packages have owners, and only the owner can upload a new version to
the repository. (There have been a few cases where somebody has gotten
tired of maintaining a popular package, and transferred ownership to
someone who turned out to be, shall we say, less than reliable.)
The problem comes if, like most large companies and many small ones, you
have a private repository that some of your packages come from. The
package manager looks in both places, public and private, for the most
recent version. If an attacker somehow gets the name and version number
of a private package that doesn't exist in the public repository, they can
upload a bogus package with the same name and a later version.
It turns out that the names and versions of private packages can be
leaked in a wide variety of ways. The simplest turns out to be looking in
your target's web apps -- apparently it's not uncommon to find a copy of a
`package.json` left in the app's JavaScript by the build process. Birsan
goes into detail on this and other sources of information.
Microsoft has published 3 Ways to Mitigate Risk When Using Private Package Feeds, so that's a
good place to look if you have this problem and want to fix it. (Hint:
you really want to fix it.) Tl;dr: by far the simplest fix is to
have one private repo that includes both your private packages,
and all of the public packages your software depends on. Point
your package manager at that. Updating the repo to get the most
recent public versions is left as an exercise for the reader; if I was
doing it I'd just make a set of dummy package that depend on them.
Happy hacking!
Resources
Another fine post from
The Computer Curmudgeon (also at
computer-curmudgeon.com).
Donation buttons in profile.