Transcript
This transcript was autogenerated. To make changes, submit a PR.
Good evening. My name is Darren Richardson and I'm here
to give you a brief introduction into the
threat modeling process and how we may apply this to dev ops platforms.
Also, we'll take a quick look at the stride threat model and
how it's changed, or in this case hasn't changed
over time. So we're going to dive in straight to the big question,
what is threat modeling? The answer to this
is simple. Threat modeling is a methodology used to identify,
understand, communicate threats.
At its core, the idea is to understand the
flow of data in a system, understand how the
trust changes for that data, to define the
boundaries where that trust changes, and then to
define mitigation for those switches of trust.
The threat model represents all this in a structured manner.
So dataflow diagrams leads into threat boundary
diagrams and such. But in its most condensed form,
a threat model is just structured
approach to mapping a system and its security.
And now we've discussed what it is, let's discuss what it is
not. And that's restricted because
threat modeling can be applied pretty much anywhere. So you can
apply it to software, applications, networks, systems and even
business processes. If there is any area
where you want to examine how things are done securely,
you can apply threat modeling. And in our case, we're going to
use a DevOps environment.
There are three aspects of threat modeling. You initially
map and diagram the data flow. So, for example,
in our system, we're going to be switching between a lot of different
tools, and the core of threat modeling
is understanding this data flow. So knowing how your data
moves, how your data switches hands, as it
were, and then we break it down to trust boundaries.
And this is, in short,
where the controller ownership of the
data changes. And then we apply the threat model
to this list of trust boundaries.
We'll move on to a little quote from the security development lifecycle
from Microsoft. I like to include this here.
It's basically the best summary of
threat modeling I think there is, if you can just summarize it in
one sentence. So let's move quickly on
to stride analysis. Now, stride analysis is
the threat model I tend to use, and I use it for a few
reasons, but let's quickly dive into it and see what
it is. So stride
itself is a threat model
put out by Microsoft in April 1,
1999, and it's still in use today.
And I understand that's quite weird. I don't think any of us
are thinking, well, I want something
secure, I'm going to use something from 1999
that's just extremely unusual. And the
fact it was released on April the first also makes me
kind of worry about it. But in my opinion, it's a
great model that breaks threats down into these six
distinct categories, and it's got a good mnemonic,
so I like to use it. There are other interesting models you can use that
all have really good acronyms. So you can use pasta if
you like. You can do a vast analysis.
You can have dread. That's always a good one,
but I prefer stride.
We're going to quickly go through the six parts of the
stride monotony. So we have authentication, which is
spoofing. So an attack against the property of authentication,
impersonating a user to log in, or generating
false websites, like anything to trick
people's traffic into somewhere it shouldn't be.
We have tampering, which is an attack against integrity.
So modifications made to a system where they shouldn't have been possible.
Repudiation, which is the only word on here, which is
slightly unusual. So what repudiation means is
the ability to deny something has
happened. So plausible deniability is a great way
to describe it. So if you keep
your logs locally on the server, if they are not properly
protected, then you are vulnerable to repudiation.
Information disclosure is another big one.
Attack against confidentiality. So, access to source code,
publishing of customer or client information,
or any confidential information, really, most companies have
confidentiality levels that are set, and anything
that's not public, if that reaches, or anything that's not
categorized as public, if that reaches someone, it shouldn't,
that's information disclosure.
Denial of service is of course a standard one.
Degrading or denying access to the service through
any means, even cutting cables, when you think about it,
can be considered a denial of service attack.
And then of course the elevation of privilege or escalation of privilege,
whichever word you prefer, an attack against authorization.
So allowing the execution of remote code while
unauthorized, or for example,
elevating limited user to admin level, or even picking the
lock, because if you're applying this methodology to,
for example, a physical presence,
then that's another escalation of privilege.
So before we go on, because threat
modeling can get quite complicated, I decided we would start with
stride, as it is originally intended. So we're
going to head back to 1999,
and we're going to start by making a data flow diagram for
a system as it should have been in 1999.
Now, I have a minor caveat here,
in that I was twelve years
old in 1999, so I was not
an IT professional at the time. I was a slightly disobedient
secondary school student, I spent most of my time skipping
homework and playing video games. So this entire
section is based on wild speculation
of what I assume the IT business
was like in the late 90s, which is based on my personal
cultural touch point, which is of course the american sitcom,
more specifically friends and Frasier.
So we're going to assume that every data
handover that happened in the time was done in a 90s style
coffee shop. So here is our data flow diagram
circa 99. We have our developer Steve,
who keeps his source code on his laptop.
He gives the data to Dave, who works in operations in
the 90s style coffee shop.
The data transitions through operations
to installation via Dave's
car. And I seem to recall that floppy disks were this
big. You can see this is Dave driving his floppy disk to
the server room or the data center.
The data is installed through the
three and a half inch floppy drive. Maybe I got inches
and feet confused with this illustration. I'm not sure.
And then the communication interface,
HTTP port 80, and our end user
Barry can watch highly compressed cat videos over
dial up. This is how I'm going to assume that
these services existed in the 90s for the sake
of the next section, which is trust boundaries.
Now let's pause here a minute and discuss
what a trust boundary actually is.
Now, there's a wiki definition here of the trust
boundary being a term used in computer science, blah, blah, blah.
I want to break it down to a very simple term because
trust is where ownership of the data changes.
Where control of the data changes.
When you are installing and then you
have the installation interface with the three and a
half inch floppy disk, you have the user interface
over HTTP. As you install, they change.
So the ownership of that changes.
And in most practical places it means as data
transits, it doesn't always mean that because you can go
very deep in these threat models and for example,
take a look at some kind of single
system and go to a deep process level.
And at that point the data is not really transiting.
But I would go in this case with number one here.
In most practical places it means data transit.
So when you do the threat boundaries, you end up with something like this.
There is a threat boundary as the data is handed over. In our
90s coffee shop, there is another data
transition as it switches over to the installation interface.
And a third transition boundary,
this threat boundary when the end user is connecting
to the interface. So this is the context of threat
modeling. This is what we care about. These are the places where data
shifts and the controller,
the owner of data, changes. And then
the next or final step in this process
is to apply the stride analysis. So as
we've gone through spoofing, tampering, repudiation,
information disclosure, escalation of privilege,
these six things are the stride threat model,
and all we do here is apply stride.
So for each threat boundary, we should look at the inputs and outputs
we should consider and note potential threats based on the
stride and then plan mitigations for the device threats.
So I'm not going to do the whole 90s data
flow because I think that would take a bit too much time and I want
to get into actual modern systems. So I've just made up a
short example. So, for example, in the
handover boundary here of data handover spoofing,
is it possible to establish the identity of either the developer
or operations? So maybe with id cards,
but failing that, it would be possible to pretend to be one of
them. Information disclosure. They're in a public
coffee shop, so it's very easy to overhear what's happening,
overhear the installation information,
overhear basically what's going to happen over the installation
process. For tampering, we can look at,
for example, the data transition of ops going to the installation
there. Does he go straight to the data
center and install the software on the server, or does
he leave it in his car overnight? Can it be tampered with? Is it possible
to distract him? Maybe the spy skills walk by because it's the
late ninety s and he's distracted long enough to have
a disc switched.
Repudiation is the threat that hasn't changed.
In my opinion, it's the only one that still relies on logging
and everything. And denial of service. Say he loses
disk two of eight and the installation corrupts, the service is down,
and our end user, Barry, cannot watch
his compressed cat videos.
So here is how I
believe the threat model was supposed to be applied.
It probably was a little more complex,
but, you know, I think we should hop
straight to modern times. So we have
a data flow diagram in 2020,
now, 23 years later,
here we are. It's kind of impossible to sum up how much security has
changed over the last two decades. I defy
anyone to fit that amount of changes
into a 25 minutes talk.
So if we say what has changed,
I think we can just say everything except
the stride threat model because it hasn't been updated
since the release in 99. So when we ask how
does it change the application of it, it simply doesn't.
The only thing that happens is it gets more complicated.
The transition to modern day is not as jarring
as you might initially think when you see
this. Now this is a
very simple diagram of
a more modern workflow. Now there are
some parts missing here, but this is just a very simplified throw
flow. Don't let the scale fool
you. Everything we're going to do is exactly the same.
So we have the developer Steve Jr. Steve's son
has joined the game now and he's developing
on his laptop. He pushes his code to GitHub,
GitLab, BitBucket. Wherever you store code,
we have a pipeline system. Jenkins collects source
code, tests with Sonarcube, secures in the
pipelines with a couple of open source tools,
builds a docker image, stores that docker image in
artifactory has some infrastructure as code,
an installation interface, maybe a server in the middle here which is
pushing out these updates through SSH
using something like ansible containers
which contain the workloads and a communication interface.
And now our end user, 23 years later,
Barry is able to enjoy extremely
hd cat videos over 100 megabits connection.
So he's extremely happy with this.
The trust boundaries in 2022
are exactly the same. Now this slide, if you're wondering, is actually
exactly the same slide as I showed in the trust boundaries
for 99. The reason for this is this is going out as a video,
so I want people to have the references nearby in case they
are going back and forward. So here is the trust
boundary again. In most practical places it means data
transit. So as the data moves and the owner,
the controller of that data, changes, that's what we care
about. And then applying this methodology, we end
up with something that looks a lot like this.
And I understand this might look a little bit complex.
Initially, we have things we never had before.
We had entirely internal threat boundaries.
Here we have a nested trust boundary.
We have systems within systems,
built upon systems. So it's not just one person
handing over code and another person installing it. This is automated
infrastructure, this is DevOps, and this is
the trust boundary as we would apply it to this
system. So getting to
the meat of this conversation is applying
the stride analysis circuit 2022
to our system. And we're going to go through all of
these boundaries. So we'll get started with the source code repository.
The source code repository, obviously potential
IPR. That's what we're worried about here. This is where
your dna is stored. This is where your
company's core code is collected.
So spoofing any kind of
authentication against it might cause problems.
Tampering could lead to the modification of source,
which is problematic. If that source then makes it
through to the end. Product repudiation
threats act as logs can be deleted,
which would make changes untrackable. So you have no
chain of authority, you have no way to track
what is happening. Information disclosure,
IPR leakage, mostly if someone leaks the source code,
loss of reputation, loss of innovation,
and then denial of service. Service loss prevents builds being
committed or elevation of privileges. So an
unauthorized admin access leads to code based modification.
The thing we don't talk about at any point here is mitigations,
because we should. Let me clear
that up. So threat modeling process should always have mitigations.
But these are examples. These are things that are
mostly, for example, handled by the source
code repository. So spoofing already covered,
tampering that is handled by principle of least
privilege, repudiation, non changeable logs,
all these things are very clear. So in
most situations, in this example, the mitigations
are quite obvious. But this shouldn't be all you have.
You should go as deep as you can.
Basically, any concern you have for this
particular section should be brought up in here.
And there's no rules saying that it should just be one point.
If you have four or five concerns for spoofing or
tampering, write four or five concerns here.
This is absolutely fine.
Next, pipelines. So this was
our pipeline boundary. This contains the pipeline
configuration. Read and write access
to, well, read from git to write to binary storage.
So theoretically there's a lot of damage that could
be done from this point because,
well, I mean, one of
my colleagues said something that hadn't really struck me the other day,
that Jenkins,
or pipelines in general are just
formalized remote code execution.
And he's absolutely right.
So the ability to edit the code after
it's been read from git, the ability to write additional binaries
to the storage. There's a lot of dangerous things that
an attacker could do if they got hold of the
pipelines. So spoofing code
base could lead to injected code tampering,
modification of pipelines. I mean, if you modified
the pipelines, you can build basically anything you want.
Repudiation is the one threat that stays essentially the
same throughout the entire time. It's just log everything centrally,
make sure it's not editable. That's the mitigation
for all repudiation threats.
If someone can tell me a repudiation threat that is
not mitigated by central logging, I would love to hear it,
because I've so far not encountered one.
Information disclosure, same risks as the source code.
It's IPR loss, so losing your
intellectual property rights. Denial of service
in acute situations could prevent the deployment of emergency
patches and elevation
of privilege. Alter build pipelines, disable or skip tests
this sort of thing,
static code analysis tool, quickly go through this one
because in my opinion it's a, let's say less than
critical area. It only returns
pass or fail values. So the best you can do
really is force the test to skip.
Now there's an in pipeline security tool that I want to talk about. So like
Aqua security, put this trivia tool out. It's one of my favorite tools for dealing
with docker images as they're being built,
but it opens the spoofing angle to an
external attack. If you're able to spoof the domains
or download the wrong tools, this would give you
a lot of power inside the pipeline. Tampering obviously
this is the build step, so any tampering could lead to
corrupted software. Information disclosure
it's again more about the IPR here.
Denial of service to prevent security checks
and then you basically force the idea of should we
do the security check if we can't or should we
skip it or should we pause the deployment?
Storage is another big area, so this
is where our binaries are stored for deployment.
If we have a compromise here,
then someone can upload their own binaries or
maybe even download as reverse engineer them. So spoofing
for credentials to allow for upload and download when we
don't want them, tampering changes
to the images which could then be distributed.
Information disclosure here might be through reverse
engineering the binaries or examining
the images. Denial of service if
this is our single source of truth for our deployments and
we deny service here, then basically all our later
containers are stuck.
Might lead to some problems trying to update and
then reload of altered images,
which would be if you are able to elevate privileges
and then upload the new image might
pose a problem.
Installation interface is kind of a very basic SSH,
so standard sort of threatens
here. So tampering of the SSH, config,
spoofing the credentials, leaking server information,
basic stuff. Also killing the service prevents any kind of
access the
container boundary where obviously if
we install the wrong containers, anything can happen.
Could include like a coin miner,
basically any kind of malicious container we can think of tampering
if we're able to download the wrong images,
information disclosure is again leaked. IPR denial
of service this is the service. So denial of
access to the containers and the services down an elevation of
privileges might look like container escape. Something along
those lines. And then finally we have the
communications interface, which is accepting traffic from
the whole Internet. So if there is a login,
obviously spoofed logins to the web browser.
Tampering could be tampered with, serve malware,
any kind of illicit materials,
information disclosure, for example, robots that should
be removed. This kind of information could be private.
And then login elevation.
So this was the stride analysis.
Now, this isn't supposed to be a kind of catch
all for the stride. This is more me
talking for half an hour to get you thinking
about stride, because in your systems,
you're going to have a much more realistic work case.
You're going to have a better
understanding of your data flow, you're going to have a much
more in depth analysis. You may have noticed more trust
boundaries. So the idea here is
not to give, like, an in depth understanding
of an exact model, but just to get you in the right headspace
to be able to generate your own. And that
being said, I like the stride model,
but there is a problem.
In my experience, it's extremely effective in certain situations,
so in most
cases it's great. But stride, in my opinion,
has been built to look
externally. Stride, in my opinion,
only seeks external threats.
And it makes the assumption that every attacker is an external
attacker, that every threat is malicious instead
of clumsy, let's say.
But history,
which in 1999 they didn't have the benefit of knowing,
has demonstrated that not every attacker
is external or not every problem is malicious.
So we have these two areas for
mishandling and malice, which I don't
have a good answer of where they should be
in the stride threat model. Like mishandling,
misconfiguration, mismanagement. Those are generally
what lead to the
rest of the stride model, really.
So most of those issues are problems with configuration
or some kind of mishandling, and then internal
malice. If you have an employee
with an axe to grind, then it's the principle of least privilege
that should be keeping them out. But they're going to have a much easier time
with threats that you have perhaps not considered,
because stride looks externally.
So if you are doing a stride analysis,
I would keep these two in mind. I bring this up because every time
I've done stride, I've had this question from
someone. Someone has asked malice and mishandling
questions. How do we deal with a misconfiguration? Where do we put
that? And in truth, I don't have a good answer for you. My recommendation
is, keep this in mind, because as I've written here,
82% of attacks are caused by this human element.
So the threat model is great.
But make sure you aim the threat model internally, too.
And then finally we have the steps of what's
next. Now the obvious thing is
do it all over again, because stride
is an iterative process. So you do it,
you validate it, you do it again. You schedule it every six
months, sit down and ask if your service or platform or whatever
has changed enough to warrant a revisit to the threat model.
The idea is that this is not something you do once and then forget about
it. You keep doing it. As your
service evolves, you continue to
visit this threat modeling, or if you
prefer a reference to 90 sitcoms,
join us again in the next season of threat modeling.
Thank you. If you have any comments or questions,
I will be lurking around conf 42 discord
for a while. You can send me a message on LinkedIn. My information
is here. My LinkedIn is LinkedIn.com
greatbushybeard, so feel free to look me up.
Bye.