Transcript
This transcript was autogenerated. To make changes, submit a PR.
All right, well, welcome everybody. Thanks for joining. As I mentioned,
diving into taking your DevOps tools to the dark side and stepping up your security
game in the DevOps space. Quick back ground
on who I am. I've done a mix of software development and DevOps for the
past 13 years. I started in software engineering using
support and web development. I moved on to infrastructure
monitoring. I'd managed CICD for
a shop that had over 100 DevOps, and it was just two
or three of us DevOps guys trying to support 100 of them. So I'm used
to things being busy and moving fast. After that, I moved on to building
a site reliability department at a billion dollar company where
we built a site reliability team. We built a NOC and all
the visualizations and response procedures and so forth from the ground up.
Currently, I'm leading the DevOps team at Netfoundry. We call it
our Rav team, which stands for reliability, automation and visibility.
And I've used a lot of DevOps tools over the years. I would
say I'm generally pretty opinionated about which tools I like, which ones I don't.
And that just comes from I've experienced both the gains
and the pain from working from all sorts of different technology, ranging anywhere
from config management, monitoring, automation, you name it.
I love learning new tech. I love constantly growing and evolving
my skill set and just continuing to learn about new tools that
are out there and are available. So let's
talk about zero trust networking. This is a term
that is emerging in the industry, and a lot of people are
throwing this word around. And what is tricky about
it is it's become a marketing term. And so there's a lot of different definitions
out there. And so it's one of those things, it's like DevOps to where it's
almost grown into a term that has no real meaning. But I'd like
to cover how we define it at Netfoundry just because
we're in the business of zero trust networking. So traditionally,
with network security, the security
is built around a perimeter based model. It's the
idea that I've got a firewall and I put everything that's important and needs
to be protected inside of that firewall.
It's kind of like a castle model where it's like I've got a castle wall.
Everything inside it I just consider to be safe. I trust it,
and I don't need to worry too much as long as I've got the firewall
up and my doors closed to things that are
on the outside that are threats. In a zero trust networking
world, what we assume is that the network itself is
already compromised.
There's already bad actors inside the castle walls,
and we should not assume that anything inside those walls
are safe. And so it's assuming compromise in those resources
already, and it forces you to think about security differently in
terms of what really needs access to what.
And let's not assume that these resources
inside the perimeter are safe. All right?
So in the DevOps world, in the Devsecops world,
why do we care about zero trust networking? Why does this matter?
Why is this important? Why are you even here at this talk? So we have
a problem in DevOps, and a lot of us are not necessarily conscious
of it, or we just kind of close our eyes to it and we're not
really thinking about it. But the problem in the DevOps space is
pretty much every single tool that we use is an absolute gold mine
for an attacker. Let's take our CI CD system.
Well, that is a fantastic way for an intruder
to compromise that. And they can begin to inject code
anywhere throughout the infrastructure, pretty much anywhere where our deployment systems
go. Those are the places that matter in our infrastructure. Imagine if
that was compromised. Now you got somebody that can deploy and execute
their own code anywhere throughout your infrastructure. That's a critical
problem. Let's take monitoring. Well, monitoring is
just a data mining platform for your infrastructure. Anything that's important is
going to be monitored. And so you've got one central stop where all your data
is going to give you a fantastic inventory
with various forensics about your systems and generally IP addresses,
address information. It's fantastic way to gather
everything important about your infrastructure and store all that data in one place.
Let's talk about etls. For those of you who work in data warehousing,
it is a collection of loosely hacked together scripts, typically all
sorts of jobs that mine data from all your important data sources and store that
typically into a data warehouse, which is a one stop shop for all your data.
Again, fantastic target, because this needs all sorts of access
to everything. All of your important data sources are exposed
to your ETL system, to your data warehousing system, and brought together into one place.
If any of those get compromised, somebody's pretty much got your whole infrastructure.
Config management. This one's my favorite because it's your one stop shop,
typically for root access to everything. If you want to take everything down in your
infrastructure or exploit the infrastructure,
plant something, whether it be crypto mining,
a rootkit, you name it, config management is a fantastic targo.
Once they get into the config management system, you're done. They've got everything.
And then the last one is developer access management.
Most people that I talk to are not comfortable with how they've
set up their developer access. Typically, people are having
to grant all or nothing. They got to give prod access to it, and there's
not a great way to specify any kind of granularity and so forth.
So if you ask them, are you comfortable with the way all your developers have
access to your systems, most people will give you kind
of a look and shake their head and so forth, or give they're
not comfortable with it. And you wouldn't turn an auditor loose on your
developer access or even your support access half the time. Typically, you grant
a whole lot of access to the people that you know need it, because when
things broken, you don't want to introduce friction. So what do you do? You open
the gates wide open to a point to where everybody's a little uncomfortable,
but at the end of the day, you got to get your job done.
All right, so how do we deal with security in this world where at the
end of the day, things move fast, we got to get stuff done, we have
to move forward? Well, we've got audits and so forth
to try and protect our systems, and we
got to pass checkboxes for security team. And so what
do we do to get it to survive the audits? Well, typically, most places
I've seen, they apply the audits to the production application.
So in DevOps, we're often in charge of deployments and monitoring,
but usually we're monitoring some sort of production system.
And that production system, typically is
where we apply the scope of our audits. And we don't usually
include the DevOps tools and the monitoring tools and things like
that. Those are peripheral systems. And at some point it's fair,
because you have to apply a logical scope to things.
You can't just include your entire ecosystem into a security
audit. It's too hard. You got to separate non prod and prod.
You got to separate CICD and so forth. You have to create
some sort of scope as a safety net. I've seen a
lot of liberty taken at some places more than others to where
they will make that scope really narrow to pass the audit. Most places
are not doing a really broad scope that includes all of their DevOps tools because
they're support systems. They're not actually the production applications as exposed to
the public. And so we do what we need to do to pass the
audit. But really, I want you to think about as kind of a gut check.
Would you turn a pen tester loose on your monitoring system?
Would you turn a pen tester loose on your CI CD system or your
data warehousing? Would you be comfortable with
introducing somebody that was looking for exploits throughout your system?
And most people would probably say no.
And the same thing with your developer access. Are we comfortable with the way
that we've specified that when we lock down their permissions, most people
are not, because these are things that we use for support. And again,
they're just things that we use to get the job done. But they're not the
things that we want to show to the world we put out there in public.
Because at the end of the day, we need to access systems,
we need to fix things when they're broken, we need to monitor them,
we need to wire systems together. It's a lot of what we do within DevOps.
So what I'm introducing is we start to think about this problem
differently. The way that
we secure our DevOps tools in particular, because every one
of them is a gold mine, we got to figure out a way to step
up our game. Because the truth is, in the industry, this is how
people are getting into systems. This happened with
forgetting the name of it now, but we've seen it where they got in
through the monitoring system to where they injected
exploits through the monitoring system through automatic updates. I've seen
a major breach get in through the CI CD system. Why? Because it allowed
them to inject code in all sorts of people's infrastructure.
It's just a fantastic way to get in. So how do we secure these things
and lock them down and step the game and make them more secure?
The context, which I am talking about, zero trust and making these things
dark, is a tool that I've learned to use called OpenZD.
It is a zero trust networking platform. It's a
way to connect systems together while stepping up security.
The idea behind it is that,
first and foremost, stop leaving ports open, stop leaving
ips open. Make it dark means we basically cut
off all ingress. You start with the firewall rule or
security group policy that says no ingress, nothing gets in.
And some people ask about zero trust. VPC peering,
is that zero trust? No. If you've got vpcs peered
together or peering between your private data center and
a cloud data center, if one of those gets compromised, the other one
gets compromised. We got to step up, we got to lock things down tighter than
this, we can't leave it like this anymore. So the key concept with zero
trust is that you get away from ips and ports entirely.
The only thing that matters in the zero trust world is you've got services
and you've got identities. You've got basically destination
addresses that people need to access. And then you've got some form of identity,
whether it's a cell phone, whether it's a laptop, whether it's a server,
everything's an identity. And the only thing that matters is that certain identities need to
talk to certain services. And so we manage that access through
service policies or at netfoundry we call them applans.
So in a zero trust world like this, with no ingress, how do things actually
talk? So what we've got here is
a diagram. So I want you to imagine that this is a diagram
of simply connecting one data center to another data
center without peering them, without compromising both
of them, if you've got a compromised entity. So the idea is that,
imagine ZD fabric is you've got this fabric mesh,
it's running in a public cloud. And the idea is that everything dials into
this mesh. And initially it may dial in, it's got a
connection to it, a persistent connection, but nothing can talk
as of yet. You still have to create policies that define
what talks to what. Kubernetes has things like this,
but it only has it within the cluster. This is something that you can put
anywhere, across any cloud, across any region,
across any cluster, anywhere that you need it to be.
You can place this type of mesh and define policies, because again,
everything is trust, a service and an identity. You define what needs to talk
to what. And zero trust comes in. So that,
for example, in this image where I've got a compromised entity,
it's not explicitly defined in the policy, therefore it's not
able to access the private resource on the right. So on the
data center on the right, imagine you've got no inbound ports
open at all. There's no ingress at all into the entire
data center. The only thing that can get in are identities that are explicitly
listed in the policy. So the way a service policy works,
ZD is built around a concept of using tags to
tie things together. So what I would do is if I've got a set of
identities that I wanted to talk to my data warehouse, I might tag them
with a data warehouse tag. And then I've got various services related
to the data warehouse. I'm going to tag those with data warehouse. All I'm doing
in creating an applan or a service policy is just saying, okay,
this is my policy. Data warehouse talks to data warehouse, and I'm
done. If you've ever worked with active directory groups
or permissions for users, it's the same kind of thing. You put them in groups
and that determines their permissions. It's the same kind of thing,
except you're doing it at the network access level.
All right, so just how dark can we get with this?
Typically, there's three different models. We call it
the network access, host access, or application access.
And each one of these gets a little bit more secure, gets a little bit
tighter lockdown the farther we go. But let's dig into this.
All right. The first one, I would define it as good.
It's the idea of creating controlled access
between data centers or between,
we'll say you can access from anywhere, but you want to grant access inside
of a private data center. You don't want to open up firewalls.
So the way that this would work is that you'd put a tunneler or an
edge router in a data center. There's no ingress and there's
no peering between the two. But at ZD, that's actually granting access
through that. And so anything that's the
device or the host on the left can talk to specified
services on the right. It's a decent level of access.
It lets you grant access to identities. So at a
previous place that I was at, everything that was important was inside the
private data center. Where it was tricky was when, say we
acquire a new company or we brought in outside contractors,
how do we get them in? Well, typically we'd use something like VPN
was probably the most common method, but these got really difficult to manage,
really clunky very quickly. They weren't very reliable,
and they were very broad in terms of the access they granted.
What ZD offers as a major improvement with this is that everything
is explicit access, so that if you
just need an identity to be able to talk to one specific service inside
of that data center, you can set up your policy to do that. They only
have access to the things that are specifically granted. You're not exposing
your entire data center in this model. So it's a massive step
up from traditional networking or traditional VPN. This is
totally like a VPN on steroids,
but a lot easier to manage as well, because, again, you're not managing ips
and ports, you're just creating simple policies that say these
identities can talk to these services. A better model
we'll call this the host access model. This is where you've
got tunnelers installed on the
hosts or identities themselves. And so this is where
you might have a ZD tunneler running as like a system D
service on a Linux box, or it might be running as an agent on a
Windows machine. And you actually set up your services to terminate
on local host. So the traffic is never going
around unencrypted anywhere. So it terminates on the host, so it gets tunneled
through the mesh, which is completely encrypted. It exits
on local holes. You can use this for SSH access, you can use
this for web access, you name it. If you're running a container,
you'd set this up as a sidecar container. And again, your service is terminating
on a local host address, so it's never going outside of the host itself.
This is a better model because you never have
kind of a period of time to where your traffic can
be intercepted unless something is on the host and the host itself
is already compromised. Well, how do we get around that scenario if we need
to go even further? That's where the best model holes in,
and that is a fully application embedded implementation.
One thing that's cool about ZD, and this is something that's pretty unique to ZD,
is that there's an SDK to where, if you want to
create full end to end encryption from one application to another,
you can actually do that. Each application becomes an identity,
and it can access a service that is essentially inside
of the other application. You never have a point of unencrypted
traffic ever. You never have to open up any ports ever. There's no
ingress anywhere in this model. And essentially everything is dark the whole
time. This is what we at netfoundry and
developers of OpenZD, this is what we consider to be true.
Zero trust, where nothing has access. There's never a
point where you trust the traffic to be floating around in the open. The traffic
is always encrypted from the source all the way to
the destination. And this is the ideal scenario in a zero trust world.
All right, so our internal use cases at Netfoundry,
we were challenged to try to dog food OpenZD
and figure out how to use it internally
and learn from it. And what are the gains, what are the strengths? What are
the weaknesses? And how do we leverage this
and learn from it? And so I
was admittedly a little bit skeptical when tasked with doing this.
Just as a full disclaimer, I'm not a sales guy. I'm not an Openziti
developer, I'm a DevOps guy. And so for me,
I'm constantly trying to keep up with the
rat rates of everything that needs to be done and constantly under deadlines. I need
to get things done, I need to wire systems together, I need to automate stuff,
and I need to move fast. That's the nature of my job.
And so when tasked with things like, okay,
we need to lock things down and tighten up security inside,
I'm groaning because historically, this has always been like a really painful
process for me. So I
went and started looking at my systems, but taking an honest
look in terms of what do we have and how
can we make our systems more secure, and what systems do we need to
make more secure. So I began to go through the lists,
and these are the lists that we kind of chose internally to try
and dog food. We did a data warehouse, we did our CI
CD system, we did our SSH access. With ZD,
we actually moved away from slack and used an open source tool
replacement called Mattermost to where our internal
chat for our company is actually accessible only with ZD.
Grafana was something that we made dark. It's tied to lots of different data sources
that are important. And so because it's a one stop shop for data, we decided
to lock that down and make it dark. And then we've begun
using ZD internally for support access to applications.
We're running ZD in a sidecar so that we can get to
things without opening up additional ports and security groups.
We don't have to make any security group changes. If we need to access something,
we just grant access through ZD instead. All right,
so my reactions to doing this, like I said, I was skeptical.
I'm not a huge kool Aid drinker, even of places that I work most
of the time. Like I said, I'm super opinionated about the tools that I like
and that I don't like. I like tools that allow me to get a lot
of mileage. They allow me to get a lot of things done
quickly, and I can reuse it for lots of different
projects. What I found when I started working
with OpenZD was trust, that it
allowed me to massively step up my security game without introducing
all kinds of friction in previous
places and more traditional networking shops. Typically when
we would lock down things and introduce more access controls,
that inevitably meant breaking things and
making our job a lot harder. And what I found
as we started to shift towards this, was that we
could actually do it very easily and we could
actually pre validate everything ahead of time, because the way ZD works was
that we would set up all the policies and set up all the networking,
and that everything would work prior to us locking it down.
So even though we were moving towards zero
trust in a brownfield environment where we had existing deployments,
it really wasn't difficult to migrate into it because we
were able to enroll everybody. We got all the identities set up, we got all
the policies set up, and we were actually able to verify was the
zero trust networking working, using the traffic as an indicator to where
we could actually see are people migrated and
are they using ZD to access things instead? And we could see based on the
traffic that they were. And so what happened on the
switchover day, which typically I was used to this being like a
really fireworks intensive day where there was lots of broken stuff.
When we switched over our first tool, which was the data warehouse,
to zero trust, it was a complete non event because we
were able to validate everything ahead of time, and we were able to
see, based on the traffic patterns, that everybody was already using the zero
trust model to access it. And we already
knew that it was working because the traffic was going by. And so
when realizing that this was really not a big deal, that's when we moved
on to our CI CD system and Grafana and other such
things, to where now that's trust, how we set up access from day one
to where every new tool that we stand up, every support
system that we stand up, is zero trust from day one. And it really
doesn't make the job any harder, because in terms
of the user's ability to access it, as long as they have
their ZD agent running on their machine,
the way that they access systems is really no different than it was before.
But the systems for anybody who don't have ZD are completely dark
most of the time. You can't even resolve the DNS for the addresses that they
need to access, because with ZD you can use fake
DNS addresses for your intercept, and so you
can have completely phony addresses where people don't even. You don't even know the actual
ips and you don't even know the actual addresses of the services that you're accessing.
The other thing I think that I've appreciated with all this too, is that when
I need to set up access to things,
I minimize the amount of red tape that is needed, because I'm no
longer punching holes in firewall or opening up security groups,
because with ZD, if anything, you can lock things down further. And so it's
really easy to pass an audit using Openziti because you
say, yeah, here's my security audit, my security group
rules and my firewall rules are no ingress,
nothing can get in. So I would feel perfectly comfortable turning
a pen tester loose on my CDI CD system
now because I can look at it and say, yeah, go ahead and try to
attack it. There's no open ports, you can't get in unless you've been issued an
identity and issued trust within the ZD network.
So if you're more interested in learning about OpenZD in terms of
what it can do, this is an open source project. It's OpenZD
GitHub IO. There's also a blog at OpenZD
IO if you're interested in more of a cloud hosted solution.
We've got Cloudhostedzd at nfconsol IO,
and there's a free sign up option, up to ten endpoints
that you can try it out, as well as a getting started wizard and so
forth. But the idea is that get started
with it, try it. And when you realize that you can tie systems together very
quickly without production, allows you to still
be in a world where you're moving fast, you're tying systems together and you're not
creating this classic security problem and you're not opening up your tools to the world.
So I challenge you. As you work in the DevOps
space and the DevOps space, lock down your tools,
treat them as a first class citizen. Stop leaving them open to
the world. The way that people are still getting into systems,
more commonly than anything else, is scan and exploit. When you
leave your ip is open, and when you leave your ports open to the world,
assume somebody's going after them. Even if it's inside of,
even if it's inside the firewall, assume something's inside
already scanning and exploiting. So step up your security game.
And that concludes my talk for today.
Thanks for everybody for joining.