Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello and welcome to securing the endpoint with open source
software. This talk is going to be a little bit of a
reinterpretation on devsecops, which is commonly focused
on bringing security practices to developers.
What I'd like to do at this talk is do a little bit of bringing
developer practices and DevOps practices CTO
security. So before we get started with the talk,
I'll tell you a little bit about myself. I'm the CTO and co founder
at Fleet and I'm a co creator of OS query and on the steering committee
for that project. Those are both projects that we'll be talking
about more later in this talk. But before we get
going on the talk, I'd like to make sure that we're clear on some
definitions. In particular, what do we mean by endpoint?
There are a lot of definitions that folks use for this.
So an endpoint is basically a computing
device. It could be one of these macOS laptop,
it could be a desktop computer using macOS,
Windows, Linux, could even be a Chromebook.
Any sort of end user computing device could be
considered an endpoint. And for our purposes, we also
think of these servers as endpoints as well,
and containers. These can
also be endpoints. And the strategies and techniques that we talk about
using here can potentially be applied to any of these environments.
And that also includes environments like this
operational technology, the control planes running
the robots inside factories,
and IoT sorts of devices,
Raspberry PI, all of these things this software can potentially run
on and manage. So first, I'll talk about Osquery,
which is the agent that we use to collect data from all
of these different endpoints. So Osquery lets us write queries
to collect logs on the state and how the state is changing
on endpoints, and also on the events taking place on
those endpoints. It supports macOS,
Linux, and Windows, which covers a lot of the computing environments
that I discussed, that I just mentioned. And we'll
talk a little bit more about Chromebooks later as well.
An osquery enables non developers to access
and aggregate data from disparate sources across these systems.
And we'll talk a bit about how it uses SQL to do that.
An OS query was designed explicitly with
the goal to have performance and reliability, to be able to
deploy it across these corporate and production parts
of organization's infrastructure. An OS query is
fully open source, licensed with an MIT license
that allows users to do essentially
whatever they'd like with it and the source code. And I mentioned
SQL. So here's a very basic example. Of an OS query query.
So select star from users will give us, across all three of
the different supported systems, the information about the user accounts
that exist on those systems. And there are a
huge number of data sources that are available in OS query.
For example things like the Etsy hosts files
or the Cron tab file, the known hosts,
these flat files that can be parsed and typically have specific configuration
formats. SQLite files which are becoming
increasingly more common on systems to store configuration
and state for applications. The data available
from system APIs, for example the Apple
system log, the keychain on macOS, these are
common sources. However, there are certainly many other
system APIs on Windows and Linux that are useful and are also
abstracted into this SQL concept. With OS
query application APIs that are exposed such
as Docker's API, the carbon Black APIs,
many applications are exposing these APIs on local
systems. And then we have access to event
based APIs that are exposing lots of information such as
FS events which can be used for file integrity monitoring purposes,
the BPF subsystem on Linux and
the older audit subsystem, Windows events
and other systems like that on Windows.
And then there's a lot of data that we might get just as metadata from
the file system, such as information about shared folders,
hashes of files, the permissions set on files. These can be all interesting
and relevant pieces of information to have from a security perspective.
And then there are more specific file formats such as plists
on macOS that are sort of a combination of XML and binary.
And again, just a nice way to abstract all of those. And these are all
abstracted under that same SQL interface.
And one of the pieces of value that we get from that is that we
can start to combine together the tables that are available
from each of these sources. And so for example here we
can take a query that joins the processes and
process open sockets tables and it does that
by looking for processes for the information
where the processes share the same PID.
And then we can do filtering as well in the SQL
query. And in this case, what we're doing is we're looking for SSHD processes
that are listing on a port that's not port 22.
And essentially we could interpret this as SSHD
running on a nonstandard port. Now next I'll
talk about Fleet, which is a system that allows
us to package, deploy and manage OS query at scale.
So remember, OS query is our agent. Fleet is essentially
our coordinator for the agent. It helps us manage these
agents across thousands, tens and hundreds of thousands
of machines, and it helps us drive insights out of the data available
with osquery. Fleet can run live queries,
detect vulnerable software, detect compliance
with organizational policies, and trigger
automations. Fleet also allows us to configure
scheduled queries via configuration as code so
the queries that we were just looking at can be run on intervals
and then those logs shipped into our logging pipelines.
And this is all also available via API.
So I think that this is kind of an important part of bringing the developer
concepts into the security realm
is this configuration as code and everything available via
API. These allow us to build the kind of automations that
are richer, more robust and more future looking.
And I mentioned that we can get logs to our logging destinations. Commonly these
kind of things are s three elastic, splunk,
snowflake, and potentially any logging destination
is viable as long as there's some way to get text.
And in fact these are JSON based logs into the
system. And as a bonus,
fleet also includes, as I mentioned, support for chromebooks.
So Fleet has an open source Chrome extension that essentially
mimics OS queries functionality and provides that
same SQL interface on the information
provided by the Chrome OS APIs.
And fleet is open core, so part of it is licensed
with an MIT license and then part is available
only on an enterprise license. Everything that I'll talk about in this talk
today is available on the fully open source
MIT license portion of fleet. So this can all be taken and
applied immediately. And just for an example,
here is some fleet user interface in which
we can take an OS query query and
we can save that query,
check compatibility and generally get some friendlier
UI on top of what we're learning
from OS query. And this is more of the fleet user interface.
This is what you see when you get a host enrolled into
fleet. There's a whole bunch of information that's collected
by default, and this can become a great baseline
for understanding the data that's available from OS query and CTO.
Start to understand some of the concepts that are exposed by fleet.
So for example, we get the software inventory
collected from the host, we get the policy compliance, and here
in this example, this host is failing two policies.
We also get inventory of software across
the entire organization or all of the hosts that are enrolled.
We can filter that software across multiple axes,
but right now in this case, we've got it filtered by software that's
vulnerable. And so we can see that
we have some Google Chrome instances that probably need updates here
because they've got some cves associated
with them. And we talked about policies a
little bit, but again, fleet provides a way to
define organizational policies that we have and allow us
to keep track of the compliance across our hosts. This is also
a good example of where automations can be enabled so that we can
start triggering into other systems to do response
to policies that have failed.
And what I'd like to do now is show a bit of a demo of
what a sort of modern configuration as code practice
could look like with fleet.
So in this case we've put up a pull request that adds a detection
using the osquery that we talked about looking for unusual SSH
processes other than the standard port 22.
And when we come over and look at this, we can see that there's a
YAML file that defines the query and
with the name and description and the query SQL that
we looked at just a few slides ago,
and we have configured this to run on a ten second
interval and turn automations on so that we can get
those logs into our logging pipeline. Now in this example,
I've also used another tool to build a detection
on top of that. That tool is Matano and that allows us to
trigger alerts anytime that logs are generated
from this query. So in this case we've also
configured the further
details about the alert that we want CTO fire off here.
And essentially we want CTO investigate whether this
nonstandard SSH is an intended practice or
possibly some malicious activity that could be happening on
the system. And because of all this CLI
and API support built into fleet, we're able to
configure all of these things through a Gitops
workflow. So in this case, I've requested
review now from someone else on the team, and this is
going to generate an audit trail that allows us to keep track
of why changes were made, who made them
and who approved them. So I've switched over
to a different browser where I'm
logged in as the reviewer. And in this example I
can now take a look and provide my
review. I'd probably be looking in this case to ensure
that this is going to be generating what we think will be a low
number of false positives and a low number of false negatives
so that we're getting a very high fidelity detection
in place here. And when I submit this review that
will allow the pull request to be merged because we've configured our repository
to only allow approved pull requests to
be merged. And what I want to really highlight here is now that
we've got this pull request approved
and we're able to merge it. It's the CI actions
that enable this Gitops workflow that I think are the really powerful
thing here. So we'll come down here and
we'll merge our pull request and that's going to kick
off the CI. We click through GitHub's interface
to pull up the actions that are now running.
And we've configured our repository to apply these
new configurations as soon as they emerge to our main branch.
So when we click into this we'll be able to see the status of the
job. And essentially our CI action just installs
the tools and then it applies the
configurations into both Fleet and Matano.
So effectively what we've done is we've used configuration as
code practices to build our security detections.
And this enables all of the advantages that hopefully many of
you are sold on already from your
familiarity with devsecops practices. And so
now we'll talk a bit about what deployment of these tools looks like.
So fleet is typically deployed in AWS
via terraform that fleet the organization provides,
but it can also really be deployed to any suitable infrastructure and suitable
means. A place where we can run MySQL and redis
and where we can run a Linux server binary. So it's a
pretty minimal set of requirements. Fleet also does provide
a SaaS offering of this, but mostly we're focused on the open source version
in this talk, and then you can expose it to the public Internet
or not. And the considerations around this are essentially
whether you have workstations that will be off of a VPN that
you might want to be able to access the interface
so that they can run queries and
send logs up. So depending on the kinds of
devices that you want to enroll, you may or may not decide to expose it
to the public Internet. Then you'd want to install the fleet
control command line tool which is used for managing the
server and packaging up installation binaries.
There's more about this on the docs@fleetdm.com
so feel free to check that out. And this is an
architecture diagram of what that deployment looks like. On the top left
we can see the OS query agent which checks
in via HTTPs with the fleet server to
find out if there's any work to be done essentially, and to send any logs
that it's generated. On the bottom left we see the API
clients, which could be the user interface that I
showed earlier, which is a web browser user interface that
uses also the same APIs that the fleet control command line tool uses.
And those are the same APIs that are accessible to
any user of fleet who wants to write code
or integrations there. And the fleet
server has its MySQL and redis dependencies and then
optionally is able to send logs out to any of those logging destinations
that we discussed. Those and more are
available. And for OS query, the deployment essentially
looks like generating the installation packages
via the fleet control command line tool. That would be MSI on
Windows, PKG on macOS, Deb for our
debian flavored linuxes and RPM for
our red hat flavored linuxes.
And then typically you'd install those packages via the standard
management workflows and that often looks
like chef for servers, it often looks like MDM for
workstations. Doesn't really matter how we do this as long
as we get those packages out there, but it could also
be instead baked into the master virtual
machine or container images, so that whenever those vms
or containers do start up, they are automatically connecting
up to the fleet server and securing their data as well.
And there's more about this enrollment process
and deployment of OS query again on the fleet docs,
so feel free to check that out there. Hopefully you
found all of this a useful introduction.
Cto the possibilities of using fleet and osquery,
these open source tools for building a
more devsecops oriented security
program and bringing some of these interesting DevOps practices
to securing endpoints. Feel free to reach out
to me on any of these platforms and thank you very much for
attending this talk.