Transcript
            
            
              This transcript was autogenerated. To make changes, submit a PR.
            
            
            
            
              Hi and welcome to my talk entitled Use Falco
            
            
            
              and EBPF to protect your applications. First, who am
            
            
            
              I? I'm Tomana Bowsias. I'm currently OSS and ecosystem
            
            
            
              advocate at Sysdig, the original creator of Falco.
            
            
            
              I was SrE for over eight years, so I know what it is to
            
            
            
              run stuff in production. I'm also contributor to
            
            
            
              Falco and the creator of Falcopsychic and Falco psychic ride to measure
            
            
            
              components of the Falco ecosystem and you can
            
            
            
              reach me on these social networks if you want.
            
            
            
              First we need to define what is runtime security. What it means
            
            
            
              runtime security are all the tools and procedures
            
            
            
              you can put in place to secured an application in a
            
            
            
              corner or not during its lifetime in production.
            
            
            
              It's different of what we currently do in
            
            
            
              our CI pipelines with an image scanning.
            
            
            
              It's also different from what we can do with Kubernetes or gatekeepers
            
            
            
              to create policies to enforce good practices
            
            
            
              in our clusters. It's totally focused on what
            
            
            
              happens when your application is disturbing real
            
            
            
              customers, is using real traffic.
            
            
            
              For that of alcove we rely on Ciscos.
            
            
            
              Cisco's or system calls are basically the way you
            
            
            
              program have to ask the kernel some
            
            
            
              accesses to those resources. For example,
            
            
            
              if your application needs to create a process access
            
            
            
              to the network, read or write into
            
            
            
              a file, your application needs to ask the kernel
            
            
            
              the access and to ask these accesses
            
            
            
              you use system calls. Basically you can see
            
            
            
              the system calls as the kernel API.
            
            
            
              If you are familiar enough with the Linux ecosystem,
            
            
            
              you already know about Glipsy or muscle for IPM.
            
            
            
              Basically Glipsy is the
            
            
            
              library used by your applications to
            
            
            
              call the system calls. You can see the step calls
            
            
            
              as an API and glipsy as an SDK.
            
            
            
              So for Falco, Falco is a CNCF
            
            
            
              incubation level project. It's a cloud
            
            
            
              native project in the CNCF landscape for
            
            
            
              securing running applications. Right now it's the most advanced
            
            
            
              threat detection engine you can run inside Kubernetes EBPF
            
            
            
              EBPF for extended packet filter.
            
            
            
              It's the Linux kernel feature which allows
            
            
            
              you to run a program in the kernel without
            
            
            
              any change of the code or without kernel or
            
            
            
              any load of a module like we did before.
            
            
            
              It enforces the stability and the security.
            
            
            
              It's really useful for security, for monitoring,
            
            
            
              troubleshooting. You also have to know right now the
            
            
            
              core maintenance of Falco are developing a new
            
            
            
              Falco EBPF probe. Basically the features
            
            
            
              will be exactly the same as the current ones,
            
            
            
              but it will also use the core reparation compile
            
            
            
              ones run everywhere. Right now you would need to
            
            
            
              build the EBPF
            
            
            
              rod for the exact version of your kernel. In the future, since the
            
            
            
              version five eight,
            
            
            
              you will use the same role for any kernel.
            
            
            
              You just have to download it or build it only once and it will
            
            
            
              run everywhere. For the EBPF OE does
            
            
            
              the collections of events. Basically in
            
            
            
              EBPF world you have hooks. Hooks are endpoints.
            
            
            
              For example, you can hook
            
            
            
              basically your probe and collect events. These events can
            
            
            
              be syscolls. They can be related to file system, they can be related
            
            
            
              to network, almost anything. If a hook is not
            
            
            
              already there by default, you can create your own.
            
            
            
              It's really convenient and really to ensure
            
            
            
              the stability and the security. All the code
            
            
            
              you write for your EBPF probe will be verified
            
            
            
              by the Linux kernel. So you code your
            
            
            
              probe with everything to the
            
            
            
              hook you want to use, the data enrichment,
            
            
            
              everything. It will be checked by the
            
            
            
              kernel. If the code is approved, it will be
            
            
            
              compiled into bytecode and injected to the kernel
            
            
            
              and it will be run inside the sandbox.
            
            
            
              The verification is there to ensure you
            
            
            
              don't have any security flows. You don't create infinite loops,
            
            
            
              you don't create overhead and
            
            
            
              bad performances in your system. Everything is
            
            
            
              there by default by design to ensure stability and
            
            
            
              security. For Falco itself, the architecture
            
            
            
              is there. You have the kernel and the EBPF
            
            
            
              probes is to collect the SyS course from the kernel.
            
            
            
              And then Falco, thanks to rule set,
            
            
            
              will trigger lets. If one event from the kernel
            
            
            
              from the SYs course matches with a rule,
            
            
            
              Falco will output an alert.
            
            
            
              This alert can be in standard, but a file program syslog
            
            
            
              HTTP sent to an HTTP endpoint or GrPC.
            
            
            
              If we take a deeper look at the FICO architecture.
            
            
            
              FICO is composed of three key elements.
            
            
            
              Lipscap two libraries Lipscap Elysiums hello regime so
            
            
            
              regime is basic and Lipscap is in charge
            
            
            
              to the inverter collections, elysium to the data enrichment and the extractions
            
            
            
              of field. You can see we have the ab prep code
            
            
            
              in the kernel space and Falco itself in user space.
            
            
            
              It's really important for us to as FICO is
            
            
            
              a security component to be as secure as
            
            
            
              possible. This is why FICO itself is running at the user
            
            
            
              space, so with less privileges.
            
            
            
              But the EBPF
            
            
            
              probe is running in candle space, but thanks to EBPF is
            
            
            
              secured and stable by default. So we
            
            
            
              have the first library, Lipscap, aka library
            
            
            
              for system captures. Lipscap is in a user space
            
            
            
              library. It communicates with the drivers. Basically it reads the syscol's
            
            
            
              events from a rig buffer if exposed
            
            
            
              by the driver and then these events are forwarded
            
            
            
              to listimp. Listinf aka library for system inspections
            
            
            
              is in charge to receive the events from cap and to
            
            
            
              enrich these events with machine state. Basically, if your
            
            
            
              application is running inside a containers, this containers
            
            
            
              is part of a pod. In the Kubernetes cluster you
            
            
            
              will have for your rules and for the lets,
            
            
            
              the containers id containers name, the pod name, the pod
            
            
            
              namespace, the pod levels. All these elements
            
            
            
              will be there to create nice rules
            
            
            
              and to be able to know what is
            
            
            
              the context of the audit. It will also perform some
            
            
            
              event filtering and extract fields from busy events.
            
            
            
              These fields are then used by the rule. So if
            
            
            
              we take a look at our first rule, for example, this one terminal
            
            
            
              shell in a constant, we have the name of the rule,
            
            
            
              the description for us human beings. It will
            
            
            
              not be used by any system and it will not be the
            
            
            
              final output we have, the condition we'll
            
            
            
              see later, and an output. The output is the exact message we
            
            
            
              will get. At the end you can see some
            
            
            
              fields starting by percent person.
            
            
            
              This field will be automatically replaced by Falco in the output.
            
            
            
              It means at the end, in the alert, you will get a real username and
            
            
            
              not this token.
            
            
            
              Each rule comes with a priority. In this case,
            
            
            
              running these priorities are useful for you to filter
            
            
            
              which rule you want to receive. And we
            
            
            
              also have tags. The tags are useful to understand the
            
            
            
              context of the rule, what is supposed to detect,
            
            
            
              and you can also set Falco to
            
            
            
              just enable a subset of rules. For example, you can enable
            
            
            
              only the rules which concern the
            
            
            
              contract or network or else.
            
            
            
              So for the rules you can use lets
            
            
            
              and macros. Lets are pretty obvious.
            
            
            
              It's just an array of AI
            
            
            
              items. In that situation is a list of possible files
            
            
            
              you can use in your system. Remember, Falco rules
            
            
            
              are yaml files, basically so you can
            
            
            
              override anything. And you can also append items
            
            
            
              or append rules or macros. It's really convenient and
            
            
            
              it will allow you to reuse
            
            
            
              macros over your rules and not copy past or duplicate
            
            
            
              codes. We also have this macro,
            
            
            
              shellproc, and you see macro name. Macro name is
            
            
            
              a built in field from Falco you
            
            
            
              can use in your roles. Even if you are not really familiar with Falco,
            
            
            
              if you're not familiar with Linux, Cisco sort of stuff,
            
            
            
              it's quite easy to understand that plug name means the
            
            
            
              name of the process. You also have proc id for
            
            
            
              the id of the process. Or plug pid for the
            
            
            
              id of the parent of the process. It's really convenient and easy
            
            
            
              to read even if you are not a specialist. We also
            
            
            
              have this macro containers if containers id already a
            
            
            
              built in field is different from host, just means
            
            
            
              if we have something different from a hash,
            
            
            
              it means the applications or the events happened
            
            
            
              inside the governor pitch abuse and we have spawn process
            
            
            
              with a tip typo and we also have event type of
            
            
            
              use and easy v art are real
            
            
            
              system calls. You can see these exact names inside
            
            
            
              the kernel code base if you want and we have event deer.
            
            
            
              It's just to specify if we want a question to the kernel or response
            
            
            
              from the kernel. Even if the rules are
            
            
            
              convenient and easy to read, we know it
            
            
            
              would be complicated to create new rules.
            
            
            
              This is why Falco comes with default rule set.
            
            
            
              Right now he has almost 70 step rules
            
            
            
              and they cover most of the techniques
            
            
            
              practices used by the attackers to do
            
            
            
              privilege escalation, to read or write sensitive files or directory
            
            
            
              to spawn a shell,
            
            
            
              exfiltrate data, start ransomware,
            
            
            
              that kind of patterns. For example,
            
            
            
              right now we have all these rules 79
            
            
            
              so we can see some of them are disabled by default.
            
            
            
              It's just because they can be noisy if you don't happen
            
            
            
              the exception list with your own context.
            
            
            
              So we prefer to disable them, but they are there and
            
            
            
              you can use them. We also have tags,
            
            
            
              so if we take a look at the full switches,
            
            
            
              the condition is a little bit different because my slide is quite old
            
            
            
              now, but basically the idea is the same. We have macros,
            
            
            
              spawn process macros is there governor, governor,
            
            
            
              shellprocess, et cetera, et cetera. And the output with the token
            
            
            
              to replace everything is there. You Falco have
            
            
            
              tags and if you are familiar with the meter framework we are trying to
            
            
            
              cover as much techniques as possible and
            
            
            
              you can find which rules is related to which technique with
            
            
            
              the tags, meter, underscore and t number
            
            
            
              after having
            
            
            
              lets is nice,
            
            
            
              but we need to use them, we need to exploit these alerts.
            
            
            
              Here comes Falco psychic basically
            
            
            
              forwards the alerts from your Falco instances
            
            
            
              to your ecosystem so you
            
            
            
              can forward the lets
            
            
            
              to a chat system, logs system like elasticsearch
            
            
            
              loki or a queue system or streaming like kafka
            
            
            
              nats pub sub. You can also forward
            
            
            
              lets to a function as a service serverless
            
            
            
              pycopsychic also exposes Prometheus endpoint.
            
            
            
              It's useful if you want to create and do some statistics about the number of
            
            
            
              alias and so and for the SRE or
            
            
            
              devsecops or health of setups.
            
            
            
              You can also trigger your own call system with Falco
            
            
            
              right now with Falco psychic right now we have pager duty,
            
            
            
              opsigenny and Grafana on call and you
            
            
            
              can also do call storage in s three or s.
            
            
            
              Basically we have one Falco instance per node
            
            
            
              because it relies on the kernel and the kernels are
            
            
            
              not distributed. So we have one Falco
            
            
            
              instance per node. They can forward all their events
            
            
            
              to single
            
            
            
              deployment of Falco. You can pull Falco to
            
            
            
              get metrics and you can send all the events to elasticsearch
            
            
            
              for data analysis for long term storage, but only
            
            
            
              alerts with priority above critical to your on call system.
            
            
            
              You can also add static speeds or else really
            
            
            
              convenient. So with Falco we have the detection.
            
            
            
              With Falco Psychic we have the notification.
            
            
            
              If you forward this event to
            
            
            
              serverless or to a function as a service system, you can react
            
            
            
              as long as you are able to write your own
            
            
            
              reaction. You can do whatever you need with
            
            
            
              lambda, openfast, knative, argo,
            
            
            
              workflow, Google function,
            
            
            
              everything. For example, you can terminate a port.
            
            
            
              You can create a network policy to isolate a port.
            
            
            
              You can also scale in or scale out an autoscaling
            
            
            
              group, whatever you need, as long you are able to write your
            
            
            
              own function.
            
            
            
              Falco psychic comes with a specific output called Falco
            
            
            
              psychic Ui. And basically it's
            
            
            
              a basic interface with statistics, with pie charts.
            
            
            
              And so to have in few minutes
            
            
            
              an overview of what
            
            
            
              has been detected by Farco in your environment, it's pretty convenient.
            
            
            
              It's not used for long
            
            
            
              term storage or else, but at least you have a quick overview.
            
            
            
              It's pretty convenient to use.
            
            
            
              At the beginning Falco was only for
            
            
            
              system calls. Then we introduced a web server to collect the
            
            
            
              Kubernetes audit codes, but it came with
            
            
            
              a lot of drawbacks. So in the last year we also
            
            
            
              introduced a plugin framework. Right now we
            
            
            
              are able to collect cisco thanks to EBPF.
            
            
            
              But Falco is also able to collect
            
            
            
              any kind of events you may have. So by
            
            
            
              events we often think about logs for example.
            
            
            
              So plugins are shared
            
            
            
              libraries used by Falco to collect insight from three more events.
            
            
            
              Right now we have plugins to collect Amazon,
            
            
            
              EKS, ODi cloud, to collect GitHub,
            
            
            
              webhooks, docker events, and even nomad
            
            
            
              events. We developed these plugins with Ashico.
            
            
            
              So with EBPF you
            
            
            
              collect the Cisco. So with EPPF and Falco you protect your
            
            
            
              applications. With the plugins you can for example the
            
            
            
              Kubernetes or deploy plugins. You are able Falco to protect
            
            
            
              your kubernetes clusters. With the Amazon cloud trial
            
            
            
              you are able to protect and detect suspicious behaviors
            
            
            
              at your account level. And with the GitHub plugin you are
            
            
            
              able to detect strange
            
            
            
              situations in your CI or in your pipelines
            
            
            
              or in your repositories. It means right now with Falco
            
            
            
              you can protect all stages from the
            
            
            
              development to the production.
            
            
            
              So the situation now with Falco is
            
            
            
              we have the EBPF probes for these discord collections, we have
            
            
            
              the plugins for the events collections,
            
            
            
              Falco and its rule engine, and to manage
            
            
            
              the plugins and the lifecycles of the plugins and of
            
            
            
              the rules we introduced a few months ago,
            
            
            
              a tool called Falco CTl, Falco Kotle.
            
            
            
              Basically it will install plugins and rules and it
            
            
            
              will also track new versions of the rules to automatically
            
            
            
              download them and reload Falco. So your cluster, your Falco
            
            
            
              fleet will always be up to date.
            
            
            
              So another few of the architectures basically
            
            
            
              same idea, that behind it. And once again the plugins are
            
            
            
              running in user space, so without any
            
            
            
              big privileges, once again for security purpose,
            
            
            
              time for a demo.
            
            
            
              So in this demo cluster I have
            
            
            
              two nodes and like I said, falco relies on
            
            
            
              kernel. So two nodes means two Falco
            
            
            
              pods.
            
            
            
              Basically they are deployed as a demon set to have one
            
            
            
              Falco planet. It's quite obvious.
            
            
            
              I also install Falco psychic, Falco psychic drive the
            
            
            
              front end and Falco psychic drive the storage
            
            
            
              backend is a radius and another deployment
            
            
            
              of Falco with the Eks plugin.
            
            
            
              So imagine you
            
            
            
              have this pod is your critical
            
            
            
              application. It can be WordPress,
            
            
            
              Drupal, anything you can run and
            
            
            
              exposed to Internet. So an attacker gain
            
            
            
              access to this docker, to this customer.
            
            
            
              As you can see, when I created
            
            
            
              my shell, it has been detected immediately. So we have the priority,
            
            
            
              we have the exact output message with the
            
            
            
              user root, the namespace default,
            
            
            
              the pod name, even the containers id and
            
            
            
              what shell has been used and what command line has been used
            
            
            
              to start the shell. All these elements are there also as
            
            
            
              output fields. They are used by Falco, Falco psychic
            
            
            
              for routine. So now I will
            
            
            
              add curl can
            
            
            
              see it's automatically detected in
            
            
            
              real time once again thanks to EBPF. So right now it's
            
            
            
              an error with packet management process launched
            
            
            
              the containers and once again the user the exact command
            
            
            
              that has been run and
            
            
            
              the containers name is there, the images,
            
            
            
              everything. So we'll try to reach
            
            
            
              the Kubernetes API. Now thankfully
            
            
            
              in that situation the API is protected,
            
            
            
              but at least we have detected it.
            
            
            
              Unexpected connection to Kubernetes API server from a containers
            
            
            
              we have the exact command once
            
            
            
              again the namespace as a pond name.
            
            
            
              Imagine overriding a critical
            
            
            
              file.
            
            
            
              Five below OTC has been opened for writing and
            
            
            
              we have once again all elements got
            
            
            
              the name, image, the pod,
            
            
            
              et cetera. And if we take a look,
            
            
            
              Sci-Fi cosychic.
            
            
            
              So we have all
            
            
            
              things that happened in the last five minutes,
            
            
            
              15 minutes. We have the
            
            
            
              pie charts, statistics, but the policies, the tags,
            
            
            
              the source. We can filter on the source,
            
            
            
              we see what I did and if we want more
            
            
            
              details there.
            
            
            
              Right now we also have terminal shell in
            
            
            
              a containers. It's exactly what we saw in the logs, but in a more
            
            
            
              formatted and nicer way. With the tags we
            
            
            
              can filter for example other namespace.
            
            
            
              Then we have the installation of girl, the attempt to reach
            
            
            
              Kubernetes API and the override
            
            
            
              of the file. Everything is there. And we also have text to the
            
            
            
              communities audit log.
            
            
            
              We have the details about someone attached
            
            
            
              to attack or executing
            
            
            
              something into a pod. We have all these details.
            
            
            
              In a real world it will be a web shell
            
            
            
              or else. But in my example, like I did an exec,
            
            
            
              we can detect it. And once again we have the
            
            
            
              target name, the pod name, the data, the namespace. And so
            
            
            
              if you want to start with Falco, the easiest way
            
            
            
              to install and start with Falco is to use the official
            
            
            
              m chart. By setting these
            
            
            
              values you will install Falco Falco
            
            
            
              psychic Falco psychic UI and use EBPF
            
            
            
              prop in namespace called Falco.
            
            
            
              In less than two minutes everything will be set on
            
            
            
              and running and you will be able to access to
            
            
            
              the web UI with a powerwall.
            
            
            
              If you want to contribute or know more about Falco,
            
            
            
              you can join us in our Falco Slack
            
            
            
              channel. You can take a look at our new website.
            
            
            
              A total revamp has been made in the last month, so we hope
            
            
            
              it's better for everybody and we also on
            
            
            
              GitHub. Thank you and
            
            
            
              have a good day.