Transcript
This transcript was autogenerated. To make changes, submit a PR.
Um, hello, let's start
with few numbers. More than 4000 is
the publicly disclosed data breaches occurred in 2021
and that implied 22 billion
records being exposed with private
information. It's a lot of private information
being publicly disclosed,
don't you think? I'm going
to talk about zero trust and why you should not trust
anyone in your system and validate everyone.
What can you expect of this presentation? Well, it's a simple introduction
to zero trust. Don't expect super
detailed information about it. It's not the
only way that you can implement a similar approach.
And finally I will give you an example with a
live demo with istio,
quarkus Java and local
cluster. My goal for this presentation is
very simple, just light a
sparkle of curiosity about zero trust architecture on
you. So if you at the time that
this presentation finishes you have this feeling of I
want to explore more, let's do PoC
and see how this works. Then I
will be more than happy. Obviously if you have any doubt
I don't know anything to discuss about zero trust.
I will be more than happy to answer them or
find someone that can help you.
I'm Jonathan Vila, I'm Java champion and
have a long time experience
with the community. First with one of the
leaders of the Java Barcelona community and also one
of the founders of JVCN and DevCN conferences.
I've been working in few conference
and meetups. I'm developer for more than 30 years
in several languages and now I'm currently working
as a developer advocate for Sonar.
Sonar is a company that has few products about clean
code. It analyzes your code and gives you hints
issues of some things that you can solve.
It has three main products sonarlint
that it is completely free that you can install in your ide and will
detect your code as you type it.
Sonarcube that it is an on premise solution
that can analyze your different projects and
it's open source and you can download it and sonar
cloud that it is the hosted solution that it is free for
open source projects. So you want to
try them, just go to sonarsource.com or come
to me and ask me about it.
So let's start with the usual context about
security trust on perimeter.
So when we have trust on the perimeter, as you can see here
in this diagram, well we have a user that
it is verified through. Well it tries
to connect to service a but this passes
through a gateway. This gateway will check
with an identity and access management
tool and it will say okay you are
validated, you can go to the service a. Okay, so the
service a call will call again to
service b, and then this service b will
connect to the database one. It also can happen
that service a needs to connect to service c,
and this will connect to db two. So this is
like the happy path when everything works fine,
right? And just in case
that the user is not validated
by the system, the call from the gateway to the
service a will be not allowed. So it
will be rejected. So, yeah,
everything's fine. So only validated users
can go through the gateway to the service a.
That's perfect.
But the problem that we find here is that everything
has been controlled by the gateway.
So it's the gateway who receives the
request and then decides according to what the
yam application has answered,
if the Getgo is going to allow the connection or not
to the service a. But what if for,
I don't know, some reasons someone can
reach service b? I don't
know. Lockforget, for instance, had CBE
that allowed remote execution.
So let's imagine that someone reaches
service b and adds an application there.
What we have here is that we
trust everything that it is in our system,
because we thought the only way
to enter the system was through the gateway.
Let's imagine an analogy. We have a
building, and we have security on
the main floor, on the entrance. So you go to that
building to work, as every
day that you do, and you show them
your identification, they check either
with the machine or either with a system,
manual system, whatever they say,
okay, you are allowed to enter the building. So once
you are inside the building, you could go to any
room or any story.
The thing is that if you want to prevent that,
you will have to create more identity
validations. Therefore, for each room you are going to
have a validation, or for
each story. Well, validation in the elevator.
So what happens is that usually you enter a building,
you have access to everything except for two or three
rooms that you need a private access.
But what if I leave the
toilet window open?
So the person that can get into the building
through that window immediately have access to
everything except for those two rooms.
Because the identity validation happens only
on the main gate. In this case,
you can see someone has added malicious
application on service b, and then this service B
can talk to service C and get information from DB
two, or even connect to DB one and get information.
That's not what we want.
So which is the zero trust approach and how to
solve this? Well, basically,
we are going to enforce identity validation in every
service, not on two
services that are crucial, but in
everyone. We are going to enforce mutual
tls or token validation in every call between services,
not only from the outside to the inside,
but internally too. But you can
never guess who really is calling a
service. It could be, yes, a good service,
so one of our systems, but it could be a fake
service that someone installed there for
whatever backdoor they used.
And even we are going to have a list of
callers and destinations. So is this service
allowed to call outside world? Well,
probably. For most of the services the answer is no because
they are restricted to call between services.
Probably there's one that it is sending emails, is checking
some information from the outside world, so only that one
will be allowed to send information to
the outside world. And for do this we
can use zero trust approach. So it's the perimeter
less security. So what we are going to
do is to assume that everyone in
the system could be an attacker.
So we are going to enforce verification
of identity for each one of
the services that are calling inside the system.
The basic core principles for zero trust is
okay, you need to have a strong identification for
every service that it is calling.
You need to authenticate, as I've said
several times now, you need to authenticate the access
everywhere in the network, not only
on the perimeter, not only on edge, but on everywhere
in the network. It's important to know
which is the whole architecture, just to know
which elements can connect to which
other elements. If not, you could
be restricting access for,
well,
services that should have access for
that. It's important to know the whole architecture.
We are going to set several policies in order to
allow or reject connections to the outside world,
or even services that cannot
talk to other services. So for instance, I have a web
service that it is simply answering,
I don't know, information about products,
but probably that service cannot have
access to the service
that access salaries,
for instance. Yeah, we know that from the application that service is
not going to request anything from the salary service.
But what if someone gets inside
that service and executes a call to the salary
service? We need to explicitly
define which are the access that are allowed,
or at least which are the accesses that are not allowed.
Again, never trust the network. Not because you are inside
the network. You have access to services.
You need to say, hey, I am service a
that I'm trying to call service f.
Then the system will decide if you have access or not
and basically use always services
that are designed for zero trust. So when
you implement the zero trust architecture,
those services are aligned with this
architecture. But implementing zero trust architecture
has some challenges. So if you need to
implement all this security on all of your services,
it's going to cost you a lot of time and money.
You are going to suffer from legacy software compatibility
issues because you try to enforce mutual TLS in
a software that has, I don't know, so many years
that if you want to update that code to use the latest
libraries, you are going to touching a lot of code and enter
a path of uncertainty. Also you can even
use third party technologies that you dont have the source
code. So what if they are not updatable
easily to allow you to
check for all the security issues. And again
it will have that you keep
on continuous maintenance and
monitoring requirements for all your services that you have touched.
So you need to okay every time that a new version of mutual
TLS or CBE has fixed, then you need to update all
the libraries for all the services that you are using
in your mesh or in your system.
So in summary, it would be to ask
SSL transport for all your services authorization and authentication
observability rules to check which
service can talk to which other services use
clean code approach in your code because you don't want
to expose private information outside and
even inspect if your libraries and
your code is affected by cves.
Therefore change libraries, change approaches
and update everything to every application.
So that's a lot of work to do to maintain all
your cluster secure.
Or what we can do is to use the
zero trust approach and not touch any
application's code.
So let's see how we can implement
this zero trust architecture without touching
our application's code. Well,
and for that what we are going to introduce is istio service
mesh. It's a collection of microservices and
the basic thing about
istio service mesh is that it will install a
sidecar for each
of our services pods and
will handle all the traffic coming and going out.
This also allow us to implement
observability or traffic management without touching
the application. Because in the end for the application it
doesn't have knowledge about istio or something
going on. Simply istio is capturing
the network coming and going away from
the pod. This also allows us to implement
a b testing canary deployments because we can
define which traffic is going to which service version.
So it allows us to
even implement rate limiting. I mean we can
decide how much traffic is going to hit a
service at a certain point, but everything is
done transparently for the service.
Even we can define
filters that will modify this connection between
services going through the network. One use case
would be encrypting, then the traffic.
Another case would be adding headers or modifying headers or
checking headers for the messages that are
going between the services.
And obviously one of the main use cases is to
add authentication and authorization
between all the services in a
transparent way for our service.
In this case, well, we see how istio works.
As I mentioned before, it has a proxy,
envoy proxy and all the traffic
across the mesh is passing through the envoy proxy,
istio. What is going to do is to translate those
configuration files into envoy
configuration files that are more complicated.
And Istio allows us to divide
all those configuration in simpler or
smaller pieces that are the ones
that we are going to use. Finally, there's a
control plane in this case with istiod that
is going to handle all the different envoy
proxies in the different clusters. We can
even merge clusters or even
we can incorporate virtual
machines into these meshes
and configuring the networking among them
again as a transport thing for the applications.
So let's go now to a demo and I will show you which is
the services that we are going to handle. Those are
made in Java using Quarkus.
But let's see the demo.
So let's going to take a look to the files that we are going to
use in the demo. So we need a
Kubernetes cluster and we are going to
have this Quarkus DTA service and another service
that is exactly the same service with a different
name just for demoing purposes and
the gateway. And first we are going to use
no security approach and
see that everything goes fine from anywhere.
And then we are going to use security applying
zero trust approach to our kubernetes using
istio service mesh using an external identity
and access management control. It's a key cloak hosted free
that we can create a
configuration that we are going to use in our validation using
the tokens coming from that service.
So basically the steps that I'm going to follow are,
well, first we are going to test from call
from outside the cluster, then a call from inside.
So from one service to another service and
then trying to go from one service to the external world.
Then we are going to move to a security approach where we
are going to replicate exactly the same steps.
And the files that are going to be involved are well
the definition of the quarkus service in Java, a gateway,
a virtual service, a config map that we are going to touch for
istio, then a request authentication
and our authorization policy in order to enforce
the validation for every connection to
any service and then a service entry that will
prevent or allow the request to
external services for the clusters.
So in our case, our service,
it's a simple service, just two methods.
That one is returning a hard coded
text on the endpoint hello and another one returning
a text concatenated with a parameter
on the endpoint echo. That's it, nothing else.
It's a rest endpoint touching else, no security,
no nothing in our service.
For the security we are going to use a gateway where we
are going to define well
the port that it is going to accept and
a virtual service that definitely is configuring
an endpoint that is going
to be called by the gateway.
So we are exposing endpoints
that are connected to services. As you can see host
is a service and we are going to attack the port 80 then
we are going to change something in the config map,
a value when we are going to allow or reject.
So the default policy when calling to endpoints
outside of the cluster by default is allow any, but we
are going to modify to registry only when
we want to only allow certain connections and
not the trust for the request authentication
that it is the file that is going to configure who
is issuing the tokens. In this case, well we
are going to define which are the workloads
deployments that are affected by this request
authentication. So in any case with istio we can filter
who is affected by the configuration using labels.
In this case we are configuring the external keycloak service
and matching all the workloads with app
Quarkus. The authorization policy is saying okay,
we are going to use this key clock external service in order to provide
this valid token. The service entry
is effectively configuring an external
service.
In the case we are prohibiting all
the connections to external services except for
those that are defined as a service entry.
With this case we are going to say okay,
it's connecting to Google.com is allowed
with this service entry, then any
other host is forbidden.
It's easy to configure which are the external services
that are allowed by the mesh in case
that you have configured it as rejecting so registry
only in the config map we will see in a minute.
Okay, so let's play directly with our local cluster
and see how we can implement the zero trust architecture using
istio. I already have a cluster
running and I also
installed Istio. It's very easy
to install istio trust. Download the istio
kernel command and that will allow you to install.
You can find all the steps in this git repository.
So let's deploy our first
service. So what I'm going to do
is to build the service that we saw
previously specifying well, which is the
namespace where it has to be deployed, which is the label that we
are going to use for this workload and
the name of the app. Okay, so we
are going to build it using maven.
In this case it's a regular Java
application using JVM, but we could even
use a native artifact because we
are using Quarkus. And this
will take longer to build but
way shorter to execute.
And now we are going to install another service,
it's exactly the same but with a different
name of the application. That's it.
So we are going to have exactly two same
services doing exactly
the same with two different names just to
demo the requests and calls
from one service to another, and then from
one service to the outside world.
Okay, now we are going to check
which is the
ip for the node in this case, because our
services are
using node port, as you can see here.
So it is using kind.
So therefore what we can do is simply use
the clusters IP and the port,
the port is 31 591
and this will redirect to the port 80.
So if I do this curl from the outside world,
yeah, I get the
response from that service, and if I do exactly the same,
but for the other
service, okay,
I receive a response from them.
But if what I do is I'm
going to do a shell in this spotlight,
okay, so I'm going to shell inside
one of the services and what I'm going to do is
do a curl to the other service using the
name of the service basically.
So, well, we can see there is no problem,
everything is working fine. But even if I want
to call any external
endpoint,
we received a response.
So that's fine.
But now let's implement the security for that.
The first step is to add a
namespace label to
our namespace default,
saying istio injection is enabled.
Okay, nothing happened in
fact. But what we need
is to delete the pods.
And now what we see is
that instead of having one container per each pod,
now we have two, we have
our application station and also what we
have is the proxy,
but we don't have anything yet
in terms of security. So if I try
to connect to the application,
everything is working as before.
So what we need now is to apply
the files that are going to configure istio for
this security. So what we have here is first,
well let's see,
oops.
What we have. So first we are going to add a request
authentication that in fact what it's doing is saying,
okay, the GWT tokens
are issued by this
application. So what we are going to do is to
apply the authentication. Then we
are going to apply the policy it
that will enforce having a
token in each connection. So let's
apply it
with this. If we try to do exactly the same coral
command, it says
access denied.
Why? Because my call request
is not passing any token.
So for that what we need is to get a token.
What we are going to do is we are going to connect
to my key clock
that it is online working and we
are going to get a token, okay, so what
we have is token
and we are going to do exactly
the same call, but in this
case passing the token into the authorization
header. Let me
change this.
Now it is answering
what it is expected. If I do this
again without passing the token, it says RBAC access
denied.
If I go to the
pod and try to do a
shell and I do
exactly the same as we did it
before,
it is saying exactly the same. So the access denied
is raised from an outside
call or from an inside call.
But if we copy
the token and
do exactly the same,
then we have a response from the other service
because the token is valid.
And now what we need is to check if we have access to
the outside world. So from the
inside we are going to do a
call to Google. Okay, it is
working. And if
I do the same for Oracle,
it is working. But if
I apply a virtual
service, in this case, let's see what
is doing this virtual service.
Sorry, the service entry,
it is saying, okay, we are going to create a
service entry for Google.com.
So what it is going to do is to allow me to connect
to google.com. But for that what we need
is to change a config map.
In this case we
are going to change this
config map in istio saying, okay,
instead of the allow any mode,
what we are going to say is only registry.
Anything that it is not in
the registry will be rejected.
So what we do is we
do that config map. Now let's see if
I can go to
inside my pod
and do,
oh, I cannot connect to Oracle,
but if I go to Google,
that's interesting.
Oh, because I didn't apply.
Now if I try to connect to Google I
have an answer. If I try to connect to Oracle,
it is not working at all.
Also I wanted to show you another tool called
Kiali that can help us in order to inspect our
cluster and see how connections are working.
It's very easy to install kiali
using Hull. You can install the operator and then
install Prometheus and it's fine and
easy. And then we only need
to do a poor forward
and finally we will connect to our
service, kiali service.
And with this we have our
applications, our services, and we can
even see how connections
are working. So if I try to do the same coral that I
did before without passing the
header, it is saying access denied.
And if I refresh, I effectively see
that there's an error trying to connect to the service.
We can see which are
our applications and the services and the
elements for them, inbound traffic,
outbound metric traces. So there's a lot of information
that we can get from Kiali
and that's it basically. Regarding the
demo with istio service,
mesh well,
after you saw the demo, well, it's very easy
to handle all this configuration with istio.
Let's talk about which are the conclusions that we can
get of this presentation. Well, as you can see here,
the cvs have been well
increasing year after year and
only in 2022 there were
more than 800 cves with similar
score that the famous issue
with lock for shell that allowed remote execution.
And a lot of services were
well in risk because they were using log,
a logging library to log information.
Very easy one,
but that it allowed this
remote execution and therefore a lot of issues
were generated by it.
So security has
to be taken very seriously because
from the more innocent library that we can use,
a logging library, a very hard
or very important
severity come into your system
and allow that another third party can
put something in your system having access to all
of your services. Zero Trust
is definitely the way to go in order to minimize
security issues because you are enforcing validation
in every call between services,
not just considering that the perimeter is
the only way that attackers can
use in order to go into your system.
Also it can be well costly to implement
because you need to modify your applications.
Adding SSl transport,
dealing with libraries, third parties libraries,
all libraries, and adding a
lot of modifications in order to increase the
level of security of your applications.
Not even considering that it could be that you
cannot have access to modify those third party libraries
because you don't have the code.
It involves security inside your cluster.
So it's not only the security from external
attackers, it can be also security in
inside your cluster from one service to
another. Because sometimes it could be the third party
can have put something inside your cluster in order
to I don't know, take advantage of it and get
information. Or it could be because one service is
malfunctioning or doing a request where it
shouldn't do it. So from
a configuration point of view with zero trust
you can configure who can talk to who.
So it's more secure this way.
Definitely. Service mesh can help you because it allows you to
implement security without touching your applications, because it's something
that it is running beneath your applications.
So they don't even know you are using service mesh.
It's transparent for existing applications because you apply
service mesh for certain namespaces
and then security is implemented and enforce
it and the applications are not suffering
or not being modified by anyone.
And even you can add more features to your
system because you can add observability,
you can add logging, even you
can add headers to the communication,
encrypting the payload. Filtering a lot
of features that can be done transparently for
your applications introduces network complexity,
that's for sure. Using service mesh adds
another layer of complexity to
your system. So before that you had your application,
you have your cluster, probably your gateway,
but now you need a service mesh that is controlling the
network. So a malconfiguration can create that.
Applications cannot talk to the applications they need.
But yeah, nothing comes for free.
Even you can implement gradual security
steps because you can enforce mutual tls now,
then you can enforce token validation, then you can enforce
encryption, then you can enforce policies
of who can talk to who, but nothing
have to be done at once. You can add more
restrictions as you are more mature on
the service mesh management n involves
a high level of customization because you can do
lot of filtering and modifications from
the envoy proxy. There are
several filters that are already built
in in envoy, but even you can use
wasm in order to create new filters that you can
use in your envoy proxies doing well.
I don't know whatever enrichment,
filtering or transformation that you want with
your messages coming from one service to another.
Finally, here are the references that I've
used and that I think they
can help you if you want to dig more into the details about
zero trust architecture. And that
was it. Thank you very much for being patient with
this presentation and hope it
light sparkle of curiosity on you. If you are
not already using zero trust approach and
whatever question you have or if you want
to comment anything about zero trust,
definitely I will be more than glad to answer them.
Use my Twitter handle, my email nil and you
can see even some posts about this in my blog.
Thank you very much and hope you enjoy the presentation.