Conf42 Kube Native 2023 - Online

Don't Trust anyone... Secure your Microservices with ZeroTrust approach

Video size:

Abstract

Let’s see how to secure K8s and VMs clusters following the ZeroTrust approach, covering concepts like Zero Trust security, SSL transport, Observability, Authz and Authn, without touching a single line of our Java ( Quarkus ) microservices.

Summary

  • More than 4000 data breaches occurred in 2021 and that implied 22 billion records being exposed with private information. I'm going to talk about zero trust and why you should not trust anyone in your system and validate everyone. And finally I will give you an example with a live demo with istio, quarkus Java and local cluster.
  • Jonathan Vila is a Java champion and developer advocate for Sonar. Sonar is a company that has few products about clean code. It analyzes your code and gives you hints issues of some things that you can solve. Three main products: sonarlint, sonarcube and sonar cloud.
  • The zero trust approach aims to enforce identity validation in every service, not on two services that are crucial, but in everyone. We are going to set several policies to allow or reject connections to the outside world or even services that cannot talk to other services.
  • Using istio service mesh using an external identity and access management control. First we use no security approach and see that everything goes fine from anywhere. Then we move to a security approach where we are going to replicate exactly the same steps.
  • Let's see how we can implement the zero trust architecture using istio. We have two services doing exactly the same with two different names. From one service to another, and then from one services to the outside world. Now let's implement the security for that.
  • Also I wanted to show you another tool called Kiali that can help us in order to inspect our cluster and see how connections are working. It's very easy to install kiali using Hull. Regarding the demo with istio service, mesh well, well, it's veryeasy to handle all this configuration.
  • Zero Trust is definitely the way to go in order to minimize security issues. It can be well costly to implement because you need to modify your applications. Using service mesh adds another layer of complexity to your system.
  • And that was it. If you are not already using zero trust approach and whatever question you have or if you want to comment anything about zero trust, definitely I will be more than glad to answer them. Use my Twitter handle, my email nil and you can see even some posts about this in my blog.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Um, hello, let's start with few numbers. More than 4000 is the publicly disclosed data breaches occurred in 2021 and that implied 22 billion records being exposed with private information. It's a lot of private information being publicly disclosed, don't you think? I'm going to talk about zero trust and why you should not trust anyone in your system and validate everyone. What can you expect of this presentation? Well, it's a simple introduction to zero trust. Don't expect super detailed information about it. It's not the only way that you can implement a similar approach. And finally I will give you an example with a live demo with istio, quarkus Java and local cluster. My goal for this presentation is very simple, just light a sparkle of curiosity about zero trust architecture on you. So if you at the time that this presentation finishes you have this feeling of I want to explore more, let's do PoC and see how this works. Then I will be more than happy. Obviously if you have any doubt I don't know anything to discuss about zero trust. I will be more than happy to answer them or find someone that can help you. I'm Jonathan Vila, I'm Java champion and have a long time experience with the community. First with one of the leaders of the Java Barcelona community and also one of the founders of JVCN and DevCN conferences. I've been working in few conference and meetups. I'm developer for more than 30 years in several languages and now I'm currently working as a developer advocate for Sonar. Sonar is a company that has few products about clean code. It analyzes your code and gives you hints issues of some things that you can solve. It has three main products sonarlint that it is completely free that you can install in your ide and will detect your code as you type it. Sonarcube that it is an on premise solution that can analyze your different projects and it's open source and you can download it and sonar cloud that it is the hosted solution that it is free for open source projects. So you want to try them, just go to sonarsource.com or come to me and ask me about it. So let's start with the usual context about security trust on perimeter. So when we have trust on the perimeter, as you can see here in this diagram, well we have a user that it is verified through. Well it tries to connect to service a but this passes through a gateway. This gateway will check with an identity and access management tool and it will say okay you are validated, you can go to the service a. Okay, so the service a call will call again to service b, and then this service b will connect to the database one. It also can happen that service a needs to connect to service c, and this will connect to db two. So this is like the happy path when everything works fine, right? And just in case that the user is not validated by the system, the call from the gateway to the service a will be not allowed. So it will be rejected. So, yeah, everything's fine. So only validated users can go through the gateway to the service a. That's perfect. But the problem that we find here is that everything has been controlled by the gateway. So it's the gateway who receives the request and then decides according to what the yam application has answered, if the Getgo is going to allow the connection or not to the service a. But what if for, I don't know, some reasons someone can reach service b? I don't know. Lockforget, for instance, had CBE that allowed remote execution. So let's imagine that someone reaches service b and adds an application there. What we have here is that we trust everything that it is in our system, because we thought the only way to enter the system was through the gateway. Let's imagine an analogy. We have a building, and we have security on the main floor, on the entrance. So you go to that building to work, as every day that you do, and you show them your identification, they check either with the machine or either with a system, manual system, whatever they say, okay, you are allowed to enter the building. So once you are inside the building, you could go to any room or any story. The thing is that if you want to prevent that, you will have to create more identity validations. Therefore, for each room you are going to have a validation, or for each story. Well, validation in the elevator. So what happens is that usually you enter a building, you have access to everything except for two or three rooms that you need a private access. But what if I leave the toilet window open? So the person that can get into the building through that window immediately have access to everything except for those two rooms. Because the identity validation happens only on the main gate. In this case, you can see someone has added malicious application on service b, and then this service B can talk to service C and get information from DB two, or even connect to DB one and get information. That's not what we want. So which is the zero trust approach and how to solve this? Well, basically, we are going to enforce identity validation in every service, not on two services that are crucial, but in everyone. We are going to enforce mutual tls or token validation in every call between services, not only from the outside to the inside, but internally too. But you can never guess who really is calling a service. It could be, yes, a good service, so one of our systems, but it could be a fake service that someone installed there for whatever backdoor they used. And even we are going to have a list of callers and destinations. So is this service allowed to call outside world? Well, probably. For most of the services the answer is no because they are restricted to call between services. Probably there's one that it is sending emails, is checking some information from the outside world, so only that one will be allowed to send information to the outside world. And for do this we can use zero trust approach. So it's the perimeter less security. So what we are going to do is to assume that everyone in the system could be an attacker. So we are going to enforce verification of identity for each one of the services that are calling inside the system. The basic core principles for zero trust is okay, you need to have a strong identification for every service that it is calling. You need to authenticate, as I've said several times now, you need to authenticate the access everywhere in the network, not only on the perimeter, not only on edge, but on everywhere in the network. It's important to know which is the whole architecture, just to know which elements can connect to which other elements. If not, you could be restricting access for, well, services that should have access for that. It's important to know the whole architecture. We are going to set several policies in order to allow or reject connections to the outside world, or even services that cannot talk to other services. So for instance, I have a web service that it is simply answering, I don't know, information about products, but probably that service cannot have access to the service that access salaries, for instance. Yeah, we know that from the application that service is not going to request anything from the salary service. But what if someone gets inside that service and executes a call to the salary service? We need to explicitly define which are the access that are allowed, or at least which are the accesses that are not allowed. Again, never trust the network. Not because you are inside the network. You have access to services. You need to say, hey, I am service a that I'm trying to call service f. Then the system will decide if you have access or not and basically use always services that are designed for zero trust. So when you implement the zero trust architecture, those services are aligned with this architecture. But implementing zero trust architecture has some challenges. So if you need to implement all this security on all of your services, it's going to cost you a lot of time and money. You are going to suffer from legacy software compatibility issues because you try to enforce mutual TLS in a software that has, I don't know, so many years that if you want to update that code to use the latest libraries, you are going to touching a lot of code and enter a path of uncertainty. Also you can even use third party technologies that you dont have the source code. So what if they are not updatable easily to allow you to check for all the security issues. And again it will have that you keep on continuous maintenance and monitoring requirements for all your services that you have touched. So you need to okay every time that a new version of mutual TLS or CBE has fixed, then you need to update all the libraries for all the services that you are using in your mesh or in your system. So in summary, it would be to ask SSL transport for all your services authorization and authentication observability rules to check which service can talk to which other services use clean code approach in your code because you don't want to expose private information outside and even inspect if your libraries and your code is affected by cves. Therefore change libraries, change approaches and update everything to every application. So that's a lot of work to do to maintain all your cluster secure. Or what we can do is to use the zero trust approach and not touch any application's code. So let's see how we can implement this zero trust architecture without touching our application's code. Well, and for that what we are going to introduce is istio service mesh. It's a collection of microservices and the basic thing about istio service mesh is that it will install a sidecar for each of our services pods and will handle all the traffic coming and going out. This also allow us to implement observability or traffic management without touching the application. Because in the end for the application it doesn't have knowledge about istio or something going on. Simply istio is capturing the network coming and going away from the pod. This also allows us to implement a b testing canary deployments because we can define which traffic is going to which service version. So it allows us to even implement rate limiting. I mean we can decide how much traffic is going to hit a service at a certain point, but everything is done transparently for the service. Even we can define filters that will modify this connection between services going through the network. One use case would be encrypting, then the traffic. Another case would be adding headers or modifying headers or checking headers for the messages that are going between the services. And obviously one of the main use cases is to add authentication and authorization between all the services in a transparent way for our service. In this case, well, we see how istio works. As I mentioned before, it has a proxy, envoy proxy and all the traffic across the mesh is passing through the envoy proxy, istio. What is going to do is to translate those configuration files into envoy configuration files that are more complicated. And Istio allows us to divide all those configuration in simpler or smaller pieces that are the ones that we are going to use. Finally, there's a control plane in this case with istiod that is going to handle all the different envoy proxies in the different clusters. We can even merge clusters or even we can incorporate virtual machines into these meshes and configuring the networking among them again as a transport thing for the applications. So let's go now to a demo and I will show you which is the services that we are going to handle. Those are made in Java using Quarkus. But let's see the demo. So let's going to take a look to the files that we are going to use in the demo. So we need a Kubernetes cluster and we are going to have this Quarkus DTA service and another service that is exactly the same service with a different name just for demoing purposes and the gateway. And first we are going to use no security approach and see that everything goes fine from anywhere. And then we are going to use security applying zero trust approach to our kubernetes using istio service mesh using an external identity and access management control. It's a key cloak hosted free that we can create a configuration that we are going to use in our validation using the tokens coming from that service. So basically the steps that I'm going to follow are, well, first we are going to test from call from outside the cluster, then a call from inside. So from one service to another service and then trying to go from one service to the external world. Then we are going to move to a security approach where we are going to replicate exactly the same steps. And the files that are going to be involved are well the definition of the quarkus service in Java, a gateway, a virtual service, a config map that we are going to touch for istio, then a request authentication and our authorization policy in order to enforce the validation for every connection to any service and then a service entry that will prevent or allow the request to external services for the clusters. So in our case, our service, it's a simple service, just two methods. That one is returning a hard coded text on the endpoint hello and another one returning a text concatenated with a parameter on the endpoint echo. That's it, nothing else. It's a rest endpoint touching else, no security, no nothing in our service. For the security we are going to use a gateway where we are going to define well the port that it is going to accept and a virtual service that definitely is configuring an endpoint that is going to be called by the gateway. So we are exposing endpoints that are connected to services. As you can see host is a service and we are going to attack the port 80 then we are going to change something in the config map, a value when we are going to allow or reject. So the default policy when calling to endpoints outside of the cluster by default is allow any, but we are going to modify to registry only when we want to only allow certain connections and not the trust for the request authentication that it is the file that is going to configure who is issuing the tokens. In this case, well we are going to define which are the workloads deployments that are affected by this request authentication. So in any case with istio we can filter who is affected by the configuration using labels. In this case we are configuring the external keycloak service and matching all the workloads with app Quarkus. The authorization policy is saying okay, we are going to use this key clock external service in order to provide this valid token. The service entry is effectively configuring an external service. In the case we are prohibiting all the connections to external services except for those that are defined as a service entry. With this case we are going to say okay, it's connecting to Google.com is allowed with this service entry, then any other host is forbidden. It's easy to configure which are the external services that are allowed by the mesh in case that you have configured it as rejecting so registry only in the config map we will see in a minute. Okay, so let's play directly with our local cluster and see how we can implement the zero trust architecture using istio. I already have a cluster running and I also installed Istio. It's very easy to install istio trust. Download the istio kernel command and that will allow you to install. You can find all the steps in this git repository. So let's deploy our first service. So what I'm going to do is to build the service that we saw previously specifying well, which is the namespace where it has to be deployed, which is the label that we are going to use for this workload and the name of the app. Okay, so we are going to build it using maven. In this case it's a regular Java application using JVM, but we could even use a native artifact because we are using Quarkus. And this will take longer to build but way shorter to execute. And now we are going to install another service, it's exactly the same but with a different name of the application. That's it. So we are going to have exactly two same services doing exactly the same with two different names just to demo the requests and calls from one service to another, and then from one service to the outside world. Okay, now we are going to check which is the ip for the node in this case, because our services are using node port, as you can see here. So it is using kind. So therefore what we can do is simply use the clusters IP and the port, the port is 31 591 and this will redirect to the port 80. So if I do this curl from the outside world, yeah, I get the response from that service, and if I do exactly the same, but for the other service, okay, I receive a response from them. But if what I do is I'm going to do a shell in this spotlight, okay, so I'm going to shell inside one of the services and what I'm going to do is do a curl to the other service using the name of the service basically. So, well, we can see there is no problem, everything is working fine. But even if I want to call any external endpoint, we received a response. So that's fine. But now let's implement the security for that. The first step is to add a namespace label to our namespace default, saying istio injection is enabled. Okay, nothing happened in fact. But what we need is to delete the pods. And now what we see is that instead of having one container per each pod, now we have two, we have our application station and also what we have is the proxy, but we don't have anything yet in terms of security. So if I try to connect to the application, everything is working as before. So what we need now is to apply the files that are going to configure istio for this security. So what we have here is first, well let's see, oops. What we have. So first we are going to add a request authentication that in fact what it's doing is saying, okay, the GWT tokens are issued by this application. So what we are going to do is to apply the authentication. Then we are going to apply the policy it that will enforce having a token in each connection. So let's apply it with this. If we try to do exactly the same coral command, it says access denied. Why? Because my call request is not passing any token. So for that what we need is to get a token. What we are going to do is we are going to connect to my key clock that it is online working and we are going to get a token, okay, so what we have is token and we are going to do exactly the same call, but in this case passing the token into the authorization header. Let me change this. Now it is answering what it is expected. If I do this again without passing the token, it says RBAC access denied. If I go to the pod and try to do a shell and I do exactly the same as we did it before, it is saying exactly the same. So the access denied is raised from an outside call or from an inside call. But if we copy the token and do exactly the same, then we have a response from the other service because the token is valid. And now what we need is to check if we have access to the outside world. So from the inside we are going to do a call to Google. Okay, it is working. And if I do the same for Oracle, it is working. But if I apply a virtual service, in this case, let's see what is doing this virtual service. Sorry, the service entry, it is saying, okay, we are going to create a service entry for Google.com. So what it is going to do is to allow me to connect to google.com. But for that what we need is to change a config map. In this case we are going to change this config map in istio saying, okay, instead of the allow any mode, what we are going to say is only registry. Anything that it is not in the registry will be rejected. So what we do is we do that config map. Now let's see if I can go to inside my pod and do, oh, I cannot connect to Oracle, but if I go to Google, that's interesting. Oh, because I didn't apply. Now if I try to connect to Google I have an answer. If I try to connect to Oracle, it is not working at all. Also I wanted to show you another tool called Kiali that can help us in order to inspect our cluster and see how connections are working. It's very easy to install kiali using Hull. You can install the operator and then install Prometheus and it's fine and easy. And then we only need to do a poor forward and finally we will connect to our service, kiali service. And with this we have our applications, our services, and we can even see how connections are working. So if I try to do the same coral that I did before without passing the header, it is saying access denied. And if I refresh, I effectively see that there's an error trying to connect to the service. We can see which are our applications and the services and the elements for them, inbound traffic, outbound metric traces. So there's a lot of information that we can get from Kiali and that's it basically. Regarding the demo with istio service, mesh well, after you saw the demo, well, it's very easy to handle all this configuration with istio. Let's talk about which are the conclusions that we can get of this presentation. Well, as you can see here, the cvs have been well increasing year after year and only in 2022 there were more than 800 cves with similar score that the famous issue with lock for shell that allowed remote execution. And a lot of services were well in risk because they were using log, a logging library to log information. Very easy one, but that it allowed this remote execution and therefore a lot of issues were generated by it. So security has to be taken very seriously because from the more innocent library that we can use, a logging library, a very hard or very important severity come into your system and allow that another third party can put something in your system having access to all of your services. Zero Trust is definitely the way to go in order to minimize security issues because you are enforcing validation in every call between services, not just considering that the perimeter is the only way that attackers can use in order to go into your system. Also it can be well costly to implement because you need to modify your applications. Adding SSl transport, dealing with libraries, third parties libraries, all libraries, and adding a lot of modifications in order to increase the level of security of your applications. Not even considering that it could be that you cannot have access to modify those third party libraries because you don't have the code. It involves security inside your cluster. So it's not only the security from external attackers, it can be also security in inside your cluster from one service to another. Because sometimes it could be the third party can have put something inside your cluster in order to I don't know, take advantage of it and get information. Or it could be because one service is malfunctioning or doing a request where it shouldn't do it. So from a configuration point of view with zero trust you can configure who can talk to who. So it's more secure this way. Definitely. Service mesh can help you because it allows you to implement security without touching your applications, because it's something that it is running beneath your applications. So they don't even know you are using service mesh. It's transparent for existing applications because you apply service mesh for certain namespaces and then security is implemented and enforce it and the applications are not suffering or not being modified by anyone. And even you can add more features to your system because you can add observability, you can add logging, even you can add headers to the communication, encrypting the payload. Filtering a lot of features that can be done transparently for your applications introduces network complexity, that's for sure. Using service mesh adds another layer of complexity to your system. So before that you had your application, you have your cluster, probably your gateway, but now you need a service mesh that is controlling the network. So a malconfiguration can create that. Applications cannot talk to the applications they need. But yeah, nothing comes for free. Even you can implement gradual security steps because you can enforce mutual tls now, then you can enforce token validation, then you can enforce encryption, then you can enforce policies of who can talk to who, but nothing have to be done at once. You can add more restrictions as you are more mature on the service mesh management n involves a high level of customization because you can do lot of filtering and modifications from the envoy proxy. There are several filters that are already built in in envoy, but even you can use wasm in order to create new filters that you can use in your envoy proxies doing well. I don't know whatever enrichment, filtering or transformation that you want with your messages coming from one service to another. Finally, here are the references that I've used and that I think they can help you if you want to dig more into the details about zero trust architecture. And that was it. Thank you very much for being patient with this presentation and hope it light sparkle of curiosity on you. If you are not already using zero trust approach and whatever question you have or if you want to comment anything about zero trust, definitely I will be more than glad to answer them. Use my Twitter handle, my email nil and you can see even some posts about this in my blog. Thank you very much and hope you enjoy the presentation.
...

Jonathan Vila

Developer Advocate @ Sonar

Jonathan Vila's LinkedIn account Jonathan Vila's twitter account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)