Conf42 Kube Native 2024 - Online

- premiere 5PM GMT

Utilizing ArgoCD & Kubernetes to provide and expand Suicide Prevention and Crisis Intervention Services for LGBTQ+ Youth

Abstract

Discover Trevor Project’s global expansion for suicide prevention. On National Coming Out Day 2022, they expanded life-saving services to Mexico with ArgoCD, Kubernetes, and Redis. Explore their switch from Salesforce to an open-source stack, showcasing resilience and innovation on a tight budget.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hey, everyone, welcome to our session. Today we're going to be talking about utilizing Argo CD and Kubernetes for suicide prevention for LGBTQ plus young people. My colleague here, Alfredo, and I will be presenting this topic on behalf of the Trevor Project. Trevor Project is a nonprofit dedicated for the mental health space for LGBTQ plus community. So if you want to learn more about who we are, what we do, please feel free to scan on the QR code. We also will be sharing this out, and also we'll have this at the end of the presentation, so you can also scan it again then. And with that said moving into the next slide. So let's talk about what is the Trevor Project. Like I mentioned before, the Trevor Project is the leading suicide prevention and crisis intervention services for LGBTQ plus young people. Here at the Trevor Project, we provide five major services for people within the U. S. And also now in Mexico. Specifically, the first service is Crisis Services, which we're going to be talking more about later in this particular session. Crisis Services is a service where we're connecting LGBTQ plus young people who have suicide ideation or have other crises that they want to talk through, whether that's through text, SMS, or web chat. Our second service is advocacy, making sure we have a group team to be able to advocate for laws that are more helpful for the LGBTQ plus community. As for our third service is research. We conduct research studies, learning about the trends within the community, as well as understanding where mental health is going for these students. Particular individuals before services, Trevor space, pretty much a social platform where the ages between 13 to 24 year olds can be able to go online, get support, talk about things and just help each other out pretty much a safe space for everyone on the platform and last but not least is the education and public awareness. Service where we're providing mental health resources, as well as material and resources for both the youth and allies who come to our page to learn more about mental health, suicide prevention, and ultimately, LGBTQ plus communities. And with that said, I'll pass it over to Alfredo to introduce himself. Thank you so much, Paul. I'm Alfredo Pizana, tech lead here at Trevor. I'm currently focusing on MERT technologies and part of infrastructure and I'm leading the development of this amazing application we are going to talk about. Thank you. Awesome. Thanks, Alfredo. And over on my side, my name is Paul Pham. I use he, him pronouns and I'm the engineering manager who gets to work closely with Alfredo on our day to day operations, particularly that supports many of our verticals here at the Trevor Project. Cool. And so let's talk about what we're going to be sharing. We're going to be sharing four different things. First off, it's going to be the project. What did we do here at the Trevor project? And how we utilize our, the CD and Kubernetes as well as many other tools within our system. And that leads us to the architecture where we're going to be sharing out the high level as well as deep dive a little bit more into the open source tools. Then we're going to go to challenges and lesson learned from what we've done on the project and wanting to share it out with you also that way. If you were to go down this path, or a very similar path, you would be able to avoid certain things. And last but not least, we want to share the impact of the solution that we have implemented so far. And so let's start with the project context. So here at the project, our mission is to end suicide among LGBTQ plus young people. That means that we want to be able to serve the youth. But at the same time, also help others learn the importance of suicide prevention. And the Trevor Project journey started on August 11, 1998, about over 25 years ago, where we first took our first phone call to be able to provide support for LGBTQ plus young people. And ever since then, we have started iterating on our technologies and services, ultimately just moving from having phone calls, to having text messages, to ultimately having web chats available for youth. But what we wanted to do more is be able to expand these services beyond the United States to global nations, such as what we're going to be talking about today, Mexico. And the reason that we wanted to do this is because we've done research studies, how impactful our services are, as well as the need for our services. In a global market, and as a result, we started doing some more research understanding where should we expand to, and we started off with 194 countries, and based on these five criteria, safety, scale, equity, language, and learning, we were able to narrow down to two countries, it was Mexico and the Philippines, and so the reason that we started with the Mexico first, because it was a neighboring country, and it was closer that we can set up operation safely and successfully. And our goal was to be able to deliver this new platform that expanded our services beyond the United States on October 11th of 2022. And we were able to do successfully. The reason this date is so important is the national coming out day within Mexico, and we wanted to align appropriately to market this and be able to have it successfully up and running. And so before, when we were talking about the United States, our services and stuff like that you'll see here, this screenshot captures what we were using, utilizing before, which is a managed service within Salesforce that allows us to be able to connect to LGBT plus young people as well as, make sure we're helping them through updating case records. All through one platform. However, this was not going to work for an expansion. It's going to take a lot of time to scale up to maintain for various instances globally. And so this led us to the creation of what we called the global platform. Or CSM, which is crisis services management platform allows us to be able to scale up more quickly and maintain it and have a better continuous tool development and delivery. And so this leads us to this new UI, where we are able to utilize open source technologies and create something like this, where you can see the similarities, but at the same time, the major differences. And we're going to do that more so on the back end side of things. And with that said, I'll pass it over to Alfredo to talk more about the back end, pretty much the fun side of things within this project. Thank you so much, Paul. Yeah, talking about a little bit on the architecture side. First, we want to mention about the tools we use for the application. On the infrastructure, we are using GCP. For our search code management, we use GitHub as well as for action flows for our CI CD pipelines. And on the infrastructure deployments, we decide to use Terraform as our infrastructure as a code, as well as Argo CD for our continuous delivery to our Kubernetes application. We are using Kubernetes and Twilio as our contact center platform. Next slide, please. And during those conversations about expanding our services, we we plan to create a platform that was scalable, open source and high available aiming to support Multi language and different type of digital channels and on interactions in case you see the global platform name somewhere is application we create to support multi language and multi layer here on the architecture. You will see at the left. The users, that's our data to use, and we are support, we're currently supporting SMS, WhatsApp, and WebChat, that are digital channels. We are supporting this through our Twilio platform, which is our contact center platform, that helps us to support all of the interaction, active interaction in real time. Now talking a little bit on the database, we choose MongoDB. As our persistent storage and ready for us, our catchy and it will it help us to scale on demand if needed. Now, going a little bit dive into the global platform. You will turn the slide, please. You will see that we are using Kubernetes for the infrastructure. And at the left is the UI. We are using one cluster. For the UI and another cluster for the back end. And we are using one cluster per environment. So you will see that we have QA, dev, and stage production, etc. And one of them, there is one cluster for each. And in the middle, you can see there is Argo CD. Which is our content, continuous delivery platform that helps us to smoothly to release different versions of the application. For our different clusters and pods. At the bottom, you will see that there is GitHub that helps us to automatically trigger those events and deploy that using Terraform. Which is our infrastructure as a code platform. Which helps us to define what we want to deploy. To create on GCP and how we want to create that. Could you go to the next slide, please? Here's an example of a configuration file for Terraform. And specifically for this one, this is a conflict fight for a cluster. So we just want to show off like how we define and how we are using, because this is a template where we are using different variables to define environment, regions, etcetera. You go to the next slide, please. Now, talking a little bit more on the Customization and scalability, we decided to use customized because we had a role in a challenge that was, we were trying to find a way to find a method to deploy our apps to Kubernetes automatically. So we decided to use customize, customize files, which help us to create templates for Terraform. At the right you will see that this is an example of how we organize different applications, retiling, different files. For example, the application one, application two, et cetera. This is just a template to show you how we decide to create the base and different applications. Could you go to the next slide please? And this is a real example of the results. For example, by using different config maps, deployments, external secrets, et cetera, we were able to create a real manifest that it's at the right with different values, real values that can be used to automatically deploy through ROC. Could you go to the next slide, please? Thank you. So why we decide to use R-O-C-D-R-O-C-D help us to automatically deploy the design application to a specific envir data environment. And it also help us to try different updates and versions through different branches using GitHub or connecting with GitHub. RO CD also follows the Gith ops pattern. of using the repository, the code repository as a soft source of true for defining the desired application state, which means that we can create different branches, different tags to maintain and support different versions of the application, depending on the environment. And just to mention kubernetes manifest can be specified on different Formats on different ways, and the one we choose was customized applications, but there are some others like camps and Jason, etc. Can you go to the next slide, please? This is the result of our infrastructure up and running. You can see different names of different pots that are currently running, and at the right you will see the versions that we are maintaining on depending on the ver on the environment and the version. For example, you can see that there's a version all 0.2 point. 0. 7. 2 that is healthy and sync, but there are some others that are out of sync and there should be some others that could be crashing or starting depending on the test we are running. So this helps us to, this gives us a big visibility of what's happening during the releases and during the development of the. Project and the code. We can easily identify what's happening. If a service is up and running or obscene, scratching, restarting or whatever. Could you go to the next slide, please? We will show you an example of what's inside of those elements of the list. Here is a file, a manifest file, which contains all of the config maps, versions, etc. That's what's inside of that file. So we can inspect, identify if there is any error, and also, we can Since we are automating this and creating all of the files automatically, we can Also, run scripts to generate those files, but this was tedious and a manual process. That's why we decided to implement ROCD, to automate everything and smoothly transition from one version to another, since it was consuming time, and it was prone to human error. As you can see, there is an example of a little command line that we can use to generate that file. Could you go to the next slide, please? So talking about a little bit on challenges and lessons learned, we were facing three main challenges that we were able to identify. The first one was the strict launch date by the business. It was the final day, so we had to align to that day. What we had to do was to decide what we want to achieve for that day, the MVP, and how we were going to do that. So it was Drive us to this, to choose open source technologies, such as what we just mentioned, Kubernetes ROC, which help us to automate the process and reduce times for those deliveries. So it helped us to, to use or to dedicate more time for the development. Also another challenge we were able to identify once we defined the technologies we wanted was a lack of knowledge on building open source. Or on using open source technologies. So it was a journey because we had to learn and decide why we wanted to use that. I just, as we just mentioned, for example, the decision of using Argo CD was a journey to learn, but it also helped us to reduce time in some other areas. And last but not least, at some point, We were maintaining up to 10 repositories because we were migrating from the old platform to the new one. So it was a challenge to smoothly transition from the old platform, maintaining all the data, creating scripts on different repositories to transition that data to a new one. So it was also a challenge. And at some point we had to produce that and create monorepos, etc. Thanks for sharing that, Alfredo. Both the architecture and the challenges. As you can tell Kubernetes, Argo CD has made it significantly easier for us to deploy. But as Alfredo also mentioned, a lot of challenges did come our way. And what did we learn from this whole entire project utilizing all these technologies? The first lesson that we actually learned is we needed to make sure that we were resilient. Introducing a lot of these open source technologies ahead of time. So that way folks can actually learn about it and utilize it to its full potential. During this whole entire project, we were learning and building at the same time. And so sometimes we didn't really use the best practices or, as I mentioned earlier, we were deploying manually instead of, sometimes going through the whole pipeline. And so this is something that we had to learn through the hard way. So if you can't just be able to make sure that onboarding, knowledge transfer, all that good stuff about these technologies beforehand. The second lesson that we learned is we needed to add more observability to our, different systems and applications. As Alfredo mentioned, we have quite a few clusters to manage, we needed to understand where those pods up and running successfully, how long, all those metrics are super important. So that way it allows us to debug and find root cause a lot more quickly and prevent some of the system outages that did happen when we did scale or Areas came up. So it's more of a preventative measure. So in the future, if we were to redo this, we wanted to add the observability as part of the launch. So that way, we can get that all set up to end. And then the last lesson that I like to share that we learned is, it's also important, not just to track observability, but also track your costs. As Afro mentioned we're a non profit and it's really important for us to make sure that we're keeping track of our costs. Like all companies trying to reduce costs, but, still deliver great quality service. And so one way that we're wanting to do in the future is to auto scale these not just the deployments, but also make sure we auto scale the pods. If there's more traffic reduce the pods. If there's less traffic, just ultimately. Being able to fluctuate based on traffic to give the best service, but at the same time, reduce computing costs. And going to our next slide in terms of impact. By doing all this, project, by doing, incorporating all these tools and technologies, there are two great impact that came out of this project using, utilizing Kubernetes, Argo CD, and many other technologies. The first one is going to be, We were able to support over 20, 000 people in Mexico since we launched. That is a huge amount of number who reached out to us and ultimately get the help that they needed. And two, our system were able to scale to support over 1, 500 chats per month. So making sure that, ideally we will want to continually support more, but this is a great number to start with. And so with that said that concludes our session. Hopefully you learned a little bit about what the Trover project is our expansion project from, utilizing Salesforce to more open source tool technology. Learn a little bit how you can utilize Oracle CD and Kubernetes within your own systems to ultimately do a better Faster deployments or more safe deployments for your Kubernetes clusters and pods. If you ever want to learn more about what we do, feel free to contact Alfredo or myself, or just feel free to scan the QR code and reach out to us through there. Thank you so much, and I hope you all have a good rest of your day. Thank you so much.
...

Paul Pham

Engineering Manager @ The Trevor Project

Paul Pham's LinkedIn account

Jesus Alfredo Pizana Espinosa

Technical Lead @ The Trevor Project

Jesus Alfredo Pizana Espinosa's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways