Conf42 DevOps 2025 - Online

- premiere 5PM GMT

Architecting Real-Time Analytics: Building Scalable Data Pipelines for Customer Experience Excellence

Video size:

Abstract

Discover how to architect real-time analytics pipelines that process 1M+ events/second with 99.99% uptime. Through hands-on case studies, learn how top companies built resilient data systems that reduced incident response by 67%.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hey everyone, I'm Amar and I'm thrilled to welcome you to this session. as a senior manager of software engineering at Lowes, I have had the privilege of, leading multiple teams, that design and deliver, platforms, that are capable of, scaling, to millions of users. but today, I'm going to dive into something even more exciting and transformative. how to build a resilient, real time analytics pipeline, a topic, that's redefining the way the business operates in our, fast paced, digital world. Think about this, right? What if, you could process over a million events per second while maintaining near perfect uptime? imagine, the competitive edge, that the, that your business brings, whether it is delivering, lightning fast personalized recommendations or, enabling real time decision making or probably for that matter, proactively addressing, customer needs, before even they arise. In this session, I'm going to uncover the cutting edge architectures and then strategies that some of the leading companies are using to achieve these goals. I'll share real time and real world success stories, walk you through the process. the technical, challenges and the technical design and probably the solution and provide a roadmap to help you implement, these systems in your projects. whether you are here to supercharge your technical expertise, gathering actionable insights to tackle your current challenges or simply explore, the incredible possibilities in real time analytics, absolutely are in the right place. Let's, go on. so, before, getting on to the subsequent, slides, I would like to say that, you could connect me on LinkedIn at, Amarnath Imadisetty. so, let's get on to the subsequent, slides. Let's try to understand what's the customer expectations. So, as I said, in today's digital era, 80, 82 percent of the customers do expect instant and personalized responses. this has an absolute, competitive advantages. Organizations who are using real time analytics. experience a 26 percent boost in customer satisfaction matrix. And also this led to 3. 2 times higher customer lifetime value. So let's get on to, the core benefits of, the real time analytics. so real time analytics has a cap. I think when we talk about real time analytics, the primary thing that to look after is, the processing speed or the processing power at which we process it. real time analytics, Has to handle, a huge amount of data. So probably I'm talking 1 million events per second, right? And also, the instant personalization is something that, is embedded as part of the real time analytics, right? so it allows business to tailor customer experiences in a blink of an eye. Right. and probably that also boils down to the business agility. when I talk about business agility, companies using real time analytics can, adapt to changes within milliseconds. Let's understand, the, real time, Scenarios where the companies are using. imagine a large e commerce site, and I hope most of you or all of you might have shopped somewhere at that time at Amazon. I'm taking Amazon as an example here. imagine the number of transactions that Amazon do get. when a customer clicks on a product, adds it to their cart, or searches for something, or, view the cart, or maybe making a payment or search for the deals. So, All this data has to be collected instantaneously. With the power of real time processing, the system can track millions of these events happening at the same time, and across the globe at the same time, right? Ensuring a smooth and fast experience for every user. let's assume, if you have ever been to Amazon and then probably, if you are shopping for say, a refrigerator and the subsequent browse history will, let you capture this refrigerator section and then start recommending the products for you. Right. even, when it comes to instant personalization, right? Some other, classic example, most of us might have watched Netflix or Amazon. for instance, when we, when we open Amazon, the platform quickly analyzes the past viewing history and suggests the shows or movies that, We might like, similarly an online shopping that I was just talking about Amazon, as we just browse through the product, we will all noticed personalized recommendations based on, our preferences, such as people who bought this also bought this all made possible by real time analytics. when it comes to business agility, as I was talking about, this. Empowers the business to adopt or, to the situation in sub milliseconds to milliseconds. For example, let's take ride sharing apps like Uber or Lyft, right? they do adjust prices dynamically based on the demand, traffic conditions, and number of, available drivers, right? when demand, spikes, during the rush hour, or probably a special event or an event that's happening. the system quickly increases, prices to encourage more drivers to get on the road. This kind of fast decision making help business, stay, competitive and responsive, right? these benefits show how real time analytics is just not a technical capability, but a game changer in creating better, faster, and more personalized experiences for other customers. Right. As I just talked through, quickly, so this slide talks about, how the major, retailers are doing it. And then. how the travel industry, does the, dynamic pricing adjustment. Even it's the same thing. if you are, planning to, look for, a flight ticket, so as more and more searches are more and more, clicks through come in or more and more demand comes in, the price of the flight, the, the ticket price at the flight gets increased or decreased. So, This empowers the business to do a lot many things in instant personalization, probably the business agility is one of the primary thing. So again, when it comes to the bigger aspect of the real time analytics, I think we can certainly help, this helps us in anticipating the required customer needs, right? So this gives what exactly a customer needs. So let's assume, if we go to, Starbucks and then they call your name as soon as they see you, and then probably tell what is required, and then probably what goes along with you, you would start looking to be a more associated customer with that particular store or the, with that particular company, right? And when we talk about churn reduction, I think probably the friction, that it creates, it helps in proactively addressing the issues before even they escalate. So, this obviously helps in, improving the, complete, customer, engagement journey. so let's deep dive into how do we design and then, what are all the things, that are, required in order to build the, the real time analytics pipeline. So primarily I'll just, go high level and then start talking everything at, So, data collection is a place where you start collecting, the data, right? as, as I was talking about, unless you have a data to act upon, you would not be able to predict or you would not be able to act upon it, right? The first step in the process is to collect the data and then collect it. Process the data. When I talk about process, the processing, the data, probably it could be cleansing the data or it could be on the lines of izing the data, making the value insights out of it. And then we, obviously we need a data storage to store the data. and then. The delivery, the data that is being, synthesized, needs to be, put in a business acumen or a business format where there can be some actionable insights around it, which is the data delivery component. And all boils down to your monitoring and resilience, where unless we have, and monitoring and resilience of the overall application in place. the reliability, stability, the accuracy of the data will not be accurate. So monitoring plays a bigger role in ensuring that the data quality, data governance, data stability, data accuracy, comes down into the picture. So let's, deep dive into, the first section of data source, right? So when I talk about, data collection, so there are two aspects of it. what is the data source? And then, what do we capture, right? So if you are taking an example of a data source, this is as I was talking about, this is starting point of any real time analytics pipeline, where the data originates, right? So probably data sources are the platforms or probably the systems or probably devices generating information that needs to be analyzed, right? so these are most of the systems that we do use almost daily, in our, life. So probably like most of us will be using websites, right? so what do we capture and where do we capture, right? So in terms of the places of the availability, it comes down to, some of these data sources, right? So websites where we do capture user activities, like what the user has, viewed, or clicked, or probably like what the user has submitted, right? Okay. And, the second aspect is mobile apps. I think mobile apps helps us in getting user interactions like app navigations, probably product searches and, purchases, right? I would, this, in the last, couple of years, I think, or maybe I would say in the last, 10 to 15 years, IOT devices has become a part of our life, right? So it could be on home sensors, probably the wearables or maybe industrial equipment that continuously generate the data, telemetry data, right? For the application health or monitoring the activity around it. And so, at times we also need to parse through the logs, the server that generate logs or probably application logs. that track the system performance, errors, and maybe the user interactions too, right? And we all are equipped with social media. So, I think real time opinions of the customers matter a lot and then the trends that are happening and the mentions from the platforms like Twitter, Meta, or Instagram. And it comes down to some of this payment system where the transactional data that records the real time purchases, refunds, or the subscription renewals that we have. So, there's numerous places the data sources can is available. so these are all the places that, that are widely used to derive some kind of a real time analytics, right? yes, the data is available. Now, how do we capture it? Right? so as I talked about data in a, in a real time, you, you have to fairly understand, or we all have to fairly understand, it's a continuous mechanism. It's not a bad job. As soon as an event happens, you are acting upon it or the system is acting upon it, right? So data is collected continuously and transmitted in real time, right? Some of these things can be achieved by using technologies like APIs or probably like the event trackers or data streaming frameworks. Each, data source, that I just talked about, contributes to unique insights, that helps business better understand their operations and customers, right? probably let's, take an example of healthcare. maybe, they might not be as interested, on the payment systems compared to, the, website or the transactions that they are dealing with. When it comes to the retail, I think they might be more interested on social media to see how the guest purchases are happening. And then probably, the websites at which they are browsing or the competitive, sales that are happening, they need all that information. So depends on the business, at which we are in, probably, the source, various matters. Um, so some examples I can certainly talk, through as a, real time example. I think as we just talked about a mission, every time a user, searches for a product, clicks on an item. So, and then add an item to the cart. and then probably go to the payment page and then complete a purchase, right? So in this whole journey, so the data flows continuously, into the, real time, analytics, allowing the system to act immediately, right? As soon as, this process is happening. We will start providing the recommendations and then track inventory and then analyze customer preferences for the future recommendations. and say simple example, for instance, if the user, was looking for the wireless heads, headphones, right? So, this particular real time analytics helps you to collect, I mean, collects this input and then it immediately processes it to suggest what's a related product or specific brands or accessories associated with it. So sometimes, I think in these days, I think most of these larger, companies do have, the mobile app based and then, the, web based application. So, sometimes we have to understand, the ty in the session between the mobile app. And, the web is highly required. Let's assume that, we went to Amazon app and then tried searching for a product. We wanted to show the recommendations when we go to Amazon web as well. So in the real time example. So this, when we start capturing the data, understanding, the business, operation and business agility and business need, the data will be captured and the data will be transmitted to the system. so. So, the primary step, in the complete, data processing is, I mean, there's a bread and butter of the application, right? We have data. If you cannot make the use of the data, it does not make a lot of sense to derive, the actionable inserts out of it. so in this particular step, the raw data is being collected, from the different sources as I talked about. analyzed, transformed, cleansed, cleaned, into something useful. So this step makes the data to be more understandable and actionable for real time data making. let's assume an example of, we wanted to do what a customer expects. Unless the customer has a continuity pattern, we would not be able to derive the real time insights. So if I am putting in a real time example, let us assume that, we wanted to understand, what does Amazon make, in terms of revenue tomorrow, right? Unless we have the historical data, which is continuous, which is actionable, which makes the average, which makes the mean, which makes some statistical data around, you know, the historical data, we would not be able to act upon it. So, as soon as we start on the day one, we might not be able to derive lot many insights around it. Yes, we can do some geo based and few other attributes, but A continuity or a, a, historical approach is always required in order to make this, real time decision making, right? during this step, we will understand whether this data can be actionable or this data will not be actionable. Right. so, what exactly happens, during data processing, right? So when the data enters into our system, it, it might be messy or probably pretty unstructured, right. Because as I said, we can get the data from website, sorry, mobile apps, and then probably some payment systems, IOT devices. Yeah. we have numerous, data sources that we are getting it from. Each has its own data structure. Sometimes it might be too much of data. Sometimes it is too less of a data, right? When it comes to processing, it primarily involves, data cleaning, as I said, during this, phase, we will try to, fix, the errors, probably, duplicated data. And, and we also check whether the data that we have received is complete or not. say, like, simple example, if, we we get, an identical purchase from two sources, the system has to keep only one, right? yes, there are times where we might need the duplicated data for some actionable insights, but many a times, when we talk about, making some of these real time decision making, the data duplication is, can be avoided, right? and data transformation, typically transform, into the format that, is required in the subsequent steps around, for data visualization or maybe a data delivery aspect. so a simple example was, I think we, as I talked about, like the clickstream data, the user data that is being captured on the website can be translated into, something like, what's the most clicked product in the last 10 minutes. And at this point of time, it also involves something like analysis and insights, as I just talked about, is this data can be used or not, right? something like we can use some kind of an algorithm or probably like some kind of a machine learning models to generate insights around it, right? Predicting which products a user might, mostly like to buy based on, their behavior. so, we talked about what are the steps involved. Just I'm trying to run down data cleaning, data transformation, and probably analysis and insight. So, what are the industry tools, looks like, right? So, I think when it comes to data streaming, without any say, I think Apache Flink or Apache Spark are pioneers in this. system when it comes to open source, framework, right? So it can handle large amounts of data in real time. I have, tested myself and I'm very well, aware of it. And, there are times where, I have, seen a use cases where, we had to process more than million records, per second, and then I did not have any problem. Right. and, when it comes to AWS, AWS Lambda, I think, it has the capability to execute, small tasks, quickly, as and when new data comes in, right. It will be quite helpful when it comes to, the delta variations that we talk about. Um, and another thing, Google Dataflow, I think, it process, streams of data, from various sources and transforms them, similar to a full fledged product that's available from, Google. Um, and as I talked about, why is this important, right? without data processing, the raw data that is being collected is too much and too kinetic to understand for reuse. unless, we process it, organize it and analyze the data, to extract meaningful patterns and trends, it will not be considered for, the actionable insights. So, again, some real time example that I can talk about, right? we just talked about a user searched for a product and within milliseconds, the system process this data to update the personalized recommendation. As simple as that, when we try to buy a washer, the subsequent recommendation should be more on the dryer, but not like, say, some toilet or something, right? So, it also analyzes how many searches are happening at the same time, to adjust inventory and price, probably recommendations, et cetera. so, like, in the case of, the Uber ride or Lyft ride that we talked about or the ride sharing, any of the popular ride sharing apps, the app process the data like what's the rider's location, what's the driver availability, how does the traffic look, look like and what is the, event process, any subsequent events are happening, the traffic volume, climatic conditions and probably and, define the estimated phase, right? So it has to process all of it, make a reasonable insight, and, in majority of the cases, we see that, this will help, the process happens in real time, ensuring, we get, pretty accurate, ETAs and probably, the pricing as well, right? and when it comes to, some of these financial institute, institutions where, During banking and fraud detection companies, when a transaction occurs, the data processing checks for, say, probably particular patterns to detect a fraud. if you go, this is where it happens and I think, Many of us would have encountered if we have used some of our credit card in a shop that we have not been or probably tagged for an unusual spending, it blocks then and there. So at this point of time, the system detected our user pattern, the location at which it happened, and the type of transaction that is happening and the amount at which it happened. And probably it also has to churn through what is the limits that we have kept in. So, this has absolutely empowered, like, there's too many numerous examples, right? This has empowered, the banking and, many industries to act on the real time, which has a great potential, right? So the data processing, plays a huge role to understand what are all the attributes and how do I translate the data, right? So again, when it comes to data storage, I think pretty much it is, the storage, where the processed data is saved, right? so it can be accessed, quickly or in near real time as and when needed. Again, every section of this is quite crucial, Pat, because it ensures the system has a reli and it has to be the fault tolerant, HA, high availability, ZEO located because, it's a reliable place where the data is available to act upon and probably retrieve the infor information without any delays. Right. what happens exactly at the, data storage, situation? once the data has been processed, it needs to be stored, in a way that, it supports or empowers real time access. Right. data must be accessible, instantly and then probably for any, analytics and maybe the decision making. Right. And as I was talking about, you understand the volume at which it comes like, so like if you take a credit card example, I just talked about, there are numerous transactions, numerous ways, numerous credit cards, numerous locations happens. so the storage has to be able to handle, extremely high volumes of data. Right. so it can, retrieve fastly, it can act upon fastly, right? And as, as it boils down to, unless you have a reliable data storage, it has to be stored, in a secure, manner. When I talk about secure manner, I'm not talking about. PCI, HIPAA and all depends on the industry and depends on the data that we are storing. This all boils down when I talk about security, it has to be, HA, high availability, multiple zones, and then probably it should have, data duplication. And some of this should happen in case of, one data center on one system goes down, you have a big backup data available to act upon. when it comes to, data storage, typically talking about in, real time analytics, some of these classic, things are, in memory stories. so, So data is stored, in memory, like for a quick action, right? Let's, assume that an user was, trying to, act, say, trying to add to cart. And then, the user was playing, in that pace for a good amount of time. So at that point of time, we can immediately act on that event and start throwing a promotion on it. If, if, if that's what the business was looking for. Right. So, I think some of these classic, in, in storage, or in memory, things for ultra fast access where, it has a TTL based and all, Redis and memcache plays a bigger role. I think mostly, the, Redis and Memcached is primarily used, to pull recommendations from the cache data to respond instantly, right? because, As soon as we, get onto the website, we want the quicker recommendations, right? So, and, when it comes to, another set of, databases, I think NoSQL is another popular database. I think this is primarily required for high scale and probably large scale data storage. where, we might have to make, too many, conditions to derive, to the insert, right? say, I think Mongo and Cassandra and DynamoDB, some of these are pretty popular NoSQL databases. a ride sharing app like, Uber, probably, store the trip details. Right. that's what we see whenever we go to the ride sharing app and, and, probably it also show the driver availability for a quick, updates and probably retrieval. so, and, in, in, like with the, machine learning and AI in boom, I think time series, predictions are, used, heavily. so, and one of the popular, I think for real time, analytics is, using the time series databases. so, ideal, for the, data that is, Changing over a time, right? When we talk about your, xaxis, Xaxis derives your time, and then why access is your, attribute data. So let's assume that as I just talked, about like one of the example, What does my company make tomorrow? What does my company make day after tomorrow? What does my company make, after three days, right? When I take about what does my company make, it could be many attributes, right? What does my sales look like? What does my orders look like? Or maybe what does my store make it? So based on this data, many, Things can be done, in terms of, I would be able to, project my inventory. I would be able to project the volume of the data that is coming in. And I would be able to project the, people that are entering into the store. So I will be able to equip well in advance, right? so, some of these time series, databases are, Influx is one of the popular DB and then, we have TimescaleDB and then Druid is also another DB that we all refer to, right? and, so some of these, monitoring, systems like Prometheus, use, time series gb, to store, server, performance data, in real time. and, and another, thing when it talks about, the real time analytics, say sometimes we wanted to store the data at a larger level. and then, historical data that I talked about. so in order for the. machine learning models to accurately work. Sometimes we wanted to give a lot of input data to accommodate seasonal, to accommodate climatic situation, to accommodate how the journey of the system is. So we need large, data storages, right? At that point of time, we, jump into, data lakes. so, this is where, either we can store the raw data or processed data, if you wanted to do, Large data processing or large data volumes at the, uh, in the future case, right? So Amazon S3, probably and Hadoop, HDFS are, the, some of these, data lakes that are in the market, right? so, if you take, Netflix as an example, it stores, probably the user data in, data lakes for later analysis, such as, finding what's the long term viewing trends looks like, right? so what, when we talk about all these, four, different styles of, data storage systems like in memory or could be NoSQL or could be time series or could be a data lake. What are the key features? that we need to look after a data storage system, right? it should have an absolute low latency. It does not matter what kind of a DB that we use, right? it should have a fast read and probably write, speeds as well for the real time accesses. and scalability ability to handle the growing volume of traffic or growing data volume without slowing down, on the picture and, fault tolerance, as I said earlier. so there is initially started off, the data storage should not lose any data, right? Even if there is a hardware failure, right? With all this it also boils down to how flexible to access it, right? it should, do we need a structured data, semi structured data or unstructured data? So, when we think about the data storage, fault tolerancy, low latency, and then probably scalability and then accessibility, plays a bigger role, right? So, this is what my, reading, through some of how Amazon inventory data is being stored, right? so in, so the real time inventory, data has been, like the, when we talk about an inventory, the item inventory, is being stored in, DynamoDB. Amazon's DynamoDB, and then it ensures, accurate, stock updates, for millions of products, right? When an item is purchased, the system instantly updates, inventory level, so the customer only sees what is in stock, right? so when it comes to, probably, Netflix streaming recommendations, right? It stores data on, user viewing habits. Like, what are the last movies or the shows that have been seen and what's the preferences in, in a distributed DB like Cassandra, I mean, the data is being accessed in real time to suggest, the movies or the shows are tailored to each user, right? IoT sensors, so smart homes, I think most of us are equipped with smart homes in one or other way these days, smart thermostats and cameras. send, DB, like InfluxDB, right? this has been stored and processed to send alerts and then adjust, things in real time. I think, some of these thermostats also are equipped to, store all these in near real time, especially some of these, thermostats where it looks after the temperature and then it tries to adjust what are we using, how are we using, echo things and then it adjusts the temperatures, right. so, as we talked about, all these aspects of, data collection and data processing, as much as every component is important, data storage plays a huge role. This is, the place where we have equipped or we have stored all this, data, right? without the efficient, data storage, I think, even the best of the best real time processing systems that we have built in terms of collecting the data, in terms of processing, it would absolutely fail, because it would not be available, when required and then, high latency would, frustrate the users, because most of the time when we open some of these, tools for recommendations, the users do expect instant responses. and then the other challenge would also be if the data is being lost, we would not be able to make accurate reliability or, we cannot make accurate insights out of it, in the long term. Right. and, as we talked about a lot of things about data storage, yes, we have the data storage available. How do we make the data storage available to make some actionable insights or share it with the users? Right. so this is probably the final step, though I have put monitoring and resilience as a thing. I think that should boil down with every component of it, right? So when I talk about data delivery, this is a final step, in the real time analytics pipeline, where, the processed insights that is being stored. Right. associated with the users or applications, right? This is a place where we share it with the systems, applications, users to make actionable insights, right? so, what happens right during the data delivery? I think in the simpler terms, the data is made accessible to the required destinations, right? what could be some of these destinations? You could either throw it in some of these visualization tools like super site looker, where, you provide them insights about how the user journey or how the user analytics are, for the teams or, stakeholders to act upon, right? And, sometimes you have to provide, this data through, APS and microservices, as well, to deliver data to act upon. you might want it to put it in, some of these, Grafana or some of these systems where, it does a real time alert, right? So when you identify a, anomaly, like in the case of a credit card, situation, where a fraud transaction is being detected, you wanted to alert the user immediately, right? and sometimes you also wanted to take some kind of a trigger, right? where, you wanted to adjust the price of the, so the, in the, in the previous case, you alerted it and then you block the transaction attempts the, in the other case where you, decided, that there is an some anomaly or some, event, happening like in the case of o Uber or Lyft Ride sharing app, where, we wanted to, update a price of, the ride, on the fly, right? so. How the data is being, how data is delivered, it, it relies on, multiple ways, right? Probably it could be a RESTful API, or it could be a WebSocket, it could be a Kafka topic where it misses brokers, or, there are some real time notifications or, There are plenty of ways where, the data is being delivered to, the systems to act upon, the things, for visualization or probably, maybe for notifications or maybe alert triggers. So what are the types of, data delivery, right? I'm, I'm just going to take a minute and then talk through, types of data delivery and some real time examples, right? So a push delivery, right? sending data proactively to users or systems, right? say, a, a notification pops up on our phone, when a Uber ride is, about to arrive, right? Pull delivery, right? Allowing users of the system, to request data when required, right? so, simple example was a stock trading app lets you refresh the page to update the market prices, right? and, These are the ways that, the delivery happen in, in the case of push notification, we practically send it. And then the other case, the data is available, pull it when required, right? so again, as we just talked about what are the key features of data storage, what are the key features of, an effective delivery, right? in this particular case, the low latency plays again a bigger role because it has to ensure that there is an, near instantaneous delivery, right? and, here the reliability plays a bigger role because it's not an asynchronous, request that you make to ensure that, it's not a fire and forget, right? There should be a way that it guarantee guarantees that, the data is, delivered correctly. Even during, system failures assume that if you are on Uber ride, you are waiting for the Uber car to come up and then you never have received a notification, then it's gonna be a juggle to identify which driver, which car, what information and not right. And also it boils down to the security aspect, right? It has to ensure that when we send the data, it protects the data sensitivity and then data during the data transformation. so I think, this, brings down to, the section of, the, how do we design and what are the popular structure, popular, tools that are available in the market and how do we, get to a state where we design an end to end aspect of it. again, when it comes to implementing the real time analytics, we have to assess what's the current state, what kind of an infrastructure that is available. maybe for everyone, Kafka might not be a feasible solution. Maybe for everyone, AWS Lambda might not be a feasible solution. So depends on where the data is available, how the data is available, we will start with understanding the existing data infrastructure and then derive what are the possible capabilities. and, As the data is available, let's try to understand what's the objective that we are trying to achieve, right? When it comes to real time analytics, is this, a business driven or can we make a data driven decision making out of it, right? so when we start thinking about, the objective and understand the state at which the data is available and the possible options available within the company or within the system that you are working on, probably as I just discussed, discussed about the various components of the technology pieces, let's pick the appropriate technologies and, and again, it all boils down to, the, how do you deploy and optimize, right? So, I, as I talked about it, when you deploy and optimize, right, the final piece of information, as soon as you are getting into production or trying to get into your production, right, the monitoring and resilience plays a huge role, right? I think, this will ensure that the key components or the key features that are required for either in data storage or data processing, data collection or data delivery. This piece ensures that the system operates smoothly and it can handle the, unexpected issues. and probably measure the performance goals like up 99. 99 percent, which is what I was talking about. We wanted to ensure that we do achieve a no, no downtime, which talks also about 99. 99 percent uptime, right? So what is monitoring, right? So primarily monitoring involves continuously observing the health, and performance of the system metrics, right? so this includes probably tracking metrics like response time, throughput errors, and, what's the resource utilization. if something goes wrong, monitoring systems, trigger alerts for quick intervention, right? What is resilience, right? it's the system ability to recover from failures, right? And continue functioning, without, any, any downtime or probably the data loss, right? So this will help to ensure that, the system that we are building, the pipeline that we are building, is reliable and customer satisfied even under the, high stress or, unexpected events. So some like say probably when we talk about, the, like some of the components of monitoring, we are talking about checking the CPU memory, disk usage and network performance, right? so it's more like ensuring that the servers don't get overloaded, on a Black Friday sale, on an e commerce retail platform, right? And we also talk about application performance monitoring, where we talk about, tracking the behavior of an application, for latency errors and probably response times, et cetera, right? something like, monitoring a, ride sharing app to ensure that ride requests are processed in milliseconds, right? And, so probably like, you, we wanted to collect the logs to see, for debugging the errors, right? and, and when it comes to alerts and notifications, Grafana, Prometheus, some of these tools helps to build the dashboards and then probably raise a real time alerts, within the system. in case if the data pipeline crashes, or maybe the operating, the operations team, needs to be notified, or the team that has been responsible to derive some of this work can be notified through, Teams, Slack, email, SMS, PagerDuty, and some of these, alerts that are available, in the, system. while we do all of it, we wanted to ensure that, it can handle failures, and then probably is the traffic evenly distributed across servers to prevent overload, and then automatically adjust the number of, resources like auto scaling, in place, and make sure that is the data copy between multiple regions or maybe multiple servers. happening accurately, and, say probably, redirects the traffic to the backup, backup system when there is a primary system fails, right? So, I mean, we do have plenty of performance testing tools and chaos engineering tools to derive, some of these things, right? I have seen an article about how Netflix is, uses chaos engineering to test, resilience by deliberately, introducing the failures into the system. System to ensure it can handle the real world descriptions, right? and, and, and most retailers, do have to thoroughly extremely test the server performance, to ensure that it, it's ready for the big sale days, like Amazon Prime Day or probably Black Friday, cyber Monday, and some of these large sale events, right? So at that point of time, they, how to ensure that auto-scaling, can process these, data, right? So in order to accommodate, in order to accommodate, the, the real time data analytics pipeline, as much as we build these systems, we wanted to ensure that the monitoring and resilience is built in every component of it to handle, the, large amount of data that's available, right? so I think what's the sentiment analysis on this, right? So when it talks about real time monitoring, as I just talked about, you continuously monitor and analyze customer emotions across media reviews and support tickets and chat interactions with AI powered sentiment detection. And then, Proactively reach out to understand how the customer, is feeling and then try to translate that negative feedback into positive, experiences by, before the customer reach, we practically can reach out to identify what's happening. And then at the same time, we can also. Understand is this a trend across the board, right, to uncover some of the larger problems that the application is facing. So, if we start doing that sentiment analysis on the customer support side of things where it is applicable, so it absolutely helps the customer support with 67 percent faster processing requests where there is a dramatic improvement in response time to discuss, to customer issues. And, 90%, it's the first contact resolution, right, where the ticket is not being bounced between levels to, resolve the issue quickly. and this also helps with, an auto chart, application where we don't need a physical presence to, address all those, things. this is what, the industry and some of these articles talks about where, average, ROI, could be, an impressive. There are companies which has talked through that. They have achieved 287%, within the first year of deployment. And, And 94 percent of them, the customer, feedback, talks about the real time personalization, plays a bigger role, in terms of maintaining the brand loyalty and then, so, and it takes on a year, time to implement all these things, stabilize the data and understand what works and what does not work, right. So, as we, get into end aspect of the complete real time data, analytics pipeline. So, let's take an end to end example, right? So, I'm just taking an Amazon as an example. A customer searches for a product, on an e commerce website. so, so the search event, the event that the customer has been, searched will be sent to, this is your data collection aspect, website, and then the data, right? The data. is being sent to a Kafka topic, right? so data, if you remember the architecture that we were talking about, data, source, and then the, data capture. So data source is your website where we have captured it. And then data, Capture is placed where we have sent it to Kafka, right? And now, as soon as the user searches for the product, and then it is being sent to Kafka for ingestion, and then we have Apache Flink to process that data to check what's the preference and suggested product. And it is stored in the DynamoDB, for the quick access. And then the data delivery talks about the real time recommendation that appears, to the customers on the screen within milliseconds. And then, Prometheus monitors across all these pipelines, to ensure that, the system works in resiliency, right. so, what does, the real time analytics, do in this case? Right. It helps to stay competitive, in the, in today's, digital fast-paced era, deliver personalized, experiences at a scale. And, absolutely drive, measurable, results in customer satisfaction, loyalty and revenue, right? so I think as we talk about, at the end of, this, so I think companies, like Netflix, Amazon, Uber and, many retailers and many healthcare industries and many industries, started setting the gold standards for real time analytics. by following the design, pattern and architecture that has been discussed, I think any business can, transform its operations and the customer experience. with that, I would like to, thank you again for taking this time to, get onto my, session. if you have any questions, please do connect me on LinkedIn, Amaren, in Madison City. I would like to connect with you and start, helping you out as much as I can. Thank you all. have a good day and, good evening.
...

Amarnath Immadisetty

Engineering Leader @ Lowe’s Companies Inc

Amarnath Immadisetty's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)