Conf42 Python 2025 - Online

- premiere 5PM GMT

Python at Scale: Building Real-Time Customer Analytics Engines for the Digital Age

Video size:

Abstract

Discover how Python is revolutionizing real-time customer analytics at enterprise scale. From processing millions of interactions per second to building ML-powered recommendation engines, learn battle-tested patterns for high-performance data pipelines.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hey everyone, I'm Amar. I'm thrilled to welcome you to the session. As a senior manager of software engineering at Lowe's, I have had the privilege of leading the teams that design and deliver platforms capable of supporting millions of users and millions of transactions. Today we are here to learn some of it and then deep dive into how real time data is transforming customer experience at a scale, discover how millions of interactions are being processed in near real time or every second, and then how do we build ML powered recommendation engines and then how Python helps create fast, efficient data systems. just think about this, right? if you could process over a million records or million transactions every second, while, maintaining a near, uptime, a near perfect uptime, and provide actionable insights quickly. Right. Imagine the competitive advantage that, brings whether it is delivering lightning fast personalization recommendations or enabling, real time decision making or proactively addressing customer needs, before they even arise. In this session, we will uncover the architectures required, and strategies that companies are using to achieve these goals. I'll share, the real, world examples, walk you through, the architectures, technical designs, and probably, the challenges and how do we overcome that. whether you are here to supercharge your technical expertise, gather actionable insights, to tackle, the current challenges that you are, your team is facing, or probably like, just simply exploring the incredible possibilities of, the real time analytics, you're in the right place, let's get, going. So let's understand, the customer expectations. What are the customers expecting, lately? so in today's hyperconnected digital landscape, I think, a, a survey indication, says that 82 percent of the customers now expect real time. personalized experiences that instantly, adapt to their individual preference and then probably immediate need, right? so the companies who have strategically deployed and try to solve these, real time personalized, situation, have seen a remarkable, 26 percent boost in the customer satisfaction and an extraordinary, a customer, lifetime value of 3. 2, X. So let's get, to, how do we, deal with, dealing with, These, scenarios to unlock the customer insights, right? in today's competitive, digital space, I think, primarily the industry leaders, Amazon could be Amazon or could be Netflix or in the, in the, in the space that they are leading are ha, harnessing, the power of, like advanced algorithms, and probably mission learning models to process and analyze, massive streams of data in real time. By managing over 1 million, customer transactions every second, these companies are able to extract immediate or valuable insights instantly to understand what is the guest or customer is expecting. Right. Just try to understand how does this algorithms work or what is this real time data about it. I am going to deep dive much into the technical aspects pretty soon. But imagine how does this algorithms work and then how do we get to a state of talking through these real time monitoring or predictive analysis or sentiment analysis. So the first step in the process is the data collection. So in this particular, scenario, it talks about, the data that is being collected from various sources like web activity, IoT devices, and the transactions that are happening. And then the data is in the raw format, so it has to be processed, cleansed. so some, some examples could be removing duplicates or handling missing values or maybe, adding synthetic data in case the data is not available. All these steps are being done during the pre processing step, right? And this data is being fed to, the, pre trained, models, to, pre trained ML models, right? at this point of time, what do we certainly look around is we understand, or analyze the behavior, patterns such as like collaborative filter for recommendations, and then probably predict the outcome, say in, in, in the case of retail, situation, right? is the, what is the guest looking to buy? What is the guest looking to buy? What item are probably, is the guest going to make the transaction, right? And then, we could be also able to segment customers dynamically based on some of these real time interactions. So, So, one biggest, I think, so these models will, enable the decision, a decision engine, wherein algorithms, that trigger actions, these algorithms trigger actions such as, personalized offers or product recommendations or alerts. Say probably like, say, once this particular, event is being triggered, but we also wanted to have a feedback loop. which talks about, the user responses, are fed back into the system to improve, the models through continuous learning. So if I'm just trying to jot down, the things that we just talked about, so there's a data collection system, there's a data processing system, and you have an ML model, which, the model will be pre trying to determine, Behavioral pattern, predict outcomes, and then probably do the segmentations as required. And then the outcome of that is a decision engine, which will be, the, say some examples could be like, in the case of retail, it could be a personalized offers, or it could be, a personalized pricing or a dynamic, personalization, is something that would be done. And then the data would also be retrained with the feedback loop, right? So let us dive into, how, the, industry retailers are using or how, some of the success stories that we wanted to talk about, right? So here I have put down like two examples, like, so, probably like in, in, in a retail industry and then the dynamic pricing. So I wanted to take a minute and understand, help you out or, talk through, you know, What are all the use cases that are possible, by using it, right? Some of the use cases possible. So let's assume that, let's take an example of Amazon, where, it recommends products dynamically based on browsing, probably purchase history and similar user data. So the outcome of this is obviously higher conversion and then it also gives us the customer, it enhances the loyalty. So when it comes to banking and finance, the fraud detection algorithms, flag unusual activity or a transaction in, real time prevents, so the outcome result is to prevent losses and improve trust. Say, I think some of us, who might have been shopping in a different state or shopping in a different place where the system did not detect the, transaction in the past street, it tries to block it. So, in a near real time, it helps to avoid, the fraud transactions or probably if our cards are being lost, it helps in detecting that information and try to, stop the wrong transactions. So when it comes to, healthcare, I think probably, monitoring, the patient vitals through, IOT devices and, and alerting doctors, about anomalies, instantly. so this will help with the faster response and, probably like the better patient outcome, right? and when it comes to the media, so I think, the Netflix plays a bigger role, in a day to day, life of an person. So Netflix, Netflix, probably like personalizes. content recommendation, based on the viewing history. And then, so probably like a time based pattern, like it derives a mood, the mood of a person, and then tries to detect the time based pattern. And then probably based on the demographics, right? so this will give increased viewer engagement and retention, right? when it comes to retail, so some of these Personalized discount offers to in store customers based on their mobile app, using and browsing data, right? It turns out to be improved sales and then in store experiences. When it comes to airlines, airlines can dynamically, airlines do dynamically, adjust the pricing, or suggest add ons based on user search behavior. it's an, again, optimized revenue and better customer targeting. So this directly, empowers, the customer, And this directly, empowers, enhanced customer experience and it also helps in the customer revenue. So, it's, it's a win win situation in, the cases where the real time, analytics are being, deployed. So let's move on to the next, slide. So let's try to understand, I have just, gone through, talking about, a quick rundown of, what is, what are the components of building a real time, framework, right? So let's try to understand, a little, detail. So, It's primarily, five to six, six step process. So, it's a data ingestion, where, millions of transactions per second, is being handled. I'm going to talk through what are all the, design in detail. but, in this particular case, there are various, sources of, data ingestion and then these data ingestion helps to handle millions of transactions to get from diverse, sources like web, IoT devices, transaction systems, payment systems, et cetera. so, and then the next step in the process is, you call it as a stream processing, you want this to be, processed as a stream, not as a batch. so, during this layer, we process the data, cleanse the data, and then, like remove duplicates, and then take actionable, insights to get into the actionable insights. You, like, this is a critical layer through which, where the data cleansing happens, right? And, then, we will have the data storage, obviously, when it comes to the real time analytics in order to take some, insights into it, the, the data will be stored, in, like says, either, NoSQL or SQL or, depends on the data sometimes we wanted to store in time series databases. And here come, and next the mission learning engine. in this case, we, we will try to generate the real time recommendations, predictions, and provide the customer insights. And then, the insights has to be, sent to, the, like, for the consumers to consume. and, like we, we will. Empower this data through APIs or through service layer. so at this phase you have a two mechanisms where we call it as a push delivery and pull delivery where, the model is emitting the data so someone can consume it. Or, in another case where we, have a pull delivery where we keep it open and the consumers will start listening to us and start getting it. So, depends on, the use case situation, the teams, will choose the option. And when it all comes down to, as we implement the great, framework, we also wanted to ensure that. There's monitoring and observability pieces has been covered, right? so, just to ensure that the application runs stable, accurate and scalable. so let's try to dive, deep into, the, data ingestion pipeline, right? So I'm going to take one, one of the technology as an example. And then, probably I'll say, maybe what are all the tools available, to do some of this work. Thank you. so I'm taking Kafka as an example, as a streaming platform where, so from various sources of, the, systems are the data, emitting systems. we will start sending the data to a Kafka. and then, we will start, having a processing layer read from the Kafka. If you are imagining a web, a web application. web or IOT devices or mobile apps or could be social media or could be transactional system. All these will start emitting the data to Kafka. So we provide an API layer or the system provides an API layer or, through other mechanisms, it starts emitting Kafka. When we create a af, when we, consider Kafka, for our data ion aspect, some things to consider, for an better performing Kafka. So probably segmenting, the. Kafka, by the customer partition, or probably like I'll put in a different word. Partitioned, by, the customer or event type is required to maintain the sequencing if required. and probably like, obviously Kafka comes up with the compression. It has, I believe, three of them, Snap, EasyZip, and, and I forgot the other one, but I think depends on, the need basis. I think using one of these algorithm will help for the faster throughput, right? when, when we are reading or when we are, writing into it. So I, I personally like to use Snappy for faster throughput, but I think all the other, compressions would work great. When it comes to, streaming, I think the popular things in the market is like Apache Spark and Flink. so one thing to consider when we are thinking about, one of these, streaming processing system is to enrich, data by, joining, the customer, profiles, either in a, Redis or DynamoDB. so aggregate metrics such as clicks, probably like purchase or, session duration and then probably like, filter out, like, the invalid data or probably like we just, in a simpler words, filter out the noisy enables or noisy records, using, some of these Python rule, based, engines. Thanks. So, when it comes to storage, I think there are multiple variations of storage. We have an hot storage probably for the real time data. So, DynamoDB or Elasticsearch and then probably if you wanted to do some historical analysis and create. year on year comparison or month on month comparison. So storing the data in, S3 or HDFS or GCS buckets, will play a bigger, role. when it comes to, the, a critical portion of it to get the insights, out of it, like feeding the data to mission learning, model. So it's training and, inference where we wanted to train, models, offline. so that you're using some customer interaction, probably like from the data lakes and then, use, frameworks like say PyTorch, to, do some of this work. And then probably, one thing is that it's always better to store in, store these models in a centralized model registry. And then, some models, using this, one of these fast API based REST service or probably model serving framework like TensorFlow, serving. and, absolutely use Feature Store, like Feast, to handle, the real time features. when it comes to API layer, I think, just to, integrate close with the API can be written in many languages and many, frameworks, but just to integrate closely and, continue to use the same libraries across the board. it is better to expose RESTAR, gRPC APIs to solve this real time recommendations and then probably insight. So certainly we can use FAST and FLASK APIs to derive some of this information. when it comes to Now the monitoring aspects of it, I think, we wanted to use Prometheus, to monitor this Kafka metrics, like, is there a consumer lag, or is the data flowing through, properly, et cetera. And then we wanted to use Grafana dashboards for, pipeline performance. And then, you know, We wanted to use Elk Stack to see and write the errors. so in order to achieve, the high, performance patterns, certainly consider, parallel processing is one thing to consider something like, use, Python's multiprocessing or tools like Ray or CPU based tasks or ensure that Thread safe, data processing, right? and, see, the places where the caching can be enabled, right? Caching frequently used data like user profiles in Redis or, to reduce, database queries. And then, back pressured, handling like, so use Kafka consumer groups to dynamically scale consumers. and autoscale compute resources in a cloud, like using Kubernetes or AWS Lambda, etc. Right. So when it comes to the tech stack, as I was just talking about, like data ingestion, we can use, Kafka, AWX, Kinesis, Stream Processing, Apache Spark, and then probably like Apache Flink. And when it comes to the storage, DynamoDB, S3, Cassandra, Influx, Druid, a lot many variations in that when it comes to ML training or inference, TensorFlow, PyTorch. and then, I call as a data delivery or a serving layer where we can use FastAPI, Flask, TensorFlow, Serving, or when it comes to a monitoring and observability, Elk Stack, right? So, as we talk about, millions of transactions, how do we want it to ensure that the scalability considerations are being, in picture, right? So, as we just talked about, ensure that, the Kafka is being partitioned, enough, and then it is being partitioned by consumer region or a customer region to handle, the load distribution, right? sharding, use sharded Redis cluster for low latency lookups. And then when it comes to horizontal scaling, scale, stream processing, by adding more work on outside. So some of the challenges, when it comes to delivering, the, high performance or high scalable near real time with 99. 9 percent uptime, systems include say, how do we handle high data ingestion rate? Use Kafka with, probably like multiple partitions. And then, if you want low latency predictions, deploy the ML edge or close to the, edge. consumer, or customer as much as possible when it comes to data consistency. like during the stream processing use exactly once, processing in Flink or Spark, for fault tolerance use Retrace and then probably like, the dead letter queues in Kafka. So, , this framework, ensures that, we are, we are building a scalable system for real time customer analytics. and at the same time, we also ensure that it po powers, the, actionable insights, immediately. So, like what is the, ROI by implementing in the real world, right? So, obviously, as we just, understand, it's an increased revenue where, we boost sales by 30 to 35 percent to 50 percent through, data driven personalization and, targeting, targeted market strategies, right? And then, improved customer loyalty, reduced customer churn by up to 25 percent with predictive insights and tailored engagement. H U, higher customer satisfaction through, personalization and, like proactive support, right? obviously, it all boils down to how do we operate efficiently, cutting down support and marketing expenses, by 20 to 30%. so what's the future of, customer experience? so when we talk about, the customer experience, it, it is not an, It used to be the case where the customer experience used to be the same for, all the customers that are entering a store or all the customers that are trying to get a service, right? So, the future of customer experience is hyper personalized, predictive experiences per customer. So, say if Amar wants to, Go to, say target, and then he gets to see what he wants to see. and then, so it helps with intelligent AI driven customer support where, the, Answers to some of these customer queries can be, quicker, no wait times, and then better customer, interactions. so, and like using this, we will have the real time omni channel interactions, a customer moving from a web to mobile app to a store and vice versa, or, coming from a marketing channel, coming from a marketing app. So this empowers to have like a seamless integration across the board. And then, using this adaptive analytics, we can be able to derive some of these strategic insights, right? advanced real time data analytics is, revolutionizing the complete customer experience and then, delivering, intelligent insights. And then, like customizing or deeply personalized interactions, between the customer and the customer reps, across every touch point. So what are the key takeaways, from this, real time data analytics? So, I think enhanced customer expect, expectation, Enhanced customer satisfaction and this both boils down to increasing the ROI, right? So what are the next steps in this journey? so like understand, what is the, what is, what data is available and then probably understand the critical touch points and then identifying, the high potential, analytical opportunities. Not every data can be, used. the data that is continuous and the data that is required for the business and the data that enhances the business, operations, enhances the customer experience, enhances the ROI is where we wanted to set some time and then understand, the customer interactions, and the data available and probably start with the next step. And then, depends on the technology stack that is already existing and then depending on the data volumes that is available. we will start deploying, advanced real time analytics platform, ensuring that, there's a seamless technological adoption and then probably ensure that it is cross functional, adaptability. so. In all the cases, we don't need to use Kafka. If you are getting like say 10 transactions every one hour, probably like they might directly emit to some of these, databases. We don't need to have Kafka. example I was just talking about, right? So as I just called out earlier, in every case, we don't need to have the same set of technology or same set of streams, right? I was just talking to you about how do we handle and high performance things. There might be cases where. The data volume is low and the insights that we are taking might not be immediate. It could be a batch processing, right? So understand the business acumen and understand the business use case and start deriving, the architecture around those patterns. So, as, we are almost end of our, session, let's embrace, the real time, evaluation, right, revolution. So when it comes to real time data analytics, as, we talked multiple times, it is transforming how the customer experience, should look like, right? and probably, businesses start gaining the competitive advantage and then probably building the customer, relation and, drive a sustainable growth, in this journey. So as, we come to the end, thank you for taking, time in, attending this session. if you would like to learn more or if you have any questions, feel free to reach out to me, on my LinkedIn, and you can reach out to my LinkedIn at amarnathimmadisetti. thank you again and, have a good day.
...

Amarnath Immadisetty

Engineering Leader @ Lowe’s Companies

Amarnath Immadisetty's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)