Conf42 Prompt Engineering 2024 - Online

- premiere 5PM GMT

Leveraging Machine Learning to Create a Reliable Anti-Fraud System

Abstract

For FinTech companies with high turnover, fraud and abusive customer behavior are common challenges that degrade product quality or exploit vulnerabilities. I propose a framework with key strategies that achieve 80% success with just 20% effort, effectively combating unwanted user actions.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. Today we'll talk about my favorite topic, anti fraud platform. We'll mainly talk about the FinTech case study, but we'll also talk about other industry examples and philosophies about generalized frameworks. Would be not excessive to say that I've been working on anti fraud for more than four years in a variety of B2C product companies. I also successfully applied the exact math that I studied at university and, Build, effective ML anti fraud pipelines in Yandex advertisement, working with the cheaters in Playrix and fraudsters in Axios. So the plan for today to begin with, we look at the example of what fraud is and why anti fraud is needed. We're going to analyze payment fraud in the FinTech industry and the cheaters in game. Next, we'll go from simple to complex. As any ML engineer would think in solving the problem of anti fraud, what would be the pitfalls and how to overcome them? We'll talk about ML system design, real time processing, and even about other step in pipeline development in anti fraud process, except ML. In the beginning, let's delve into the concept of anti fraud. Let's figure out why a company can benefit from it. To understand what anti fraud is, first you need to define the fraud, of course. Let's say these are customer sections that are not intended by the company, which result in deterioration of k metrics. If the client uses internal inefficiency of the company's mechanisms, fraud calls, endogeneous, otherwise exogeneous. Accordingly, we call anti fraud mechanism and process that prevent this malicious action. I'd say that the fight against fraud can be fully automated where the decisions are made based on data and understanding of the product, but without manual work, it is almost impossible to leave the anti fraud platform completely. The definition though, the metric was not chosen by chance. It's convenient for us to formalize fraud at the set of complex action of an agent with our internal system, when eventually has an effect on our metric. So for a unique user ID, for our fraudster, we can, compare the corresponding drop in money retention and so on. let's start with a simple example of fraud in mobile game dev. usually free to play games are built with paywalls, a difficult place where only a very skilled player can pass through, or they must pay in a in game stop for additional benefits. A froster can use some software to break the game and, bypass the most difficult levels. at the same time, if the game has an online competition with other participants, they Same player can latch huge advantage over others, which spoils impression for the rest people for sure. Here the key metric of detection will be number of attempts per level. How quickly a person passed a difficult level? Did it happen on the first time, for example? If yes, it's worth to think about it. Now it's getting a bit harder. why does any fintech, company have, so of course the place where customers money is stored, which means they're only mechanism. Also a mechanism for how this money gets there. We'll discuss, the study in payments, a transactional fraud linked to foreign exchange conversion rates. Let's imagine that, a techie from Europe used a, Payment system to place Euro into USD account. at the time of submitting the application, the billing system generate a document with a fixed price of Euro USD. However, the person actually had the right to pay it or not. Therefore, he has time to see where the Euro USD rate will move. if the Euro becomes cheaper, then a person can make a transaction. If it becomes more expensive, then, do not send money at all. Don't do transaction. Those, the payment billing creates an option for the client to buy dollars, sell errors when, the insurance of this option costs zero for client. Actually, all that remains for the client is just to monitor the fixed rate and create a fraud with a positive math expectation. Okay, let's build an, automatic anti fraud. Assume that, we have a. payment cost metric that takes into account the money that comes to us, like in one currency in Euro or Indian rupee or whatever. And, which goes to the payment, in another currency commissions, measure and, normalizes anality somehow, and many other indicators, complex, big metric. We will draw a time series for this metric with a blue line in time. Let's take the classic gradient boot boosting, train the prediction of the value, train, and predict variance of time series by autoregression. And, look at the segments and people in particular. draw orange line for a specific segment. let's call it India by country, segment, already at that stage, if we made a good, classical ML prediction of time series, we can identify the pitfalls, segment in detail. now we have a hypothesis that payment fraud may occurred rupee and dollar. a few words about ML. What about ML? I will be blunt here and, suggest that you don't waste, your time trying to optimize neural networks to catch anti fraud. there is a very simple and, understandable reason why. as theory of probability and ML algorithms in particular work when the same event can happen many times in similar conditions. If this is done, then we can create prop function, build statistic and make hypothesis, whatever. Obviously, it is quite difficult and long to train neural network. But, it's even harder to make it stable. And, here I can compare Antifraud with, crazy gladiatorian arena in which you never know what will happen in the next moment. So the same fraud can have many different complex variation. It is very quickly adapt to anti fraud and therefore neural networks go bad faster than you set it up. So better to forget about it here. At first glance, it's more or less simple. The model is trained offline on a data lake. Kubernetes launch a model in the persistent pod that lists the data from Kafka. this data is a user data. why is it so easy? if we don't have heavily loaded services and, we need only client data. Technically any ML engineer can adjust it nowadays, but it's not easy. In a realistic situation, usually predicting sometimes is quite a difficult task, and we need a lot of different representative data to increase accuracy, reduce variance also and so on. In our case, market data is generally necessary by definition of our fraud. What then? then we need a hire of team C developers who will design an effective reader of market quotations. You can imagine how can, How much the load on services will increase, the complexity of data security and so on. it is already worth thinking about the infrastructure. So the antifraud system doesn't slow down the load on the product itself and the customer do not experience delays. Therefore, we'll have a market provider and the aggregator block in our ML design. In fact, this is an extremely simplified scheme since aggregation itself includes many. processes start from data qualities, checks and, and a fair architecture of microservices that allows you to combine data in convenient tabular form with the correct timestamp. What other problem could be, it's too generalized, isn't it? Of course, we are trying to come up with a function. that, will well reflect the bad action of customers, but often an ensemble of model is needed. You need to clearly understand that in order to achieve high accuracy in anti fraud with a decent recall, it's not enough just to train model well. It's necessary to study very careful what kind of fraud you have and associate a separate set of metrics for each fraud. accordingly. So in our example, the generalized payment cost can be divided into commission costs, market arbitration, different platforms, arbitration time, cost, for, interaction with the payment system, counterparties and so on. this, our infrastructure is, expanding and requires an individual analyst, data scientist, approach to identify clusters of metric models that will give the best accuracy of forecasts. On the top of answer in the model, you need to choose the ideal algorithmic solution that will match your risk appetite, namely to answer the question, how often we can make a mistake. it's time to talk, how to make, this ensemble perform often. in the beginning, the SMPU Spark was, used as a default solution, but it was, sorely lacking. Now Trina is used for, modeling and, computing in my current company, which shows itself much better due to, completed work with, memory. Another option here would be, GPU Spark calculation, but it's not trivial to run properly sometimes. A huge problem, especially in, transactional and market data is an uneven load on infrastructure. So our news, Yeah, we have a, periods with a high load in our case, it's in use, but in your company, it may be a different period. this period, period with a high priority at this moments, all prioritization with all resources is given to the vital function of, of the product, making transaction or, Transcription sponsored by RenaissanceRe Traits, whatever, while the model, politely receives information, actually, it's just not efficient and, we can't calculate it properly. From this point of view of ML, a specialist, we have, two solution either to, intelligently adjust the model. to the high load intervals, or to make a separate model, not in real time that is looking for fraud on the high load, the second wall of calculation. In total, we'll have, fast, efficient, and accurate collection almost always. But in the most difficult moments of product, we reduce amount of resources allocated to anti fraud block and, either perform in a simple version without the wasting time and, catching the big, biggest frauders, the most painful in terms of metric or recalculate this entire block into the past a few minutes after load stabilize. So a flying support team. in fact, in fact we come to a very important part of today's story and, what about the before and after ML model part? These are people, analyst and anti fraud officer standing there. look at the probability formula, please. If we. competently built a pipeline for filtering good users from frauders from, fraudsters while, filters will be independent is just the, with the right processes, we'll make a huge improvement in accuracy of the entirely anti fraud platform because our filters of analytical ML and the operational team will go independent way. All different teams have a, they're all researchers and thoughts how to capture fraudsters. let's look at the full pipeline using an example. The analytical department signal, signal that a strange activity was happening in India, for example, dropping a little deeper our metric. and, they noticed that the people associated with the one particular payment system had dropped it significant on average. Okay. Now email engineers step into the game. look at the metric, study data, predict this metric, and, catch quantiles from people, with a particular strange behavior. After the correct validation of the model, it became obvious that some of them can blocked immediately. These are, molecular fraudsters. But, in the same part where the accuracy is, close to average, ML engineers are not sure. And, they give, these cases to the operational team. they sort out the, borderline cases, they're digging into the deepest essence of why the fraud occurred and, making decision with their heads and hands. what next? Monitoring. Also very important part. Monitoring both online and offline metric is crucial to ensure the system operates efficiently. One of the most important online metrics should reflect the potential business cost of fraud. such costs related to the payment system, may help us to prevent it or calculate, uh. it an efficient way, in particular tracking, not just the value, but also the behavior of the cost. it can provide the early detection mechanism for fraud, helping business to balance fraud prevention, with the operational team as well. Offline metric provide a deeper relation of the model security over time, given the class imbalance. Often present in the fraud detection, metrics like peer oak and, particularly useful for, assembling the model's performance. Additionally, monitoring the false positive rate is crucial. as, because we want to minimize false positives and, we need to maintain. A positive user experience while, effectively, fighting against fraud. And, visualization of this metric allows to team, analyst on or, operational team identify, emerging fraud patterns and, spike activities. Therefore, as you can see, online metrics. using visualization and the special tools go directly to rapid response people to check complex edge cases, either when the model is uncertain or where metrics are contradictory and they need to, detailed analysis, offline metrics as usual will help us to determine the degree of degradation of the model. And, over time we'll retrain it. Once we need it. so thank you for your attention and, thanks, to, Conf42 for the opportunity to talk about ML in antifraud. The main conclusion from my story can be drawn that, don't need to over complicate places with the mathematics that you do not require it while the proper process approach allows you to improve the quality of the model. Multiple times. Thank you.
...

Pavel Zapolskii

Senior Quantitative Researcher @ Exness

Pavel Zapolskii's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways