Transcript
This transcript was autogenerated. To make changes, submit a PR.
Good morning, good afternoon, good evening. From wherever you're watching,
I welcome you to my talk. I'm jibind Thomas.
I work as a technical architect at Signet Jewelers and I have
14 years of experience with supply chain and retail.
I'm also a senior member of Ieee. First of all, I want to thank
all of you to join my talk today, and I want to thank
Mark to invite me to this wonderful Machine Learning conference of CON
42. Today I want to discuss about optimizing
omni channel order fulfillment with AI and advanced analytics.
And before I delve into the machine learning side of things, I want to talk
something about the ecommerce industry. As you all know,
that ecommerce industry encompasses businesses that
operate on the Internet to sell goods. And according to
expertmarketresearch.com, the current value
is USD $1.1 trillion, which is
expected to grow to $3.85 trillion by 2032,
which is like a 14.8 percentage increase.
And I believe that artificial intelligence and machine learning is going to
take a lot of role in this growth. Coming to omnichannel
retailing omnichannel retailing, or in other words,
order management system, is basically a system which
takes the customer order and delivers it to multiple
choices based on the customer choice. Like it could be a store,
customer preferred store, or it could be the customer's address
or to the collection point. And as the name
Omni, it basically means that it is actually connected with
multiple channels. And in this case, we have different sales
channels like store online, mobile kiosk and call center.
And the system is basically having a lot of fulfillment rules,
orchestration rules, and processing and monitoring rules, because of which it
is able to deliver the customer order. Now,
though, there are many vendors who
are implementing this software, but there
are two major vendors, IBM Sterling Commerce and Manhattan. My experience
is with IBM Sterling commerce, and though the system
is a big giant and has a lot of benefits, and it gives a
lot of benefits to the retailers. But there is a problem
statement, and this is one of the things which I would like to highlight over
here. And the thing which I have mentioned over here is
real time inventory visibility. For example,
you walk into a Walmart or a target store and you look onto
an aisle and see that a product is lying over there, say for
weeks or months. That means that that product is actually overstocked
at that store. But at the same time, if you go to
another Walmart or a Target store and you see that the same product is
missing from there, that means that the store is under
stocking. Now, in a non AI world,
it is very hard for a retailer to have
that kind of inventory visibility in picture.
And this is where machine learning and artificial intelligence can
help retailers to boost up the way
how they display the inventory in the current world.
And the second problem is the suboptimal sourcing, which basically
causes a lot of high shipping costs to the customers,
and it has a lot of impact on sustainability,
supplier and stock management issues, lost sales and customer dissatisfaction.
Today, I want to talk about four topics. Building a
central inventory visibility system using AI and predictive algorithms.
Then I would move on to talk about using
machine learning to optimize order sourcing and routing.
Then I will talk about predictive analytics to allocate omni channel
inventory dynamically. And finally, I will move on to talk about
reinforcement learning to optimize allocation across fulfillment
centers. Moving on to the first topic, which is building a central
inventory visibility system using AI and predictive algorithms.
Now, to have this system in place, there are two
major forecasting systems which a retailer should have.
One is called as demand forecasting, which is basically a
forecasting system which has the
forecast done between a store and an item combination.
On the other hand, the sales forecasting is a system which forecasts the
future sales. Now, for all these
use cases, which I'm going to talk about, all those
four points which I was talking about, I'm going to display a
small use case. This is not a real time use case for
a retailer, because the real time machine learning data
for retailers could be in millions, which the retailer has to train.
But in this case, we are going to see a small use case
with 10,000 records for all these four scenarios,
which I was talking about. And we will see how we can use
some of the machine learning algorithms to identify
and predict and do the forecasting in this case.
So here I have actually put a use case to
do a sales forecasting system, or basically to build a sales forecasting
system. Now, let us imagine we have a retailer
named company a. Now, company a wants to implement
a sales forecasting system. And the data, what you see on
your right is basically the transaction data.
And it is also known as the sales data.
And company a wants to implement the
sales forecasting system. Over here. There are some of the key columns
here, which is transaction date, sales quantity and price.
And based on this, we are going to implement the
sales forecasting system, which basically means that it is going to forecast the future
sales for company a. Now,
to implement this, I have used three different models.
They are also known as time series models. They are provided by
the Python framework. And these are Arima Sarimax
exponential smoothing. And these are three models which I've used
for this use case. Now, regardless, whichever model we are trying
to build, whichever machine learning algorithm we are trying to build, the steps
pretty much remains the same. What I mean by that
is that first step is always to
gather the data. And in this case we are trying to
gather the historical data set of company a.
And once we have that done, the second step
is basically to analyze the data, which is also known as EDA or
exploratory data analysis. So as part of the exploratory data
analysis, we try to analyze the data, and finally
we try to push the data to the machine learning algorithms.
So once we have this in place, the models
will actually produce the output which is forecasting.
So as you see on the right side, the flow diagram over here,
Arima model works on the concept of p,
d and q. P stands for autoregressive,
d is the differentiation and q is the moving average.
I'm going to talk about how we identify these parameters.
So once we identify the parameters, we try to build the model,
and then there is a diagnostic
checks where we try to fit the model. And finally the model, it does
the forecasting. And if you see there,
there is a continuous improvement cycle over here,
which is basically in loop. That means once
the forecasting results are out, we try to adjust the model
parameters based on the results, what we get,
and then we try to again fit it. So this is basically a constant
loop, which basically means that we are always
training our model. And this also requires that we
feed the model with the daily sales
data, and based on that daily data, the model
is getting better day by day.
Now you can ask me, Jubin, how do you identify this
PD and q parameters? So, to answer that
question, Python frameworks actually provides
an option of PD and Q to provide
the value of PD and q. And the way how it works is it
basically sets the default value of
PDQ. So this process is also known as auto Arima process.
So what I have done here is basically I have run my training
and test data using the auto ArIMA
process, and the auto Arima process determined based on my training
and test data, that the best value for the PD
and q in this case is zero and one.
The screen is kind of big
right now, but I think you should be able to see right
now. So it identified 501 as the best parameters
and it gave me a mean absolute error of 2.6
which is, which is the best, which I
could get from this test and training data.
Now let me talk about.
We can plot Eda. Now, EDA,
as I said, is exploratory data analysis, which basically gives
us time series plot and correlation metrics,
which we can use these kind of metrics to understand
our data much clearer. So once we have
this in place, the next step is
writing the code. And this is where I want to show you that,
how we can write a code for
the sales forecasting system. So I try to load the training data and
the test data once I had it. Then if you
see that the index frequency is set to d,
which basically means that it is a daily sales data.
And then we have the EDA, which is exploratory data analysis
where we try to analyze the data. And this is where we try
to. And if you can see Arima,
I have passed 501, which is what we got from
the auto AriMA process. Once we have this in place, we try to
do a model dot fit, which will basically fit the model,
and we call the forecast method of
the model which will try to forecast the results for us. And at the
end we basically do the forecasting.
We print the forecast results and we also print the different metrics
which Python framework generates. Now,
coming back to the results, so as
you see that the forecasting results are
displayed over here, and all three different models are EMA,
Sarimax and exponential smoothing. They gave me 20
records, and these 20 records are basically the number of
records what I have in the training data, sorry,
the test data. So it basically generated the results based on
my test data, because I had 20 records in my test data, it also
generated 20 records over here. And if you see that the mean
absolute error is 1.8 for Arima,
where in the other models are kind of higher on
the higher side. So mean absolute error is
basically a parameter which is used to
understand whether the model performance was good or bad.
So in this case, I see that the Arima model has
actually given me 1.8, which is basically 1.8
units of the actual test values in the test CSV
or the test file, and which basically tells
us that Arima model
in this use case has performed better compared to the other models.
Now moving on to my next slide, which is using
machine learning to optimize order sourcing and routing. As part
of this, I want to talk about two different topics. One is
predictive delivery date, and the second one is sourcing and routing.
Before I talk about predictive delivery date, I want to talk about something
called as estimated delivery date? An estimated
delivery date is a date which is displayed at the retailer's
website or the storefront. It is
basically to show when the
customer's order would be delivered or when an item can
be delivered. And in a non AI
world, the way how estimated delivery date is calculated is
based on a lot of factors. These factors are calculating
the lead time between the store and
the carrier, and the carrier would have its own lead time.
So adding up all these lead times, the retailer displays
a buffer date or a lead days on
the website, which is also known as the estimated
delivery date. Now the problem with these dates are
the estimated delivery date could be way higher than
the actual delivery date. And that is why
machine learning can actually help us to reduce
this lead days so that we can have
near to accurate delivery date displayed on
the storefront or on the web page.
On the other hand, the sourcing and routing is basically a concept
where the order management system tried
to source and route the customer order to
the customer's address or the customer's provided zip code.
So this is another concept where machine learning
can actually help to understand which node
or which store is basically near to customer's address,
and it can provide lesser shipping costs.
Now, to build this use case, I have used
two different models. One is the random forest regressor and
the second one is kneighbors classifier.
So, random forest model, as the name says, it is
a forest model and it requires two different parameters.
Like in the previous use case I was talking about the Arima
model and it requires PD and q. Here in the
random forest it requires two different parameters, which is n
estimators and random states. Here I
passed the n estimators as 100 and
random state as 42. So an
estimators is basically it creates
hundred different decision trees based on the input provided
to the model, and it tries to create this decision tree
to identify what is the
best predicted value. And at the same
time, random state 42 is basically a parameter which
would help the model to always produce similar result regardless
how many times we are passing the same input.
And on the other hand, the knee RS model is basically a
model which works on the concept of neighbors.
That means based on the input, what we provide to the model,
it will try to fetch the
five nearest neighbors which are matching to that input,
and it basically displays that. So these
are two different models. And on the right side of the screen you
can see an Amazon screen over here. And if you
see that there is a date displayed over here, which is
Saturday, April 27. And this is the
date which is also known as the estimated delivery date. And according to
one of the e commerce article where I read
that 47% of customers, they abandon their
cards because this date, which you can see over here,
estimated delivery date is not matching with their expectations.
So it is that important for a retailer to have this date displayed correctly.
Now, in this use case, we are going to build a predictive
system. The first one was the forecasting system. Here we are trying to build a
predictive system. And the first data set, what you saw
was the transactional sales data for the customers and
for the retailers. And in this use case we are going
to see it is basically the
delivery data of the retailer. And here we
have the delivery data of company a. And the key
fields here is customer zip code, store zip code,
shipping option fulfillment success. And this
is a small code snippet which will help us to understand how
this works. And first we try
to load the data and we load the us
zip code. And as you see, there are three key fields
which I dropped from the data frame. Customer zip code,
store zip code and fulfillment success. And I'm going to talk about why
I did this. And there are two variables
which are pretty important over here. One is x test or
one is the x variable and the other one is the y variable.
Now in the concept of machine learning,
the y variable is basically the
target variable. So the columns which we
want to, we want our model to understand
that we need to train our model based on
these target variables. Then we need to drop that from
our data frame and add that as part
of our y variable. So that's what I'm doing over here. I dropped
the customer zip code, store zip code, and fulfillment success because I
wanted these three key attributes as
part of my training. And I'm telling my model that,
you know, these are the target variables and you have to get
trained based on these three target variables.
And on the other hand, all other things, all other columns
are as part of the x variable for
this training. Now, you can ask me that. There is no test
data over here. That's a valid question.
And what I am telling my model here is that the
test size is 0.2, which basically says that
the model is smart enough to understand that
it will take 20 percentage of
the training data as the test data.
Now, as I was explaining before the end estimators, I passed it
as 100, random state as 42, and for
k neighbors, it is neighbors as five. So once
I fit the model based on these three inputs,
which is the shipping option standard and customer
Zip code 19063 and customer
location, which is 1963. And for the item
two, I wanted my model to predict
me the nearest store which can deliver the customers
order, and it should also give me a predicted
delivery date. So once I run this model
or once I ran my script, this is what the model
generated for the shipping option standard and
for the customer Zip code 19,063. It was
able to figure out that the store, ZIP 89402
has an available quantity of one 10, which means that it
is able to fulfill my order because my requested quantity is 100.
Based on the training which model received,
it was able to predict a delivery date as 2024
526, which is like around six days off from today's
date. And at the same time, if it
is overnight, it gave me a very early
date, which is the next date. It is tomorrow's date over here.
521 and this is again based on the training.
The training data has a lot of overnight records and it
was able to understand based on the training which it received,
that okay, if the shipping option is overnight,
the delivery date is 521. At the same time,
if I reduce the quantity to ten, it was able
to find me a store, 19052, which is
like 2.4 miles away from my place.
And that is pretty close. So model is able to understand that
it has to fetch a store which is pretty near
to the customer's address. And at the same time, if I pass the
requested quantity more than what I have in my
training data, it gives me a message saying no available inventory for item two
at the nearby store. Let me come to my third topic,
predictive analytics to allocate omnichannel inventory dynamically.
If you remember, in one of the initial slides, I was talking about the problem
statement, the statement of having a
real time inventory visibility issue.
Now, in a non AI world, it is completely impossible to
have such systems in place, but with the help of machine learning,
we can have this. And in this use case, I will talk about how
we can build a real time inventory visibility dashboard which
will help to understand what kind of inventory status we have to maintain.
So without further ado, let us help
poor fella who is worried about the overstocking and understocking issue,
and let us help him to to build a system in place.
So the first use case we saw was the sales forecasting use case,
where in the transactional data was a sales data,
in the second use case it was the delivery data.
And now in this use case we have product data. So company
a has various products and this
is the product data. What we have, this is the product id and
then the cost per unit. And like this we have
different other columns. And based on this data, we are
going to create a model which will help us to predict
what inventory status we have to maintain for a particular item.
Here I am going to give a demo, I am not going
to show the code. So the way
I build this use case is using Python and flask. Flask is a microservice
framework within Python. And the model I used to
train was the random forest regressor.
And I'm not going to explain the random forest regressor once again because
I explained in one of my previous use case.
So the way how this application works is the user accesses the application via
web browser and the user inputs the data using the form fields.
There is a data validation in place. If the data is incorrect,
it displays the error message and it comes out. And if the data is successful
or if there is no errors in the data,
then the model basically predicts the inventory levels and it also calculates
the mean absolute error and displays that on the screen.
Now moving on to the demo. So this is the
real time inventory dashboard for company a. And it has various
fields, as you can see on the screen right now. And to
show this demo, I would quickly like to switch my screen to show
you one of the records of the training data,
what we have. So we're going to use the same set of data to understand
whether we were able to have our model predict the data correctly.
So let's take the first record, which is product id 30. Let's put it
over here. And we have the product cost
as 275 and
the revenue is 5784. And the
days of the week two is for Tuesday, one is for Monday,
so on and so forth. So here in this case we
have Tuesday. If you look into the data,
the generic holiday is zero, which is basically, it's not a
holiday. And then the expected time, delivery days
was actually seven. So based on
this, when I click on predict status, it said that the product inventory
to be maintained is 23, which is pretty close to what I
have in my actual test data. So, which means that
my model is able to predict a value which is near
to accurate to the actual,
the training data. Now, if I want
to increase this revenue,
so the product cost is 275 and the product revenue, if I
want to increase it to 9999 for a particular store.
And let's keep all of the fields as is.
My model said that you know the inventory has to be maintained
is 41. One last thing I want to show over here is if I
pass the same product id here and give the same price and
let's keep the same combination what we have. When I click
on predict status, it gave me exactly 24 over
here. That means regardless how many times I pass the same
input, the model is able to give me the same value. And this is
what we has in the random forest
regressor, which is random state. And in this case
we passed as 42, which helps the model to understand,
okay, it has to give the same result regardless
whatever time I pass the same input.
Let me move to the last topic, which is reinforcement learning to optimize the location
across fulfillment centers. Reinforcement learning is a branch
of machine learning algorithm which
does sequential decisions in environment to maximize cumulative
rewards. Here. In this case, I've used q learning. Q learning is one of the
simplest reinforcement learning algorithm. It learns
the optimal action selection policy for MDP by iteratively
updating the q values of state action pairs.
Now, for q learning, it basically works on
certain parameters and they are gamma, alpha and epsilon and
having a states in place. And it basically
calculates the actions and does the decision making. So while it
does the decision making, there are certain factors in place, which is
calculating the reward metrics and calculating the q values. And based on the
learning, it again updates the q values. And this iteration keeps on going
until it reaches its final decision. So here,
in this case, I try to build a route
calculation using q learning.
And as I said that Q learning has these parameters,
agent environment, actions, rewards and learning.
So this is a code snippet, as you see over here.
I passed various east
coast states here. And what I'm asking the Q learning
algorithm is to find me the best route between North
Carolina to New York. And the Q
learning algorithm was able to find this based on
certain rewards metrics and Q learning,
or queue values, which got updated. It was able to
figure out the best route. So if you look into this,
it gave me North Carolina to Virginia, to Washington
DC, to Baltimore, to Maryland, to Delaware, to Pennsylvania,
to New Jersey to New York. And in the real time map, if you see
this is looking pretty accurate. So this is how reinforcement
learning can actually help, you know, to find
the best route or best, you know, fulfillment centers
locations. And using this learning,
it will definitely help supply chain to find
out the best routes for the warehouse. That is pretty much which
I have to share today. Thank you so much for having me, and thank you
so much for listening to me and this is my email address,
Jubin Thomas.org. and that's my LinkedIn
and GitHub ids. And please reach
out to me if you have any questions or if you want to have
a connect, please do reach out to me. Thank you so much.