Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, everyone.
welcome to the talk on optimizing retail forecasting through advanced AI
models, the role of prompt engineering.
in this talk, I'll be talking about how the generative AI has had an impact
in such a traditional field, such as forecasting, where it was always
thought about using statistical machine learning and deep learning models.
And now how these generative AI applications can impact and enhance
the development of these models.
Before getting into it, I'd like to introduce myself.
I'm Sijo Valek at Money Content.
I'm a data scientist with over seven years of experience.
Thank you And, I have experience in, traditional ML, natural
language processing and other, areas within data science.
but, my primary focus and interest in the past couple of years has been at
forecasting at scale and how to use these different areas to enhance forecasting
in the retail, QSR and other spaces.
I'm actively involved in research, In bringing these multidisciplinary
developments in AI and machine learning into other fields.
and I've been a mentor to a few data scientists, junior data scientists,
and I'm also part of several startup accelerator programs on how to use AI,
to improve their businesses and also to develop intelligent automation systems.
I have a Master of Science from the University of Texas at
Austin, and I am an engineer.
yeah, so let's look into what we are going to talk today.
This is a quick overview of what we'll be covering in today's talk.
the first, I'll talk about why do we forecast.
We'll start by discussing the fundamental reasons for forecasting in retail, such
as predicting demand, optimizing inventory and improving customer experience.
Then I will go into the forecasting challenges and then I will get to
the specific soft prompt engineering and how prompt engineering can help.
forecasting.
So we will dive into how prop engineering specifically applies to forecasting
covering techniques like chain of thought, tree of thought, self refined
prompting, that can improve model performance and forecasting accuracy.
finally, we'll explore emerging trends and future possibilities in retail
forecasting, including integration of external data, AI driven supply
chain, And fully automated forecasting systems by the end You will have
a comprehensive understanding of how prompt engineering can ai Can
enhance ai driven retail forecasting?
and the exciting potential of these advancements in retail industry,
first Why do we forecast right?
so we know that the retail value chain is You Made up of flow of
goods from manufacturers to consumer.
So if you take any goods let's say for instance, let's take a t shirt It
is being manufactured in a different part usually Than then you buy a
store right then you buy from store.
So this product flows from manufacturer to the store through various channels,
like there is a lot of transportation involved, there's distribution
involved, and then different modes of transportation involved, or it could
be in different order as well, right?
I have shown a very simplified diagram here with a few, one distribution
center and a store, but it can get extremely complicated with hundreds
of Intermediate steps, right?
It could be, let's say, as complex as to given one example, it could be
like a manufacturing plant somewhere in China, and then the manufacturer
ships the product, it goes through road into a port, and then from the
port, it could go into a ship, and then from the ship, it could come to
somewhere in, let's say, a port in China.
europe or something right so then from the port it's again traveling Via a
truck or something to a distribution center and then from the distribution
center It would go to another warehouse sometimes and then from the warehouse.
It will come to your Doorstep or it could go to a store so it can get extremely
complex with respect to how a Good or a product travels to consumers, right?
so and this is one, so now That we know like the value chain
even there are other factors that affects the retail landscape, right?
You There are, we saw how there are different components and when you have
a lot of components, any uncertainty in each of those components can affect, how
the consumers receive the product, like imagine ordering something on, let's say,
e commerce website, like Amazon and, Okay.
Amazon tells you the product will reach you in, let's say,
one day or a two day, right?
so you have the expectation, but there's a lot of things going
on behind the scenes to get the product on time to the consumers.
So it's extremely important to forecast when it's going to reach and
when So that the customers know when it'll reach you and it is it leads
to a lot of customer experience and consumer satisfaction and developing
a relationship with the customers.
So one of the key challenges that here is like we need to start
producing products way before the anticipated demands in a lot of cases.
Let's say if you're talking about a t shirt, it takes a while for
the t shirts to get manufactured.
And even before manufacturing, there's a product design involved.
So the retailers want to, predict the demand in advance.
So that they could get a head start, right?
And another thing is, these product designs and things like
those are customer driven, right?
Like the, Maybe, I don't know, a gray shirt might be trending with some
abstract design might be trending now, but what is going to be trending, let's
say, one year from now, it's hard to predict, but you can get census and
fence from, the consumer behaviors and also with respect to Omnichannel, like
a lot of people are shopping online.
in store, sometimes they go in store, and look at the products, try it
out, but they don't want to carry it in the store and they order online.
So over the years, the retail landscape has become extremely
complex and forecasting is a critical component to identify these
signals in consumer preferences.
Right and also there is a say various components when it comes to
forecasting Let's look at some of the challenges i'm going to touch upon.
the extremely critical ones with respect to forecasting like one
of the things is horizon, right?
Like even if you are forecasting, let's say something like a demand from consumer
it looks simple it looks like it's the same thing but forecasting demand
Is Different when you are looking at different horizons like greater than 24
months from now We are making decisions on product design contracts strategy
planning and other things, right?
But let's say if you are looking at something like between three to
24 months We are looking at Okay, assortment planning, space optimization,
sales and operations planning.
But when you get into a three to 12 week period, we are looking at
a completely different use cases.
Workforce optimization, capacity management, inventory allocation,
distribution center replenishments, and other aspects, right?
And one to three weeks, we are extremely operational, like inventory allocation,
clear outs, discounts, and other things.
So that's horizon is a very important, concept, because that adds a lot of
complexity, like how, for instance, can you predict something five years from now?
It's extremely hard.
Nobody predicted COVID.
Nobody predicted a pandemic.
but at least it's good to know there are always errors.
Nobody can predict the future.
But like you having forecasting systems give a direction, a
corporation has to move forward.
In order to gain customers in order to increase revenue, right?
There are other challenges like for instance, granularity with respect
to forecasting You could forecast at different granular scales.
You can look at a individual product and say It's the demand of this
product is going to be five units or something like that versus You can
look at it at a store or like a region or a channel level as well, right?
So Generally, predicting at a high granular scales are extremely
easier compared to forecasting at a lower granular level, just
because of the volatility, right?
and then we have, as we discussed, short term versus long term
forecasting, and then, the other biggest challenge is data fragmentation.
the data is, collected.
if you have an e commerce website, it's easier.
to capture data like, okay, everything is online and the data capture systems
go into different systems online.
But let's say when there is IOT devices, like you have your payment
devices on this, stores and, there are tracking devices on the store.
In that case, it gets complex, with respect to the data and
that will add a lot of noise.
Sometimes then the data goes missing and so on.
And of course the demand volatility consumers change their preferences
and we need to handle the fluctuations in customer behavior Another thing
is one of the biggest challenge was like the covert 19, right?
There has been a drastic shift in consumer behaviors due to lockdown and
economic uncertainty And that's when there was a lot of supply chain disruptions
that affected inventory management.
this has been, eye opening moments for the retailers.
And this has, created, understanding in the importance of real time
forecasting and adaptability in the supply chain systems.
So now there is an accelerated adoption for AI ML models to
handle this unpredictability.
So now let's look into, prompt engineering.
and then I'll first introduce what prompt engineering is and how it's.
different, and how it's going to help forecasting and how we can use
prompt engineering to build and help, developing these AI forecasting systems.
Let's start with an understanding of what prompt engineering is and why
it's so vital in retail forecasting.
so prompt engineering is the process of guiding generative AI models.
to produce high quality targeted outputs.
This involves carefully choosing the right format, phrases, and structures that
shape the AI system's response, right?
So when we talk about prompt engineering, we know that there has been a lot of
generative AI applications like ChatGPT, Cloud, Anthropic, and every, every,
and every tech company is building their own LLM models and, generative
AI solutions and so on, right?
And this has boosted the productivity of developers, data scientists,
data analysts, and so on, right?
prompt engineering allows us to build a library of prompts that can be reused
across different forecasting scenarios.
And this scalability is extremely important because it enables us to
tailor AI models to various retail forecasting needs efficiently without
starting from scratch every time, right?
So, yeah, so that, that has improved a lot of, saved a lot of developer hours and
like I personally have used it to create a few systems and we will get into some
of the key techniques and how we can use those, to quickly develop these systems.
So there are three key techniques, in Prompt Engineering, right?
One is the chain of thought prompting.
this technique is particularly useful for breaking down complex forecasting
questions into series of logical steps.
For example, if you want to forecast.
seasonal sales trends, the model can first analyze historical data patterns,
then consider external factors like holidays, and finally produce a forecast.
By prompting the AI to tackle each component sequentially, we get a well
reasoned and comprehensive forecast.
The next one is tree of thought prompting.
With this technique, the model explores multiple solutions.
Or ideas before reaching a final conclusion in retail forecasting context
This could mean evaluating several forecasting methods such as arima
seasonal lives arima and different machine learning models and then
comparing their suitability based on the Retail requirements volatility demand
fluctuations Seasonality and so on right?
So this approach, allows the AI to consider different paths and select
the most optimal forecasting approach.
The third one is self refined prompting.
This method encourages the AI to refine its answers through an iterative process.
For instance, after generating the initial forecast, the AI can critique
the results, identify where it may have underestimated the demand spikes,
and adjust the models accordingly.
This iterative refinement ensures that the forecast improves with
each pass and leading to more accurate predictions over time.
Each of these techniques brings unique strengths to forecasting
and retail, helping us harness generative AI to produce reliable data
driven insights tailored to complex and dynamic retail environments.
Next, we will look at each of these examples and I will give you like one
or two examples on each of these on how you can use in your forecasting task.
The first one is chain of thought prompting.
so as we mentioned, the purpose is to break down the forecasting task into
logical and step by step components.
So this is an example prompt that I've generated.
so for instance, develop a retail sales forecasting model by breaking
down the process into key steps.
Start by preparing the data, then explore and evaluate different
forecasting models and conclude with an evaluation on the test dataset.
Outline each step and include key consideration.
In here, I'm telling the model each, individual step that, that
the generative model has to perform and then, outline the each step
and then include key consideration.
Depending on your AI model, you can expect different results, but this is a sample
result that AI could come up with, right?
So first, it'll start with data preparation, as I mentioned.
It could convert dates into daytime formats, such as index, check for missing
values, and fill them using interpolation, aggregating the data, and so on.
Then it'll start doing the exploration of model tuning, and then model evaluation.
So, I'm not saying the models are perfect or anything, but currently the generative
AI models can do a lot of, task by itself by generating codes and it will save a
lot of time, for instance, converting day to day time format or, Checking
for missing values and so on, right?
you get the scripts.
You get the inputs from these models quickly.
sometimes you may also miss out on certain things, right?
the benefit is, with this chain of thought prompting is it guides the AI
system to systematically approach each component of the task ensuring that all
the critical tasks are covered logically.
Then we have the three of the prompting.
This is one of my favorite approaches because it explores multiple approaches
to solving the forecasting problem before selecting a final model, right?
one of the example here is consider three different approaches for forecasting
retail sales, a traditional one, a machine learning one, and a hybrid one.
And then for each approach provide baselink outline of steps
advantages limitations finally recommend the most suitable one.
So the model could, first start with the traditional, and then we have
the machine learning models, such as XGBoost, random forest, and then
a hybrid one, which is to combine the ARIMA with XGBoost here, right?
and then it would also suggest, okay, for large seasonal data, the hybrid model
may offer the best performance and so on.
So it actually gives.
a lot for even just this one is more about what you can do okay These are
the steps that you could take gives a structured approach in project management
itself you can also next go ahead and ask the next level question right is
to generate codes for these and then so that you can keep adding to your thought
process the next one is self refined prompting, the purpose of This is to
guide the AI systems to iteratively refine the forecasting model by critiquing
its performance and making adjustments.
An example is create a retail sales forecasting model using Sarima
model and then critique the model based on its test performance.
Identify issues such as under estimating seasonal peaks, refining the model to
improve accuracy, and finally evaluate the refined model and report the improvements.
So the AI system can go build an initial model such as Sarima to capture the
seasonality, and then it could evaluate the model on the test data with metrics
such as MAPE, and then it can critique.
It can say Sarima model captures seasonality, but underestimates
peaks during holiday season, leading to higher errors.
The refinement could be adjusting the seasonal parameters to account for
sharper peaks and introduce exogenous variable to improve peak forecast.
And final evaluation.
So this approach allows the AI to assess and refine its forecast iteratively
leading to more accurate tailored model of for retail forecast, right?
So this is the approach that as data scientists, we use, in our
day to day lives as well, right?
We start with a baseline model and then, we iterate again, conduct
experiments, evaluate to finally come up with our, final solution, right?
So we are prompting the AI systems to do the same, to select the best model.
What are the applications of prompt engineering and forecasting and
how can we actually use them?
It's a very unique question because people still think about prompt
engineering as something that helps with NLP tasks and they see it as
used in chatbots and like different websites or product recommendation
or in generative AI application.
How can we use this in a traditional field such as forecasting, right?
There are so many, people are getting creative with so many, new applications,
but there are a few areas where we can use greatly enhance the forecasting process.
The first one is proof of concept development.
When building a new forecasting model, prompt engineering can help
quickly generate a proof of concept.
This allows to test ideas, validate model feasibility, and demonstrate
initial forecasting capabilities before this full implementation.
Next is generating scripts, writing scripts to pre process data or evaluate
models can be extremely time consuming.
But with prompt engineering, we can guide AI to generate clean, efficient
codes for tasks such as data cleaning, feature engineering, and model evaluation.
This accelerates development and reduces manual effort.
One of my favorites is exploratory data analysis.
For forecasting, understanding the data patterns is critical.
and prompting the AI to perform EDA helps visualize trends,
seasonality, anomalies, and etc.
The visualization scripts are particularly complex.
It's not complex with respect to the conceptual understanding, but
it can get cumbersome with respect to the, different libraries and
different, data types and so on.
but AI is good at just extracting the, logical data, right?
So it saves a lot of time.
It has personally saved me a lot of time, in developing scripts and
visualizations to identify patterns.
Then we can, use AI to debug.
I saw a stat that, okay, the stack overflows.
Traffic has been going down a lot since chatgpd was introduced.
And there's a reason for that.
you can easily ask, these systems to debug your script and with the right
prompt engineering, it gets better, right?
Like it can show you things that you have missed out personally, and it also
increases The data science productivity and one of the other interesting and
the new area is to get signals from textual data Detail forecasting can
benefit from external textual data like news headlines social media sentiment
and so on And with the right proms we can extract key signals from textual
data Text such as sentiment trends that correlates with sales performance and
this data can be Incorporated into our forecast to capture the impact of market
sentiment or economic events on retail
And each of these areas demonstrate how prompt engineering supports the efficient
development of robust forecasting models, making the AI driven forecasting
process more scalable, insightful, and responsive to complex data.
Retail needs.
Next, I'm gonna show some use cases on, how to use some prompts, right?
So this one is around proof of concept development.
Like how do you develop a comprehensive time series forecasting model?
one prompt that we could use is develop a Python script to create
comprehensive proof of concept time series model using Sarima.
Now we are outlining each and every step that the AI model should use, right?
Begin with data processing steps, include time based indexing, handle missing
values, and outline applied detection.
Then we are talking about how to use certain parameters, such as seasonality,
and how do you minimize it and what sort of technique you need to use, like
use good search or AIC minimization.
Then we are talking about the backtesting strategies, split the data training,
and use Error metrics such as MSE and MAPE and provide visualizations for
both actual versus forecasted values and also include residual diagnostics.
Now conclude with a summary report assessing model assumptions, limitations
and suggest potential next steps.
Use pandas, stats models, and matplotlib to implement this Robust object.
So there is a lot of information that's given to the AI model
So that it doesn't hallucinate.
It doesn't Stay away from it.
We have very specifics to work on but it is still your job to provide
these specific instructions so that AI models can easily generate them.
The other thing is around script generation, right?
So this is a prompt for sophisticated data transformation and aggregation script.
So the prompt is around generate a Python code that performs.
Advanced data transformation for time series analysis on a data set sales
data containing date product category and transaction value This code should
so we are already giving data specific.
So I made up these, features and data set but we are giving some very specific
column names which matches with the data that I have so that it's easy for me to
just You copy paste and then get started right and then i'm talking about again
the specific convert date to time based index handle Any missing or inconsistent
time stamps are just for time zone differences if necessary And, then we
are talking about aggregation and then calculating features, specific features
like month over month growth rate, moving averages, seasonal indicators, and so on.
And then we are telling the AI system to output a structured format with, and
then also mentioning implement this using pandas and ensuring high flexibility
for different time zones and potential categorical feature explosion, right?
It's a very well written prompt on where AI has access to certain things, but limit
your hallucinations and other things so that we get the very specific output.
And, This is a, exploratory data analysis, prompt to get a in depth
visualization of trends, seasonalities, anomalies, so it's around design a
Python code that performs exploratory data analysis on a time series data set,
sales tiers with column date and sales.
And we are telling what exactly, the script should do, generate an interactive
line plot, decomposition, and annotate anomalies using C score method, and
summarize insights from EDA in the report section, highlighting seasonal patterns,
cyclic trends, and any anomalies.
And we are again mentioning use libraries such as plotly for
interactive visualization Stats models for decomposition and scipy
for anomaly detection, right?
So by one by limiting the ai system to Certain libraries which you are familiar
with it's easy to take this work as an initial starting point and then work
build upon it so This is the last example.
It's one of the most upcoming trends right is to capture
the signals from textual data.
This is another favorite of mine.
I would call it the advanced signal extraction from text for time series
modeling So the prompt is around generate a python script to extract meaningful
signals from Dataset news headlines containing date and headline and integrate
them into time series stock prices dataset with date and closing price This
is not exactly a, retail forecasting sense, but it's still on the time series.
A retail forecasting use case would look like something like getting a
social media trends and like seeing any index or create an index saying
the customer sentiment, right?
So in here, we have, a few steps, right?
See pre processed text data with advanced NLP models, and then specific steps.
Steps like tokenization, removal of stop words, lemmatization, and
extracting sentiment scores using fine tuned models for finance
or industry specific sentiment.
Then we are talking about how to aggregate sentiment scores and keyword frequency
metric that reflect public sentiment.
sentiment and event frequency and merge these with stock prices.
Then we are talking about causality to predict relationship between sentiment
signals and stock prices movements.
And finally, generating a visual correlation matrix and time series
plot overlaying sentiment score and stock price to illustrate
alignment and divergent strengths.
Implement spaCy or transformers for text processing, pandas
for merging and transformation, and matplotlib or seaborn for
visualization and correlation analysis.
So these are some of the things that like add a lot of value, right?
See like You can quickly generate scripts, you can quickly ask these AI systems to
get friends with large amount of text data and integrate it with stock prices,
at least you could see whether you're seeing any signal or not, without even
wasting any development time, right?
having said that, the AI systems have a lot of benefits, as we saw, feel
free to use any of these ROMs in your forecasting tasks, but, there are a few
things that we need to be aware of when we practically implement, these prompts.
So the quality and the accuracy of the forecasting outputs heavily dependent is
heavily dependent on the clarity of the prompt vague or incomplete prompts can
result in Models that overlook factors like seasonality trends specific time
intervals leading to unreliable forecasts So when you looked at the some of the
prompts that are shown i'm not giving any open ended Concluded Prompts, right?
It's all closed end.
I'm even with respect to the library scripts on how much an AI system can go
So that's very important because it can actually, if you don't give the right
prompt, you can actually interpret scripts that would take you a long time, or
even the AI systems can tell you things which is not actually true, which may be
hard to debug at a later point in time.
And then many advanced forecasting models generated by prompt engineered
scripts, especially like deep learning approaches, can be difficult to interpret.
Because, because they, have the liberty and also, it's very hard for us to,
look at a deep learning model since it's like a black box type of model, right?
Even though we have weights, it's very hard to go through all the nonlinearities
within, different models, right?
So it'll be hard to debug, right?
You need to be careful when you are doing such, modeling tasks, and then prompt
engineering models may struggle to account for unexpected external shocks like
economic crisis, pandemics, because the AI models itself is based on limited data.
So you need to have an understanding of.
till when the data is trained and what's the limitations and, what
these models can't do, right?
And also the reliance on high quality data and pre processing.
Accurate forecasts require clean and well processed data.
The AI models generated from PROMs may not automatically address issues
like missing values, irregular intervals, outliers, and less.
You are explicitly prompting these systems.
So you need to be extremely careful.
it's not just forecasting, but with respect to forecast, you need to be
extremely careful so that it doesn't increase the risk of inaccurate forecast.
If data issues goes undetected, right?
So, my recommendation would be even now with my practical experience, these AI
models are extremely good at, doing the initial analysis and initial analysis
can be, easy to debug, even if there is a mistake, like visualizing data.
You can, spend writing prompts to generate Tons of visualization to understand your
data really well, which you can then take and then develop models, right?
Like visualization scripts usually takes a lot of time, but these
systems can reduce the effort required to build these visualizations,
exploratory reports, and analysis.
Next, I want to touch base on, on the future of AI in retail
forecasting, the trends, right?
As we look into the future, there are several emerging
trends in retail forecasting.
that's, one of the key things is, what I see is incorporating the need
for incorporating social media and external data for demand sensing with
the vast amount of data generated on social media and external platforms.
Retailers can capture these real time insights on customer
segment and market trends.
I'm not saying that it's not being done at this point, but
there's a lot that can be done.
By incorporating this data into forecasting models, we can improve demand
sensing, especially around events, product launches, and seasonal shifts, right?
The next thing is around AI agents.
so 2024 has been, here where agents, AI agents have been on the rise and AI
driven supply chain agents are enabling faster and more agile forecasting solution
for inventory and logistics management.
For instance, these AI agents can generate rapid forecasts.
Adjust inventory levels, even recommend reordering strategies in order to
response to demand fluctuation, optimizing the entire supply chain.
And then we see a lot of trend around predictive analytics
for customer personalization and marketing optimization.
so we, there is, this has been a field that's been existing, but like we see the
integration of AI systems to, Makes sense of a lot of customer, reviews, customer,
support, and it can make sense of, let's say, Take feedback and quickly analyze
it and turn it into some action, right?
Like for instance, let's say you have a food ordering app And
then there is some issue that happened while delivering the food.
You can always Insert these type in you don't need to wait for a customer
support agent to, to be available.
You can have an AI agent or a LLM system to clear the query for you
by understanding the sentiment, by understanding the actual issue,
and then taking action, right?
It can also then give the customers, some promotions or some incentive
to stick around and to increase the customer satisfaction.
having said all of these, there are some ethical concerns as well.
There's a big, new field that's prompting up with all these AI systems.
How much should we be concerned with respect to these systems and giving
autonomy on these systems to give the decision making power that humans
have been doing, such a new area.
it's an area, but it's getting more and more popularity with
respect to, privacy regulations.
And then the other trend is around, fully automated demand
forecasting systems driven by AI.
fully automation is still a far step, but I see it happening sometime in the future,
at least in the non critical areas, or it could, it can include, or we can have
systems with uncertainty measures like probabilistic forecasting, so that even
if we have a forecast, we know how to read the forecast and make it interpretable
for, let's say, the businesses do not rely on a tech team or a data science
team to provide them with the forecast or develop a forecasting system, right?
So it's enabling businesses to generate technical solutions for their need and
quickly act upon, the business decisions.
that's it, for the talk today.
I hope you enjoyed the talk and I hope you can use these prompt
engineering techniques and use these understanding in quickly
developing your forecasting solutions.
If you have any questions feedback on or any thoughts, you can email me at cjoe.
vmanikandan at gmail.
com I'll be happy to answer any questions any thoughts
and any feedback on this talk.
Thank you I hope you enjoy rest of the conference