Transcript
This transcript was autogenerated. To make changes, submit a PR.
You.
Hello everyone. And today we are going to talk about
deploying machine learning models. Models. And I will tell a few
stories from my experience, mostly from the
beginning of my career as a data scientist. And probably
you will find yourself in things story. So you can probably
learn some of the things if you are lucky and
you didn't have any failures like that in your career.
So first of all, we all used to
train the models because this is like the main task
of us as a data scientists. We all know all of these stages
like data gathering, like exploratory data analysis.
Then we do feature engineering and we train our model.
We can fine tune it by optimization of
hyperparameters of our model and also we evaluate our model.
This is something that we are so used to and into things process.
And we have a specific mindset, I would say a researcher
mindset. And we are focusing not just on the
end solution like an accurate model, but also we are trying
to keep the experiment right and checking everything at
every stage. And basically deployment kind of requires
another set of skills, another set of oops and another mindset
as well. For example, in most of the cases you
probably will be building some kind of a prediction service
and it communicates with end user application or
with the system. Depends what
kind of problem do you solve and how your solution is integrated
in the system in general. But the main idea is that during
model training you're basically focusing on having the
model and making it better, improving it all the time. While during deployment
you are focusing on preserving your model because you don't
want to get different responses and in form of prediction
scores from the same model for the same input data.
And this is like one of the main concerns to think about
and to look at during deployment. And also
there is completely different set of issues you have to
focus on during deploying, for example infrastructure.
And you don't really think about infrastructure when you training models
like, yeah, you have to think about capabilities of your service
or how you can deploying, what kind of models do you train.
But still at that moment you don't think really, but production
service, like what's those load on that and
what you can do about that, how you can plan and organize
infrastructure for your solution as well as you don't have to think
about integration of the models with the whole
system because you just work separately on your research and
you're basically alone with your own environment
and you can do all kinds of experiments that you want
and you don't have to think about that moment, about the
influence of what you do on other components of the system,
as well as the risk quality of code, which is again
not that important during research
because I wouldn't say that it is unimportant,
but it's more like not a focus of our research because we are
focusing on the experiments. And sometimes it can get messy,
especially if you are using Jupyter notebooks, as you may know that
already. But for deployment,
it's highly important to keep the quality
of your code high, also to write tests. And in my
opinion, tests for data scientists is crucial,
especially for deployment, because again,
during research you have this kind of luxury
to be able to check your data set, see how the
data changes, and if something goes wrong and the calculations
are not working as you expected. But you don't have this luxury
during deployment. And somehow you have to be able to evaluate
whether your calculations conducted in
an expected way and to save the states of your data
somehow and to check that. Also there is logging, which is
again, not really required during the research,
probably because quite a lot of data scientists come
from a research background, from applied mathematics and so
on. Usually we don't really have a software engineering background
and that's why we use such tools which don't
really require login. Like for example, if all of us
were using Pycharm, for example,
probably we would use some kind of login to
track the experiments. But if we use Jypton notebooks,
we don't need that because we can look at the
data at any moment and see what's going on in there while during
deployment. Again, it's unfortunately impossible and
we have to think about other ways to handle that. I might share
of my mistakes during deploying
of machine learning models, and unfortunately, quite a lot of
them I did in production. But that's how you
kind of learn a lot of things, and sometimes you learn
them the hard way. But let's hope that we're not going
to repeat that ever again. And you can learn by
just listening what I'm going to talk about instead of
going through that by yourself. And the first story is
about scaling, and basically it's about the
moment when I developed my first ever
model and it was my first job
as a data scientist and I needed to deploy that.
And since I was working in a startup, and you know how sometimes it happens
when they have really tight deadlines,
things fast pacing, startup culture,
investors coming to the office and training
to get everything what they need in a few days.
So it's quite a lot of pressure, and that's why
especially if you are doing your first project as a data
scientist in a startup, you have to at those same
time do something valuable, but also do that fast because it's
really hard to prove that this kind of
investment worth it. And that's why I
would say a lot of mistakes is happening for many
different data scientists in their careers. And for
me the story was but deploying my
first model along with my teammate and we just basically
created a prediction service which was really
all built around just one model and it wasn't really able
to scale ever by any other model. And scaling it
was like super inconvenient. It wasn't
really made for it, it was just made for one model.
But in reality, I think I deployed
more than 20 models using that prediction service and trying
to scale that. And it was a big
pain because again, it wasn't really made for
it. I constantly had different issues related
to the fact that I didn't really try to plan that thoroughly
because I was out of time, I had deadlines and
I felt like, okay, I'm going to just deploy that kind of model and I
will think later about how I will be deploying other models.
But in the future kind of gets harder and harder to prove
that you need more time to change something,
especially to do something from scratch.
So basically the lesson that I learned there is that
it's super important to plan your deploying
thoroughly. Like try to communicate with different teammates
from other teams as well. Especially if you don't really have an
experience in architecture or experience in software
development. Try to communicate with different
people who are more experienced in that venue and
they can help you to create great architecture,
great plan, or even take some of the tasks that you
are supposed to be doing for those first time ever and help you
out with that too. Also, there is always a possibility
that you probably will scale your solution.
And even if you think that you're going to make just one
model and you're never going to make any other model again
other than things prediction service, probably you will have to maintain
that. And by that it means that you will have to
release different versions of the same model and retrain it.
So it means you will have to add other versions to
the same service. And if it's not made for it, it's going
to be a huge pain for you for a long,
long time. Also, it may seem that it
takes quite a lot of time to plan
a good prediction service for deployment,
but actually if you do that later, it takes even more
time and it gets even more expensive. Like I said in my
story, I deployed over 20 models using this
service. And already at that time,
we all understood that it was our own way
to do that, that it was inconvenient, that it was holding us down,
that wasn't great in terms of resources,
usage and so on and so on. And when we started to
build another solution for that, and basically we just had
to throw away everything that was done before and create
something completely from scratch, which took really
a lot of time. And especially it was hard since we
had a lot of models already running in production
and we had to keep them consistent,
which was again, a big task in terms
of testing and the implementation. And it was harder to
move for all the models to another service. So it just
seems in the beginning that you're having too much time to plan
that, but it's never too much time to get rid of
the huge headache you might have in the future.
So another lessons that, another story that I had was
about the time when I became a team lead in
a team of data scientists. And I'm a perfectionist.
And for me, it's really hard to kind of give out some of
my tasks. And at that moment, I was just
participating in model development as much as all
of my other teammates. I wasn't just managing a team.
And that's why there was quite a lot of things which I was
just doing by myself. And because there was a lot of pressure,
no time, and I was really worried about the quality
of what I was doing. I was taking some
of the things, just only not trying to
delegate that anyhow. And there was a time when I
went on a vacation and I remember it was like, it could
be great to having out with my friends in Barcelona,
in the park and have fun and enjoy
the weather. But on the contrary,
I had to go through the work chats and try to help my team
debug some of those bugs that occurred in production, which I couldn't
help them with because they were seeing them for the first time and they
haven't had any experience with that because I
was the one maintaining this deployment service and
I didn't let anybody else do that because
I was sure that I will be the one who will make it
the best way. And that kind of taught
me that you have to split your work
somehow, mostly because you are human being and
sometimes you're going on vacations, sometimes you get
sick, and so on and so on. And when there is just one person
responsible for something, it never ends well.
So talking. But the lessons that I learned is that delegation
is super important, like delegate if you can. It's more
about managers, I would say, than people
who just work on their tasks. But again, for data scientists,
I would say it's also quite important because often
we do so much work, which is usually done
by software developers, by QA engineers,
by DevOps, and we can just learn to
cooperate with different types of people instead of
trying to put all the hats on our
head and training to do everything ourselves.
Another important issue here is synchronization of
tasks. And honestly, it feels great
when you can actually delegate
something and feel like your team is
working as a one algorithm, I don't know,
as a process. And even if you are not here,
it's something that is working without your participation,
and it's great not just for you, but also for the product.
And again, it kind of helps to feel
like you are part of the team instead of jugging everything by
yourself. And as I said, you can't be everywhere
at once. You can't
always respond to everything, what is happening.
So even though it's not super popular among
data scientists, but the models, deployment services,
they all should be shared in the team. Another story
is about not being alone, and by that I mean
not just being focused on having your team
of data scientists, because there are so many people with different
roles around you as well. And we had
at some point a roadmap and it was like three months
of a tough work and we had to focus on quite a lot
of tasks. And other teams like back end
development, front end development, they all had their own tasks
as well. And we all were on a tight schedule
and we didn't even thought at that moment that since
we are cooperating just several times with those back
end team, that we will have to include them and to
tell them that they should be doing some of the small things for us.
And our tasks depend on what
they do. And so it happened that since it
was too late to do something earlier, it was
like just the last night of the deadline,
and we're all sitting together and trying to do everything with
backend engineers just because I totally
forgot that me and my team, we are not just alone
working on our solution. We are doing a part of the system
and we had to cooperate with really a lot of people,
with key engineers as well, with DevOps, and especially
with backup engineers during deployment,
because our component was just a part of the system
which communicated with another part of it. And they needed to send
us requests, we had to respond to
those requests and they had to process somehow our responses.
So after that, I think I always created
a specific task in Jira, which included
communication with the backend engineers, planning together
a lot of stuff that where help is
required. So basically the lessons that I learned is that you
shouldn't just rely on your team. There is a lot of things that
you do in a team. And again,
taking all of the roles for yourself, it's not efficient.
And probably you're not going to be that great in that instead of
getting a help from someone who is more experienced in that.
And also it's important to remember that
you are not alone in the company and you're not just doing something like
your part, which is, I don't know, the most important.
Somehow you're all doing something together.
Also, helping other people to understand
what kind of work are you doing kind of helps
you as well. For example, education of your team,
and not just your team of data scientists, I mean,
but the whole company helps
them also to come to you when something happens and tell
you what can we do for you?
Instead of just being the one who forces things
kind of cooperation and trying to push everyone to
do something for you. You kind of, on the contrary,
you try to build at once that kind of relationship
and try to explain at once what is the essence of a data
scientist work and how they can help you and how you can
help them. So the next story is about
data scientists and no need for tests.
So prediction service that I was talking about in a few
previous stories, we deployed it without tests,
like zero tests. And since
when we do the research again, as I said, we don't really use
tests for that because we can check everything at any point. But during
deploying we have so many things that we need to
be attentive about, but not just the code itself,
but also the data, how it
is processed, how dispatched.
There are so much things we should test about the states of data.
And so what happened? It basically wasn't really
just one story. I had quite a lot
of cases like that when we had to debug something,
trying to understand what went wrong. For example, a new category was
added to the categorical variables
that we used in the models and we didn't expect
that. And everything just was breaking down as well
as for example, you just get some
recent values in the feature that didn't have any recent values
when you trained the model. And again, something can break
and you don't get any response from those model instead of handling
all these cases. So the lessons that I learned is that
basically, in my opinion, software engineering
doesn't require as much tests as data scientists do,
because we don't just work with those code, we work so
often with data, and it's super important that
everything that we do, but the data, every kind of feature is
calculated the way it should be. Like we have to check
different cases related to different ways, how we calculate
the features, and there are so many things like, as well as
testing later the distribution of the data,
how it changes and so on. Also, it's kind of easier
to write tests than to fix bugs in
production, which kind of makes sense because sometimes
when you're out of time, you have a tight schedule and you need to
fix something really fast, and you don't have even logging,
you don't have tests, you can't understand what is happening,
you have to just debug through that. And it takes quite
a lot of time, a lot of nerves, and it'd
be better to kind of be prepared for that.
And another story I wanted
to talk about is about monitoring the models.
Basically, it's about those past being in the past and
but us just forgetting that we have to maintain
our solutions. And it's interesting that even some
quite big companies have this habit of not monitoring
what they are doing with their models. And I had an experience
like that when you come to the big company and they don't even monitor
anything for years, and they only
fix something when some of the clients tell them that
something's going wrong, which I things is really
bad approach. So one of the things that I
did, again, things is those first time when I deployed models
was that I didn't monitor how we perform.
I monitored my metrics when
I needed to. So I would say more like by request rather
than doing it constantly. And at some point my
models were performing worse and I didnt
understand why because it was my first models and I was
thinking maybe I should change something about how they are built,
or, I don't know, change something about features.
There were so many ways to do that, so many
options, what could go wrong? And then I found out
about such a thing called data, such shift, which happens if the
data that you use for models changes over time. For example,
patterns of users behavior are changing. Instead of
ordering some products in a sequence a,
they change the sequence in which they order those products,
for example, and it breaks the model,
but not in a way that you can see that. That's why monitoring
is crucial for so many things. And it helps
you to understand when you need to improve your solution, when you need to
maybe change it completely and it's kind of another way
to test your model and to catch some kind of errors,
which are more logical errors rather than boxing
the code. So basically the lesson that I learned
is that monitoring is a lot
about the result of your solution. And we
as data scientists, we don't just build models for the sake of
building models over. It's a lot of fun.
But because we try to achieve something like that, and if
we don't track how models work in Neverland,
whether we took the right approach, whether we have to improve our approach
somehow, maybe, or what can we do about that?
In general, we don't know what to
do with that. Basically, we don't really understand
whether our solution is valuable enough, whether it's worth the
efforts that we made. Another thing that we
talked in the beginning, but the processes of
model development. And there is this lifecycle
of a model. And basically it doesn't just include model training.
The next stage is usually model deployment and the stage
after that is monitoring of a model. So basically it's
just a part of a lifecycle. The model doesn't die when you
deploy it. It doesn't cease to exist.
It's still there, it's working. And all the work that you did before,
it was just a part of all that. It's not like you
standard the project and that's it. You have to maintain your model
over time and to check, for example, how the data
changes, how your model changes, what's the
distribution of prediction stories and so on and so on. Another thing
is that you may have tests,
which is great, but also they don't really catch
mistakes like the model started to work less accurately.
You don't really have tests like that. So it basically helps
you to see the logic of your solution and to check more,
I would say data science errors, rather than seeing
some just bugs in the code. And another point
is that I genuinely think that data scientists work
is not just about building models and predicting
something, making something, just more efficient. It's more
about finding the answer to questions like why is that,
why we can use that, how we can improve that. And that
is why when monitoring kind of helps us to answer these
questions. So talking, but all the
stories that I told you
and what I learned about that is that
you should always expect and plan that your solution
will have to be able to be scaled later on.
There is no such situation when it's worse
for your solution, when you already try something to do it for that. Even now,
when I build something and I think about scaling sometimes I
still make some errors about that and I still see that
I didn't predict some of the things and I should have done something better
even when I already tried to plan for scaling and
do some of the things. Imagine in my head that I
will add here more of those pipelines, or for example, that I
add here more of the models. So I will be deploying more
of the models with things service. Also, it's super
important to work as a team, and unfortunately
I don't see that it's really a things in
data science field because we are used to
just doing our research on our own usually,
and we don't really share the tasks that much. But in
my opinion, it actually helps when someone can
at least review what you are doing, because sometimes we tend to
focus too much on some of the things and miss out something
or we just get stuck at some point.
Also, for managers or probably any data scientists,
it's important to remember that you don't have to do everything.
You should delegate, and there is a lot of people who can
be better in that than you think.
Another important issue is to be able
to cooperate and to work with other teams
as well and other departments. There is no just data scientists team
work completely separately from everyone else, and they
do something which is not related anyhow to
the work of other teams. That is not true usually.
And especially we get to cooperate quite a lot with
software engineers, with QA engineers, with DevOps,
and it's super important to educate your
teammates in the company and
also to build communications and
to be ready to cooperate. Also, as I said, data scientists
probably need tasks even more than software engineers or
anyone else, because we don't just deal with code,
which usually we don't have a software engineering background,
which is that strong. And also we
deal with data and it changes states all
the time. So we have to be able to track that and to track
those states. And another point
is that model monitoring is a key to understanding
whether we are doing something valuable, whether our model is
performing in a good way, whether it's
performing in an expected way, and how we can improve that. So it
basically answers all of the questions for the future planning,
and we don't have to feel
like we don't know what's happening when the business side asks us
what's happening, what can we fix, why it's not working that way.
Monitoring is something that helps us to be the same control,
to maintain the models and to remember that their life
cycle is going on and they are not just disappear
after we do our research, get the most accurate models that
good. So thank you for your attention, and I hope
that this presentation was useful
for you. And I hope that you're not going to do
all of those mistakes. But even if you do, you remember that all
of them are just lessons that we learned. Thank you.