Transcript
This transcript was autogenerated. To make changes, submit a PR.
Welcome. In this session I will discuss as
how you can take your data science and machine learning projects
from idea to production by automating your machine
learning workflows with pipelines.
Before I start, I want to point out two great learning resources
to follow up on this topic after today's session.
Besides working as a developer advocate, I'm also an O'Reilly author
and coursera instructor. The O'Reilly book data
Science on AWS, which I coauthored, discusses in
over 500 pages and hundreds of code samples
how to implement endtoend, continuous AI and machine learning
pipelines. Another great resource is the newly
launched practical data science specialization.
In partnership with deep Learning AI and Casera,
this three core specialization teaches you practical
skills in how to take your data science and ML projects
from idea to production using purpose build tools in
the AWS cloud, and it also includes on demand,
hands on labs for you to practice.
So we're talking about automating machine learning.
Hmm. I have an idea.
Alexa deploy my model.
Which multi armed bandit strategy would you like to use?
Thompson sampling Epsilon Greedy or online cover?
Well, I'm pretty sure someone already thought about developing
this Alexa skill, but unfortunately,
getting your machine learning projects ready for production is
not just about technology. A term you
will likely hear in this context of getting your ML applications
ready for production is mlops.
MLOPs builds on DevOps practices that encompasses
people, process and technology.
However, MLOPs also includes considerations
and practices that are really unique to machine learning
workflows. So while most of the time we
tend to focus on the technology, people and process
are equally, if not more important.
Let's take a look at a few key considerations in ensuring
your models and machine learning workloads have
a path to production.
First of all, the machine learning development lifecycle is very
different from a software development lifecycle.
For example, model development includes longer experimentation
cycles compared to what you would typically see
in an agile software development process.
You need to consider choosing the right data meets,
perform data transformations, and feature engineering.
So besides the actual model training code, you also
need to develop the data processing code.
Next. The model is typically only a small part
of an overall machine learning solution, and there are often
more components that need to be built or integrated.
For example, maybe the model needs to be integrated into an existing
application to trigger further process developing on
production results. This leads to the next
consideration.
There are typically multiple personas involved in the
machine learning development lifecycle, often with competing needs and
priorities. A data scientist might feel comfortable
in building a models that meets the expected
model performance metrics, but might not know how
to host that model in a way that it can be consumed
by another system or application.
This part might require a DevOps engineer or the infrastructure
team. You also need to integrate
the projects with existing it systems and
practices, such as change management,
for example. This could mean that as part of
the pipelines, you automatically open a change ticket
anytime a new model is ready to get deployed
into production. Or you might want to
add a manual approval steps in your pipeline before deploying
any model into your production.
If we look at the goal of mlops, you want to move away
from manually building models, which is often still
the status quo. In the first phase,
you can accelerate the path to production by instead of building and
managing individual models, start building and managing
pipelines. You also want
to improve the quality of deployed models. To do
this, you need to be able to detect model decay,
maybe due to a drift in the statistical changes in
data distributions. You should also monitor
the models for any drifts in bias or explainability
from a set baseline. This can be accomplished
in a second phase, and ultimately
this should lead into building AI ML solutions
that are resilient, secure, performant,
operationally efficient, and cost optimized.
Let's have a close look at each phase.
Today we often still manually build and manage individual
models. We also execute
each step in the model development workflow individually.
Here is an example workflow where a data engineer may create
a raw data set and manually send it to a data scientist.
Then the data scientist iteratively performs
data preparation and features. Engineering performs
multiple experiments until a training model
is actually performing well according to the objective
metrics. Then the data scientist
may hand it off to a deployment team or an
ML engineer who is then responsible for deploying the model.
If there has been limited communication between teams,
this part could result in a lot of delays because the model is
essentially intransparent to the deployment engineer or
the DevOps team, meaning there is limited visibility into
how the model is built or how you consume that model.
Then a software engineer potentially needs to make
changes to the application, which consumes that
model for prediction. And finally,
someone ultimately needs to operate the model in production,
which includes making sure the right level of monitoring is
set up. We can see the challenges
in this setup. The workflows includes multiple handoffs
between teams and personas who might not
all be familiar with machine learning workloads.
A limited cross team collaboration could lead to limited visibility
and transparency using increased code
rework, and ultimately slows down the ability to
get the model to production quickly. So what
can we do in a first
phase, we should improve the situation by orchestrating
the individual steps. As a pipeline,
we can also look at automating tasks in each
step. For example,
we can build a model training pipeline that orchestrating
the data preparation, model training, and model evaluation
steps. We could also build a deployed
pipeline which grabs a model from a model registry and
deploys it into a staging environment.
The software engineers could then use this model to
run unit or integration tests before approving
the model for production deployment.
Let's see a demo of a model training pipeline.
All right, here I am in my AWS demo account,
and I want to show you how you can leverage Amazon Sagemaker
pipelines to automate the individual steps
of building a machine learning model. Amazon Sagemaker is
a fully managed service that helps you build, train,
tune, and deploy your machine learning models.
All right, so the use case we're going to build is
I want to train a natural language process model to
classify product reviews. So I'm going to pass in
raw text product review text. For example,
I really enjoyed reading this book, and my NLP
model should classify this into a star rating. So in this case,
hopefully a star rating of a five the best, or a star
rating of 4321, with one being the worst.
And the way we're doing this, I'm going to use a pretrained
bird model. Bird is a very popular model architecture in
the NLP states, and what you can leverage is actually
pre trained models that have been trained on millions of documents
already, for example the Wikipedia. And then you can fine tune it
to your specific data set, which I will do in the training step in
this pipelines, fine tuning it to my specific product reviews text.
So the DAC we're going to build here is first process
the raw text data to generate the embeddings that
the bird model expects as inputs.
Then in the training step, I'm going to fine tune the model to
my data set, and I'm also evaluating
the model performance. So in this case, I'm coding the
validation accuracy of my model and
I'm defining a threshold, a condition. And if my
model performs above this threshold,
then I'm going to register it to a model registry. Think of
it as a catalog of your models to compare the different versions.
And I'm also preparing for deployment by creating a model object
here in Sagemaker. All right,
so let's see how we can build this.
First of all, here I'm importing a couple of sdks
and libraries. One of them is the Sagemaker Python SDK
and a couple of additional libraries used here in the
AWS environment. All right, first of all,
I'm going to import or set the location of the raw data
set, which is my reviews data here
hosted in public s three bucket.
And what I'm going to do here is I'm pulling a subset of the
data just for demo purposes here into my own AWS account.
So I'm setting a path here to my own bucket,
and I'm going to pull in just a subset here so the model training doesn't
run for too long, and I'm just pulling in three different categories
of the product reviews data.
All right, let's start building the actual pipeline.
So first of all, I'm creating a name for the pipeline.
Let's call this my Bird pipeline, and a timestamp.
And then one nice thing about pipelines is that you
can define parameters to parameterize individual executions.
And now let's start with building the first step, which is the
feature engineering. Here's a little bit
of explanation what we're going to do. So my raw data set
here on the left has star ratings
and the review meets. So for example, this is
a great item, or I love this book, and the corresponding label,
which is the star rating. And what I'm going
to do in this first features engineering step is I'm using a sagemaker
processing job, which helps me to execute code
on data, so it's specifically suited if you want to run feature
engineering. And I'm converting this raw input data
into embeddings that the bird model expects as
inputs. All right, so what
I'm going to do here again, I'm making sure I have access
to my input data, which is now in my own bucket.
And I start by preparing a couple of those parameters which I
want to be able to parameterize. So one
is definitely where to find the input data in case the location changes,
or I want to use a different data set.
I'm also specifying the processing job instance count.
So what I could do, depending on how much data I need to process,
I can run this distributed, so I can run it across one
AWS cloud instance. I could run it across two instances, five instances,
et cetera. And it's as easy as just setting the value to one
or five or ten. And the processing job will make sure to
distribute the data across the instances and work in parallel.
In this case, I have a small subset of the data, so I'll stick to
one instance. I can also specify here
the instance type. So this is the AWS easy two instance type
managed by Sagemaker to run this processing job,
and I'm just specifying one particular instance type here.
Then I'm also defining a couple of parameters, which my feature
engineering script requires.
Then I could set parameters such as do I want to balance the
data set before and percentages how to
split the data into a training set, validation and test data set.
So in this case, I'm taking 90% for my training data,
and I keep a split of 5% for validation and
another 5% for test.
All right, then the step actually needs to perform the
feature engineering. So what I've done in preparation is
I wrote a Python script which performs the actual transformations,
and this is here in this Python file.
So I'm not going to go into all of the glory details. If you're curious
how to do this, have a look at the GitHub repo.
All right, then I can start creating this
processing job, and for that I'll
define a processor. Here I'm using a prebuilt process based
on scikit learn. I'm defining the framework
version passing in my IAM role the instance
type and the instance count, and also the region I'm operating
in. And I need one more thing,
because before I wrap it into the official workflow and
pipelines step, I need to define the inputs
for the job and the outputs. The inputs here
is the raw input data,
and the outputs are s three folders
where to store the generated features.
And I'm also going to split this again by training,
validation, and test data meets. So I'm putting here the
three locations where the data will be written to,
and those internal container paths will get mapped to
an s relocation later. All right,
and with that I can define the official step as part of
my pipeline. So here you can see I'm defining a
processing step. I'll give it a name. This is what you
saw in the DAC. I point to
the actual Python code to execute my feature engineering,
and then I'm passing in the scikitlearn processor, which I defined
the inputs and the outputs. And here I'm passing
the specific job arguments that my script requires.
So for example, the training, validation test,
split percentages, et cetera.
All right, this defines my processing step.
Now let's move on to the second steps, which is fine tuning
the model with the help of a sagemaker training job.
And this is pretty similar. So here you can see again I'm defining
parameters, for example, the training instance type and count
again. And then I'm setting up parameters,
the hyperparameters will depend on the models you're using
the use case. So in my case, some general parameters,
number of epics runs throughout the whole data set.
And I'm just going to keep this here to one. For this demo purpose,
I'm setting a learning rate and then additional values,
for example, the Epsilon value train, batch sizes,
validation batch sizes, et cetera. Again, this will highly
depend on the type of model and use case you are
training. All right,
next, what I'm going to do is I'm also going to capture
the performance of my model during training.
So I can specify here regex expressions
that the model training code will output in the logs.
So for example, my script that I use will put validation loss
and validation accuracy as an output. And I'm
using here those regex expressions to capture them from
the locks and then also are available in the UI and
for me to check later on in the evaluation step.
All right, and I've talked about the training script.
So again, I've prepared a Python file which contains the
code to train my model. This is here in the Tfbirdreviews
py file. And again, I'm not going into details. If you're
interested in seeing how to do this, in particular,
how to use this pretrained hugging phase
model and then just fine tune it to the data, please check out
the code in the GitHub repo. All right,
so we can now prepare the training estimator and
then build the model training step. So first
of all, I actually need to define the estimator which performs
the training. And I'm using
a built in Tensorflow estimator with Sagemaker, which has
optimizations to run Tensorflow on AWS. And as
you can see here, I'm defining this Tensorflow estimator,
pointing to the training script, which I just highlighted,
and also passing in the additional parameters, the role,
instance count and type, python version,
tensorflow framework version I want to use. And again,
my hyperparameters, which I defined earlier.
One more step actually, that I'm coding to do is activating
step caching with steps caching. What you
can do is make sure if you're rerunning
the pipeline and individual steps might have not have changed to
reuse the previous results. So Sagemaker will apply
this caching to help you accelerate and
run the different executions more efficient.
All right? And with that, I can define the training step.
So here you can see I define
the official training step. Give it a train. Again, this name,
the train will appear in the DAG as you could see before then I'm passing
in this estimator that has all of the tensorflow and training script
configuration, and I'm also training the inputs.
And here you can see I'm referring to
the previous process step output and using it
as an input for the training. So those are the features that I generated for
bird training, for bird validation, and for bird test.
All right, after the training, there comes the model evaluation.
So let's see how I can do this.
The model evaluation I can also execute as a processing
job again. So I'm going to use this scikitlearn processor
again specifying the framework version instance
types and counts. The difference here is that I
do have another script to execute. So instead of the feature
engineering, I've now written a script that evaluates the model performance.
And again, here is a link to the python script.
Basically what I'm going to do is I'm using the pretrained
and fine tuned model from the previous step, and I'm
running some test predictions and see
how the performance is. So in this case, I'm specifically
looking for the validation performance of my model.
All right, and the results get written into JSON
file, which I call the evaluation JSON, which is the official evaluation
report from the step. And here's the official
definition. So this is actually implemented as another processing step.
In this case I call it evaluate the model.
I'm pointing to this new script which runs the validation.
And again I'm pointing it here to inputs, which in this case
is the fine tuned model, the model artifact from the training
step, and also again, input data
which I could use to run validation.
And I'm also pointing out an output location to store
the evaluation results.
All right, the official model metrics are defined in an
object, which I do here, which contain the evaluation
JSON file. All right,
and the last part now is to define the
condition step. So, checking whether my model meets
the expected highquality gate. And if yes,
then register the model to the model registry and also prepare
for deployment. And what I'm going to do here is first
create those steps afterwards, and then I can reference them
in the actual condition step. So let's see this.
And one nice thing that you can do with sagemaker pipelines
is to actually set a model approval status.
So when you get a model, you evaluate it. You can specify
whether this has a pending, for example, manual approval.
So somebody has to look at the metrics and then approve
it for deployment. You can set it to always approve,
but in this case, I want to show you I keep it to manual approval
only. All right, then I'm
going to define again the instance types and counts, where to later
deploy my model and host it for live predictions.
And I'm also specifying the model package group, which is
registered in the model registry. And I'm
defining an image that is used to later deploy
the endpoint and then run the model and the
inference code. So in this case it's going to be a tensorflow based docker image
again. All right, so here is the step that registers
the model. It's taking the estimator object and
the information about the inference image to use.
It actually points to the s three location of the fine tuned
model, and it also defines specific input
format types. For example, you know that your model expects JSON lines
as input, and also response is going to
be in JSON lines. Then this might vary, of course, depending on your model.
And you also set here the model package group, where to register
it with, and the approval status. This one will be pending manual approval.
All right, then what I'm also going to do here is create a
step to prepare the model for deployment later.
So I'm preparing a model object in sagemaker,
again, pass in the inference image and also the artifact.
All right, and then I define here
the official create model steps for the pipeline
and pass in the model which I just created.
So now we can start in creating this condition check that
comes before. So what
I'm going to do here is I'm importing the conditions
and corresponding functions that are available. For example,
a condition greater than or equal to, and I'm
defining a minimum accuracy value I want to check
against. As this is a demo and I'm just training on a little bit of
data, I'll keep this low so all the model training runs will actually
pass. But obviously in other use cases,
you definitely want to bump up the accuracy threshold. In this case, I'm using
20% as my minimum accuracy to check against.
Then here is the definition. So it's going to execute
this check condition greater or equal to.
And here is my evaluation step.
And I'm also pointing to this report file to generate and
set to this value I created.
And the additional official step is then tuning this
condition. And if I meet the condition.
So if I'm passing my quality threshold, I'm going to register
the model, and I'm also going to create the model in preparation for
later deployment. In the out step you can say,
what else? If I fail the test, right, send a message
to the data scientist or whatever you want to do. In my case, I'm just
keeping it empty so it will fail and end the pipeline.
So what we've done now is defining each and every step from
the preprocessing, to the training, to the
condition and preparation for deployment, if I pass my quality threshold.
So now I can wrap this in the end to end pipelines definition.
First of all, again, I'm importing some of the functions here needed and
objects from the SDK. And as you can see here,
I'm now creating the official pipeline object passing
in the name that we created, and all of the parameters which
I specified in the above code. And then the
steps here will actually line up the individual steps in
this stack. So let's start with the processing, then move
to the training step, do the evaluation step,
and then you have this condition step to evaluate the models,
which will in itself trigger the two different
path depending on if I pass the quality check.
I'm also adding this to an official experiment tracking, so I
can keep track of my pipeline runs, and that's the definition
of my pipelines. And then I can submit the pipeline for execution.
So I'm calling the pipeline create passing a role that has
the permissions to execute everything. And then what I
can do is I can call the pipelines start,
which will start an individual execution run of this pipeline.
And you can see here also you can pass in
now your parameter values,
and that will kick off the pipelines in the background.
But this will run now on AWS with the sagemaker processing
job, the training job, and also run the model evaluation.
And what I also want to show you is that Sagemaker
keeps track of the individual artifacts that are generated in each step.
So what I'm doing here is I'm listing all artifacts
generated by this pipelines. And this is super helpful
if you want to keep track of the individual steps.
And for example, what were the inputs in this case for
the processing job, I do have the raw data. I do
have the docker image that's used to process the data.
And the output is the generated features, and you
can see here contributed to or production.
Then in the training step, you have the generated features,
training features as the input and the image to
execute the training job. And the output here would be
the model artifact. So really nice to keep track of
those artifacts for each step. And with that,
we can come back to actually checking on our pipeline.
So I'm going to go back here in the pipelines.
You can check on the graph, and this is the overall graph that we defined
so it matches our steps.
You can check the parameters. Again, you set the settings,
you can click into the individual graph, and the color coding
here shows green means it completed successfully. You can
click into each step and again see the parameters here
that were used to run and execute the step.
All right, what I also want to show you is the model registry.
So let me go here in the navigation also to the model
registry. And we do see here our model group,
the bird reviews and here is my model
version. And again, I've set this
to manual approval. So this one here
will still need my approval to be deployed into
production. I can update the status
and set it here to approved and
say this is good for
deployment into staging
update. And with that the
model is now approved for deployments and
this completes the first demo.
In the second phase, we could automatically run pipelines
and include automated quality gates.
So here the model training pipeline could automatically
evaluate the model in terms of model performance
or bias metrics and thresholds.
Only models that fall into acceptable performance
metrics get registered in the model registry and
approved for deployment. The deployment pipeline
could leverage deployment strategies such as a b testing
or bandits to evaluate the model in comparison
to existing models. The software engineer can
automate the integration tests with pass fail
highquality gates, and only models that passes
get deployed into production. And finally,
the operation team sets up model monitoring and analyzes
for any data drift or models drift.
If the drift is violating defined threshold values,
this could actually trigger a model retraining pipeline.
Again, another trigger to rerun a model
training and deployment pipeline could be code changes
as part of a continuous integrations and continuous
delivery. Short CI CD automation let's
see another demo of a code change triggered pipelines
run all right, I'm back in
my AWS demo account. Now I want to
show you how you can leverage Sagemaker projects
to automate workflows. Pipelines runs Sagemaker
projects helps you to set up all of the automation
needed in the AWS account to build a
continuous integration, continuous deployed workflow.
The easiest way to get started is if you navigate here
in the menu to projects and then
create project. You can see that it already comes with a
couple of prebuilt templates which you can use.
So one template will set up all of the automation
for a model building and training pipeline and automation.
The other one is for model deployment. And what I've pre provisioned
here is a CI CD environment for
model building, training and the actual model deployment.
I've already pre built everything, so let me walk you
through the steps here. Programmatically,
Sagemaker project is based on an AWS service
catalog product. So the first step is to reusable sagemaker
projects here in the studio environment.
Then I'm again importing all of the needed sdks and
libraries and set the clients, and I'm
coding it here programmatically. But again, if you're coding the template
through the UI, this is done for you. I've provisioned the service
catalog item and the template here programmatically through my notebook.
Once the project is created,
you can see here on the left that I do have this entry. Now for
projects, I can select it. And what
will happen here in the first step is that sagemaker projects
will create two code commit repos for
you in this AWS account you're using. So here
you can see I do have two repos,
one for the model build and training pipelines,
and one for the model deployment pipeline.
And all of the automation is set up. So whenever I
commit code, push code to those repos, that will actually
trigger a pipeline execution. And I'll show you
how this looks like. So, back here in
my main notebook, the first thing I'm also doing in this
notebook here is to clone those code commit repos
locally into my sagemaker studio environment.
So this is what I'm running here for both code repos.
I'm also removing one of the sample repos
encode that are in here. And I'm
triggering the first execution of the pipeline by
actually copying over my sample pipeline that I've
built before in the demo. So if I go to the file
browser here on the left, you can see
I've cloned down those model, build and model, deploy code,
commit repos, and I can click into those.
And I do have all of the needed code already set up
all of the triggers in the AWS account. So what I
can do here is in pipelines, I just need
to add my own pipeline execution code that I want to run
on the code change. So if I go in here,
you will see I'm using the exact same Python scripts for pre
processing, for model training, and for entrance that I've showed
you before when I was manually building this pipelines. So especially
the pipeline py file. If I open this one, this should
look pretty familiar to you. Hopefully this contains exactly
the code that I've been showing you before when I
was building the pipeline in the notebook. The only difference
here is that I'm now tuning this in the
Python file. Programmatically, when I trigger the pipeline,
run back to my notebook
here. So what I've done is I've cloned
the repos here, and I'm coding over my
sample code into those repos,
and then I'm committing them into the code commit.
And you can see here it detected that I have removed some sample code.
And I've also added my own pipeline
code. So that should be enough changes hopefully, to the repo.
I'm making sure I keep track of all the variables I'm using here.
And again, here's the pipeline py file, which I just showed you,
which contains the pipeline code to run. And this is
exactly what I have set up before.
And when I was doing the code commit
and the code push to the repo, this set up all
of the CI CD automation that the template set up for you in the AWS
account and started the pipeline run. So if I
go down here into my projects again and
click on pipelines, I can actually see
that here is a pipeline that got started and
I can select it and it has a succeeded execution
run already. So I've started this sometime before
the session. All right, so let me show
you how this automation works. So what happens
is that projects integrate with the
developer tools, for example with code pipelines and
has this automation built in. So the first step, as you can see here,
is creating the source code needed,
and then it's going to build and start the pipelines execution
run. And this is done with code
build. So if I now jump into the build projects,
you can see here that our build projects
already in place. And the first one is
the model training pipelines, which just succeeded. You can
also see that the model deploy is currently in a failed state,
and this is because it doesn't have an approved model yet
to actually deploy. This template is
also set up to have a two phase deployed.
Phase one is deploying in a staging environment,
which for example a data scientist could approve after
evaluating the model. And then also it comes with a
second stage to deploy into a production environment,
which would most likely be another team to approve.
For example, the integrations team, the DevOps team, the infrastructure
teams. So I can click here into the code
pipeline and I can see that my latest run here succeeded.
The first one is the one I stopped and deployed from the sample
code. So let's go back
here to my environment.
So what I need to do once the pipeline has
executed, I can list the steps again here.
I can see everything looks good. I can also list
the artifacts again. And this looks familiar to
the one I showed before. It's the exact same pipeline,
all the artifacts that contributed to each steps.
What I do have here now is a last step.
That is an approval that is needed to actually deploy
this model into the staging environment.
And if you can remember in the previous demo, I showed you how to
approve the model here through the studio UI.
What I can do now here as well is to approve it
through the API programmatically. And this is what I'm
going to do here. So, in here, I'm looking for
the executions and I'm grabbing
the model package arn where we registered the
model. And then I'm going to update
the model package here and approve it for deployment
into this first stage, which is the staging environment.
So here I'm going to update the model package. I'm setting the
status to approved, and then I can check
here for the model name, and we'll
see that the model starts to get deployed into the staging
environment. Let's see the
deployed pipeline. So what happens here is
once I've approved the model for staging,
it actually started the second pipelines
here, which is the model deploy.
And you can see here it started building the source, and it's
currently in progress deploying the model into the defined staging environment.
I can also have a look here. So I'm looking
at the endpoint that this pipelines will set up.
So if I click here,
I will see now that here is an endpoint being created on
Sagemaker for the staging environment. So this
will take a few minutes for the endpoint to be
ready. All right, the endpoint is
now in service. And if I click in here, I can see
the rest API I could call to get
predictions from my model. Now, let's check
this. In the notebook here, you can see the endpoint is in
service. And what I do here is I pull in
a Tensorflow predictor object, which I can create,
and then I'm going to pass in some sample reviews.
Let's say this is great, and I can run this,
pass it to the predictor, and you can see I get a prediction result
back from my model deployed in the staging environment.
Predicting this is a five star rating. Let's have a look at
the code pipeline that we executed.
So you can see here that the staging
succeeded. But there is one more approval
needed here for the actual deployment into a
production environment. So this could really be something
that another team handles. So if I'm the DevOps
engineer, the integration engineer, I could make sure I'm running all
of the tests that I need with this model. Now that is hosted in the
staging environment. And if I agree
that it's good to be deployed into production, I could
either use here the code pipelines to
approve the model and deploy to production,
or I can also obviously do this programmatically,
which is what I'm doing here now in the notebook. So again,
I review the pipelines and what I'm doing here,
exact same thing is that I'm programmatically approving
this for deployment in production.
You can see this succeeded. Let's actually check our
pipelines and there we go.
You can see it took the approval and is currently now
working on deploying this model into the production
environment. As pointed out earlier,
the ultimate goal is to build AI ML solutions that are
secure, resilient, cost optimized,
performant and operationally efficient.
So in addition to the operational excellence which we discussed in
the context of mlops, you also need to incorporate standard
practices in each of these areas.
Here are a few links to get you started. First,
a link to the data science on AWS resources and
the GitHub repo which contains all of the code samples I've showed.
Also here are links to the Amazon Sagemaker pipelines
and the great blog post.
Again, if you are looking for more comprehensive learning resources,
check out the O'Reilly book data Science on AWS, which covers
how to implement endtoend, continuous AI and machine learning pipelines
in over twelve chapters, 500 pages, and hundreds of additional
code samples. Another great
training resource is our newly launched practical data science
specialization in partnership with deep learning AI and Coursera.
This three course specialization teaches you practicals
skills in how to take your data science and machine learning projects
from idea to production using purpose built tools in the
AWS cloud. This also includes on
demand, hands on labs for you to practice.
This concludes the session. Thanks for watching.