Transcript
This transcript was autogenerated. To make changes, submit a PR.
Good morning, good afternoon, good evening,
wherever you are at our virtual world.
My name is Ron Dagdag. I'm a
lead software engineer at Spacey. Today I will
be talking about leverage the power of machine learning with
Onnx. For all you Pokemon
fans out there, I will not be talking about the
Pokemon onyx, nor I will be talking about the
mineral onyx. I will be talking about
Onyx on NX open neural network
exchange. All right, let's go back to basics.
What is programming? Programming,
traditionally you have an input.
You write an algorithm, you combine them together,
run it, and it will spit out answers for you.
In machine learning world, you have an input,
you have examples of what the answers would be,
and the computer's goal is to
provide an algorithm for you. So as a primer,
you have your programming, traditional programming at the right,
machine learning on the left still
have your input, answers, and your algorithm machine
learning world, we call
the input and the answers as your training data.
You have to use a training framework in order
to get a machine learning model.
And based from that model you would
substitute or use that into your
application, and that's what we call inferencing.
And you would use a runtime to
be able to process your input and your model, and it would give
you the answers. And now that you have more
answers, it could be a good feedback loop to improve
your training data.
So typically data scientists would
program, or would create a program
in Pytorch. They would run it locally on their machine
using the cpu. And then of course,
if you are a JavaScript developer and you've
seen all these different Javascript frameworks
and all these different ways, and how you
can create applications or
web applications the same way as in machine
learning world. There's all these different machine
learning frameworks, training frameworks that you can use,
and the ecosystem is growing.
And of course we're not just limited in deploying it locally
on our devices, not just on
our laptops. Also, you have to sometimes use a phone or
deploy it in the cloud. Sometimes you want better
performance. You run it through a GPU or
FPGA or ASIC, or you
can also run it, you might wanted to also run it on
a microcontroller.
And that's when Onnx comes into
the picture. Onyx is the bridge between how
you get trained and where to deploy.
Onnx is short for open
neural network exchange. It is an open format
for machine learning models. Notice that it's not just limited
to neural networks, it's also capable
of your traditional machine learning models too. It is on
GitHub, GitHub.com onyx
and best place to learn more about Onnx about
Onnx is through Onyx AI. And when you go
to this website, you'll notice that every time I would go
in there's new partners coming in and be
able to improve that ecosystem.
We just started between partnership between Microsoft and
Facebook and I've noticed that more and
more there's partners
using this application on GitHub
about 10.9 what, 11,000 GitHub stars
pull request about almost 2000 pull request about
200 contributors about 2000 GitHub
forks and there's also model zoo. Onnx is
available out there. It is a graduate project of
Linux Foundation AI.
And so it's becoming more,
there's a lot of traction going on for these Onyx
application. When would you use Onnx?
Is when you have something
that's trained in Python and you want to deploy it to a
C sharp application, or maybe you want to incorporate
it to your Java application or JavaScript application when
you have high inferency latency
that you want for production use. So, meaning if it's
too slow to run it or if you want
performance so that you can use it for production.
Because let's say if you have it in some training
platform or training framework and it's
not good enough and you want to improve,
if you have high inferency, that would be a good use case for
it. If you want to deploy it to an IoT device or
an edge device, it might make sense to convert it
to Onyx and be able to deploy it to those devices
when it's trained on a different
OS. And if you want to run that model into a different
OS or different hardware, that is a good use case
for it. When you want to combine
the models, let's say you have a team of data scientists.
Some of your models were created on, let's say Pytorch and some
of the models were created in keras. And you want to create a
pipeline and you want to combine these models that
is trained from different frameworks. That is another way.
Another one is through training, takes too
long. And that's when you start talking about transformer
models. Let's say if you want to train it locally
at the edge, that is one way that you can use
Onyx too. All right, so we
did talk about what is Onyx. One thing I
want to point out, one good way of to describe
Onnx, it's kind of like PDF, right?
You create your Word document in
Microsoft Word. Or you create your documents
in Microsoft Word or some word processing
application, you convert it to PDF.
Now you can display it on different types of devices using
acrobat or PDF viewer.
And then we did talk about when
to use Onnx. And then we'll talk
more about how to create the Onyx models
and how to deploy onnx models,
how to create Onyx models, step one. And then we'll
talk more about step two. Step one. Let's focus on that. Have you
ever baked a cake? And of course there's a
lot of different ingredients, different procedures.
Of course bakers specializes on this.
My analogy in this is that bakers
or your data scientist,
your team, they're the ones who make the
secret recipe for your business. They try
different tweaks and different ingredients and different procedures
and how to create these AI
models, which is going to be your secret recipe.
So how do you create these Onyx models? One way is to export
or using Onyx model zoo.
Onyx model zoo. There's existing
models out there that you can just download off the Internet and start using
incorporating to your application. You can use Azure custom
vision or some service that exports to
Onyx. You can convert an existing model and
also you can train models in azure machine
learning or some automated machine learning. So Onnx
model zoo allows you to be able to just,
someone already pre converted all these different
popular AI models out there or machine learning
models out there. If you're interested in Restnet, it's already converted
to Onyx for you and you can just download it.
These are some examples of the different sizes
of that model once it's converted to onnx.
So it's not just limited in image, there's also sound.
There's different models out there
you can just download. Another one is through custom
vision, which allows you to do low code vision
service where you would upload some photos,
you tag them, start tagging them, you train
to create a machine learning model for you
and then you can export it to Onyx.
Another way is to convert the model
from the existing training frameworks.
So let's say you
have it in Pytorch or keras or Tensorflow or Scikitlearn.
There's a way you can convert it to an Onyx model. Of course there's three
steps loaded in existing model into memory.
Convert to an Onyx and save that Onyx model.
Here's an example of how you would use and
convert from Pytorch to onnx.
So you would load that model
and provide some sample input and use this torch onnx
to export it. Another way is know
if you have it in keras. Same steps.
You load the Keras model, convert the Keras model into
Onyx, and then save it as
a protoblock. And there
is onyxml tools that
you can do to pip install. You can also
convert the Tensorflow model using command line.
So where you specify, of course, your input, where's your
saved model and then your output. A lot of good
examples how you would do this on
GitHub. This one is through ScikitLearn.
Notice that there's an SKL to onnx
where you can convert scikitlearn into
an Onyx application or to onnx
format. I keep
on talking about Onyx. Let me go back real quick.
There's this tool called nettron app
that visualizes this Onyx model for you.
And it also helps software
engineers to kind of know what's
the input and output of that existing model without going
back to the data scientist, going back to the original
code where it was trained from to know how to use can Onyx
model. It visualizes the inputs and then
be able to kind of visualize what the graph of operations
would look like. If you go to Netron
app, open an Onyx file there,
and you can visualize it. All right.
You can also use onnx as an intermediary format
intermediate. Let's say if you
have a Pytorch model and you want to convert it to Tensorflow, you can convert
from Pytorch to Onyx and Onyx to Tensorflow.
That is one way. Also,
there's Onyx to core ML that you can
use. There's ways also you
can fine tune, can onyx model create
and do transfer learning on an existing onnx model.
If you're interested, of course
you can train models in the cloud.
You have a GPU clusters.
But the important part here for me, and I wanted to talk more,
this is your typical end to end machine learning process,
where you have your experiments and you're building your
base from your different iDe, or you create
your training application. Once you train it,
you would have a machine learning model.
You register it somewhere in the cloud and
manage these models. You can have these versionings and
then based from that, kind of like when you have a
docker image, you have kind of like Docker hub where you
can store all these images.
Also as your machine learning,
as a way you can manage your models, you can upload these
models there and be able to version
them and also build a pipeline to create
and to download these and incorporate it and create the image.
So we did talk about step one,
creating. Once we have an Onnx model,
start deploying them.
Okay, so we did talk about,
as your data scientist, building kind of like a
chef or a baker building your secret recipe.
Now, let me ask you one thing. What is the difference between
a baker and starting a bakery?
Main difference is they all have different skill set.
In order to create a successful business
or successful bakery, you need both.
Need the baker and also you
need someone that actually manages the bakery.
Software engineers are
great at looking into how to
start a bakery. They know where to put the cash here,
how to collect money, right. How to create
these pipelines and how you would display or
use the application and be able to create
those different areas of the business system,
how you would use the
machine learning model or how to create the whole application itself.
What is the customer experience in all these different
things? So it
is important, whenever we create these machine learning
models, it is important where we're going to
deploy them. Some things to think about, right?
You deploy it on a VM, you might want to deploy it on
a Windows device or a Linux device or a Mac,
you can deploy it on it. Edge devices or
phones, different ways. How you would deploy
and create these AI models and
use these AI models. Of course,
every time we think about deployment,
think about where we can deploy this. We're going to deploy this to the cloud
or at the edge. Edge meaning how close it is to
your customers or your users.
The analogy in that is McDonald's and
subway. What's the difference in how they make the bread?
Right. McDonald's most likely it's
in not a warehouse, but it's outsourced.
It's not at the edge, meaning it's not
at the store, just compared to a
subway, where you
bake the bread at the store. So it's
a different experience. Right. So what I'm trying to
get at is whenever we talk about deployment, where we're going to run
these AI models, where do we want to run them?
Do we send the data to the cloud and then we run the inferencing
at the cloud and then return the results to us,
or at the edge, meaning closer to the user. Maybe it's on
the phone, or maybe it's on the camera
itself, or in a gateway
closer to the user. So those
are things we have to consider when we
deploy these machine learning models,
especially in the Onnx model. Of course,
you can also deploy them in the cloud, how you would deploy them,
since you already have registered in
your machine learning model or your Onyx models in the cloud. As you
build your image, you create your pipeline.
That is one way where you
can deploy it through a service, an app service.
You can deploy it and run it in a
docker container or in Kubernetes service.
Speaking of docker images,
there are Onyx Docker images that you can start using.
There's an Onyx base that has minimal dependency that you
can use it. If you want to run
use onyx into your application. There's Onyx ecosystem
that allows you to be able to convert without
an installer, right? So let's say
if you just want to convert an existing Onyx model,
an existing application or an
existing machine learning model, let's say it was written in Pytorch.
You don't want to download all
the converters locally in your machine. You can just
use these docker
images. So whenever we
talk about edge, what is the edge?
Remember the definition is how close it is to your customers
or to your users. But of course, every time we
think about the edge, we'll talk about
deployment. When we deploy it to the cloud,
most likely it's just you're deploying to the data centers. Maybe it's thousands
of devices. If we talk about we're going to deploy it
in 5g infrastructure where
we deploy it to the fog, which is maybe
just millions of devices, millions of
models where you're going to deploy these. And of course, when you talk about edge
might be billions of devices depending on the
need, because each
device may have those different deployment structure.
So why would you want to deploy your machine learning
model on the edge or run it on the
edge? One is low latency.
Think about, let's say you're collecting videos, right?
You're doing inferencing based from
video or sound. You want it faster,
so it makes sense to run it locally on
that device itself. So it's load in.
See, think about it. If you have to ship that to the cloud,
you have to ship each images, each frame.
That might cost you money and of course produce scalability.
So it might make sense to run it at the edge
to provide scalability. Another one is flexibility.
So it might make sense to run it locally
so you don't have the need for Internet connection.
Also rules, privacy rules,
want to send any
personally identify Pii information or
might make sense to local laws that
it's limited to certain geographical
areas. So it might make sense to. It gives
you that flexibility where you want to deploy
and where you want to run this inferencing.
There is Onyx runtime where you can run it's a
high performance inference engine for
your onnx models. It is actually open
sourced by Microsoft under MIT license. So it's
not just limited to neural networks.
Also for traditional machine learning spec it has
extensible architecture that allows to
have different hardware accelerators.
It's part of Windows ten as
Winml. And if you want to learn more about Onyx Runtime,
there's Onnx runtime AI website.
The good thing about this is there's
this part where I think it's pretty neat or
let's say if you want to use different platforms,
let's say I'm going to create a Linux application and I
want to create a C sharp using C
sharp API and this architecture X
86. If you want to run ARM 64, you can
select them and then you have these different architecture,
different hardware accelerators. So if you want to use the gpu,
select CUDA, or you can just use default cpu and it will
give you instructions how you can incorporate it to
your application.
Notice that there's different hardware accelerators. So like
for example if you wanted to run Openvino, you have to convert
it. You don't have to convert let's say a Pytorch
model to something that's compatible with Openvino.
You can go Pytorch to Onyx and
then use Onyx runtime with the Openvino
hardware accelerator.
Like I said, onnx runtime ships with Windows AI
platform. So if you're
as part of the Winml API,
which is a practical, a simple model based API
for inferencing in Windows. So let's say if you have can existing
forms application and you want to add machine
learning model, or you want to add
machine learning to a windforms
application, this allows you to be able to do that.
There's also direct ML API,
so that if you're creating a game, there is a way
to be able to use direct ML that runs on
top of DirectX twelve which has a real
time high control machine learning operator API.
And of course you have these robust driver models that
it automatically knows if you have a gpu,
a VPU or XPU fully defined,
but it automatically switches. If it can
run a cpu it would use it. If it can run in
any one of these, then you'll be able to use that. That's how
it's able to access those
drivers. There is also Onnx JS
which is a JavaScript library to run Onnx models in the
browser or even in node it's using WebGL
and webassembly and it could automatically
use CPU or GPU. So think about
let's say you have it in your browser.
What it had to do is it would download the
Onnx model to the browser
locally and then use onyxjs to be
able to use inferencing.
So instead of sending it to cloud, the Onnx model
is actually locally on the Chrome browser or
on that browser itself and doing
inferencing that way. It is compatible with Chrome
Edge, Firefox, Opera. If you want
an electron app, you can also integrate it with your node application.
It's not just desktop, also mobile,
Chrome Edge Firefox you can use too.
All right, I'll do a little bit of demo. If you're interested in
getting what I'm using to demo, here is
the link and I will show that later. Again,
let me pull this application
for you back there.
Okay. So if you go to that link, it will get you this
application or this website.
If you want to try out our demo today,
you click this out to try it out.
And what it'll do is it would pull up the
docker file and create an instance of
a Jupyter notebook using binder.
This is what it looks like.
So now that the kernel is ready, this is a c sharp application.
So what I have here is running Jupyter notebook using.
Net interactive so I can have a c sharp application.
And what I want to demo here today is I wanted
to convert a
model trained in ML. Net into
Onyx. So this is how I would get
some nuget packages and download
them. While it's downloading,
let me kind of talk a little bit about the code.
So this one right here is system
IO. I'm using system IO,
Microsoft data analysis Xplot
plotly. And this one right here allows
me to be able to format it properly, to display it
properly on this Jupiter notebook.
So it's just a library. Okay,
let's wait until that one's done. So this
one right here, I have a CSV file, salary CSV.
Let me try to open that for you. So this is what
it looks like. I have two columns on this csv file,
years of experience and salary. I want the
simplest example. I mean, this is not the best example if
you're going to create a machine learning application. But I want to
one input and one output. Input is your
years experience. Output is salary.
So we want to create a machine
learning model that when you
create years of experience, your input is years of experience.
It would kind of guess how much is the salary based from that experience.
It's just a contrived example.
Okay, let's go back here.
Now that that one is done, it was able to download all my nuget
packages. Now I'm using to run this system.
I'm going to run this to your application right here so that
I can load the csv file using data frame.
And based on that, this is what my data looks like,
right? Notice that as
the years increases the salary.
So it's just simple example and
looking at this description, then it gives me
what's the min, the max. Right now I only have
30 items on my list. And so at the
end of the day, what this one is trying to do using ML net,
I want to create a pipeline to be
able to train a model.
Whereas in order to train, there's two things you have to
do. You have to use ML net and once you have that context,
you create the pipeline and then you do fit
and transform. There's always that pair.
So once you have a transformer model,
it'll create that model for
you. So what I want
to do now is now that I have that
model, I want to convert it to
Onyx. So I
use context model convert
to onyx. I pass in the stream
and my data and
what it'll do is it would create these
model onyx for me.
See where that one is?
Model onyx. Put it in here.
So let's try to open it again.
There you go. See, I noticed Onyx
model was generated for me.
Think I can open that Onyx model?
Okay, so now that I have an Onyx model,
let me try to verify it and see how I can run.
I'm going to open another project. This time I
have this Onyx inference Python notebook.
This is not a c sharp application,
so I want to change my kernel.
Let's change this kernel into a python.
So this time I want to use Onyx runtime in Python
to do the inferencing on that model onyx
file. So I do pip install onyx
runtime. And what I'll do is it would download all the necessary
requirements to install to get
Onyx runtime library. Of course,
here I'm just importing them and I create
this inference session.
So notice that this model Onyx,
if you go to netron app, it would display something
like this where you can view the
contents of your onnx model. And notice
that it gives me the input and then the output.
This input gives me the years of experience and
salary. The output is like this.
So whatever we're interested in is input. In this
case, it's only years of experience salary.
That's the one. We're trying to guess this point when you're doing inferencing.
So it would be ignored, but it would use all
your inputs. So even if you place a
number here, it would be ignored, it won't
be used. Notice how that one is not connected.
So it's only using years experience. He's using feature vectorizer
to be able to create this linear regressor
to get the output. Of course,
here, all these are not going to be used anyway.
It's just going to be stub. What we're interested is the output
right here. Okay,
so this one, what I wanted
to do is to get the name, the shape
and the type of years experience,
input, years, shape and type. This one is
for another one for the salary.
Years salary shape and type
kind of gives me the descriptor and of course the
output gives me the shape and type.
In this case, how did I get four?
It's the fourth of the output.
So 01234.
Right. So that's the fourth output.
Let's run this one too. So now that I have
that I can pass in that data,
in this case pass in input experience,
input salary, and it specify the years because
I know the type and I need to
place them into these array.
So I got the years salary.
Notice how I put zero because it's going to be ignored anyway. So let's say
I change this to ten and
the output would be identified
here. And notice that if
I have ten, that would be the value of
the result. And now I can grab that
one and that would be my output.
So if I change it again to say three
years, three and a half years, see what
happens, and then I'll have a different output.
Okay, so what happened so far?
What I did was to train
a model, export it to
an Onyx file using ML net in
c sharp. And based from that Onyx model,
I use Onyx runtime to
use it in my python application and
do inferencing that way.
This is how it feels like after learning all these things.
Now it kind of connects all the different,
how I can just easily
use an existing Onnx is
a way for us to be able lead software engineer to
be able to talk to data scientists and
also data scientists to talk lead lead software
engineer that we can use these secret
recipes, right, use these machine learning models and
integrate it to our application. At the end of the day,
all our best effort and
all our programs. Actually,
as long as we can integrate it to our application,
not just existing application or any greenfield application,
we can start incorporating these machine learning models
through Onnx. So we did talk about as a recap,
what is onyx? It's an open standard.
Use the right tool for the right job and how
you can efficiently run it on a target platform.
It separates out how you train it and how you
would run and do inferencing on that
model. How you would create Onnx model.
I did show you how to download it from the Onnx model
zoo. You can create it and convert using
some of the Onyx convert.
There's different ways how you can create can onnx model. You can
also deploy Onnx model.
You can deploy it through Windows. NET JavaScript
using Onyx. JS did
a demo how you would use Onyx Runtime in Python to
it in high performance. All right,
if you want to learn more about me, my name is Ron Dagdag.
I'm a lead software engineer at Spacey. I'm a 50 year Microsoft
MVP. The best way to contact me is
through LinkedIn or Twitter at Ron Dagdag
I appreciate you geeking out with me about Onnx,
Onnx Runtime, about Jupyter notebooks,
about bakeries, bakers and breads.
Thank you very much. Have a good day.