Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, welcome to this session on natural language mod link with
Amazon, sagemaker and blazing text algorithm. My name
is Dinesh Kumar. I'm an aspiring ML specialist and
I'm hoping to convince by end of this session that you don't need machine learning
degree to take advantage of tooling AWS has put
at your disposal. Now, as part of this session,
we will pick a language purposefully, a foreign language likely
unfamiliar to you, and apply machine learnings to create some magic.
We will use the language of Tamar. The Tamar language is one of the world's
longest surviving classical language with a history dating back
to 300 bc. Tamar as a language is rich in literature and
one of the language that has evolved over thousands of years.
The reason I picked it is I'm familiar with this language and
I was keen to see whether machine learnings was able to figure out the relationship
between the words of this language and I want to
apply known machine learning algorithm such as lacing text,
in this case within Sagemaker to discover these relationships.
What's cool about today's talk also
is you will be able to apply these
techniques post this session to
the language of your own choice and discover
the relationships yourselves. Now, what are the prerequisites for this
session? It will be good to have fundamentals of the
machine learning or deep learning and understanding
of typical natural language processing problems,
ie search and some familiarity with AWS services.
Not necessarily sagemaker. I will surely cover that as
part of my session and then knowledge of python or numpy.
As long as you know any other coding language, this is straightforward,
you would not be lost and these knowledge
of Jupyter environment notebook environment as such would
be very handy. Now in order to
get this magic happen, we need to introduce word embedding with
word to vec algorithm. Then we need to understand bit about Amazon Sagemaker
and its unique capabilities, especially around NLP area. Then we can
spend bulk of our time with building text algorithm and a quick
demo. So that takes us to these interesting topic
of word embedding. Let's dive into word embedding, natural language,
text, contents of words and so we need to represent
individual words, sentences and collection of
words in some way. Now couldn't we just use strings
containing the words? First of all, words have different lengths
and even written representations differ dramatically from language to
language. So if you look at the
language of Tamil, some words might be mentioned
in a different way in different region of Tamil
Nadu, while the same words will
have different representation or different way of being pronounced
or spelled in different region.
Now, with these complications and
added to that, more importantly, many machine learning techniques
require numeric rather than text input as you would obviously.
You know, computers are good with numbers and
may not be with natural language vocabulary like us.
So that brings us to the topic of representing
these words in a numerical form. So if I want to represent
this particular phrase, you see there to be or not to be in a numerical
form, how can I go about it? Let's say I
represent each of it with an individual label. Let's say
I represent two with zero, b with one, or with two,
not with three. Now that gives me zero, one, two, these zero,
one, a random set of numbers. Now, this particular way
of representing it with just unique labels is actually going to create
or introduce random relationships. For example,
b as a word is now closer to two
and r but away from not.
Is that true? No. So how did we even arrive at
these labels when there is no such relationship?
So just randomly naming them with numerical label may
not be great way of solving the problem. Now let's
say we use vector
instead of single number. Typical approach is to use one, not encoding.
Here each word access an index into a vector of all zeros
with only a single element set to one. Now you
would have already guessed these problem with one out encoding approach.
The example here is having just four words, and a typical
language will have hundreds and thousands of words. In case of Tamar,
I don't have account, but it's a very rich language, as I said earlier,
and there is quite a lot of words to its vocabulary. So representing them
by ones and zeros may not be the efficient way of actually
solving the problem in hand.
Now with that in mind, let's look
at the other side of the coin. Given a sentence,
what is our chances of maximizing the probability
of predicting the context words? Let's say I introduce
this word, Tom Hanks. How can I predict the
context of this word? In this example? Let's say we want
to predict the context of Tom Hanks. What is the probability that somewhere
around Tom Hanks we will find words? Great. Can actor
quite a lot because he's an actor. So there is quite a lot of chances
that those words would appear somewhere near
Tom Hanks. But what is the chances of me finding something like
quantum physics next to Tom Hanks? Maybe very less.
I would not say zero, but it is relatively less
when you compare it to word actor. Now that's the point we are trying
to actually make. How do I figure out that
this particular word has more relationship and
hence contextually more closer to this word.
In typically deep learning world,
we will actually have a fully connected network.
So for every input word it receives, let's say a
vocabulary has 10,000 words, and if we receive one word,
then the output
of this network should be able to figure out what are those
words within those 10,000 words this particular input
word is closer to. So there will be lot of
hidden layers of network that will try to
extract the contents words and then it'll spit out the
probability of the words that might sit
contextually closer to that input word. After training such
a network, we can now quickly compute denser output vector for a
sparse input vector that we had after our 1 hour ten coding.
So we have now grasped a bit about the problem itself
that we are trying to solve. The problem statement is we
want to understand the probability of a
word being in closer relationship with,
or probability of having set of words that
are in close relationship to the input word that is coming
in for inference. Now it's time to understand about the word to veg
algorithm itself. So, dimensionality of the
output vector is a parameter we choose. This is why we said embeddings
high dimensional object, one not encoded word, in this case
into small dimensional space. Turning a sparse vector into a much denser
representation is what we are trying to achieve. Once this representation is
computed, we can simply convert every word into an n
dimensional space. In the end,
words that appear in similar context will likely be
mapped to similar vectors. Words close to each other in vector
space are likely to be similar in meaning as they tend to be used in
similar context. This is where we are getting closer to
these magic a machine learning system automatically discovering
words that appear to have similar meaning. In theory,
we also expect certain vector relationships to hold. This doesn't
have to be exact and depends totally on the carcass being
used for training. We can help to discover word relationships with
transitive properties, as shown on the slide. For example,
if we take a vector that corresponds to word king,
that's a classic example that we usually see.
Let's say I have found a vector for the word king and subtract
vector for the word man.
I get a meaning that says there is some sort
of royalty associated, and then if I add vector
of woman to it, I've arrived at vector of queen.
This is magic here because we have managed to
understand the meaning of a word and then doing typical
addition subtraction that we play with numbers, but in this case
with words and their inherent meaning that
they bring with it. So that's exactly
what word to vec algorithm is actually trying to solve.
You may have heard about new models such as Bert, Roberta,
but the end goal is to achieve the word embeddings.
Yeah, I'll probably show you this
completed word embedding for english language.
And this has been mapped into 100 dimensional vector.
How can we visualize this result high daily on a two dimensional
picture? This is where we probably could use another
trick of trade. T distributed stochastic neighbor
embedding plot. Now that might be a
bunch of words, but it's just simple way of telling
that we now have way to visualize all
those relationships in a two dimensional space.
As you see here, the model has
now figured out american, british, English,
London, England, French, France, German. Now all of
these are closer to each other. It has mapped
that they all can contextually appear closer
or they are all related in some form or shape. And these clusters
are interesting clusters that it has figured out because
of the corpus that was thrown at it. The very
important one I would probably show is if you see there,
it has found out that there is relationship between son,
family, children, father, death, life.
It is able to actually figure out that these
words are related and are closer
in context. And there is more probability
of a word from this particular cluster appearing next
to a word that has just come, that it has just come across.
Now, enough of the word to vec and
word embedding the actual problem statement that we were trying
to solve. Let's try to understand the toolings that are at our disposal
and how sage maker itself as a service is able to help
us with in solving this problem. And our placing text
algorithm then fits into it. This is a typical AAML
stack. In the top you see AI services.
Most of these services are out of box services
in the sense you do not need to have any sort of machine learning skill.
Let's say you are a team of developers
who are having zero skill with machine learning, but want
to actually see how you could use it for your
own application or a problem
or a workload that you have. Then these are
some go to AI services. These are across different areas.
For example, we have AI services for vision, speech language,
chatbots, forecasting recommendations. All you need to do is an API
call. These are models that are up and running and are always
learning because so many of our other customers are using it.
And you will be able to just, with an API call,
get the inferences back. And you do not have to go
down the route of building a model, training it,
validating it and monitoring it of any sort,
you're just going to consume it. But let's say you are a bunch of
developers who are already actually into machine learning and want
to make your life simple, then that's where ML services
with Amazon Sagemaker as a platform would come into picture.
Amazon Sagemaker is a platform that was built ground up
for developers. The primary ambition was
to actually make machine learning development easy for developers,
and hence there are lot of services that
it packs that makes the development lifecycle
lot easier than it was before.
We will dive into that part in detail, but otherwise,
if you are experts and if you want to actually do
it at your own pace,
the frameworks of your choice, and then
we completely support Tensorflow, Mxnet and all those popular
frameworks, and quite a lot of instances
and GPU
based instances that are available for you to leverage.
But we would primarily focus on Sagemaker as a platform
as part of these session, because it's a very big world
to explore on its own. So let's just take ourselves to
the main hero of today's subject,
Sagemaker. So amazing. Sagemaker. As I said,
you can build, train and deploy ML models
at scale. Now at scale is the key term
there. Whatever be the part of your journey,
either it be building or training or deploying.
For each of these stages, Sagemaker as
a platform offers you the right set of APIs with
bunch of python code. You will be able
to build your model, train your model, validate your
model and deploy it. And of course, AWS, part of
today's demo, we will show you how it is being done,
but that's how easy it is to get going in Sagemaker.
So to start with, we offer pre
built notebooks. These pre built notebooks examples
are very much available within sagemaker environment. In your own AWS
console, machine learning developers can just use these examples to
start experimenting for their own specific use cases.
So if you are actually new to sagemaker, you do not have
to start from scratch. You could leverage these pre built notebook
examples and use that as a
starter to explore the sagemaker environment.
Now, let's say you explored it and you
want to actually move forward. Then once you settle in
with your training data, of course, that's these most important bit.
Your models are as good as your training data. So quite
a lot of our customers spend a lot of time in ensuring
they have the right data. Let's say you have the right data, training data
that you need. You then need to choose right algorithms
that will have to actually go with them. Now, when it comes
down to algorithm. We have lots of choices within
these sage maker world. As you see either it
be regression or classification, or image
vision based aa models. You have quite a lot
of built algorithms that comes very handy. These algorithms
are in the form of containers. All you need to do is
actually refer to the container registry and pull the
right container for this algorithms, and then you should
be get going with it. Otherwise,
if none of these existing built in algorithms does
not cater to your needs, you could always bring in your own algorithms
in the form of containers and leverage
the sagemaker as a platform. Now, what's unique about
these inbuilt algorithms is that they could actually take advantage
of distributed computing infrastructure
we have in cloud. And you
might sometimes find the equivalent of these algorithms
in open source as well. But the ones
that are inbuilt with Sagemaker are validated
for their efficiency in utilizing the distributed
compute environment and infrastructure that the cloud offers. And hence it is
always relatively lot performant than the
open source version of the equivalent algorithm you will find out
in the market, in the open source market. So let's
say you determine the algorithm,
then you need to train the
model. So you will want to tell
the number of missions you want to use for your training purpose.
You can then kick start the training with just one line of python
code. Yes, you heard it right, it is just one line of
python code within your sage maker SDK or
click in a console. And that's all it takes to actually
get the training going. And you can actually do
your training and create your model out of it.
These are so many deep learning framework containers that
are supported in Sagemaker. So it
uses docker containers, as I said earlier, that's designed for
that specific algorithm, and that's designed to support that particular framework.
So let's say you have a particular algorithm in Tensorflow that
you want to use for your use case. Then you
will find a container for that particular framework,
for that particular algorithm. And as long as you
actually refer to that in your code, you will be able to pick and run
with it. What you do is when executing the training,
as I said, you just select the right container and provide the data in
form of a file in simple storage service. Now what Sagemaker
would do is launch cluster of training machines.
And again, it will not just launch a lot of machines,
you do not have to worry about the cost, it just launches the
number of machines you are told within your code
and the type of instances that you have advised for it
to pick up for the trading purpose, and then use
those machines to train with the data that it
has from s three. Now, it can also perform distributed training.
As I said earlier, if it
is distributed, you can train it in multiple instances, or you
could always choose to train it in single machine
and end of the day, the output, the resulting model
will be back in s three. So you could always actually
look at the model and see how accurate
is the model giving inferences,
and then decide to go ahead or not, or retrain the model
again with new set of data. Now,
what if the model is not optimal enough? As I said,
there are so many optimization techniques that are available for
you now.
Usually, let's say you create a model and
it is not really performing well. What the
usual ML developers do is actually they tune
the hyperparameters associated with that algorithm.
So for people who may not know what are these? There is usually
a bunch of parameters that contents behavior of a given algorithm.
Often machine learning developers are left in dark as to
what value should they use for these different parameters to get their model
optimized. That's where automated model tuning comes to the
rescue. The reality is even machine learning practitioners
themselves, even experienced ones,
often don't know what to do. In these cases, they just rely on
random grid based hyperparameter
choosing strategy.
Now, Sagemaker allows you to do that by kicking
off the so called hyperparameter optimization job,
and you specify how many machines it needs to run
on to control the cost. And ultimately it helps you identify the
right parameters to optimize your model. It does it with the bayesian
search algorithm. Basically, let's say it
ran the training with bunch of parameters and
it gets a model accuracy and it is not
fitting the built it will now remember what are the parameters that it
ran with in the previous iteration, and hence choose to
deviate away or go towards it based
on the iteration and the learning
that it's picking up. It is not efficient way of actually tuning
the model, and many of our customers leverage this
for their machine learning training
purpose. Amazon Sagemaker Neo is something that I would briefly touch
on with Neo. You can compile your models to be ported
to any of the target processes you may choose to run your model on.
This way your model would not only be smaller
and deployment ready built, also more performant. So it doesn't
matter which end architecture you want to port
your model to. Neo may help you do that and you will also
be actually carry your model with you and deploy
it in lot lighter form in any architecture
of your choice. And the list of architectures to which you could port it
to is always being added on. So you
could always check our AWS pages to figure out what
are the ones that we support. So let's say we
have got the data, we trained the model, we have validated
the model after the fine tuning thanks to hyperparameter
optimization, and we have now got the right accuracy
and performance we needed to deploy this in production.
Now, what we will see as
part of the demo as well is that
with one line of python code, you can take this model to production.
You can then manage the same and easily scale with Amazon
Web services. Now that's the beauty of Sagemaker.
Everything is simplified for the
developers to leverage this distributed
computing platform and focus
on the business outcomes that they want,
rather than all the undifferentiated heavy lifting
they will have to do in building,
training and validating these models.
So that's the bet with
Sage maker that I wanted to cover before we dive into
the world of blazingtext algorithm.
So as I said, blazing text algorithm
was published back in 2017 by
a couple of Amazonians. This is the paper that was
released back in 2017 to
discuss how the blazing text algorithm will go about this
particular problem. Now, key thing to
note these is this algorithm provides highly optimized implementation
of word to Vic and text classification algorithm.
Using blazing text. You can train a model with more than billion of
words in probably a couple of minutes using
multicore cpu or GPU, and you can achieve performance on par with state
of art deep learning text classification algorithms
out there. So the other important thing that
it offers is an implementation of supervised
multiclass, multilabel text classification algorithm,
extending the fast text algorithm implementation by using
GPU acceleration with custom CuDA kernels, but also relying on
multiple cpus for certain modes of operating this algorithm.
In this particular demo, we will be using distributed
training, just so you know.
But there is no hard role.
You could always do it in single machine if that suffice
your needs. These are some of the highlights of
blazingtext that I would love to highlight.
As I said, you can run with single cpu instances, you can run with multiple
GPU acceleration if needed.
And these interesting
thing that you would see from these slide is it can be 21 times faster
and 20% cheaper than fast text on a
single c four. And if
you go down the distributed training route, it can achieve
a training speed of up to 50 million words per second.
Now this is speed of eleven times over one
c four lodge that we saw in the previous one, which is amazing,
actually, the kind of efficiency that
these have actually managed to harness
from blazing text algorithm is amazing. Now how do they do that
in our demo? As I said, we will use blazing
text on multiple cpus. But even within single cpu, blazing text
takes certain steps to optimize its performance. And that's
what we are seeing here.
It uses blast two by intel
and hence it is a
lot more optimized in terms of cpu utilization.
And as you see here, this picture
is showing how we
are optimizing bird to vec by sharing the k negative
samples across using the blast to
advantage that we have from intel. Now this is a
slide that compares the throughput of 1 billion bird
benchmark data set. Over here you are seeing
throughput characteristics of the blazingtext right hand
side you see implementation of fast text, sort of what
is published out there. Because fast text is not able
to be distributed on multiple cpus or gpus, we are
running it for benchmarking on single machine. Now you can
compare and contrast the performance when you look at left hand side where the algorithm
has been run on multiple multicore gpu machines. And in the middle
section where you see the yellow bars, we have performed batch
skip gram benchmarking. And again here you are seeing the results of
running algorithm in distributed fashion on multiple cpus.
But of course throughput is always not the only factor that you
would choose to run with a particular algorithm
because we also need to consider accuracy and cost.
So I would love to actually show you another set
of benchmark results.
Basically, as you see here,
I think the right way to interpret this diagram
is that the circle that we see
here is the throughput. And when
you compare number eight and number two,
sorry, I think these number
eight is the one that was run by
skip Graham blazing text. So yes, what I
said is right, if you compare number eight and number two,
you get lot of throughput, you get lot of accuracy.
In fact, almost same accuracy for very less cost.
The horizontal axis here is cost. The vertical
axis here is accuracy. The size of the circle denotes the throughput
that that particular algorithm is able to achieve with
the given configuration.
So it just goes on to confirm
the previous claim we had that
it is a lot more performant and lot more cheaper when compared
to our other fast text algorithm.
With that we will move on to
the demo part of our discussion,
which is called as semurai in Tambur. By the
way, let me bring the demo
page for you. So hope you're able to
see this. This is the
notebook that I have created in Amazon
Sagemaker. Now you could just go to Amazon
Sagemaker service. I can probably
show you that quickly.
I think we'll just continue with the notebook because
it might be a bit tricky for me to actually share that screen.
So it's very simple.
If you go to Amazon Sagemaker service within AWS console,
you go to notebook that
will be in your left panel and you create a Jupyter notebook.
This is as simple as any Jupyter notebook
that you would have seen, nothing special about it. And it
is just an environment for, if someone is
not aware of it. It is just an environment for data scientists to
share and do more data science
in a collaborative way.
That's all about this environment. Now,
if you look at this,
as I said, we are going to actually take a large corpus of text
in Tamil. We will use this large corpus of
text for data ingestion purpose. Now, in this case,
I have actually taken the dump from this
particular URL. You could very well actually get it from whatever
of your choice. But ideally, Aws, I said your
model is as good as your data. So always please be
careful with the data that you choose.
In my case, I've chosen it from wiki dump. It's totally up
to you where you get it from, but more the data,
more you could learn and actually the inferences
could be a lot better. Now, we have downloaded this wiki
dump. Now there is this wiki extractor py
script created by Atadi.
You could find it in GitHub. What we are doing
here is we are actually passing the dump that we have got and
we are doing extraction of that data.
So we have downloaded that extractor
script and we are using that extractor script to
extract these information from the dump.
So what this extractor would do is just cleanse the data and
make it easy for us to do the machine learning model training.
So as you see, it has picked up that file and it has actually given
us the list of words that we could actually pass
in for our training stage. So we have all
these tamil words, mudarpakam, katirakalai, katirangalin,
patiyal, poviyal, varala, arupuri. So these are
all tamil words that it has picked up from the dump. Now,
this is a very big dump that I had
downloaded, so I didn't want to actually waste
the time during the demo. And hence I have done that hard
task of actually getting this downloaded and training
the model prior to these session. So I'll
just drag to the bottom of extraction
part, or rather the cleanse part,
and these we go. So the extractor is done.
In fact, actually it was just going on for a long time.
I thought I've got enough data, so I just killed it
and to get on with the next stage.
But you can actually leave
it for a long time if you want a lot more data.
As I said, more the data, more it is good.
Now this is where the sagemaker
comes to party. If you see here, we are creating a sagemaker.
We are importing the sagemaker SDK sagemaker.
We are creating a sagemaker session and we are creating
a default bucket that we will use for this
particular training purpose.
Now what we are doing is we are uploading the data.
So the cleansed data that we
now have got after running that wiki extractor Py
is what is now being pushed to the bucket and
we are also setting the output location for s three.
Now those are the basic
constructs that we need from the sage
maker service before we could actually go about
the algorithm side of things.
Now, as I said,
using the inbuilt algorithm is just one line of python
code. And there you see we
are from Sagemaker. We are importing
image URis and we are saying image
Blazingtext. Now this brings
you pointer
to the blazingtext algorithm.
And here you see you are using Sagemaker
blazing text container from EU west one.
After we create the container object, which now essentially
holds the kind of algorithm that it is going to apply for this
training purpose. Again, as I said in this case we have chosen basing
text. You could either go and choose some
other built algorithm of your choice within
Amazon Sagemaker world, or you could bring in your own
containers that you might have in on premise to use that
for your training purpose. And now we are actually creating
the estimator object.
Now estimator is where we pass
the container object. We just created the role
that the training job
would assume when it is actually doing the training.
So this role is what is going to allow it to retrieve the data
from s three, push the data back, or push the model
trained model back to s three and do all that sort of start or
any other service it has to interact with. This is the role
that probably will actually control the permissions associated
with that particular training job. We are also giving
it the number of instances we wanted to use for training. As I
said, it is totally dictated
by you. What instances are being used,
how many instances are being used, what instance type is being used
and what is the input mode.
You can actually choose it to be file or
there is another option of actually another
performance option of input mode. You could go for.
And once you choose these parameters
and create the estimator object, you then pass
the hyperparameters. In this case, we have actually passed the
hyperparameters ourselves. But as I said,
we could actually use hyperparameter tuning or the hyperparameter
optimization option that we had mentioned earlier
and discussed about which could actually do that
iterations to lock in to the
best hyperparameters that will give you the best accuracy and
give you the best performance model that you can choose from.
Once you set the hyperparameters, you then actually
kick start the training by you
point these training data that needs to be used and then you kick start
the training by calling this fit method.
Now, once you say model fit aws, you see here,
it starts the training job, it completes
the training, and you
are going to be just charged for whatever time that
the training has run. As you see here,
the total training time in seconds is 32 86.
So these four instances that you had chosen of
type c, four, two, x, lodge, they are
going to be charged only for those whatever seconds,
36 odd seconds or 32 odd seconds that
the actual training job took. Now,
once the training is completed, the trained model is now uploaded to s three.
It's now going to be residing in s three. And as
I said again earlier during our session,
right after this training is completed,
if you are happy with these accuracy, usually our customers choose
to have a validation stage. So one of your
data engineers or data scientists,
whoever controls what model gets deployed in production,
might get an approval task,
and they will see whether these accuracy of
the model is good enough to be deployed in
production, and then they will give it a go.
So this whole thing actually could be orchestrated
in a CI CD fashion. We have got
something separately called sagemaker pipelines.
It's a feature within Sagemaker that you could leverage. There's no charges
for it, it's just the way you can do CI CD for
machine learning. All of these tasks,
starting from ingestion training
and then validating, deploying, all of this could be orchestrated
in a totally automated fashion if you want, but deployment
itself is just that one line of code that you see
there. So I'm happy with this model and I want
to deploy it in this particular instance
type. And that's it. It gets deployed.
Now, once it is deployed, you now have an
endpoint to do inference against. So if you see here,
I am actually creating set of
words that I want to use for inference.
So these, you see, the first word
is tamar, the second word
is language, or mori music
which is in Tamar isai,
song is another word which in tamaris pardal,
politics in Tamaris Arasiel,
leader in Tamaris, Talibar, year in
tamaris and century in Tamaris Notranda.
So these are random words. Some of them are related, some of them
are not related. We will see how the entrance behaves
based on the context that it has explored
with the blazingtext algorithm that we used for our training purpose.
So we are pointing it to the endpoint that we just created
by doing the deployment. And now
what is happening here is Aws,
I pass these words, it is creating vector
representation of these words. Now for example,
starting from here, you see,
until here is the vector representation of word.
Now what it is doing is it is actually trying to map these word
thumbnail in an n dimensional space.
And that's why you have so many weird numbers like it's
being represented as a list here.
So you will get these kind of list for EAch word
you are trying to vectorize. And then what
you are trying to do is actually map these words that are in
n dimensional space into
two dimensional space for
Your picturization or visualization case.
But otherwise, this is where the word tobac
actually is trying to do the magic.
Once this vectorization is completed,
you can now actually,
now you have the numerical representation of those words.
It's just not zeros and ones, it's this weird list of
array that we see there in the top.
Now this is the real fun. If you see music,
the word isai is closer to the
word song because song is paddle.
They both are closer. And hence, if you
see the relationship or the
vector subtraction is giving you 6.17,
forget about that number. But that's how close
they are is what actually it has inferred. But now
if you see politics and leader,
yes, they are close. And hence if you see vector of
leader minus vector of politics, it's giving you 5.8. They are
a lot closer because these
are words that appear in contents. Now when I try
music and politics,
it has figured out that they are bit away than politics
and leader. So what we have now
achieved is actually we have now created
the vector representation of each of these words
and have now actually identified the distance between them
contextually and where
would they sit in an n dimensional space in
terms of context. So that's what our inferences achieved.
Now, because I restricted
myself to less words within our carpus
and did not bother much about
accuracy, we are
seeing what we are seeing, but with
bit more effort on hyperparameter optimization.
This can be lot accurate and very
interesting inference could be made from this one. Now another
trick that I probably mentioned earlier was you could actually bring in
the model and unpack it and
apply matte plotlib techniques
to actually create a
two dimensional representation of these words later
on. But I think with
that we come to the end of this session. Just to summarize,
we started with an unknown language, the language of Tamar,
and then we understood what is word to wake
and what is word embedding and why do we
need to do that. And then we introduced sagemaker
as a platform and lot of features that comes packed
into it and how sage Maker as a platform could actually
help you in making your machine learning development lifecycle
lot simpler by offloading
the undifferentiated heavy lifting you will be doing at
the moment. And we also explored how
easy it is to actually apply the
placing text algorithm on the data of your choice
by simple demo that we saw at the
later part of the session. Now,
I believe this would have created some sort of interest within
you to actually go and explore the natural language processing
using some of the input algorithms we have within Sagemaker or
the algorithm of your choice and play
with one of your favorite language of your choice
and explore the world of machine learning within the AWS
ecosystem. Thanks for joining the session. It was my pleasure to
actually give this session for you and wishing you a great day ahead.
Bye now.