Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello and welcome to this talk on build cloud infrastructure
using Python. My name is Sohan Maheshwar. I'm a dev advocate
with AWS, and I'm here to talk to you about this very
cool new cloud concept called infrastructure as
code. So today, like I said, we will be talking about infrastructure as
code. We will talk a little bit about AWS CDK,
we will talk about how it works, and also on getting
started. Now, this talk is aimed at a beginner to
intermediate level. So if you don't know too much about
cloud or cloud infrastructure, or even Python for that matter,
this is a good talk for you, right? So let's get started.
Now, a couple of years ago, or maybe it was a few years ago,
what is time even right now? Our CTO, that is AWS's
CTO, Dr. Ferna Fogels, said this at our annual conference
at reinvent, and he said, so what does the future look like? All the
code you ever write is business logic. Now, for those of you
all in the crowd who do coding on a regular basis,
you will know that you end up writing a lot of code to
actually scaffold the business logic that your
company typically needs. But we are slowly moving past that.
If you really look at how coding has evolved, especially when you're building
applications, we've gone from building huge monoliths, which is
this massive application with so many interdependencies,
to a list of microservices. And this is how
modern applications look like they have a lot of bits. So you have your
database, which is talking to your data lake, which is talking to your
serverless app, which is talking to your data warehouse,
queuing service, notification service, and all of that. This is
how typically a modern application looks. Now,
to create all of this, how do you actually do it?
How do you actually create a serverless function or a database or a
queuing system? And that's where really infrastructure AWS code
comes in. How do you create these pieces of cloud infrastructure
using code? The lowest level of actually doing this is to
create this infrastructure by hand. This is typically, you have
one awesome person in your team, and this person's doing all of this using
a console. So if they're using aws, maybe the AWS console,
and they're like coding, and they're doing all of this by hand, and they're creating
your organization's infrastructure. Now, this is not
recommended at all, because this is very manual, it's time
consuming, it's hard to audit. I mean, even the
best of employees can make a mistake once in a while,
so it can be error prone. And most importantly,
it's not reproducible. This is literally the equivalent of saying,
hey, ask Alice, she knows what to do, because one
person's doing all of this and then they have to teach another person, or there's
a wiki, but it's not reproducible and it's not programmable.
A layer above that is something that organizations
have been using for a fair amount of time. And this in
a way is infrastructure as code, but it is imperative
infrastructure as code. So you have your employees who create,
say, something like a deploy script, a typical shell script,
which has things like, if your resource is equal to this,
do this, else. If so, there's a lot of boilerplate code for
this, but it is in a way automating how you create cloud infrastructure
with imperative infrastructure as code. What if
something fails and we need to retry it? Now this is a straight up script,
so if something fails and something happens,
that's it. You can't do much. Also, what if two people
try to run the same script at once? And the larger the organization,
the more the chances of this actually happening. So this could
lead to things like race conditions where two things are trying to create
cloud infrastructure at the same time, and of course could
lead to a lot of errors and could lead to a lot of downtime,
and we don't want that. So the level above this was something called
declarative infrastructure as code, and a lot of
organizations even now use this, and of course, I highly recommend
it. So what happens is you write infrastructure
as code in something like a TXT file, and you use services like
AWS Cloudformation or hashicops terraform.
Now this speaks to something like the AWS SDK,
which then goes ahead and creates your organization's infrastructure.
Let me give you a real world example. So suppose you're programming
a robot to get you coffee at 09:00 a.m. On a Monday
morning. Now this is very declarative, so you have to declare
each and every step that the robot has to do. So you have to say,
robot, walk ten steps down the hallway, take the first
left, find the coffee machine, add milk to it.
I like milk in my coffee. Add coffee powder, put a
cup on the mug and bring it back. So you're actually defining
each and every step. You're being declarative about every step that
that robot has to do. That is what declarative infrastructure
as code is. So typically that TXT file contains
YamL or JSON code where you define literally
every single thing about your infrastructure.
So as an example, if you're creating an virtual private cloud
or a VPC, so you say, I want to create a VPC,
these are my subnets, these are my nat gateways, these are my cider
blocks, these are my availability zones. Don't worry if you don't know what any of
that stuff is. I barely do. But essentially you're defining
every single thing in that virtual private cloud. A lot of
organizations use this because like I said, it's very different from
that level zero. So this is reproducible,
it's not as prone to error, it's not time consuming,
and people can work on it at the same time. So this is infrastructure
as code, essentially, at its very logic. So you're coding in
YamL or JSON, two ways to represent data, and you can
maybe store this file on say an s three bucket,
and you create infrastructure stacks using a
service like AWS cloud formation. So you're saying, okay, create a virtual private
cloud, create a database, create a lambda function,
and all of this is being created using cloudformation. And your company's infrastructure
is ready. Now, this is all well and good, but a couple
of years earlier AWS introduced something which turned it a
little and called it infrastructure is code, where you could do
all of this using your favorite programming language, which in
this case I assume is actually Python. So today we're going
to tell you how you can build cloud infrastructure using Python.
And honestly, I like giving this talk because I know a lot of people who
code node js, Python, et cetera, but they're not sure that you can
actually write code to create cloud infrastructure.
And it usually blows their mind. They're like oh man, that's so cool. Because honestly
I find writing cloud infrastructure using Yaml or JSON
fairly tedious. It's a lot of lines of code, but I know Python or
JavaScript programming and I can get started straight away. So that
onboarding to creating cloud infrastructure is so much easier.
There is no context switch. You can use the same id that you're
using, you can use the same CI CD pipelines that your company
uses. And of course it's completely customizable, it's shareable,
and you can use constructs from programming like for loops and
if else statements and conditions to actually build cloud
infrastructure. How cool is that? So let's take a look at CDK,
and infrastructure is code. So how it works in CDK is
you would write a simple program. So let's call it app Py and that
app Py interacts with cloud development
kit, which then creates your infrastructure stacks using
cloudformation. Provisioning that another layer of abstraction
from writing verbose cloudformation code.
Now you might think, hey, is only Python supported?
I'm a person who writes in multiple languages. No, CDK is supported in
typeScript, in Java, in C sharp, in what else?
JavaScript, Python of course, and Golang
support, which is coming soon, or if I'm not mistaken, is in preview
mode right now. Do check it out. So what really is CDK?
Well, it's an open source multi language software development
framework for modeling cloud infrastructure.
And like I said, you're basically creating cloud infrastructure
using your favorite programming language. The great thing about this is it's
completely reusable. So say you enter a new company and you're like
hey, no one's actually done all of this. So you
write a program in Python to create a database,
and like a serverless function, you can actually give
that same construct to someone else to create the same thing
and reproduce it with all the defaults that you have actually built
in. So I think that's pretty cool. The main three components of
CDK are its core framework, which you see on the left. These are basically
the resources that you can create. Now a bunch of resources together
creates an infrastructure stack and a bunch of stacks
together create an app that you will be building. There is also
an extensive construct library. So as you would know,
AWS has so many different services, 200 plus, and each
of these has its, or most of these have their own construct
libraries that either you can create or have already been created
so that you can reuse. And lastly of course there is the AWS CDK
CLI or command line interface, which is the interface you
use to create all of this. Just to give you an example of how it
makes your life easier, if I have to create a virtual private cloud with
all those things that I told you about earlier, I'd have to write 270
lines of cloud formation Yaml code.
Honestly, that's quite a bit. I have to specify Nat gateway and subnet
and availability zone and ciders and all of that with cloud formation.
Take a wild guess as to how many lines of code you'll have to write.
One, literally just one line of code. So I can just
write something like VPC equals to EC two VPC
and just give it a name and it actually creates a virtual private
cloud for me with all of those settings enabled. Literally I
can just do it with one line of code. If for instance I say,
hey, maybe there's something in the defaults that I actually want
to change up. I can just add that as a parameter. And then with two
lines of code, I've created the same thing. So that's 270
lines of Yaml code that I typically don't have to write.
So let's take a look at how the deployment workflow actually
is. So this is it. You typically start in the CLI with
something like an CDK in it. And you should know what this does. It initializes
a new project in CDK. You can also start with
some sample apps, which it
has a project navigation structure, it has the default.
So it's just easier to do that. Once you've done that, you can do a
pip install if there are any dependencies for your project. This uses
the CDK CLI. Of course, once you've written
your code to build cloud infrastructure, just three
more steps to actually push that out to cloud. First one is CDksynth,
which creates these templates and assets. Essentially,
CdKsynth is creating it into a cloud formation code.
You can do a CDK diff if you want to actually see
what has changed. This is similar to a diff in say git
where it actually says okay, you've added these new pieces of infrastructure,
or you've removed these pieces of infrastructure, and finally you just
do a deploy. And what this done is what this does is it just
pushes all of the changes to the cloud. So say for instance,
you create a simple serverless function to store data in a
database. When you do synth, it'll create all of this. And then when you
do a diff, it says okay, created database, created serverless function.
And when you do deploy, it's actually pushed to your cloud provider
so you can actually see what's happening. There we go. Now,
I've spoken a lot about how this works in theory, et cetera, but I know
you want to see a demo to see if I'm actually telling the truth.
Is it really that easy? Well, it is. So let me quickly
show you a demo. The prerequisites for this are fairly simple. Of course.
You need Python, you need the AWS ClI and an AWS
account and user. If you're not customer of AWS
right now, there is a very, very generous free tier. Granted you have to
enter a credit card, but you won't be charged. It's just for verification purpose.
And the free tier limits are actually very extensive. So anything you
want to do for testing or for just simple MVP, or just to
play around with it, please go ahead and do so. But of course do check
the billing page if you want clarity on what to do. Installing the AWS
CDK toolkit is simple. Just do an NPM install or a
Pip install. And just to make sure it's working, just do a CDK
version in your CLI. So we're going to do something very simple
and create a simple virtual private cloud.
The same example that I was talking about. So far I've prerecorded
this demo because creating the cloud formation takes maybe four
to five minutes. So didn't want to waste your time looking at cloud formation scripts
running. So let's go ahead and get
started now. Hopefully you can see my screen. Yeah, there you go.
So I'm just going to create a simple folder called CDK test and
get into it. That's done.
All right, so like I said earlier, we can start with CDK
in it. And there are a bunch of sample apps that you can
start with, right?
So I said CDK init and I gave sample apps.
So it gives you something with the project structure that you can start with.
And you can specify the language with which you want to get
started with. In this case python. So that's what's happening right now.
It's creating a virtual environment.
And all done now because
I'm using Mac or if you're using Linux, you want to
get into the virtual environment. So that's what I'm just doing right now,
just activating it so that now I can actually run this
python code. Okay, now that's done. Now this
is the folder structure of what's created. As you can see,
there is an app py and there is also a requirements TXT
which has the dependencies of this project.
So I'm going to do a pip install and first install the
dependencies of this project.
Just looking at requirements TXT, installing a whole bunch
of packages and voila,
it's done. So let me just open this. I'm opening it up in visual studio
code. And there you go. This is the sample
app. Again, if you're not getting what's happening here, it's completely fine. You can
just delete all of it because it's a sample app, but you can take a
look at the project structure. So this is
app Py, right? Let me just do that app
py where this is a starting point of your app when you're using CDK.
So let's see here. And you can see something that says
app synth and you see a region and a specification as
well. You'll also see
some boilerplate code, which we're going to delete all of them because we're
starting from scratch right now. So like I said, we're building
a very simple virtual private cloud. If you're using visual
studio code, it does autocomplete. So that's another option.
For each module that you want to install, or each
construct that you want to install within your CDK
app, you need to install it into your project first.
This just makes sure that your project folder doesn't have all the AWS services
installed. So if I have to create a virtual private cloud, I need
the EC two construct. Or if I have to create can s
three bucket, I need the s three construct. I can do that very easily
by just going and doing a pip install followed
by AWS.
AWS two. So this is the construct for
EC two specifically. So now that I've installed that
construct, I can actually import it into my app py.
Sorry, into my CDK test, underscore stack py. Literally just
written one line of code that you see here. I've imported this module
as EC two, so now I can reference EC two
in my project. Let's go. Like I
said, creating a VPC is literally one line of code. So I'm just saying VPC
equal to EC two, VPC. As you can see,
there's autocomplete with visual studio code as well.
And you can literally give it any name. So I'm just calling it my VPC
and done. Like I said, if there are other attributes
you want to specify, you can do that too. Now,
if I do a CDK diff, you will actually see the difference before and after
I added that one line of code. There you go.
Yeah, you can see VPC is added. There's a subnet right there,
there's a root table, there's a nat gateway,
whole bunch of other things. So all of this has been added with that one
line of code. And now when I do CDK synth,
it actually converts that one line of code into a cloud
formation template, which you can push. As you can see, I've fast forwarded
a little because this takes a couple of minutes, but that would have been your
cloud formation code that you would have had to write if it weren't
for CDK and if it weren't for that one line of code.
Isn't that pretty cool? I think it's pretty cool.
Okay, now I'm just going to do
something called a bootstrap here. I essentially want to take
this CDK app that I've built and push it out
to the cloud. It has to typically read from an s three bucket.
So that's how CDK works. Remember I mentioned that before? You can do it two
ways. You can push this to s three, or you can use the CDK bootstrap
command, which actually bootstraps this line of
code into a temporary s three bucket. So it's creating
a bootstrap environment. And now I can do a
CDK deploy. So I just do a CDK deploy
and all of my resources that have mentioned in that app will be
created. You can see it says creating cloud formation chain set
right here. This takes a minute or two because cloudformation
has to create all these resources. And yeah, it's creating, you can see
it's creating the VPC, it's creating some metadata. If you
go to your AWS console, once it's done and just open
up cloudformation, you will actually see this in process. You can actually see
each of the events that are being created. So as you can see, it says
create in progress. Can toggle the view nested button and
see all of the events that are being created with that one line of code.
Again, pretty cool. So as you can see, it's done.
If you want to see for yourself if it's actually done, you can open.
It says create complete. So you can open VPC on AWS
and you will see that a virtual private cloud has been created just
for you with that one line of code.
Yay. Yeah, there it is.
I mean, some of this stuff still blows my mind that you can do all
of this with just that one line of Python code. Now of course,
this was a very hello world type demo. Like I mentioned, this was a session
for people who are new to this topic. But imagine entire
applications in the infrastructure for each of these
applications in lines of Python code. Now that means you can run your
CI CD pipelines, you can share it with your teams people. It's very reproducible
with creating applications. You do not have to reinvent
the wheel. You can use something called CDK constructs.
So let's take a look. Like I mentioned earlier, there is
a library of constructs in AWS.
For instance, for serves there's lambda API,
gateway, dynamodb. If you want containers, there's ecs,
Fargate. All of these pieces of infrastructure can be represented
in code using CDK. And the good news is there are already
constructs readily available with best practices built in so
you don't have to create them from scratch. There are three types
of construct levels. The base one is called an l
one construct is a cloud formation resource, essentially. So this is
a cloud formation resource that's automatically generated.
Essentially it's using straight up, it's a one to one mapping
class right between CDK and an
AWS resource. So that's that. The level above
that is what is called an AWS construct.
Now these are slightly higher level service constructs like
the one I just showed you in the demo. So if you want to create
an s three bucket, you say new s three bucket and it's created.
So there is a layer of abstraction above. These constructs are
much simpler than a cloud formation resource and they require
very little input, as you just saw. And finally,
a layer three plus, which is purpose built constructs. Now like I
said, these are very opinionated abstractions. So for instance,
if I was building like a fairly complex load balancer,
maybe I don't have to start from scratch. And maybe AWS tells you
that these are the best practices that you want. So then you
can use an l three construct to say okay, I'm just going to take
these best practices and just create one. So let me give you an example.
This is typically what an l one construct looks like where you're saying
CFN bucket. So a cloud formation bucket, you're giving the name, the bucket
name and all of that. And these are generating mappings from
CDK to cloudformation. So if you can see in this demo,
the type of bucket is here and you're mentioning that
here as well. Also you're calling the name my bucket which is
right here. So each of these things are one to one mappings of a
CDK construct to a cloud formation template.
Typically though, we will use l two constructs, which is what we just used.
In this case we said new ec two VPC.
It gives you all of these things ready to use. It's ready
to use. All those ips are split default values.
Good to go. This is fairly commonly used, especially when you're starting out
as well. So do check it out. L two constructs can
also be in this sort of case where you're building a slightly more complicated
app. In this case you're building an app, something that gets an object
from s three using lambda and then stores it into
dynamodb. Now instead of creating each of these by hand or
using cloud formation, you can actually write this code here
to do all of it. So you can see there's a new table being created
here. There's a new lambda function being created here and you're granting read
and write properties. And of course you're able to read from
an s three bucket right here. Again, just with what like 1015
lines of code, I was able to create a fairly intermediate level of complexity
app using Cdk and using an l two construct
and finally an l three construct. Like I mentioned, this is purpose
build. So in this example we are actually building a
VPC with all of this subnets, nat gateways,
root tables, all of that. This comes with a built in load balancer,
so that if there is too much load, it's automatically balanced. And it comes
with a Fargate service for serves, containerization,
AWS well as an ECS task definition. Now you might think if
I have to build this on cloud formation, it's going to take me 829
lines of yaml code. Quite tedious, can be prone to error,
but with an existing l three pattern, I'm actually able to
create this using just four lines of code. So Amazon's launched
few patterns, and the one you see here, you can just reference it from
a particular registry and it creates all of it for you.
Again, I think that is so cool, if I have to say so myself.
So that is what an l three construct really is.
The great thing about CDK, in my opinion, is that it's
a very vibrant ecosystem and you can also create your
own constructs or reuse constructs from people that
are experts who have already created it. So there is CDK patterns,
for example, that have pretty much every example or use case you
can think of. So check it out. In fact, CDK day is a yearly
or maybe bi yearly conference. It just happened I think a
couple of weeks ago. Very vibrant community, so try and
be a part of it as well and I'm sure you'll learn a lot.
I've spoken a lot about CDK, but just to reiterate, there are a lot
of benefits to using CDK. One, I feel not too
many people know about it and they're very, very interested once I tell them
about it. Especially the fact that you can use logic like if
statements for loops, conditionals when defining infrastructure.
And I think that's this great new paradigm that I think will
be very common in the near future. You can also use object oriented
techniques to model how your infrastructure should look. And of course
you can reuse your current infrastructure as library.
Most importantly, you can reuse your existing code review workflow.
Typically that involves like a team lead, like some branching, some CI
CD pipelines, and you can incorporate your cloud infrastructure
into this workflow as well. And of course you can use it
in the ide that you are already using with code completion. Anyway, I hope
you learned something new today. If you are an intermediate user
of CDK and you knew all of this stuff, then check out the next steps.
There is a CDK workshop. There are a lot of samples you can check out.
You can also contribute to the community. We'd love to hear or we'd
love to have contributions from you. So yeah, do check it out.
I think it's fascinating and I'd love to hear what you're building.
If you have any questions, if you have any doubts or you just want to
share what you've built, hit me up on Twitter, LinkedIn, twitch.
I also run a YouTube show called the Emerging Tech show where I try
to talk about emerging tech topics in a very non serious
manner. I just did season one, so do check it out and I'm recording season
two. Any way, hope you learned something new today and hope
you're having fun at this conference. I'm having a great time so far.
Enjoy the rest of the conference. Thanks for listening and see you soon.