Transcript
This transcript was autogenerated. To make changes, submit a PR.
Today we're going to talk about pragmatic security automation
in the cloud with Python. So we have a
very exciting talk in this session,
and I'm going to share my experience and techniques when
dealing with these types of requirements.
So before I start, let me introduce myself.
I am Joshua Arvin Lat. People call me arms.
I am the chief technology officer of Nuworks Interactive
Labs. I'm also an AWS machine learning hero,
and I'm also the author of the book Machine Learning
with Amazon Sagemaker Cookbook. So there
you'll see about 80 recipes which
would help data scientists and developers and machine learning engineers
perform ML experiments and deployments.
But for today, the topic will be on security and
security automation. So what will we
talk about in this session? So we will talk about
the cybersecurity attack can we're going to talk about security
automation with a purpose. Basically, the goal there is for
us to understand the different ways we can
make use of automated scripts.
We'll also talk about some tips and techniques when writing
security automation scripts. I'm using to
share also some techniques when it code to creating an
automated data integrity layer. And we're
going to talk about automated vulnerability management
as well. And finally, we're going to talk about secure
infrastructure as code techniques. So, are you ready?
Yeah, let's go. So let's talk about cybersecurity
attack chain first. So if you were to ask yourself,
what's this all about? What's the cybersecurity attack can so
here it's important for us to know that if we are planning to secure
an environment, it's critical that we know how
hackers would attack an existing system
or environment. Sometimes you will
be working and building on system, you'll be building systems and
you may read in some sort of blog
post or comment section of a stack overflow link
that the best practice is this. And in
order to secure your implementation, this is the best practice.
But if you were CTO, take a step back and ask yourself, is this
important? The only way for us to know that
is to reverse the process might
be able to perform the attack.
So in the next couple of lives we will share a couple of examples and
see how an attack changes and
how the set of sequences would change as well, depending on
how the environment has been implemented.
So here, as we can see in the screen, we have
a virtual private cloud. So we have a network there with
public and private subnets.
So generally most of the web application servers are
in the public subnet and in most cases the
database servers are in the private subnet. So the
reason why the servers private subnet is that it's not
directly accessible from everywhere else.
So if you were to try to connect directly to the database
servers or some of the servers there, you won't be able to connect.
So in this case we have a Kubernetes set up
where some of the servers are deployed in
the private subnet. However, there are some servers in the
public subnet such as the cloud nine control instance,
which we should consider as high risk, especially if it's
accessible in the cloud. So if an external
attacker would attack the system, of course
one of the primary ways to do it is to attack
these ones in the public subnet. So being care
of which areas are exposed is the first step.
And at the same time it's important for us to know that
it's not just the malicious actors
from the outside which could cause the attack. It's also
possible for internal employees of a company to
perform the attack. So it's important for us to perform
and have a set of something like defense in depth.
These. Even if you have secured a certain layer, it's important for
different layers to be secured and for the entire implementation
to be audited regularly. So here,
as you can see, that setup is high risk.
And if we were to add some security automation
work, one of the first things to review
would be these security of the ones in the public subnet.
At the same time. If you were the attacker, what would you do?
These goal is not to simply attack the system.
The goal these is to steal something or to perform some tasks
that your system is not supposed to do. So for example,
if some of the servers have already been compromised,
the databases would probably been dumped
and stolen and downloaded from your setup. That's one.
The second would be stealing the passwords.
Once the passwords have been stolen, the passwords
may be used against other systems.
And the other thing that hackers would do is they
can probably use your server to attack other websites
and systems. So there are a lot of things an attacker can do.
And if you think about the word hack, hack means you're
able to perform something using a certain tool
or system, which is not generally the common use care
for the tool. So your website may be used to do something else,
like attacking another system. So that
said, the attack vectors do change and
the ways an attacker would attack your systems would change depending
on how your system is implemented. So if you have servers
and your servers have vulnerabilities, and the vulnerabilities have been exploited
by the attackers, then they'll be able to use
the tools and the attacks which are appropriate
for that type of environment. If you have a serverless application,
even if you think that your application is secure,
there care still different ways to attack that kind of application.
So for example, if you have a serverless environment and
serverless setup, you don't manage the infrastructure.
However, the code that you've written inside, let's say the lambda
functions, can be vulnerable to attacks. So if
you care unable to write secure code,
meaning code which can be exploitable.
If you're unable to write something which can't
be exploited, then of course hackers can directly inject something into
your code in your python code and then perform some operations
which your code shouldn't be doing.
So again, in a server setup, there's no
servers to manage, but still there's code that's written
from your end. So how do you secure that? The first step
there is to make use of existing
features of serverless services, for example,
Amazon API gateway. There are some security
features there which you may check first before creating
your own script. And once you have identified that you need to write your own
script, then that's the time you try to add additional
security layers. So you may add a WAF or a
web application firewall, or you can create a custom script which
prevents access to certain directories
or lives. So in this diagram, what we can
see here is that we've made use of lambda edge,
which is basically a lambda function where you can write
custom function, custom python code and function custom
python code, and it gets triggered, and it
gets invoked when a user accesses
your site. So before the custom logic here at
the bottom is executed,
the lambda edge, at the edge, close to the user, is executed
first. So you can add those security checks way
before a user gets to the API
gateway. So that's one of the things that
you can do so that you'll be able to separate the security logic from
the custom code inside the lambda functions.
So at least you're able to protect things from that layer itself.
At the same time, do not forget to also add security
code and custom logic near your
lambda functions as well. As I mentioned earlier, you need to have
multiple layers of security there.
All right, so now let's talk about security automation
with a purpose. What are the different ways we
can do or we can use to
practice security properly? So here
we're going to share some common techniques on how to improve the
security posture of your systems and of course the
security awareness of your entire company. So here in these first
slide here, we can see that, oh, we can use Python
to do some automation work and integrate,
let's say, Nmap,
a network assessment tool with another tool, or maybe
another set of tools. If you are aware, and if you have used NMAP
before, you can technically use it by itself.
So even without additional automation work, you'll be
able to get it to work. And yeah, you just have CTo know the
different options, you just have to know how
to use it and also know other features that other people
do not know about. For example, people just think of Nmap as
a network assessment tool.
However, you can make use of existing scripts there by
making use of the, I think it's the scripts argument.
So if you're able to use that option, then you
can also perform some additional security checks in your site.
So people think of this tool as a purely network assessment tool, but it
can do so much more. One of the reasons why
you might probably do these is if, let's say that you want some sort
of custom functionality that
makes use of Nmap and then you want to make use of another tool
and try to connect them. You can use Python to
bridge those two things. There are different ways to do this,
but yeah, if you're trying to build a system that's already
making use of python and you want to integrate Nmap with it, then yeah,
you can do something like that. In the past,
I have done something like this and I wasn't aware of the fact that
there's already different ways to automate it. So one of the
things that I did in the past is I literally just used the
python code which runs bash scripts inside
the python code in order to trigger the Nmap command.
However, if you are already aware,
then you can technically install a library or a
package so that you just need to import Nmap
and then things are much cleaner from an implementation end.
So even if you're using Python, there's still different ways CTO make
use of Nmap in your script. And the recommended
way here is to look for a library or a package where you can literally
just import Nmap and there's an API that you
can use.
So if you run Nmap and you want to,
let's say, list the top ports, if you want to do
this sort of check first before running an extensive scan,
then that's one of the advantages of using this type of setup. And if
you want to automation generate a really amazing, really clean
report, then the tool in this box here can be replaced
with some sort of report generation
script or tool or service.
The other way to utilize your Python scripting skills
is to do a demo. So generally
when performing and when implementing a vulnerability management
program in your company,
a major factor would involve people and processes.
If people do not understand the importance of security.
If these don't understand the security concepts and the implications of what they're doing,
then it would be hard for you to enforce certain processes.
So one of the recommended ways to do this is through demos, and you
can technically have this sort of example where you have a password protected
zip file and you can create a custom Python script, maybe 20
lines of code, which basically just runs a brute force attack
on the password protected zip file. So this would help you perform
a quick demo on password strength. So people think,
okay, my password will be password. So maybe in 1
second, or maybe less than a second, an attacker would be able
to crack these password of your zip file.
So the harder it is and the more time it takes
to crack the password, these the better. So with your python
scripting skills, you can write a 20 line or 30
line script and you'll be able to perform this demo.
And this is one way to not just focus on tech,
but also on people as well, because once they get
to understand the implications of their actions, then you'll be
able CTO introduce more concepts, CTO your company, and have your
initiatives approved.
Next would be security automation tips. So this is a very exciting
part and I'll be sharing some of the tips
and techniques I'm using myself.
So the first one here is to make use of
Python context managers.
So generally when you're using context managers,
you're probably going to use this in these context of
opening a file, writing to a file, and then closing it.
What I've done in the past is in order for me to clean my
code a bit and CTO improve the logging. When I'm writing really
quick and small scripts, I make use of these context managers.
So as you can see in the screen, I created a custom function
where if you use the with
block statement, then before a certain set
of statements are executed these, it would print the
label and then the word start, and then when these set
of statements have completely executed, have finished execution
these, we'll print the same block
with n. And the advantage with these very basic
approach is in addition to automatically making
your program or script easier
to understand, it's much easier to debug,
because if there's something wrong with your code in some sort of function,
then you'll easily know these the error happened
because when you're writing scripts, it's not going to work 100% right away.
It might break while you're writing it. And these, faster you're able
to debug your scripts, the better. So it will definitely
be quite noisy in terms of when you're running things.
But yeah, as long as you're able to debug
your script really fast, then that would definitely help you save
so much time compared to having a noisy set of logs.
And of course you can modify these functions and disable logging
when your script is already working. So there are different
organizations, CTO these one. The other things that I've done in the past is
also colorizing the logs. So if you're writing
automated scripts using Python and you're writing and building
security tools, it's more preferable to make
use of the color coding solutions where you
just add some characters
before and after certain string elements.
And then when it's rendered in the terminal, there will be
a color red or a color blue, and it's much easier
to read compared to a single colored set of logs.
So check that out.
All right, so here we are seeing a sample class, and I
generally use these when trying to create a script
which makes use of different configuration variables.
So when writing custom scripts which, let's say perform
a certain security task, one of the things that I do
in Python is I create utility classes. And these utility
classes would of course depend on the preference of these
person writing the script. So what does this
simple class do? So what
it does is if you have, let's say a dictionary,
let's say mail with nested,
nested dictionary where you have mail and SG
file uploader as the keys, and then you have other dictionaries inside.
Sometimes it's quite tricky,
and sometimes it's also confusing in your code
when you have to deal with configuration stored in dictionary as well,
because of course when you're writing a script, it's better to make your script
as stateless as possible. So again, in this sample
script, the configuration in the dictionary
would of course be stored in some sort of n
file or some sort of JSON file outside of the actual script.
And once you need to load it inside your code, you need
to be able to manage the
chain of access
variables. So that, let's say that if you need to access the
username of, let's say, mail, then instead
of having a really long line of code,
maybe you can just use node mail username as seen in line
15 and so on. So that allows
you to separate configuration from the actual code logic.
And there are different ways CTO do this. So again, this is just a quick
example, and you may try to look for other techniques
when you're trying to manage custom configuration.
So once you have your script ready, you will have different environments.
Let's say you have a custom scanner script.
So your custom scanner script, when in staging, would make use of a different
set of configuration files. Once it needs to be used in
a production environment, these, you can just change the configuration file
and then run the same script without changing anything inside, and you will be able
to scan the production environment and then finally generate a report.
So those are some techniques, because when you're writing a script, the last thing you
want to do is hard code credentials inside that script.
All right, so those are just some of the techniques that I've
used, but that should do for now as we'll be talking about other things
also in this talk.
Next, we're going to talk about automated data integrity layer.
So if you were to ask yourself, what's this
and why do we need to talk about data integrity
of a system? So when an attacker attacks
a certain system, let's say a banking application,
after an attacker is able to, let's say, steal the credentials of the
users, of course, the attackers would try
to modify, may try to modify the values inside
the databases, because once can attacker, let's say,
has bind shell or reverse shell or whatever access the attacker
is able to have, the next step is to perform
something and change something and then steal something. And sometimes you cannot
steal everything inside. What you can do is what the attacker can
do is steal the passwords and then use the passwords to
access certain web portals
or web admin pages. But then even there,
even if you have web admin access, sometimes being able
CTO directly modify what's inside the database is the easier
approach CTo make. So once these records in the database
have been updated,
then if you add, let's say, 1 million to a certain account,
then it would be very hard to track how that
account had that amount, especially if the system did not
have proper logging in place when there are transactions.
So being able to detect any sort of
data integrity issues inside your application would
help you identify if your system has been attacked, because in most cases,
people have no idea that their systems have already been attacked.
So here we can see that one plus one equals two.
So one of the ways to detect security attacks
and attacks which involve data integrity,
which involve the data integrity of your systems, is that if the numbers
do not add up, then maybe it might have been modified not
via your application code, but from somewhere else.
And sometimes people get confused when the numbers have
changed a bit because it may be due to developers mistakes,
but it can also be due to an external person attacking
your system. So one of the techniques
that you can do is when your application is storing something in the
database instead of you trying to write auditing
scripts, which may be a bit reactive,
some attacks may be detected much faster
when the script is inside your web application.
So this may seem counterintuitive to some of us,
but if you're able to write a custom data integrity checker
script which does not significantly add
to the transaction time. So let's say that
the transaction time is 0.5 seconds and your custom data integrity
checker script would make it 0.6 seconds. Then you'd
be able to perform some checks before and
after a transaction happens.
Let's say that the write operation was able to write
one plus one equals two and then the total would be two,
right? But these, your hacker was able to
change the database content, change the database values,
and the number suddenly became three.
So what you can do in your custom data integrity checker script,
using, let's say Python and SQL alchemy,
you'll be able to detect that the values
do not seem to be correct because there are some formulas
which are used to check the formulas
and trying to compare the logs in real time. So if
the set of records are loaded in real time and there seems
to be something which does not seem
to add properly, then you can already tag that for
review. Then you can perform the review manually.
So that's one of the ways to do things because in addition to
being able to prevent the issues from
being deployed to production because of a developer's mistake,
you'll be able to detect if your system is also attacked,
especially on the database side of things.
The next topic would be on automated vulnerability
management. So here
we're going to make use of vulnerability assessment tools.
But it should be critical for security
teams to know the different ways to secure
an environment and make use of assessment tools.
Sometimes people think that oh, I have this amazing security tool,
I'll be able to secure the
entire system by just running the tool and then patching the
vulnerabilities. However, these timing is also
important because when you're running a manual tool,
it will require human hours.
And if you care going to run, let's say once every month,
run that tool once a month. Then there's that one month gap where
your system might have been compromised already because
of a vulnerability which has been deployed accidentally
by your implementation team. So one of
the ways to do these is to look for tools, let's say Amazon Inspector,
which automatically runs when something in a system changes.
So one of the changes, one of the cool things with this tool is
if you push a new container, then Amazon inspector
automatically runs an assessment in the container. And then when
something changes in your instance, let's say you install
something there, then the assessment tool also runs automatically and
it generates a report which you can read
and review. So this means that instead of
having a security team run
scanners manually regularly, you can
depend on tools which run automatically when a set of
changes happen. So this allows things CTo be more to
be easier when it comes to your implementation team's collaboration with
these security team. So that said, one of the techniques you
can do there is, given that you're using Amazon Inspector, you can make use of
both three, you can make use of both these and Python
to automatically set up this entire security
setup. So that when you have a new project, let's say you have a virtual
private cloud, instead of you manually setting up Amazon inspector
and all the prerequisites required to install Amazon inspector,
you can just have a lambda function with Bulletree and Python
and some custom scripting stuff where, yeah,
everything set up automatically whenever you have a new environment.
So there, the advantage there is choose the
right tools and then use Python to do
some additional custom scripting work to
make it easy for your entire company.
The last one would be on secure infrastructure as
code. So when you're trying to secure your system,
sometimes people think of security as having
a vulnerability assessment tool and
patching the systems and making sure that there are no vulnerabilities
which can be exploited. However,
things are trickier than
what they think. For one thing, not all vulnerabilities
can be patched right away. For example, you have a new version
of a library which would solve any
vulnerabilities existing in your system.
Then if upgrading that library version would
make your system unstable, then sometimes the
patches are delayed. So there are different ways to
manage the vulnerabilities of your system.
And one ways to do it is through secure infrastructure
as code. And with this approach you'll be able to
easily secure environments
in a layered fashion.
So let's say that you have this sort of environment and you want
to convert this as code. So once you're able to convert this as
code, you'll be able to create
new environments and be able to
run different sort of security scanners and tools in
those temporary environments. So once your application
has been converted to a certain template
or a certain set of scripts, then creating
new environments that's a clone of this original environment would be easier.
And if you were to integrate this with some sort of devsecops pipeline,
then that's possible as well. So in your pipeline you push your code,
an environments is automatically generated, a set of security
scripts and scanners are executed on your temporary
environment. And once the scan has finished,
you can now take down and delete that environment and then push
your challenges to production. So in that way you'll
be able to make sure that your production environment is not affected by
the scanners, because sometimes the vulnerability scanners may be harmful and
sometimes noisy, and you want to do that in an isolated environment
which is very similar or maybe a clone of your production environment.
So one of these techniques you can do these is to make use of Bodotree,
python and cloud automation to perform this
kind of implementation and technique.
So with this one you can make this very scalable and then you can
reuse this also in other projects.
At the same time, when you're using cloud automation, instead of having one
single template for all the resources and properties
of your entire environment, you can do it in a layered approach.
So what you can do is you can have cloud
automation templates, let's say JSON files or
YAmL files which only focus on the network
level, or maybe the IM level where you list
the permissions there. The advantage there is you'll be able to
automate the rollout of configuration of your
environment from staging to prod using
code. So instead of you doing things manually,
you'll be able to test things first in staging,
maybe close the redis port using cloud permission
templates and when you have tested it that is working and there's
no impact your application, then you'll be able to use the
same template in your production environment. So yeah, so this one is
very practical CTO use and it's easier to identify
also from a security audit standpoint when these security
configuration changes have been performed, because it's versioned
and you'll know when these changes have been implemented.
And finally, once you have this approach, as mentioned earlier,
you'll be able to create multiple environments. So what you can
do is once you're able to easily roll out and
create a complete environment in
let's say ten minutes to 15 minutes, then you can use one
environment for manual testing. Because sometimes if people want to perform
manual penetration testing, then yeah, you can use that environment there.
At the same time you can have another environment which is a clone of the
other environment and use that for automated security
scans and tests. Then finally, since they're
literally all clones of each other, you can now create
your own production environment or maybe update your existing
production environment, which is a cloud of this environment.
And you may not need to test that since they're literally
100% clones of each other. And then once you
no longer need the other environments, then you can delete those environments.
So these are some of the techniques that you can use,
and as long as you're not trying to cheat your way when trying to update
the configuration lives, then you'll do fine, especially when
implementing infrastructure as code.
So there, that's pretty much it. We were
able to talk about a lot of topics and being able to one
identify what you want to secure, being able to identify
how you're going to approach security, and being able
to use Python properly when writing automated scripts,
you'll be able to perform security automation in the cloud properly
and you'll learn as you go along,
especially when dealing with security requirements.
So that's pretty much it. Hope you learned something from my
talk and have a great day ahead.