Abstract
Cloud platforms must face all kinds of security issues that are frequently a matter for security engineers, not for developers. As a result, security is treated as separate from development. Although sponsors have promoted the integration of security practices into all stages of software development, many developers think security is a topic for other engineering fields. Despite having tools such as Snyk and Blackduck, as result developers are missing the benefits they get from their cloud platforms.
This talk will show the benefits of practicing security chaos engineering [SCE] by empowering developers to leverage the power of security topics directly. [SCE] offers many advantages that include a reduction in remediation costs, disruption to end-users, and improvement of confidence in production systems. In this talk, we are going to show how this practice has helped us to develop a culture based on security between software developers.
Methodology:
* Present the foundation of the software development life cycle.
* Explore the integration of SDLC, resilience, and security using tools such as Snyk and Blackduck.
* Analyze why developers do not include the security topics in their activities.
* Present a novel practice titled Security Chaos Engineering.
* Show how democratizing security between software developers has shown us the benefits from the distributed, immutable and ephemeral, or DIE, model.
* Show some of the experiments that we are trying in ADL for promoting a culture based on security using SCE.”
Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi everybody, we are really excited to be here.
Thank you very much for attending. The title of our talk is
security the cloud empowering developers to practice
security chaos engineering. Nice to meet you.
We are Yury Nio and Jonathan Hill. We work as
cyber liability engineers for AdL Digital Labs,
a company in Colombia at provides technology and innovation services.
We are chaos engineering advocates. We are promoting the adoption
of this discipline in our country. Cloud platforms are facing
security issues that are frequently a matter for security engineers,
not for developers. As a result, security is rather has
separate from development. Today we are talking about security Chaos
Engineering, a novel discipline that offers a methodology
to bring two developers to leverage the power of security
in their roles. They are the topics for today.
Jonathan will provide a foundation about cloud and security.
He is going to show some keys from the well architected
framework. I am going to explore the integration between
software development, reliability and security.
At this point I am going to analyze why developers don't
include security topics in their activities. With this context,
I am going to present a novel practice name security chaos engineering.
Finally, Jonathan is going to show us how democratizing
security between software developers. He is going to show
us the benefits from the distributed, immovable and
ephemeral framework based on security chaos engineering.
So go ahead Jonathan. Thank you Yuri for this
big resume. Hi everyone, my name is Jonathan.
I talk about the cloud and security and
something is scenarios. We start talking about
the frame architecture. This script
from AWS could be, you know this
slide that talks about the
architecture, about the excellence of premiums, about the security
or reliability,
performance and cost optimization in these
big steps that define every cloud.
You need to know about how to work
the security in everything that you need to adopt,
right? Then we talk about bigger
cloud providers especially that is
starting with AWS.
AWS talking about the implementation Strong
Identity foundation that talks about the implement
principle of least privilege and enforce separation
of duties with appropriate authorization for each interaction.
With AWS resources centralized identify
management and IAM use to eliminate reliance
and long term static credentials, right? The next
enable traceability monitoring everything
that you need that you do in your cloud.
It's a very needed to
define in your infrastructure, right? And the next
apply the security in all layers. In AWS
you have a VPC subnets,
easy to instance Astri bucket.
You could define in all these steps
security in your IAMS group,
security in your security group that allows who
grant access for these resources.
Next one, talk about the automate security alert.
Yeah, best practices for security could be
is when you try to automaticate ADL
things that are out of security then if you make
this something you could
be better in your security, right?
Protect data and transit and rest. It's very careful
if you manage data
very sensitive correct.
And the next one prepare for security events then training
your team, trying your people
that know who is security and how to security
and what is my business and how to the security could
be based my business, right. Because with
security you have a very big problem
if your reputation it's involved in some of
security cases and then practice
that. And the next one talk about the Azure
security then talk about
the defense indeed protect your information from
beginning from installing the U walls who access for
this data. It's very important define who
access from these objects in Azure
cloud and the next entity management benefits
of single sign on then manage
all your access with controls,
with AD, with your active
directory, with your company directory.
That's very good practices to access for
this cloud. The next talking
about the infrastructure protection defined equals that
AWS defined in deep how to protect
all things that exist in this ecosystem.
That's a very nice practices encryption encrypt
your data in res in transit
that it's very nice to have to do in
your infrastructure network security can
grant whole access for your data,
whole access from your application and how
to apply some control of security that
it's very nice to include in every part of jute software
application security defined from requirement gathering
training your people to generate these requirements
for the application that you generate. It's very safe
from this beginning. Yeah and last
one talking about the GCP security talking about
from these stages that define implement
the least privilege with identity and authorization controls
that it's same but in other words
that defined in others cloud infrastructure
providers building a ledger security approach implement
security at each level in your application infrastructure applying
and defense in deep approach use the fiatrogenist
product to limit access and use the encryption,
right. Automate deployment of sensitive tasks
if you have tasks from your data from
generate reports cloud be needed to
generate some automatic tasks that the people
don't execute these scripts could be because
if the people access to these scripts access
to the data it is very difficult control if
you don't have a very granularity of
these controls then if you have a very
automated deployment you have a
very automated this task. It's very
nice to have because your security is
better. Automate deployment of safety tasks
they are talking about implement securing monitoring
in all cloud providers define
how did you deploy some things about the
infrastructure about your application then if you
define could be pipelines for generate these
resources for my application for my substructure
for my business objectives could be to define
some task task protect
these steps in all things from
security. That's right. And with that you
generate best practice for all your model that
you define in your infrastructure and your business.
Thank you Jonathan. Cloud computing models presented by Jonathan
are dynamic and complex which make difficult detecting threats
and consequently to pronounsticate cyber attacks. As a result,
different systems are designed to respond to failures
in quite different ways. In the absence of an adversary,
systems often fail safe. Failsafe behavior
can lead the two obvious security vulnerabilities to
defend against an adversary who might explode a power
failure, we could design the door to fail secure and
remain cloud when not power. These primary reliability
risks are not malicious in nature, for example about
software update or a physical device failure.
In the other side, security risks come from adversaries
who are actively trying to exploit system vulnerabilities.
When designing for reliability, we assume that
some things go wrong at some point. When designing
for security, I think it's different because we must
assume that an adversary could be trying to make things go
wrong at any point. Both security and reliability
and concern with confidentially integrity and
availability of our systems. But they view these
properties through different lens. They have traditionally been
confidence fundamental attributes of
secure systems. The key difference between the
two viewpoints is the presence or lack of a malicious
adversary. A reliable system must not breach confidentially
accidentally, while a secure system must prevent an
active adversary from accessing, tampering with or
destroying confidential data. Confidentially integrity and
availability and related with
these two concepts reliability and security.
According to Google, reliability is the most important feature
of our systems. Considering this, for reaching
this they must have to securing. Probably you are wondering
where to begin integrating security and reliability principles
into your systems. The first and the most important
step is securing security and reliability
issues is to educate develop. However, even the best
training engineers can make stages mistakes, security experts
can write insecure code and sres can miss
reliability issues. Considering that it's difficult to keep
the many considerations and trade off involved in building
a culture based on secure and reliability systems in mind,
we started making an evaluation of the situation in our
company. So we apply a survey between developers with
the aim to know how much they know about security
and what is their perception about the importance of
this topic. We interviewed 130 engineers in
ADL, of which the 25% were
software architects, 16% say they were front end engineers
and 60% were back DevOps engineer and
just two or 3% say
they were full stack and quality engineers respectively,
although we were expecting that a percent of them
didn't show interest in security topics to the first questions
do you have interest in security topics? Almost 15 didn't
have interest on those topics. That is an important person if
we consider that the group is mostly composed
of backend engineers. Tools is an online
community that produce freely available articles,
methodologies, documentation, tools and technologies
in the field of web application security. It is a great reference that
ADL DevOps engineer in building of a digital solution should
know. So we consider could make sense to ask about
its practice. To our surprise, the percentage of
people who didn't practice AWAPs except one third of
the responders about static analyze static analyze
is about analyzing and understanding computer
programs by inspecting their source code without executing
or running them. Static analyzers parts the
source code and build an internal representation of the programs
that is suitable for automated analyze. This approach
can discover potential box in source code, but also
it is a great tool to discover software vulnerabilities, preferably before
the code is checked or deployed in production. To the
question do you run a static analyze?
23% of developers don't have security steps
enabled in pipelines. This group includes
five software architects, 14 backend engineers and only one
front end engineer. 100 people have security integrated
in the continuous integration continuous deployment process.
Finally, we ask them about the tools
that they integrated into their development environment.
In ADL, our Jenkins pipeline run steps for measuring
the quality of the code using sonar. That is the reason for
the first value, 49 of them use sonar. But when we
ask by tools or plugins for identifying vulnerabilities
in the dependencies of the code, the values are low. Just three
people use black doc and two use
sneak. Finally, just one people or one person
use fortify Amberco
two tools for building secure software fast finding security
issues early and fix them. It is our conclusion of this survey,
14% of engineers don't show interest in security issues.
That is really really important because it
imposes us a challenge. Motivate and create
culture about security between software developers group of
people is molecule formers vacation engineers. That is an important thing
to think about this and we need to motivate them
and we need to generate motivation stages for them.
So let me move another section of this presentation.
It is clear that we have a problem from the development which can
be extrapolated to the cloud. Considering that there are
a group of people who don't show interest in security topics,
the most probable is that an exploited vulnerability is only a
matter of time. Software development is a dynamic profession.
Source code change daily and once in a while.
So that's the way the local development environments
needs to be set up. There are many benefits of integrating
security in the development process. We have to
rethink how developers sce the cloud in
terms of software development and adoption. In the early days,
we performed the software development lifecycle stages
offline and on premise. If you remember where developers
use their computers has terminals to access early versions
of the worldwide web, helping them find answers to
problems. All right, so far, thanks to Internet software
as a service, solutions quickly bought significant
security vulnerabilities. Nowadays, Digital Reliance Trust,
open business have served to how to important secure
software development lifecycle is for business, customers and
society. A common security box can lead to catastrophic
breaches if undetected. A 2019 study
found out of 32 web applications,
82% of vulnerabilities were located in the application
called itself. Hackers can attack users
in night out of ten web applications.
Attacks include redirecting users to a hacker
controller resource, stealing credentials
in phishing attacks and infecting computers with malware.
Unauthorized access to application is possible on
30% of sites. In 2019, full control
of the system cloud be obtained of 16% of web
applications on 8% of systems,
full control of the web application server allow
attacking the local network. On average, each systems contain
22 vulnerabilities on which forward of
high severity it is a fact we need to secure
and guarantee reliability. Code to date
has lead to the growth chaos engineers where
resilience is built into code by designing and methodology.
Security should be front of mind both security
engineers and developers. That is a fact. That is our conclusion of
this first part. Organizations must offer training
and sculpture internally. What can we do if
there are developers don't like security? Has I
mentioned in the survey we have a proposal here.
Use security chaos engineers Security chaos Engineering
is the identification of security controls failures
through proactive experiments to build confidence in
the system's ability to defend against malicious conditions
in production. This definition was promoted in this book, Security Chaos
Engineering, published in April of the last year.
I have highlighted six words that
are valuable in the definition provided by Aaron security
failures experiments and it is super important
because this discipline is based on the scientific method,
confidence and defense because it is about to achieve
resilience and lastly production because the theory
says that we should run experiments on production environment,
although it should not be necessarily so, we can expect
traditional teaching methods such as classroom
based learning to change our developers mindset
on secure coding. Gamified developer programs are
a great way to engage developers and actively test
their secure coding skills. Chaos game days are based on game days
and now I am going to provide some definitions
related to that. A definition from AWS says
that game days are an interactive team based
learning exercises designed to give players a
chance to put their skills to the test in a real world,
gamified, risk free environment. Most importantly, they are
an extremely fun way to learn more about the potential
of a technology as a form of game days. Chaos game days
is a practice event that can take a whole day.
It usually requires only a few hours.
The goal of a game day is to practice how you,
your team or your supporting systems deal with the
real world turbulent conditions. That is the objective for this practice.
So it is a framework provided by rules.
Miles the framework has three phases, before,
during and after during. Before we pick
a hypothesis, pick an style, decide who
where went the event was wrong.
So after that,
during the durian phase, the tech decitation is the
objective of this part. And other activities include take
adept bread, communicate, visit dashboards in the
observability tools, analyze data, propose solutions
and apply it and solve the incident. And finally the
last has is for writing a post mortem. So in this phase
we analyze what happened, what is the impact of
the incident, what is the duration, what is the resolution
time and what are the action systems included here.
So let me talk about
some examples for practicing security chaos engineering in
a game day. One experiment for it
is introduce laryncy on security controls. Drop a
folder like iron escape will do in non production
software secret clear text disclosure disable
service seven login permission collisions provide
permissions collisions for example in AWS AP gateway
shutdown create an
encrypted three bucket or finally
disable multifactor authentication impact of security chaos in
general in previous slide we're talking about the chaos and how to
this practices cloud be generate more value in
your teams and generate some practices in our teams.
And then now we're talking about how to our teams
could be generate this value. These objectives
with using chaos and what could
be generate this impact in our teams.
Right then could be one big
problem with this part is
because you needed to talk a
long amount of data to resume and
correlation generate some patterns about this data.
You could be based bigquery use some other strategy
from your cloud and how to this cloud generate
this pattern. For me that's not easy to
use, but you can use and define what
useful and what not useful for my
cloud provider use this summer a lot of resources
that you can define and how to all
part of the cloud generate this part for me that
I not needed to generate very big tools.
I can use some of IIA to generate
this part of my data that I have
in my storage.
Then how to correlate these logs and how to
generate some part of my job more
easeful. Right. And the next slide we're
talking about the impact. What is the impact about my
teams? What happened with my teams talking about
what is the most big importance with
your requirements gathering and architecture design
for security. This one has the part of focus from
your software cycle. Because you
can prepare your people and generate some plans to
this part. But if your architecture
or the person that company your
business don't generate this value for the
company it's very difficult.
That view from all team. Because all
part of the team it's very important in
all part of the software. And you need to
generate this value for your team,
right. Then you need to generate plans
to capacitate your people and how to these people cloud
be topics like
a herd and generate these requirements
the correct way. Right. Then we'll
be talking about the continuous testing about this
team that it's very important. But first focus
obviously is the customers
and the new features that generate with the value from the
customers. But the second focus that generates
for this team is the security how to the security play
a big important part
of this SCE. Because that team
will be generate more requirements for my
development team than my business.
Because if my team from
QA could generate these plans,
some tests that generate could be
some issues. The security could
be my development team is more stronger when
generate more software development for my customer then
it's a very big part of my team.
But it's important every parts of
the team. Obviously it's very important. But if you
generate focus in this part, the other
part in your team generate this because that
current generate all the teams in the
thought for developers, right.
An opportunity to involve business. An example
asking if a low that
lodging could be generating more from one browser
cloud be it's a one off part that more or
less than the
other ways that the attacker could be generate
some waste to this part.
And in that part you need to involve your
customer in these requirements, right.
We highly import of secure dependencies in time on software
design and implementation. If you define
in your team, in your software team that use
open source library it's very carefully because you
needed to generate security for that. Yeah.
It could be the develop security use
open source library that generates some
part of holes in
my software and it's very difficult to identify when your
software is in production. But if you generate
part of this definition of how to
security these dependencies and how to generate
some scan about this library, you generate more value from your
customers and AdL value that you generate with your developer team,
right? That it's a very nice form
to generate this impact.
Talking about some recommendations that
in AdL we generate with jury
and all security teams that it's a very important in
this moment, right? Use algorithm to prepare
sensitive data. Use algorithm that use highest
encryption keys, not 256
keys, could be in 1000.
Generate your popular algorithm that
it's easy to decipherate
for your team. Could be if your
team of security is very mature, could be generate some
algorithms to protect your
data. If you have this part of your
teams, very nice because it's part of the software
too. Don't leave clear data in logs.
Some developers use clear data to debug the
application, but in production it's very other
way and it's very painful because if
you leave clear data in logs,
if these logs was stolen, you have
a big problem, right? Then could
be used more or less data
included with the logs from your developer team
if needed. Debugging in production exists
other ways to generate this debugging and how to your application
use production for
your developers,
right. Use MFA for
critical application actions. Multifactor authentication is a
very useful for your actions
not only for login. You can use this strategy
for generate more secure approach in your
software, right? Use a long short life effective
links for documents to be delivered. If you generate
some PDF from your customers, could be generated
long time from these documents of
OTP. One access from these documents
and that's it. Remote frontier storage. That is a
good practice because you don't have to generate all
things of this PDF. Or could be other strategy is generate a
hash. When the hash is get it from
your customer, the system automatically generates a PDF.
The customer download it and you remove from your storage
and that's it. You don't storage this
data because it's very sensitive and it's a
very hard way to storage and maintain
names. This data, right. Uses session
management from front kinds cloud be in this part
separately more your application generate
back end generate front end. If you can put some
more layers from your application from
your front and your back end because you
generate difficult patterns from your hackers that it's
very hard way to generate some attacks
from your application. Then if you use this
strategy, you secure more your application,
right. Do not use cookies and process storage.
Could be if you use storage and
use cookies could be not put in there.
Some sensitive data could be put just session
id could be put list of products or
list of access from my role.
But no sensitive data because it's very hard
took from the computer's customers
this data and replace that, right. And if
an attacker take these cookies or storage
from browser it's very difficult remove
that and the customers are very painful because
the customer generate new user neo password
generate some control that could
be disheartened. And it's very painful for him
to make all software activities
auditable. If you have a software with all
action of detail, it's very nice because you identify
what the customer is doing in your software.
In this way you can define
what happened if what customer access to unexplored
link and what try to access to this link.
Because then later, right.
Then you need to define this auditable
from generate some alerts, generate some patterns
from your customers and how to define
it's not a pattern from your customer, right.
Perform vulnerability scan of the software. Talking about
the dependencies is a good way to generate this
part. Another part.
Enable CI and CD steps. If you have in this moment
from delivery for your customers from
your application, you need to enable
these steps from security. That's very nice
in your application because you
could be generating some politics
about from the software about the roles
about how to deploy, about how to secure
my artifacts, my application and how
to generate this very secure in
these steps. That is a very previous from generating
software in production, right. Use a hard info
validation of software elements could be used
this hashing from your artifactory repository
that generate juratifactor.
Put in your artifactory repository with the has when
you download check if the hashes change
it. And that's a very easy control
to your repository and your artifacts that generate too,
right. Generate container
image in secure way. You could be used
less access
strategy of minimum of privilege from these containers
that it's very long way
to your containers, but it's the good way from
your customers and your application. Because you don't
have to generate these containers with
the root access because don't need it. The application
really don't need it root element with your environment.
Because if one application needed that, it could
be redefined how to access from some resources in
the container or from your resources, right.
Separate environments from applications and separate
database too. It's a very useful
this part if you have a production environment,
development environment, QA environment. It's a very
easy part. If you separate
and isolate this environment from each one other.
That's a very nice and very good
way to generate this securing right. Then if
you separate for less just production
and other environments. In other part is right.
Because you could be generate some with this strategy.
Because I don't have a lot of money that it's a
good way and some part to start with that
use security blocking user unsuccessful attempts.
If you detect that your customers
logging from three, four, five attempts
fail it, you could be blocked. Because it's
a very good practice and your customers is very
grateful for you. Because you can
send email advice about
your user has been blocking about some
attempts that could be done generate and this alert
for your customer. I'm very grateful
because it's a nice alert and
that's it. Right. Thank you.