Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Hi and welcome to my
talk about how to architect for continuous
delivery. First to myself. My name
is Romanoroth Ross and I am the chief of DevOps
and a partner at Silke. I work
now for 21 years for Tilke. I joined
Silke directly after university,
became a junior net
engineer, then an expert software engineer,
then a lead software architect, and finally I became
a consultant. And one thing that was very close to my
heart is how can we continuously deliver value,
how can we ensure that the quality
is good of what we are going to deliver and how can we
automate things? And when the whole DevOps movement started,
I jumped right on top of that, became one of the organizers
of the DevOps Meetup Zurich, which is a monthly meetup we are
doing. And I'm also one of the organizers
of the DevOps Days Zurich. The DevOps Days is a
two day conference and all over the world, in all
of the big cities there are DevOps Days conference.
And I am the president of the DevOps Days Zurich.
You can see that DevOps is something that is very close
to my heart. That's also why I have my own YouTube channel
with over 100 videos all around DevOps
architecture and leadership. And I also post and
tweet a lot on the social media channels.
If you want to learn more about DevOps
then please follow me on these channels.
In my project I work for different clients
in different industries and I'm doing DevOps
transformations. And when I'm
in such a consultant mode or in
such a DevOps transformation at the client side,
then I usually see exactly this picture.
What we can see here, the business together with the clients,
they have very, very bright ideas.
They are putting these bright ideas into word documents
and into gyro tickets and then they throw
it over the wall of confusion to the development
team. The development team looks at
these requirements and they say, hey, if you want
to have that, we can implement that. And they are going to implement it.
And then they throw it over the wall of confusion to
the testing team. And the testing team looks at that,
looks at the requirements. It's not the same, but they
test something, it's green. And they throw it over the wall of confusion
to the operation team. The operation team looks at that
and says, wow, this will never work in production. But somehow
they manage to get it to work and they throw it again over
the wall of confusion to their
client and to the business. And the business is, or the client
is like, what is that? This does not solve our problem.
This is not what we wanted. And you can see in this
picture, this blue line there. This is the
value stream. And the value stream is broken by these
walls of confusion. And they are coming from
the silo organization. We have the business development,
QA and operation silos.
Usually you also have quite a lot of legacy systems technologies,
and this all leads to inflexible and slow processes.
There is no alignment in here. Every team is
working on their own goals and usually in such an
environment, security and also quality is something that
is done when things are in production.
Also, what we usually have is quite a lot of cultural resistance
and regulatory and compliance also plays
a huge role. Now, where do these
challenges come from? These challenges
are coming from the projects which we are doing.
In the past, we have done projects in the waterfall mode.
We have planned, designed, developed, deployed,
reviewed, tested and launched in large
batches. Our projects there, the scope,
the budget and the time was fixed. Then around
the year 2000, some bright people said,
no, this is not how we can do
software development. We need to go agile.
And we went agile. Now, time and budget is
still fixed in our project and the scope is
variable and we are delivering in smaller increments,
but we are still doing projects.
But our clients, they want to have
products, they don't want to have a project.
Now to understand that, we need to little bit
dive deeper into that. So a project always
has a start and an end. It has a
fixed set of features that are going to be delivered
and also resources which are applied to
a project. So I want to have these
ten features in a half year for,
let's say €200,000.
This is a project. So a project is
focused on output, maximizing the
stuff that gets delivered, features, user stories,
tasks and code. On the other side,
we have the project. In the project, we focus
on the outcome. We want to solve
the problem of the customer. We want
to change the behavior of the
customer. So it is very outcome
focused and we want to understand the problem of the customer.
So we want to deliver that one feature that solves
the problem of the customer. So we
wants to do products. And DevOps is
very good when it comes to deliver products,
because DevOps is a mindset and a culture and a set
of technical practices which allows us to organize ourselves
across the value stream, bringing all the
people together which work on a product so
that we can continuous deliver this
product. When we talk about DevOps,
we also need to talk about the people and the term
DevOps. I already mentioned,
DevOps aligns us across the value stream and
brings all the people together. That work on a
product, on a value stream. The term DevOps implies
that it's development and operation.
And of course there are some smart people out there which say no.
Nowadays security is also very important.
So we need to fix that term. Let's call it
DevOps development, security and
operation. How? Also this conference is called.
And then there is another group of people which.
No, no, the business is very important. The business,
they are giving us the money and that's why we need
to call it Bisdev Ops, which is business development and operation.
And you can see this discussion leads nowhere.
We need a term like Devsec bids,
arc comp, QA ops. And I'm pretty sure I have
forgotten someone. We can also call it DevOps
or dev starops, or we just call it DevOps because DevOps
is about bringing all the people, process and technology
together to continuous deliver value. This is
what DevOps is. So why should
you care about DevOps? Now in the year 2000,
some startups with funny names like Google,
Facebook, Amazon, Netflix, they started to
use DevOps practices. Nowadays they are dominating
the market at the moment. You can clearly
see that Tesla, SpaceX, Weimo is
also using these techniques to create
cyberphysical systems. So systems which have hardware,
software and electronics in there. And I'm
pretty sure in five years we will see that these companies
are dominating the market. Do you want to get
an example here you get an example of
Elon Musk, which is tweeting about the
self driving release of his
software. So he's
tweeting on the 7 October 2029
that FSD, better 10.2, will be rolled
out to roughly 1000 owners with a perfect
safety score. What does that mean for us?
It means that he has the
software modernized in the cars, so there
is a module for self driving and he can update that
over the air. It also means that
he is constantly monitoring how people are
driving and he can distinguish between people
who drive very good and people who drive
not so good. And he is able to target
a certain group of people based on that data.
This is a so called cannery release.
Now, on the 15 October he then
tweets, everything was okay with the 10.2
release and we are releasing now 10.3
to a larger group of people.
Again, this means that he was able to monitor
the system, or Tesla was able to monitor these cars.
Then on the 24 October he says,
ooh, there was some problems with the
10.3 release. We are going to roll back
to 10.2. Now here we
need to quickly stop. So here we talk about
cars on the road which are regulated and
he is able to do rollback of
the software. Many companies are not in that position to
do that, but he does that with hardware and software.
Not even 24 hours later he says,
oh, everything is okay. We are going to roll out 10.3.1
with the bug fix and we do a fix
forward. So really amazing what Tesla
is able to do there. So to be able
to do such things you need
DevOps. And DevOps has been analyzed
scientifically with the book the Science
of DevOps accelerate which I can highly recommend
to you to read. In this book they have found
out 24 key capabilities which
drive improvements in software delivery performance.
They have categorized these 24 key capabilities
in five categories. There is the continuous delivery
capability where it says you need to have everything
under version control. You need to have deployment automation,
continuous integration. So reintegrating the code,
trunk based development. Hey, science has
found out that trunk based development is the way to go.
Just to say then test automation,
test data management, shift left on security and the
continuous delivery capability. So this
is very important then based on that you need to have
some architecture capability like loosely coupled architecture
and empowered teams. I always say you should create
a loosely coupled architecture around modules with clear inputs
and clear outputs with teams of maximum
five people. Then they are also empowered
and they can deliver continuously. Then we have the product
and process capabilities with customer feedback, value stream
mapping. So organizing a rod across the value stream,
working in small batches and team experimentation.
Then the lean management and monitoring capability.
And here very important. The scientists
has found out that change approval boards
are in the best case for nothing. In the worst case
they will slow you down very very
much so. There is not a benefit in
having change approval boards.
Monitoring is very important also for the protective
notification, so that you find the box before
the clients do that. Then working process limits
and visualizing the works. This is the Kanban boards and
also the scrum boards. Then the cultural capabilities are very
interesting. There you have the western organizational culture where
we distinguish between three different types
of companies. There is the pathological
company, there is the bureaucratic company and there is the gross
company. Pathologically there,
when an error or when a failure happens,
the person who reports the failure or has done the failure will get
shot. In the bureaucratic company,
the person who has done the failure or has reported
the failure will go through
bureaucratic, long lasting processes.
And in the growth companies, when someone reports failure
or has done a failure, he will get
celebrated because everybody has learned something.
And of course all of the companies wants to be
in these gross companies, but usually companies
are bureaucratic. This leads us
to supported learning. You want to continuous learn,
collaborate across the teams, you wants to have job satisfaction.
And for all of that you need to have transformational
leadership. So these 24 key capabilities,
they are very, very important when it comes to architect
for continuous delivery. And you will see some of these
key capabilities. I will pick and dive a little
bit deeper in my presentation today.
Already identifying these 24 key capabilities
was quite amazing. But they also draw that
picture which you can see here. And they also found
out the relationship between these key
capabilities. I usually say this is one of
the most important pictures when you are going to do a
DevOps transformation or when you architect for continuous delivery,
because it shows you on the left side,
sorry, on the right side, what you want to achieve or
what you can achieve when you do the things
on the left side. And this is a
very important picture when you are going to architect for continuous delivery.
So with that, what are the
benefits of DevOps? Also again here science has
looked at that, there is the state of DevOps reports,
which is done on a yearly basis.
And here we are going to compare the
low performer versus the high performer. And you can
see the numbers are massive on deployment
frequency, lead time and so on. This all boils
down to the benefits which all of the CIOs
nowadays wants. They want to have a faster time to market,
more value for money, higher quality, higher customer satisfaction
and top qualified employees. And these are the
benefits you get when you implement DevOps.
So now when you are going to implement DevOps,
this is nothing else than modern software development across
the value stream. For that we have here this infinity
symbol, this DevOps symbol where you can see that
in the planning step you need to have agile requirements,
engineering and backlog management. You need to have in the coding,
agile software engineering practices which you should apply.
Everything and really everything needs to be under
version control so that you can fix forward or roll
back things. And then you
have in the build step the continuous integration that you're continuously integrating
build and test automation and also the application security
in the testing. It's very important to have test automation and
test data management that you have the synthetic test data and not
use the production data in the deploy. We need to have a staging
environment, a production near environment and of course deployment automation
which needs us to release. In DevOps, we are
distinguishing between deployment and release.
Deployment is bringing the code into production with the feature toggle
off and releasing means we are going to switch on the
feature toggle. This is exactly what Elon Musk did
with Tesla. He switched on the feature toggle for a
certain group, and this is called Canary releases. And you need to have
feature toggles for that. And it enables you to do dark launch
because he delivered of course that software to
certain cars, but only a few of them were
able to use that feature. This brings us to the
operate where we want to have cross team collaboration, so that
we really, across the team work together in
resolving problems. And we need to have productive detection,
which means we want to find the problems before our customers
do. And for that we need to have a full stack
telemetry with visual displays
and federated monitoring so that we can make sense out
of the data that we get. And this goes
back into the agile planning and
into the agile requirements engineering, because we should always
base our decision on data numbers
and facts that we get from our monitoring system.
So the first thing I want to dive a little bit deeper
is but in quality, because that is
essential for architecting for continuous delivery.
When you look at the Musk companies like SpaceX,
Tesla, Neuralink, the boring company there,
Elon Musk is investing into the
new products and 50% of that invested
money goes into automated testing, because only with
automated testing you can deliver
that fast innovation to the market.
And this is quite a high number. I'm pretty sure
that the numbers in your company are much lower
than these 50%.
So now we need to have a look at the
testing. Usually we test in
a v model, which you can see there where we have delayed
feedback. Usually it happens that way that a
requirements engineer is writing a feature,
then another requirements engineer is writing a
story, and finally a developer writes
the code. If we are very lucky,
then the developer also writes some tests for that
code. Usually we are not that lucky in
that model, and no test is written on code level.
And then the story goes to a tester which tests
the story, and then another tester will test the
feature. Usually in between that writing
a feature and write and testing the feature. It can easily
be three months, six months or even a year.
So it's delayed feedback. When we go into
agile testing, which I highly recommend in a DevOps
approach, then we are going to define the feature
already with behavior driven development,
where we define the acceptance criteria of a feature
in a given when then form, given the following
state, when the following action occurs,
then I want to have that result. So when
we have the acceptance criteria in that form, also for the stories,
then it's very easy for us developers to create
out of this specification,
automated tests, always a good
tool for that. And what we are doing
there is called test driven development.
So we write first the test, even when we don't have
the acceptance criteria in a BDD form,
we should always write first the test,
then the implementation,
and then do the refactoring. This is what TDD is
all about. You can see that in there. So you
should always write the test, then the code. And this means we
are shifting left, the whole
testing of story and also of
a feature. And this is why we usually talk about the
shift left in testing.
For that we also need to have a look at the
testing permit. Usually in traditional testing,
the test pyramid looks like the one on the left side.
You have slow and expensive end to end
tests which are done manually.
You have only some integration tests and
very few unit tests which are fast and
cheap when you apply HR testing.
And you could see that when you have the specification of the acceptance
criteria written in a BDD form. And when
you apply test driven development. So always
first write a test before the implementation. Then you get a massive
amount of unit testing because everything is
tested and some integration testing and only a few end
to end tests. And this brings us to when you
look at the traditional test permit,
the focus is on finding every buck
where on the agile testing permit there
the focus is on preventing bucks. So it's a
risk based approach which we are applying
again when we go into our
infinity symbol of DevOps. This means that in
the planning we are doing behavior driven development. So we specify
our acceptance criteria already in a form
which are testable. Of course, definition of ready is
very important in that case. And then when we go to
code, then we apply test driven development. We always
write first the test and then the
code for that. This brings us to the build.
There we are using unit testing of course,
and also the application security. So we do container
scanning, license scanning and all of that. And then in
the testing we of course have the definition of done,
which is important, and the whole test automation,
including also the test data management with the synthetic
test data. When we then deploy production testing
is a very important step that we are having. Then even
if we don't release, we should always apply a subset
of the tests also in production when we are going to release,
then cross team collaboration is again important.
And of course in operate we want to detect the
problems before our clients do, which leads us to
the continuous monitoring of what we have done so that we
have the data which is needed for the planning step.
So building quality is
one cornerstone which is important for continuous
delivery, for architecting, for continuous delivery. What is also
important is built in security. This is
really an important thing. And here we are in the Conf
24 sorry 42 about devsec
ops. So when we look at continuous
delivery pipeline then usually we talk.
But the CI CD pipeline which is continuous integration
and continuous deployment, I usually say it's
far more. It's your continuous delivery pipeline because
in the beginning you have the continuous exploration with ideation
and backlog management and requirements engineering, and at the
end you have the release on demand where you are going to switch on
the feature tofu. So this is the continuous delivery
pipeline. And you can see in such a continuous delivery pipeline you have quite
a lot of tools which need to be integrated with each
other. Of course there are avengers out there which promise you
to cover the whole continuous
delivery pipeline. They are coming with their DevOps
platform. Yeah, well when you take a close
look, and I have done that in a video series with
25 videos, I have done
that together with public Stego and I analyzed
git how we can implement devsecops pipeline
with GitHub and also with GitLab. And you can
see how such a pipeline can look like.
You have of course merge request. You use the
software composition analysis, you wants to know what kind
of libraries are there and what vulnerabilities you
have in the libraries which you are using. License compliance is
very important. What kind of licenses do you use in
all of these libraries? And also
the static application security testing. We are
going to statically analyze our code. The container scanning
is important and we need to scan our code for
secrets. Then when we have deployed, then of
course we apply dynamic application security testing where
we try to break our application
security wise. And of course when we are in
production, we need to apply scheduled pipeline
where we execute the pipeline again and again, because in
the meantime with that version which is in production,
there could be already some vulnerabilities.
As said with Patrick Stego, I created
25 videos all around how to create
with GitLab and with GitHub such
DevOps pipeline. And we are also comparing those
two pipelines together.
So this brings us again to our infinity
symbol. When we are doing continuous security, then it's
important in the planning that we do the threat modeling, so that we
analyze what are the attack vectors, what are the
threats which we have and how can we mitigate
that. When we are going to code, we need to do merge
requests or pull requests.
Then when it comes to the build step,
we apply application security in there everything is
in there like security testing, soft license scanning,
container scanning, secret detection, all of that.
And then we apply dynamic application security
testing. Of course in production we will do
penetration testing and again we do cross team collaboration
to find the problems. It is essential that we do
proactive detection when we operate the
product. We want to find security vulnerabilities
before our customers do. And for that we need
to continuous monitor our system security
device. But not
only security is important, also operability
is a very important thing because in DevOps
we want to do you build it, you run it,
and this means, as I already mentioned,
proactive detection. This means that
our monitoring systems need to alert
us about condition
based on our tolerance thresholds, which we
have. And for that we need to have a
good monitoring system. Also disaster
recovery is a very important trying. Usually disaster
recovery procedures are not really in place
and also not rehearsed on a regular basis.
And of course based on that we need to have a notification
strategy. I also mentioned cross team collaboration.
What we don't want to have is that only operation
takes care about the production and owns the incident process.
We want that the team owns the
production so that we can work across the value stream
and that we together respond on
production failures and also hold incidents
post mortems.
Now the monitoring, I already mentioned that
this is a very important thing and I wants to quickly
show you how this has evolved and why
everyone is now talking about observability,
because that's a cornerstone when you want to architect
for operability. In the past we had two tier
system, you had the UI and the database and there
you used monitoring, you had the metrics.
So like database
size or file system size
or even cpu metrics and you
had your lock statements about the health of
the application. These things we
still have, but todays we have of course three
tier application where you have a UI and a mobile
application. Then you have can application server and behind that you have
a database. Here you need to do application monitoring
with traces because you need to know where wants the
request coming from, UI mobile and of course you
need to do infrastructure monitoring based
on that because now you have not only a database, you also have an application
server. And then nowadays we have distributed
service oriented application where you have UI,
mobile, third party APIs and so
on. Usually you have an
application gateway, then you have different services
or even microservices and behind that
you have databases. And this puts
quite a lot of challenge on that because here you
can have some weird behaviors in such
a distributed system and here we are talking about
observability also here we usually also use
aiops to make sense out of the
massive amount of data that we are getting out of
that.
So we need to architect for operability.
This leads us to this infinity symbol where you can see
when we are planning, we already need to architect
for operability. We always when we are architecting a system or
a subsystem or a microservice, we need to think by ourselves how
is that going to be operated in
production and based off that we define the architecture for that.
Then when we are going to code we need to build in application telemetry.
Also the developer needs to think about that.
How do I get the data out of that into
the logging systems? We need to
apply infrastructure as code. In the end everything
should be code. This brings us to the
build step where we have the continuous integration again build and
test automation and also application security.
Test automation is applied and test data management.
This brings us to deploy where we need to have a production
near environment also to test
the operation part including also test
automation, deployment automation and
of course we need to have canary releases with feature toggles and
dark launches in operate. We need cross team
collaboration, approach diff detection. And it is absolutely
essential that in the monitor step we have that full stack telemetry
with the visual displays and federated monitoring.
So now we saw that we
need to continuously test do continuous security
architect for probability. This is quite a
lot to do and therefore we
are going now into the chapter about building a platform.
Now as I said, modern software development is
a continuous process across that value stream that you see there.
And you remember that I said DevOps is
a mindset and a set and a culture and a set
of technical practices. And these are
the technical practices which we need to apply
in order to continuously deliver value and in order to
architect for continuous delivery. And yes,
this is quite a lot which we need to
do. That's wow.
Now it even gets harder because
you remember we said you build it, you run it, which means
you take care about the infrastructure cloud on prem,
you need to take care about the runtime Docker,
Kubernetes, VM, you need to take care
about the CI, CD pipeline, GitLab, GitHub, Circle CI
or whatever you want to use monitoring,
tomatoes, dynatrace, Datadoc and whatever you
want to use security with sneak, Sonarcube and
so on. And you also need to have tools like Gyra,
confluence, Miro and so on. And of
course you need to apply cost management across
that, especially when you are in the cloud.
And maintenance of all of these tools
is also very important. And you need to provide
access and security to all of these tools for
your developer or your team members.
And yeah, just forgot you also
wants to implement an application. And there it
is, your application where you want to implement some
features. And yeah, this is quite
a lot for implementing just a feature.
And the thing is,
every team has such a stack.
Usually in bigger companies when you scale that up,
when you scale up DevOps, this is quite a lot
which you see there. And you can see
this leads to some inconsistencies and redundancies
when it comes to these platforms.
Every team reinvents the wheel. There is also a lack of
operational experience, no synergies are
used, and it's not easy to move people
across these teams. And the complexity of these tools
is very high. And this is why everybody speaks
about the cognitive load which is very high
on these people. So now we are
here, are we doomed? Can we do DevOps
or should we go back to the silo organization?
No, let me quickly explain how we do that
nowadays. So nowadays we are talking about
platform engineering. Platform engineering is here to
enable DevOps or devsecops in
the product teams. What you have is you have a platform
team which builds a product. This is called the
platform. So there you see it, it's an example.
This platform provides application runtime. So the
environments where things are run, for example a Kubernetes
cluster, and it
also has devsecops built in. So there are
already devsecops things implemented
like license scanning, like container scanning
in the platform. So a lot of standardization is given
by the platform. You also have access and identity in
there, centralized security with web
application firewalls, gateways and so on. And of
course the whole monitoring and observability stack is delivered
by the platform. And of course with such a platform you can
also implement funny things like AI and large
language models. So this enables quite
a lot. As I said, this is a product that
is given sort of also a service that is given
to the product teams. The product teams,
they are building and running and maintaining
their products on top of this platform.
This does not mean that the platform team creates the CI
CD pipeline or monitors the application. No,
absolutely not. The teams are doing that.
The platform team just gives them the platform
where they can build up on that a standardized
platform which generates value for these teams.
And this means that the product team
can do DevOps, the cognitive load is lower, so they
can generate value for their customers.
And the platform team generates value for the teams.
Now some of you might say this is not a silo.
No, it is not because the platform team
creates a product. So a service for
the auto teams, the teams, they need
to operate their own product so they have
the operation directly in there. The platform team
just creates the platform so that
the teams are enabled to deliver
their products. This is a very important thing.
If you don't follow this rule, then of course you
will just introduce another silo and another wall
of computation. When you look at
the market and also at Gartner,
BCG, McKinsey, all of them are
saying that platform engineering or building such a platform
is very important in the upcoming next
five years, which we will have where they,
for example will say that a lot of companies
will build up these platforms. So clear
recommendation into this direction.
This brings us to the summary how to
architect for continuous delivery.
First of all, you need to go away from
that project's mindset into a product's mindset.
We wants to put the customer into the center and
we want to solve the problem of the customer.
We don't want to do projects for our customers.
Then it is very important to apply DevOps or
devsecops. This is about bringing all the people,
process and technology together to continuously deliver value.
And for that we need to apply continuous testing
and building quality. We need to build the quality right
in from the beginning. We need to shift left the whole
testing so that already when we
are writing our specification, we write it
in a form that is testable. And this
brings us to continuous security. Again here we are also
shifting left the whole security. It is built in
directly. When we architect, we do
threat modeling and then when we are going
to code there, we are going to test quite
a lot of things on security problems.
And the fourth trying is you can see this
is quite a lot of things that teams need to do.
This means we need to standardize all of that.
And here we are going to apply platform engineering
and creating a platform for our
product teams. As you can see,
we are entering the age of industrialization,
of software development. Platform engineering is
the key to build this platform which enables
the teams to do devsecops. And this
is the way how you can architect for
continuous delivery. Thank you very much.