Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Thank you so much for attending this talk.
My name is Services Mocheski. I'm developer relations at Vision,
the DevOps company. And thank you so much to the organizers of the
Conf 42 DevOps 2023 event
for organizing this talk and for giving me the opportunity to talk about vision
and appuio cloud. Through this talk, we would like to give you an idea of
how we build this new product using DevOps as a guiding philosophy.
Today I'm going to talk about how we build Appuio cloud at vision.
For that, I will first start by explaining a little bit of our
culture. What is Appuio Cloud and how we use DevOps
to create it. And then I will give you some technical details about how
we created this new product. But first of all,
I would like to introduce Vision to those who have never heard about it
before. That's how you pronounce the name Vision.
Just like Vision. That's why there's an I in
the logo. We're a company of 50 people located in Zurich,
and we are the DevOps company. The slogan of Vision is
actually the DevOps company. We embrace the DevOps philosophy completely,
and as you'll see today, we use its principles and ideas in
everything we do. What does Vision do? We provide
various services and products. We offer DevOps as a
service or has with a full team of Kubernetes and
Openshift experts ready to monitor your clusters
and your applications. Twenty four seven on the cloud.
We help companies become selfsufficient and
cloud enabled. We help software engineers to build bridges between dev and Ops
with CI CD pipelines in various platforms like GitLab,
Openshift or Argo CD. And then we are Kubernetes
and Openshift specialists to the point that our strategy is 100%
oriented towards Kubernetes. Everything we do
runs on Kubernetes. Aside from our technological
choices, the important thing to know about vision is that we have chosen
to drive the growth of our company in various ways that are completely non
standard. First of all, we are the DevOps company,
and as such, we embrace the DevOps mantra completely.
Everything we do is automated as possible, and that frees our
brains to think. We have decided as
a company not to grow through venture capital.
Instead, we rely on the good old method of organic growth.
We have been self funded since day one in 2014, and we've
been consistently profitable and growing since 2017.
We use sociocracy as a management growth framework.
This means that all decisions, and I mean all of them,
happen through consensus among all visioneers. And yes,
that's how we call ourselves, by the way. And we have created
a handbook, freely available online, which in printed form takes
573 pages, which explains
everything we are and what we do at vision with quite an incredible
level of detail, I must say. I invite you to check it out at
Handbook Vision CH and you will learn everything there is to learn
about us. So shoocracy is our evolution
framework, and we have a small team of people at vision whose only job
is to help us evolve into this framework continuously. In particular,
any visioneer is able to raise issues, problems and to ask
for help to change procedures or situations that are hurting their happiness.
At vision, visioneers are able to create vision improvement proposals,
or vips as we call them. They are simply tickets in our jira that
explain a current situation or decisions, the drawbacks and negative impact,
and propose a solution to be discussed by everyone involved or
interests in the issue. This simple mechanism
has completely transformed the way we work in the past three years,
and as a result, we all feel part of the structure we built,
and we feel responsible for it at all times. All of
these choices have shaped our culture in ways that are really not common at all
and have interesting consequences in our day to day operations. For example,
our hiring policies are different to those of most IT companies.
Not only we do pay attention to the IT skills of those who want to
join our team, but we place a very high degree of attention
to the human factor. We want people to feel great at vision,
and one of the primary factors we evaluate during the hiring process is the likeness
of the person. That is, how much would we like to work with them
every day? As Steve Jobs once said, we don't hire
smart people to tell them what to do. We hire smart people so
that they can tell us what to do.
Another interesting consequence of how vision works is that we had embraced
remote and asynchronous working well before the pandemic.
When the swiss government mandated everyone to work from home
in March 2020, we simply stayed at home and continued working
as if nothing had happened. The important bit of information here is the
asynchronous word. Not so much that we work remote,
but that we work in a nonsynchronous way. And this particular mindset
has shaped our company greatly. But let's
get back in time a little bit and see how Appuio cloud came
to be. In 2016, Vision and Apostle ITC, a well
known Swiss IT and software consulting, launched a joint venture called
Appuio. This is a word in Esperanto it means support.
Apuyo consists of a series of products built around
red Hat Openshift. For those who haven't heard of
it yet, Red Hat Openshift is the most widely used Kubernetes
based platform in the enterprise world. It is quite popular with big
companies, and it incorporates a hardened and highly available Kubernetes
cluster surrounded by lots of relevant software, for example,
a container repository, a management console, CI CD
pipelines, and with a very nice and professional GUI
on top. We decided we wanted to be a part of the
Openshift market, but we also realized that installing and
operating Openshift is a huge endeavor, and many companies could not use OpenShift
because of the lack of staff or budget. So we decided to join forces
with puzzle ITC. So Appuio is a response to
the complexity of Red Hat OpenShift. With Appuio, customers can create a ready
to use cluster together with the know how of these two companies.
We at visual we specialize in the setup and maintenance of
Openshift clusters, and we've been operating Openshift clusters,
by the way, since version three. Puzzle on the other hand,
they are specialists in the creation of software solution and cloud native
applications for Openshift, which is something we don't do.
Together, the Appuio team can help companies make the most out
of their Openshift investment.
Appuio has been historically been available in various forms.
First of all, there was Appuio Public based in OpenShift three. It was
the first swiss based shared Openshift cluster available
to customers. And it has a shared platform where customers
could run their projects without having to care about management or anything else.
There were Appuio public clusters running in
various cloud providers, like cloudscale in Switzerland and Aws in Germany.
Then there was Apuyomanage, which is the next step. With Apuyomanage,
organizations get their own non shared, their own
Openshift clusters for their exclusive use and puzzle ITC and
vision take care of the opinions of the cluster transparently for their
users. Finally, ApU yourself manages the final
step in the evolution of organizations. With it, organizations not
only get an Openshift cluster, keys in hand and
ready to run, but we teach their id teams how
to manage and maintain the cluster by themselves. We gradually
fade in the background and provide help until at some point we become
completely invisible and they become completely independent.
Apuyo Cloud is the latest offering in the Appuio family,
and it started as a product in September 2021. What is Appuio Cloud?
Simply put, Appuio Cloud is to Openshift four what Appuio
Public was to Openshift three. Because, you see, given the major architectural
changes between Openshift three and four, instead of migrating
our Appuio public infrastructure to Openshift four, we decided to
create a new project from scratch, and we gave it a different name and
even a different visual identity. We notified our APU
public customers of the upcoming phasing out of the service Appuio public
with an offer to help them migrate their payloads to Appuio Cloud.
And in just one year, Appuio Cloud was fully decommissioned.
That happened in September 2022, only one year after Appuio
Cloud started operations. As said previously,
Appuio Cloud is based exclusively on ape ship four, and at the moment,
we have two Appuio cloud zones available to our customers. One in Canton,
Argao in Lupfig, it's in central Switzerland, and another
one in Geneva running in the exoscale premises.
We plan to open more regions in the future as required and
following the demand of our customers, we started working on Appuio
Cloud in spring 2021, and we released it to the public in autumn
that year, we reused a lot of code and infrastructure we has created
for our work previously. First of all, we reused KDAP.
KDAP is a Kubernetes backup operator that has been picked up
by the cloud Native Computing foundation as a sandbox project. You can
find it in ktab IO and also products,
which is a suite of tools that allow developers to manage
remotely. Lots of kubernetes clusters
from a central location using Gitops. Check it out.
It's on syn tools. Who is Abuyo Cloud
for? So, Appuio Cloud, just like its predecessor, Appuio Public, is meant
to be an entry level product catering to the long tail of Openshift
customers who might be interested in getting access to a working
Openshift cluster without the hassle of installing and operating it.
So as such, we identified a few target groups. First of
all, startups. They might be interested to have a namespace on
Appuio Cloud so that they can launch their mvp and get more venture
capital DevOps and CI CD pipelines to deploy your
projects. Mobile lab backends for iOS and Android education.
So, for example, if you want to learn about OpenShift, Appuio Cloud would
be a great place to start technology trials for companies who
want to hedge the risk of getting into the Openshift world,
and of course, resellers who would like to offer Openshift
services to third parties.
What is included in Appuio Cloud. First of all, you get instant on.
You sign up for Appuio Cloud, you get an Openshift namespace,
you are ready to go to deploy your applications. You only pay for
the resources you actually use, and you can define users and
groups in your organization so that everyone can
work on that project. You get pre installed KDAP
so you can backup all of your work at any time.
And there's a few more preinstalled operators. Among them we
have, as I mentioned, KDAP, and we have also cert manager for you
to create and manage your X 509 certificates.
And finally, we've got community support. If you need help, you can
check out the Appuio cloud forums and the chat.
And for those needing more help, there are support packages available at
extra cost. Now, Appuio Cloud is a public platform,
so it comes with some gotcha. First of all, the maintenance policies
are mandatory and predefined, so you know
when these are going to happen and there might be some interruption for your
work. We communicate status information live on status
cloud. The resource availability is not
guaranteed. You get what you get. We cannot guarantee much
more than that. That means that the SLA is best effort
and there's a fair use policy. All Appuio cloud users
should behave in the sense that otherwise they could impact
or degrade the service level availability for other users.
There are no privileged containers running on Appuio
Cloud. It's a very secure platform and the log retention
is only three days. After that we clear the logs. You can
download them, but we will clear them. It is not possible for the moment
to have other operators than the ones that are defined and pre
installed in Apoyo Cloud, but we evaluate new ones regularly
depending on the demands of our customers.
Now this is about DevOps, and what do we
mean about the DevOps way of working? So let's see what we
mean by DevOps first, because it's one of those words that can mean anything
and everything, depending on who you ask. So for example,
this is a famous cartoon about DevOps,
and this is clearly not what we mean by DevOps, although to be honest,
there's a lot of yaml involved in what we do. Usually when
people talk about DevOps, they think about this physical division between
developers and operation teams and how they don't communicate anymore and how
much better it would be if they did. Then DevOps comes along with continuous
integration, automation, cloud containers,
collaboration and infrastructure as code, and then somehow
all barriers are destroyed and we can once again collaborate and
work better together. For us at vision, this is
a limited view of what DevOps is and can bring. It is an important
part, but not all. Instead, we prefer to think about DevOps as a
set of three principles, following what some authors have written about it.
In particular, we think the best people to talk about DevOps is the author
of the DevOps Handbook and the Phoenix Project, Jean Kim himself.
The latter is actually a modern rewriting and reinterpretation
of a classic management books from the 80s called the Goal by Ellie
Yahoo. Goldrat. But it's quite faithful in spirit as well.
In those books, DevOps is usually defined by the three ways,
the principles of flow, the principles of feedback,
and the principles of continual learning and experimentation.
So let's see how these three principles have shaped
appuio cloud. Let's start with the principles of flow and
see what it means for product development. The first thing we have to decide
was where to start. That is, what was the value stream
we wanted to provide first. We wanted to have actual results as
early as possible, because seeing things happen and appearing is
one of the best ways to keep a team in activity motivated and
delivering that work brought together the
product documentation. That's right. The first thing we created through discussion
was a written documentation of what we wanted to offer. Why written?
Because we work asynchronously. That means that some of us
work better at night, while some work better in the morning. Having everything
written down helped everyone creating down drafts of the documents until
there was agreement. Agreement from whom? From everyone.
From the product owners to the DevOps engineers, who, at the
end, have to maintain the solution. This way,
operation teams knows exactly what's going to happen. There are no
surprises down the hall, and they feel empowered and listened to.
All the features of Appuio cloud are, simply put, possible to be
released either now or later. But they are
possible. The important thing here is that we started
by applying Conway's law. That is, we first structured
the team that would work for Appuio Cloud, and then we
got to create the system. The end result of this process is
that the architecture of Appuio Cloud, following Conway's law,
strictly mirrors the structure of our team. We do not fight
Conway's law. We embrace it.
The result of this work of architecture and documentation can be summarized in
three different documentation websites for apiocloud. You've heard right.
We have created three different sets of documentation,
and we keep them updated every day. There's product owner documentation.
There is system engineer documentation, and end user
documentation. We've made all of the documentation publicly
available and viewable, actually even editable.
Because transparency is one of our values at vision, we want all
of our customers to know exactly we are doing things the way we do.
In turn, this generates trust in our existing customers and it
shows our know how to prospective ones. These three documentation sites
are, simply put, great marketing tools.
The principle of flow requires teams to make work
visible, reducing batch sizes and intervals of work,
and to build quality in. We limited work in progress to
the strict minimum and we automated has much has possible in the
process. Talking about automation, this automation
involves removing the human factor from the maintenance of those clusters
as much as possible. One of the key factors for doing this
was project scene, a suite of tools we started building in 2019 that
allows our small team to manage hundreds of clusters from a central location.
We created project scene as a way to be able to operate our customers assets
with reduced human footprint, but it turned out to be a great way to
handle appuio cloud as well. Thanks to products in DevOps,
engineers can specify and deploy changes to lots of clusters from
a central location using Gitops. Just commit
your changes as infrastructure as code to a git repo. Wait a few
seconds, all of the clusters have those changes.
We use Project Syn to deploy Kyverno security policies, for example,
to our Appuio cloud clusters so that all regions conform to the
same rulebook. We also configured each of the Appuio cloud
zones with the mandatory differences between the cloud providers
we use, because exoscale and cloudscale do not offer exactly
the same features, and being able to see those differences
written down allows us to manage those systems in
the best possible way and to take the best possible decisions.
So we know that Apu Glad is a complex system. It's built
out of complex systems itself, and they are all prone to failure
at any given time. So this is not a matter of
if, but rather a matter of when things
are going to go wrong. So we need observability and
we have built observability and management tools immediately from the
start. In our work of Appuio Cloud, we have reused
the management infrastructure provided by Openshift, the same one we
were using for our private customers. But we have built on top of that Appuio
cloud specific tools so that we have
a complete observability on the cluster at all times.
Using everything as code as a basis for our work means that every
time we fix an issue on the platform, we have to change a configuration file
somewhere. This information is later stored in a git
repo as part of the project history. Not only that,
but we also update the required documentation files, both internal and external,
so that everyone knows asynchronously and at their
own rhythm what happened when and most importantly,
why and when we say everything has code,
we mean it. Security policies, build configurations,
general configuration, infrastructure, and documentation.
All of this is described in their corresponding files and decisions.
In git repos, we use GitLab and its integrated CI CD pipelines
are configured to automatically build, test, and eventually
deploy changes as they happen, thanks to products.
In all of the feedback we bring back to the system is automatically deployed whenever
possible, which reduces the amount of brain work required to
keep things running. Even our documentation is
automated. We use the Antora documentation generator
tool, which can automatically extract and integrate documentation from various
sources into a single website, and we use GitLab pipelines
for that as well. With this process, engineers only have to update these documentation
sources using ASCII Doc, which is very similar to markdown
and git push. These changes are immediately
picked up, verified. We actually have styling
and syntax check built in into our pipelines,
and all of this is deployed automatically.
Now, regarding the principles of continual learning and experimentation,
I'm going to share with you an anecdote. Abu cloud is not and will never
be finished right? It is a product that changes continuously, sometimes in
small ways and sometimes in bigger ones. And this screenshot
it's a screenshot of the appuio cloud console a few months ago,
around the month of May. And can you see the red banner on
top? That red banner on top indicates the result
of us learning something interesting about Appuio cloud,
something we did not know, a change we have to bring to the
platform. Here is the text of the Red banner in the previous
screenshot issue with cpu requests resolved,
the resolution in cloud includes a slight change to
the pricing model. This, as you can imagine, is the result
of a learning process. We realized that in our preparation we have
not designed our cpu request pricing properly. As a result,
as soon as the first users started using the platform last year,
we realized that some of them were consuming disproportionate amounts of cpu
and this was a huge problem because they were not aware of that.
And we as a company, we would have to cover for
a lot of extra costs at first.
So that of course, from a business point of view was a disaster.
But we basically modified the policies in our clusters
and we made this clearly visible and communicated this
to all of our customers and we updated our documentation has shown
on the slide the solution is exactly what you see
right there. It's an extract of the documentation.
This was an unexpected and unplanned learning, right? A local discovery
that brought a global improvement in Appuio cloud for all users, and also
for us in terms of business. We did cover some of the costs,
but we rectified our policies openly and communicated clearly
with our customers. The result is not only all of them
acknowledged and understood the changes. We didn't lose a single customer
because of it, and this level of cooperation with our customers is one of the
things we're most proud of.
So let me give you now some details about the work we did, including team
sizes, tech stack used, and many other details. Transparency is one
of our values, so we're very happy to tell you everything about it.
The team that built Appuio Cloud is called Aldebaran.
Envision was the team that was mostly in charge of the design,
deployment and operation of Appuio Cloud. They also received
help from people from other teams, in particular those with very good
experience in the deployment and operations of openshifts, clusters,
of course, and also from marketing and sales to
coordinate the communication and the marketing campaigns to get new users
onto the platform. The project manager and main product manager
of Appuio Cloud is Tobias Bruner, one of the founders of Vision and the
current CTO, who provided very strong vision,
no pun intended, about how Apu Cloud should behave and
look like let's go into some details about how APU cloud is
built. This slide contains all the major components we've chosen for it.
We've got red has OpenShift, four point eleven, of course,
we use for security policies, Kiverno for identity
management, we use keycloak. We store secrets in vault,
we use rook as the storage plugin, and we use isovalent
silium enterprise as the networking plugin.
For backup. As I mentioned, we use kdap, and for all the GitHub's
operations we need, we use project syn. For the documentation
websites, we use Antora, which I strongly recommend that you check it out.
It's a wonderful tool. During our day to day
asynchronous communication and collaboration,
we use the usual took that you need to
keep in touch with your peers, for example, Zoom, rocket chat,
Jira, confluence, and so on. A short timeline
of events that led to the release of Appuio Cloud. We started talking about Apu
Public 20 around two years ago. By July 2021,
we had chosen the product name and we registered the domain.
Things accelerated during the summer of 2021, and we made
the public announcement of Appuio Cloud in September, and by
October, users started migrating their apps from Apuya
Public to Appuio Cloud. In December, we announced a partnership
with isovalent to use their CNI plugin on Appuio
Cloud. Last year, we opened up a new region in
Geneva and we released the ApU Cloud portal so that our
users can manage their projects, users and groups autonomously.
And finally, we released our new product, Appcat,
which allows a Puyo cloud users to specify dependencies such as
s three buckets, databases, message queues,
other systems directly in YamL from their Openshift projects.
We also enable vertical polyto sailing and workload monitoring
for all of our users. So can other organizations
use a similar process to create a product? We believe yes, it is possible.
However, there are a few caveats that we know
some companies should have to work on those items first to be honest,
in order to have a successful DevOps journey. First of
all, writing skills are fundamental. We need
DevOps engineers to be writers and to put everything down,
not only as everything as code, security,
infrastructure, business rules, et cetera, but also as documentation writers.
Making sure that both the engineers and users are able to refer to
a written document that explains the reasons why things happen
and keeping that written documentation updated. This part
of the work is not a chore, it's a bonus. It's part of the deliverables.
It must be updated, reviewed and proofread. Second,
cloud native technology has been designed to work faster than ever.
Containers Kubernetes CI CD pipelines open source all
of the ecosystem of cloud native technology is the greater enabler
of our modern world. The technological context constitutes
a fantastic giant shoulder where we can stand on
to go faster, to go better. We definitely could
have never done this work without the ecosystem of open source cloud
native technologies available today. But third of all,
trust is paramount. You have to trust your teams.
We actually think that trust is more important than flat hierarchies,
even though these have helped us. Without trust, there's no way we could have
created a poyo cloud in such a short amount of time.
Trust allows teams to work independently, moving fast and
without the inherent fear typical of a blame culture.
And trust is the key ingredient for an asynchronous work culture.
You cannot really go full async if you do not trust your teams.
We stress this point because this factor is a deal breaker for many
teams in many places in the world. These are,
we think, the three important pillars of our DevOps culture,
writing, technology, and trust those who
have helped us shape appuio cloud into the product that is right
now steadily growing and changing continuously.
Is it easy to work like this in a DevOps mode? Of course not.
There's a lot of things that can go wrong. But is it worth it?
Let's put it this way. After all this time, we have internalized this way
of working so much we couldn't do things any other way. We think
it's totally worth it and as a result, we just do things like this all
the time. With Appuio Cloud Vision demonstrates that we can deliver
world class product in a short amount of time with a small team of experts
and with fast cycles of feedback and experimentation baked in
the process. We regularly publish blog posts telling the
story of the product and sharing news about future features of development.
So please check it out. At Vision CH blog, for example,
we have blogs about behind the scenes about our API,
how billing works, and all of that. So please check it out and
if you would like to try Apu cloud for 30 days, please go to
Apuyo Cloud register, use the voucher code conf
fourty two and get a thirty day openshift
project for your team to use and to test.
Thank you so much for your attention. It's been great explaining
to you all of this story. I hope it's been useful to you and
I hope you have questions. I will be around in the chat to answer
some of them. Thank you so much.