Transcript
This transcript was autogenerated. To make changes, submit a PR.
Welcome to our session about the cloud native configuration. First of
all, we would like to thank comfortitu forks
for letting us introduce you with this topic that
Joachim and I think often overlooked,
and also to thank you for being with us right
now. We know it's always very difficult to choose
a session amongst these numerous topics.
So like we said, in a cloud native context, when we
consider application that are massively distributed, that are
relying on application that are constantly updating, and also
infrastructure that may fail, we need to embrace this uncertainty
and we need to apply patterns in order to
tackle this uncertainty. And configuration is no exception to
that. However, we see that topic overlooked.
And when we talk about cloud native patterns, we are more focused
on services, the relationship between those services and
even the data, but not so much on the configuration. And that's
why we would like to introduce you with a specific
pattern that will help you to think your configuration in a
cloud native context. So first of all, we will introduce
the configuration as a definition not only in the cloud
native public, but also on premise, because both
of which take the same approach,
but we have the same definition but not the same approach. And we
will see why we need a different approach in
a cloud native context. And usually we
meet two different approaches, but they come
with shortage in term of dealing with the specific
challenges on the cloud and we will see what those limits
are and how we can overcome them with a specific
pattern. That is the central point of this session.
Aside from the theory we will introduce you with
practical cases we see, we saw on the field
and also a demo so that you can yourself think
your configuration with inside your own business model.
I am Ismail. I am a cloud developer at Wescale.
And I am Joe Kim, a cloud native developer and coding
architect at Wescale. So at Wescale we are a community
of 50 experts on the cloud.
More than expert, I would say passionate, passionate about what we
do, that is to say, helping clients to think
their business in the cloud. To think from day one where we will
architect the business inside the cloud, but also
to day two when we will help clients already on the cloud to
enhance the existing projects in order to
be more secure, to be more reliable, to be more
available and so on. We are not bound
to any cloud provider nor any tools. And we
think that there is always context to think on.
And usually what we will do is to discuss with
the clients so that we can advise the best platform,
the best tool that we fit your own needs.
And to help this process, we also provide
training on those different tools and
platforms. Let's go back to our main topic,
configuration. What is configuration?
Configuration could be simply defined as the
thing that will change between deploys that
is likely to change, and we distinguish two kind of them.
First one that is under our control, called application configuration,
will tell how the application behaves. Third party services
location. I'll go with them. Configuration feature enabling
the second one is more about how the application is
called IP address port, and this time we are not able
to change it, or usually it's up to the
cloud platform to provide it. Why are we considering
it? Because in the cloud native application,
we in fact deals with hundred if not thousands
of instances. Still, we want this application to behave
as a single logical entity, meaning we want hidden
potency. It does not matter that I call the first instance
or the 42nd one. If I have the same input,
I should get the same result. We see two obstacles to that.
First one is the history of your request, and we
know that providing a stateless application will solve this issue.
Second one is what if my instance
42 contain a different configuration than
the instance one? I won't have the same result
for sure. So the challenge is to
get an automatic way to guarantee that
each spawn instances for a given version
of given application version will spawn with
the right application configuration version. And this
is not possible to consider a semi
automatic ways, because unlike the good old days
when an operator were connecting inside a virtual machine
to upload the right configuration file, now we are dealing
with ephemeral instances and even instances
that we cannot connect to. So usually we can think
two classical approaches. First one, the simplest one, is to
embed inside the artifact that we deploy
the configuration file with its values. That way we are sure
that each instances spawn from this artifact will
get the right configuration. Also, from developer
point of view, it's easy to think that way because we hard
code the values inside the artifact, and also because we have
a central model of your configuration. And it's easy
to think that way because it gives you a mean to know what
your application needs to start. But from the operator
point of view, we now have an artifact that
is linked to an environment. The artifact meant
for the dev environment won't be able to be deployed on the
staging environment, so we break any kind
of traceability between deployments, and also we break the trust
between a developer and an operator, because we will
fall into the problem of it works on my machine. So we
need to reconciliate those two words and
providing a single artifact would help.
Twelve factor app with a third factor tells you to store
the configuration inside the environment more precisely
and how we see it implemented is to store the values
of the configuration inside environment variable so
that the application code
could consume it. Problem is, although we have a
single artifact now, we directly consume
the environment variable and also we have a code
that is spread with get of course, and prevent
your developer to have a single logical view of your configuration.
We need a third path, and the configuration layer pattern
is this path. In this model, we have a single
artifact that embeds a configuration model
without its values. And values will be stored inside
the environment. It would be up to the configuration
layer to one, expose the values to
the application and two, fetch those value previously
to exposing fetch those values from the environment.
So this is what we describe here we have the
templates that represent our configuration model
and whether it is deployed inside the
first environment. A second one, it will have
different inject, render different values.
So now we have an applications that is protected from its environment.
We do not consume it directly. We see that the
configuration layers act as a single point of
truth for the configuration values, and also the configuration
layer is in charge to fetch the data that
will constitute the configuration values. So we have to
keep in mind that the configuration layer is conceptual.
It's not a technology, it's more an association of libraries and
patterns. If we talk about libraries, we know that we
would have different kind of libraries according to the ecosystem we consider.
And in this term we know that we can have
different kind level of work to produce in order to implement
configuration layer. So we saw the theory,
now let's see the practice, in particular how
to expose values to the configuration layer.
We can expose it as always with environment variables,
but we can also provide values through web servers
that the configuration layer will be in charge to request
and get value from. In particular, we have
this technology printlock configuration that will expose you a
web application, and under the hood it will be able to fetch
data from different data store, whether it be a git repository,
a blob store and a vault. For instance, note that
the application foo here represented by this container,
kubernetes pod, won't be able to communicate directly
from with this environment nor with
the web server. It's only the configuration layer that will
serve as a single point of truth for the configuration failures,
and we can see the configuration layer as a configuration
gateway. In that sense, the previous example does not
shows you how to fetch those values.
It's more about how to store the configuration values.
And with the client we implemented in a PHP ecosystem
this pattern with the following initial situation.
We had an application repository that were deployed
through a sync operation on a non premise environment on
a given server. Configuration was not embedded inside this repo,
it was embedded inside a second repo that contained
every configuration of all the applications of
the clients applications. And the deployment process
was taking into account the two deployments, the first the
application code, and the second one the configuration repo, and doing
the symbolic link, creating the symbolic link
with the right files. So one obvious problem that
was that we were exposing too much information.
And second, we had no way as a developer
to know the model of the configuration unless we have access
to this configuration repo. So our first point was
to implement a CI CD pipeline to replace
the in house CLI tool that was in charge
of the synchronization with the server, and also the creation
of symbolic link to the configuration file. And what we
can notice is that inside a pipeline environment we
can do everything we want. In particular we can implement
this concept of configuration layer. So first
point was to set the template of the configuration
inside the source code. This way we can,
inside the pipeline environment, render it with
the environment values that we provided.
How? Thanks to a of file that
we fetched from spring cloud config web
server. And then a rendering process was in
charge to set the values inside the template. And finally
we managed these configuration files with the
PHP files inside the final artifacts
that we deployed on the environment.
So we still have the drawbacks
to produce packages that are related to a given environment.
But we saw that we now have the configuration
layer concept inside the pipeline environment,
and we think that we can move this
logic inside the true platform that in
our case was fortunately kubernetes,
into which we defined a ninit container
into which we implemented a logic of fetching
the values from the environment, and also
the same spring cloud config server in order to
render the template that was exposed inside the
configuration volume. Once this staging
was done, the application was allowed to start
with the right configuration. So we can say that we can
see that configuration layer is more about implementing
the right tool and associating it with
the right platform. And in theory it's accessible to
every kind of ecosystem. But we may say that it's
too much manual works and maybe it exists
some libraries that will do the work anyway. And with
Drakim we think that light bend is this kind
of library that was not thought at the beginning as a configuration
layer technology, but which is very adapted to
the cases we just described. So now we will
see an example with lightbane configuration
about how to implement our
three layers of configuration. Lightbane config is
Java library that you can use in Java,
Scala, Kotlin, etc. All JVM
languages. It's a new configuration format at the same time,
so it describes the Hokon format
that is close to JSON. It is compatible with JSON.
If you have a valid JSON, you have a valid hook on. But it comes
with a lot of interesting failures,
like the order of definitions that matters.
Temporal units to make clear temporal
values, like 30 seconds it is typed,
so it's a real object like JSON configuration.
And you may have references and
a central entry point for all
your configuration. Because once you ask Lightband
config to load your configuration, all configuration
files, system properties from the JVM and
environment variable are visible in the same place
in the config object. You can still separate
your configuration file to keep the separation of concern and
have only one entry point with includes,
and we'll see how the standard orchestration of loading
and merging will allow us to build three layers
of configuration. For our first layer, we will use
standard behavior of Laban config, that is the
automatic loading of all files that are named reference
conf at the root of the class pass. They are concatenated
merged. So it's very important to have your own namespace
to avoid conflict. But it's very useful because if
you have a library, you will use this file
to validate your config structure because you will have
inside the full configuration with default values. If it's
an application, we won't have all the structure,
but only default values. Mandatory values
without default won't be here,
and the application will still access the configuration
through the configuration layer that is the library config library
here. Then our second layer will
have another file, applicationfix.com,
that will be load programmatically as a fullback
configuration for the third layer that
we will see later. But because reference
conf is always low priority, because it's always
default values, the application fixed conf will
contain values that can overwrite some default
and some values that can complete what is
missing in reference. Typically you will put in this
file, you will put algorithm parameters, business parameters,
everything that is tied to the behavior of one
version of your artifact and this file, very important,
like the reference, it will be inside your artifact.
So it will be shipped with your applications if
you want in the docker image,
but the preference inside the jar itself because
it is tied to a version of your application. All that
is inside will not vary depending on the environment,
but for more flexibility you can already include
references to environment variables. Then our
third layer will be the application runtime.com
file, this one.
For this one we will profit from standard
behavior of Lisben config. You can specify
a path or a URL for this
file as a system properties. That is, when you launch
your JVM, you can specify uppercase
d flags that are system properties and one of them
will be the path of the configuration file.
That will allow us to write a code without a specific reference
to this file. And this is very important that we don't have a
specific reference to this file because it comes from
the environment. In our example it will be on an external,
external disk, it will be mount like volume inside
our container. But you may put it on a
config server and specify it as
a URL. And basically we will put inside
everything that depends on the environment.
You can put hostname ports,
sizing parameters, technical parameters, depending on
the environment or arguments that you
would pass to your application to specialize the
instance. Then we'll see how to implement it.
So I will show you scalar application.
What do we have here? Here we have the reference
confile. So because it's an application I will but only
default values. And here you can see
what you can do with lightband config,
especially hocon formats. It's like JSON,
but you can use dot syntax to avoid
using a cascade of curly braces.
For example, this namespace is equivalent in
JSON format of config. Demo object inside the scratchpad,
object inside the j arrow object
inside object. But you can shorten it.
You have an application name that is fixed,
it's a default value. It's okay. You have an instance id that
is defined only if the hostname variable is defined because
of the question mark. If the hostname variable is not
defined, instance id won't be defined. And for
this example we decide to imagine that our
applications is consuming and producing messages
on the Kafka cluster. So we want to structure
our Kafka configuration. We put a default,
we have a consumer configuration object with the
default session timeout. We have some default
producer parameters. And so let's say
that these default parameters makes the application works
nearly everywhere except if you want to override it
for specific use case. So these
are default values. Then in the same
jar we will have applications.com.
So here the example, very simple.
I don't have any business example values, but let's
imagine that all
those variables are tied to the application version,
so we want to redefine the application name.
So this line overrides the one in the config
reference confile,
but only if the up name environment variable is
defined. We add another environment
variable for instances id. So if osname isn't here it
will try instance id because the definition are read in
order. So the last one will override the first one and
then we add the group id for the consumer group
and we add a client id for message production.
And we decide that for example the consumer
group is linked to the application
name. Here it's a reference to the upname attribute
because all instance
of the application will be in the same consumer group, and because each
instance of the application will have its own producer.
I create client id name from the app
name and the instance id. So you may have
noticed that the instance id is not always here because for
now we define it only if some environment
variable are here. That means that if we start our application
with no environment variable, this reference won't
work and lightband config will throw an
exception and so the application won't start. That's exactly
what we want. If the configuration is not consistent,
we won't start. Another file is
outside our application. This is the configuration
file we pass at start time. This is application runtime
conf. So here you will find some attributes that
will be merged inside the Kafka consumer structure.
The host names the port
number, and for this example we decide to keep the default
port. But in a real example you may want to
put a reference to another element variable or a fixed number.
And here I copy the bootstrap server
array from consumer to the producer bootstrap server
attribute because I consume and produce on the same
cluster. So here we see all
features of lightband config
and we will see how it is cloud in
our application. So it is very simple, it loads the
configuration, display it, and then that's all. The first interesting
line is this one. So with
this line you will cloud reference conf, you will
load the application runtime.com, the applicationfig
conf, and everything will be overridden with
JVM system properties. Because this is convention
in lightband config.
The load method will load the reference conf.
It's mandatory, it will always be done. And because
we don't specify a path for the config file,
it will use the system properties that specify
the path or the URL of the default file config.
That is for us a path that point to a
volume that is mounted and with a
specific name. Application runtime conf
and as we said in slide about layer two,
we won't define the application fix conf that
is inside the application jar as a fallback. So because reference
Comf is always low priority, even with fallback,
the reference conf will be used after a while.
So here we have applicationaroundtime.com that is
high priority. Then if
we don't find our variable inside this
application runtime.com file, we will look into
the applicationfix.com and then if you don't find
variable inside this file, we will look inside reference.
And because JVM system properties are overriding
everything, they are very interesting to
be used for overriding some values of the configuration
at launch time. It's very interesting. So here
we display the configuration and we map it then
here to configuration to a configuration model.
Lightbane is advising us to
always have an object model for our configuration and
map it to the file. Why? Because it
will make the code using simple objects that
are safe without side effects.
And so once the configuration is load and validated,
your application can start without any risk
of inconsistency in the configuration. That's very important. So here,
because we are in scalar, we are using scalar.
The pure config library is mapping automatically
the file structure to my model. So we'll see quickly
how the model is done.
So here you
can see that it is like the file
with up name and instance id. KFC configuration
at the first level. Then inside KfK configuration we have
the consumer configuration, the producer configuration.
Then we have the bootstrap servers that are inside an
array. And if occurrence of the each
occurrence of the array is a host per instance,
with a host and a port, my model is
tightly linked to the file structure and that's very important
for validation and for our application.
So our application, not only the application is loading
the configuration from only one entry
point, it is mapped to a model so
that the configuration became totally transparent and
without any complicated
method or inconsistent method like
call to get off. Check if
the variable is defined,
check the type at runtime at the time the
variable is used.
All those things have been checked just before. So I will
show you a run of your application.
So I should be in the docker directory.
I will show you the docker file quickly to
understand how it is done. How here you see
that I'm using system property
to specify the path. So the config path
will point to a config directory. That is the volumes
that we will and config file that is
application runtime.com.
Then I will run my application with a script that
will launch docker. And on this line you see
that I am using a host directory to mount
a configuration volume inside my container.
So I can put everything that
is depending on the environment
in the host directory. And what
we are going to get is this.
So we have a lot of things here because I activated
a kind of debug display so that we
can understand how the failures are merged.
So for example have here our original
namespace. Then we have comments from the
lightband config libraries that are telling us that
application name is coming from the line two of the reference confile.
So no app name of our own variable was here. So the
default app name has been kept,
the instance id has been defined, so we can deduce
that another variable defines it defined
it. And in this case this is a hostname
variable that is defined to the container id.
Then the Kafka structure has been merged
from a lot of files you may notice to run
application runtime comf, applications conf
and the reference conf. And if
we go deeper into this structure, we can see that each variable
has an origin, some coming from application runtime.com,
some coming from reference conf.
So everything is exactly what we expected.
We have only one big configuration for
all files, even if the
application configuration, our artifact, was not complete,
and even if we specified the configuration,
the last part of the configuration at the last minute,
and the result is here we have a display
of our configuration model. So I hope it was
clear for you. And I write an
article about Liband configuration on blog.
Ismail wrote about the
configuration in the cloud on the same blog. And then I can't
let you go without saying that we are hiring.
So if you are interested about what
we are doing, if you are passionate, you can join
us. And the new things is that we are
creating a new remote agency. So if you are not near Paris
or near Nance, you still can join
us from all around France in this
new remote agency. That's all folks, thank you for your attention.
Thank you for everything. And we hope that this session
will inspire you to apply this new pattern
or this pattern you already know on your own
business context. And we also hope to see you on
the Discord channel so that we can further discuss
about these topics. See you, see you.