Transcript
This transcript was autogenerated. To make changes, submit a PR.
Okay, welcome again to my talk on Nebulous. I'm at operating system for broken
hyperdistributed applications on cloud computing continuums.
The talk that I wish to deliver to you today is
titled have your head in the clouds while keeping your feet firmly
on the ground. And I'm Rodosov Pirisic from seven bullscom
and this is my email address at the bottom. The project
that I'm about to describe is funded by the European Union and
the title. While the only connection between the title and the
project is the word cloud, you will see from the
presentation that this title is actually pretty relevant to the
concepts that I will be showing you as both having
your head in the clouds and keeping your feet firmly on the ground
makes sense in the various contexts that
will be discussed quite soon.
Okay, so without further ado, let me show you the agenda.
So this is the agenda of today's presentation. First we'll
discuss the concepts, the concept of the cloud continuum.
Why do we need cloud continuum? How does it
relate to other concepts that are found, commonly found in
discussions like that? Then we'll discuss what a meta operating
system is. I will introducing the nebulous project and give
you various details on this project. We'll also have a
sneak peek into the nebulous architecture that we are planning.
And finally we will have a summary
of the contents of this talk.
So, regarding the cloud continuum per
se, this is very important. This is actually fundamental
to this project. This actually
has been defined now. So the cloud continuum,
this term has been used in various ways,
obviously for the purpose of a common ground.
I'm introducing it now in this form following
the paper with Moroskini from 2022 actually Morriskini
atal. It was published in IEEE access and the
authors provide the following definition of cloud continuum
that this is an extension of the traditional cloud towards multiple
entities, and the examples given on Edgefork IoT
and this dimensions provides analyzes, processing,
storage and data generation capabilities.
So basically,
if we actually translate it, transform it into other form,
the continuum of something is an extension of something, right?
And these case this is a cloud continuum. It's an extension of the
cloud and towards some other entities that are
relevant in this context. So like in this context of having computation
somewhere else, these is Edgefalk Iot, and we'll also
see it more detailed on some other slide.
What I wanted to underline actually did on the slide is the
data, and this is actually very important.
That data is what is describing the computation.
Actually all the kinds of efforts that
are being made in that it's actually
no computation without the data. All right, so we'll see
that in a moment as well. But first of all,
why cloud? Continue? Why do we even study such a
thing? Why is cloud not enough?
So if you are practicing cloud,
you probably already know that cloud is not
a solution that will cater to all and every need
that your application, that your infrastructure, that your solution service
might have, right? First of all, we know that these
applications themselves, these know no boundaries. Like they don't
actually care unless they were written in such a way
that they have to run in the cloud. Like for
example, we see this lift and shift approached,
that people are just moving applications naively, let's say,
into the cloud, and probably also taking
the cost of doing that this way. But nonetheless,
applications, no boundaries, they don't care.
These prefer that they have scalability.
It's good for the operations to have the using capabilities.
So cloud is actually good for that. So considering
the continuum from the perspective of having some other resources
and joining those to the cloud, you enable yourself to
utilize the semifinals,
let's say almost finite, less resources
of the clouds, public clouds.
So usually when I just say cloud, I will mean the
public cloud, as we see that offers us the infinite
amounts of
processing power. So the
cloud might not cater to all the requirements that apps may have.
So you might have some technical constraints and non technical constraints
that are either hard to achieve with a cloud solution. With a cloud
only solution, for example, remaining the
technical constraints, your latency and bandwidth requirements might
be such that the various ways that you need
to transfer your data, it might not
be sufficient to have all your computing
put into the cloud. You might also be using some
specialized hardware, I don't know, maybe FPGA. You are doing
some scary new algorithms that are optimized for some
FPGA, and you actually need to
utilize that software to stay competitive
with other competitors. And similarly,
you have the IoT.
Generally you cannot directly connect the IoT to
the cloud. You do it via some other means, via some other
intermediate provider. Of course you can connect
it directly to the Internet and then reach out
almost directly to the cloud, but this is usually not the case
because of the various ways that you would like to process
your data. And this is actually very
relevant to the latency and bandwidth that I discussed
a moment ago. Similarly, on the side of
the non technical constraints, when you are considering compliance
with the various policies like the european GDPR,
the California CCPA, and the other global
ones like PCI, DSS and HIPAA, that is,
I think, mostly United States. And you might also
have some internal business compliance. And in
cases like this, cloud providers usually provide
some way to help you with ensuring
the compliance with the policies that
they are at least aware of. But in some cases,
it might actually be the case that you need to keep your
data off the cloud, like it shouldn't be conducting the cloud
in the first these, at least not in some anonymized,
cleaned up form. Right?
And in these cases, the cloud only solution
might not be the best one.
And considering the actual current landscape
of how companies are behaving towards the clouds,
we know there are companies that have still not migrated to the cloud,
that they
run off their data centers, they use collocation services, or otherwise
avoid the cloud at the moment still.
But some of them are looking for a solution
that will help them to utilize the cloud and do it properly,
not just going with the lift and shift approach, which most companies
have already learned the hard way that this is going
to introducing a lot of cost for not
that much gain out of the cloud.
And they would also like to keep their current investments.
So they already put some money in
the existing infrastructure, and it doesn't make sense to say
that, well, this is all obsolete now, let's throw it away.
Well, using backs is no good for a company.
And some other companies that, like I mentioned already,
that have learned these hard way that the cloud is
costly, they are actually looking for an optimized way to utilize
the resources in the clouds, as well as looking
for some insights, whether some
of the workflows that they have would better be made
running on some other kind of infrastructure,
rather than using the clouds,
the public clouds. And the
underlining aspect to all of that is
we could answer the y cloud continuum, because data,
now we hear the terms the
big data, the Internet of Things, the IoT, the industrial
Internet of things, et cetera, that produce
a lot of data.
And considering these computation in general, when you
think about that, even if you consider a game,
it's always processing some data. So in case of a game,
of course, this is like processing the interactions,
the human interactions, against some
world state. And basically, this is your data. So the
world state and the interactions that happen to it, consider a constant streams
of changes to that world, being it
a multiplayer game or a single player game. Of course, for a single player game,
it would be on a local machine only because it doesn't make sense
in general to centralize it.
But also in all other cases, you can
actually define what data is like. For example, if the user has
a form, then the user is inputting some data into the application,
into the services, and you can also get
the data from other sources, like the devices, the sensors.
So you either have a human or you have a non human source
of your data, which creates some data provenance,
of course. And further,
this data that you gather, that you generate, that you obtain,
needs to be stored somewhere. It's either stored transiently
while it's being worked on. So this is in the
working memory usually, and it can also be
stored persistently. So this is most often these
storage that we think about, that persistent storage that
doesn't get cleaned
up after some thing fail or distributed,
et cetera. Then we have the state of the data
that had been transferred. So it's being in transit between two
computing nodes that they need to communicate this data to carry
out comparison. Or maybe the other node is going to actually
store the data in some persistent way. And finally,
of course, the thing that
takes the most of the computing power is the actual comparison.
So we know that of course the various chips
need to be working for the generation,
for the storage, for the transfer, of course. But the
actual processing is the thing that is
consuming the computing power. And similarly
to the generation and obtainment of data.
In this case of processing and analyzing data, it can be either
human driven or automation driven. So either
it's human doing some computations like running the comments,
various comments and describing some results
in excel, for example, in excel like
software. So the spreadsheets or
automation driven, like it's a workload, it's a service that has been
programmed to do something this way. And this is the computation.
And by adapting cloud continuum,
what we are allowed to do is we can optimize
these flow, the data flow, the way that we act
upon the data. And two
these we can achieve the best utilization of available resources.
Like the slide says, in that if we disregard
the fact that we might have some better
environments, some better place to carry out these operations,
we are actually disregarding some optimal solution
that is impossible when using just
the cloud. Like I mentioned already with the IoT case,
it might make sense for various reasons to process
the data closer to the IoT device, rather than
trying to stream all the data directly to the cloud.
And where might this continue live? So like mentioned before,
we have these public clouds. And these
is fundamental for the cloud continuum,
because in the name you have the cloud. And basically this is
usually consider a public cloud. Of course there are
private clouds, but they're comparison to data
centers like these really differs in the
way that you manage your resources. But this
doesn't usually meet the definition of the cloud, as it's for
the public cloud, with the resources being almost
finite, less, and so on and so forth.
In these case, this is often called the fog.
So the layer between the edge,
the IoT devices and the public cloud is the fog.
So whatever we put there, our data centers either managed
in a more traditional way or a private cloud way, like with
Openstack or some other cloud solution from another
provider. And then
we have, apart from the cloud, from the fog, we have
the edge. And in case of edge
also, there are multiple ways to look at it. Either you
consider the edge sites, which look like these data centers,
but they are actually smaller usually than
the fog like data centers. So this is also
often called the micro data centers. We have
also a similar concept with the multi access edge computing,
which is what these telco providers provide,
offer to the users,
in that it's similar to the
public cloud, but running on the telco devices. So usually
the public cloud providers don't run the telco,
so they don't operate at the level of the
end user network, but in the case of telcos,
they can do that. They actually do that.
And this offering of multi access edge computing is
also what is happening now.
And similarly closer with each step, it's closer
to the potential resources of your data,
with the final destination,
with the actual source of the current big
data, let's say, is the IoT devices. So the devices
that are usually not equipped with so much computing power,
maybe they have some computing power, but generally not as much
as a typical server that is being deployed in data
center, but nonetheless, they are there and they're important
sources of the data.
And regarding the other concepts I
mentioned in the agenda that are often discussed
together with cloud continuum, and that they have been mentioned
much, many more times than cloud continuum so
far, because they are older terms, they are
also often mentioned now by these
public cloud providers, and as they offer the solutions to
the multicloud and hybrid cloud approaches
to cloud. So in case of multicloud, what they usually
mean is that you will have two or more public clouds,
right? So pick two major cloud
providers and they will probably have a service to make it easier for you
to use their services to connect their networks, et cetera, et cetera.
Regarding the hybrid cloud, it's similar, but it says that
you would have this public cloud and not a public cloud. So this is
what makes it hybrid, because it's not really one type
of the cloud. Like in the case of multicloud. So we have all of them
public. In this case you have one public, the other one is not
public. And like I mentioned previously, it could be the case
that the other hybrid part is not actually
a typical cloud. It's not even managed like a cloud.
It's just your data center. It can act in a
similar law, it usually falls in this.
And finally, this is like a generalization of this
concept. So you have the cloud continuum when you just go
mix and match as you wish, as it suits your solution, as it
suits your product that you're trying to deliver.
All right, so finishing with
the cloud continuum now, a little slide of the meta operating system.
So since you are here, you probably know
the operating system definition, at least at these very,
let's say shallow even level, but generally know what an operating system
is. And these Wikipedia gives
these definition that matches it very well what the operating system
does. It manages the computer hardware,
software resources and provides common services for computer programs.
And if you go and try to recreate the meta operating
system by studying what the meta means, in this case
the meta from the greek meta, so meaning
after or beyond is a prefix remaining more comprehensive or transcending.
And these, if you build these definition of a meta operating system,
which is at the moment not that well defined in general in
the literature, you get this definition
of a meta operating system. This is system software also,
but this manages sets of computer resources.
So you have more than one computer,
more than one source of computer resources,
let's say. And still this system software
provides common services for computer programs running on them.
So you pluralize this, you layer it on
top of the existing operating systems and this gives it
the meta flavor of an operating system.
And this is basically the goal with the project that I'm about to describe.
And here it is. Here is the intro to the nebulous,
nebulous as it has been shown on the very first slide on the title
slide is actually a name for a meta operating system for
brokering hyperdistributed applications on cloud computing
continuums. So we have discussed the meta operating system, we have discussed the cloud
computing continuums and what we haven't discussed is brokering
hyperdistributed applications. But of course it's been
all the time in the background when we discuss that we have the
data. The data is being operated on by the services, by the applications,
in other words by the programs that are running there.
They're distributed because they can run on more than one computer.
These is where it comes so we have actually already mentioned that and
they're hyper distributed just means that well this is going the
hyperscale, right? So like not the two
computers in one room, but going over distances over many
components in that and brokering. So suggesting
the optimal solution, finding the
proper way, the optimal way to actually utilize those
various resources. So we have also mentioned that and
nebulous is a consortium project, so it's
funded by the European Union and its consortium. Actually consider
of 16 organizations, we have both academia and business,
we have both tool providers and consumers which provide these use cases.
The schedule is as follows. We have three years for these project
and it's quite new. The link
is nebulouscloud EU. I invite you to visit
this site for more details, but I'll try to provide some more
for you already. So these are the logos of the various
partners in the consortium. You might recognize
some of them, you might not, but for the details on them please visit
these site. And this
is one of the slides that tries to describe how nebulous
is being imagined. So what
we already discussed is cloud continuum.
So you have resources that are in various
places, you have the public clouds, your private clouds or otherwise
your major data centers, let's say you have your edge devices
that are those micro data centers, at least you consider
them to be on the edge of your solution.
So closer to the actual sources of the data you have at the
bottom, your IoT devices, so they can
also count as edge devices, but they don't have that much computing power.
And what you want to do is that for your particular solution,
you want to have an ad
hoc infrastructure solution, right? So for your software solution,
for your product solution, you want an ad hoc infrastructure
solution. Your ad hoc cloud computing continuum
designed specifically brokerage specifically for your
use case for your application.
So mix and match whatever you want. And also, as I
mentioned previously, that telcom providers with their multiple
access edge,
these own private data centers that are very close to
the end user connections are also
to be selected by such a project,
right? Okay, so regarding
the various details on the nebulous project, regarding development at
resources, the nebulous will be developed in open
and it will be offered as open source software and the license will be
Mozilla public licence version two. So this is a very liable
license, you might know it already.
It's similar to the approached license, but it differs in
a few bits. So for those interested, you might look it up on
the Wikipedia. The development platform is
yet to be discussed again because,
well, this is, sorry. So it's yet
to be discussed because you are previewing this
from the recording. I will probably know by the time
you are watching this. So you can reach me on the discord or
via my mail or otherwise via the project website.
But at the time as I am recording this,
this is not yet fully established,
so bear with me for that.
But also I would like to mention that nebulousness is not only about
the software that we are producing. So there will be software deliverable,
these will be something to plug and play, to configure,
run your software, et cetera. But we are also producing other
resources, other deliverables, as we say,
what we do. We have the open access research,
we have the cloud continuum culture, dissemination, this being a part of
it. We have training that we offer, and we
also have a relevant open call that I will
describe in a moment.
And regarding the directions of work that we take in nebulous,
these are the directions of work. These were also included in
the description of
the talk that I'm giving now. So first
of all, this way to model the
actual applications that consider the full cloud continuum,
cloud computing continuum, including the folk and edge
aspects as I described these before. This will
be research and solution to do that properly
will be described. Again. This will consider both the infrastructure
that you can offer, that you
have, that you offer to your application by means that you have it,
that those that you would like to use from the public cloud providers and
telco providers, but also apart from the infrastructure, you give
also the application requirements and the details
on the data streams that you will be using
with your application. So the data will be driving that.
And similarly you
define this. We give you a way to define that so
that the platform can assure the quality of service that you desire
and also optimize it. So the next
bullet point regarding the optimization, so that it actually selects
the resources that work best for your
particular use case. Third bullet point
considers that
it's important to not forget the security aspects
of doing that. So whatever you encode
in the application specification should
be transferred, should be materialized
in this cloud computing environment,
in this cloud continuum environment, in such a way that it
still considers the security aspects, that the final solution is
actually secure by all means. And final
bullet point this is something novel that we are
also trying out. This is
not per se,
let's say, specific to cloud continuums, but generally
for the services that are being offered is
the monitoring and conducting those smart contracts
based service level agreements. So the
service level is not only realized by its
textual form that is being analyzed interpreted
by humans, but it's also automatically monitored,
recorded, and its violations are detected in
a smart contract way. Right. So this is also
some relevant aspect that we are pursuing in this
project. And the open call that I mentioned
earlier is that we have this
amount of euro in the fund for nine
grants and this open call will open in 2024,
actually. And what the
open call gives you if you decide to join it and if
you actually win this call is that you will
be given the opportunity to discover these software, to roll
out your software, your solution on the nebulous
platform that we'll be delivering, and you will help us.
So you will join this in the middle of the project and by joining
it you will also help us build a better product
by the end of the nebulous project.
Right. This is one of the newest additions,
let's say, to the european project and
finally going something a little more
into the architecture, into the technical
side of the nebulous platform regarding
the architecture assumptions. So what the nebulous tries to do
is obviously tries to support the scope of the project.
This is like paramount that we support what the
scope of the project is in
a very coarse grained way at the very high level.
Let's say I have already given you a description of the scope of the project
by all the slides, by all the previous slides,
and this is generated direction so that it's going
to cope the various ways to cater for
those requirements in the scope.
Then the other assumption is that
we support the use cases of nebulous. I'm not introducing those
because I don't realms have time in these relatively
short talk, let's say.
Well, maybe not relatively short, but not long enough
to really discuss every little detail that is relevant to
the project that I'm describing. So for details on
the use cases, and there are various, please visit
the website that I mentioned previously.
And the other bullet points speak
about the technical, actual selection of the components
that would be used in these open source software on
nebulous. So we are reusing the open resources software
as much as possible that we have already been contributing
to in various ways or using in various
ways. So first of all, these open resources components of multi cloud
platforms that exist already from the other european projects,
for example, these melodic functionizer who have already been completed as
well as the morphemic that is nearing its completion
now, but it's still work in progress and
on the side of orchestration like the consider
whatever orchestration that you can do with Kubernetes. And I said on
the slide and more, because with Kubernetes you can actually do more than
just orchestrate. And the last bullet point is actually all about
that. What we want to offer to the end users
is that they have this user friendly, declarative,
reproducible and GitHub friendly as well, app modeling
in Yaml so that it's not
scary, it's something that the entire
industry is adopting at the moment, right? So having
it GitHub's way, having it reproducible,
having declarative,
basically making manageable realms
manageable for the end user.
And our approach is very much inspired and based on
Kubernetes custom resource definitions with custom controllers.
So we have been inspired by the existing crds
and we are basing our solution on crds and
the custom controllers that we will provide, that we will
develop in these very project and
these sneak peek into the nebulous architecture. So like I mentioned,
this is still early and we don't have
many details to show you. Sorry for
that, but I'm really open to discussing that with you
via mail or via discord while I'm there by
the way, I will try to be there, but my
time zone might actually prevent me to be active at all times
during these actual event. But feel welcome to leave
me a message or just leave me a mail and
I can certainly talk about it
a bit more. And regarding the
sneak peek into the architecture in
the center, we have this cope that is Kubernetes driven
that provides the various services for the
actual implement, for the actual realization
of an application on this meta operating system.
At the top we have the users who provide, as I mentioned already,
the available infrastructure description, the application
description, the data streams description, and the goal
of the meta operating system is to realize the application in
the MOS optimal way using the resources that are available and considering
these various constraints that have been given in the various descriptions
from the user. All right,
so without really going into the details of the various points,
you can see that on all
the sites of the site of the public cloud, the private cloud,
otherwise some data center and also the edge fog nodes,
wherever these are,
we consider that there will be another local Kubernetes
cluster that will be somehow federated with
the main cluster and via this federation
as we call it now, we will opensource that the workloads
that are meant to be running in the target environment
actually running there. So it's like a two layer orchestration
for the nebulous meta mos where the core layer
decides on the first level of the orchestration and
then the second layer decides on the others.
Furthermore, we are discussing considering various elements
of fault tolerance, like if one of the
sites is actually disconnected from the core,
how should it be final that the site should still
keep on using and optimized locally for
the resources that are available for the site, for the resources that
can still run to some extent, like degraded more or
otherwise for this local
site, whatever that use case is. All right,
so things like that is what is the scope of this
particular project? First step is of course,
given definition of that, what are the
exact details, how we want to approach that? And this is still in
the talks, so that's what we are doing very
much at the moment, that we are actually spending lots of time on that.
Well, that's basically it. What I had to
tell you about the nebulous project and its state at the moment.
And this is what I have a summary for you from this presentation.
So I won't be reading it for you because
you can pause the video, it's pre recorded if you want the
summary. Actually,
the summary is bulletization of all
the most important points from the entire presentation
that I would like you to somehow remember and keep
from this particular talk. And with
that said, thank you very much for
having me and for hearing me out to the end.
Radosafijik seven bulls.com ciao.