Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, welcome to Comp 42 cloud native confident.
I am Adrian Teka and I'm a senior developer advocate
for MongoDB. And today I'd like to talk to
you about multicloud magic.
So let's start on this journey. We're first going
to get on the same page. We're going to say what is
multicloud? Because you may have heard it before, you may have seen
the buzzword flown around or seen it around the Internet,
and it's really just a simple definition, but it's important to be
on the same page. So we know weve we're going in the context of this
presentation. After that we'll move on to the next likely question
you might have, which is do we really need a multicloud option?
Then we'll get into the bulk of my presentation, which is the
different ways that you can use multicloud clusters
in real world scenarios. And then finally weve
going to see how to actually set up our own cluster and see how easy
it is to do in MongoDB Atlas. So we'll
start at the beginning, what is multicloud? And if you have an idea
or if you have a different definition, I'm curious to hear. So feel
free to let me know either in the chat or after the presentation.
But for the context of this presentation,
multicloud is probably the first instinct that you
had when you thought about this question. And that's any single architectures
that uses two or more cloud providers.
And for the context of this, weve going to be focusing on
the big three. So that's AWS, GCP and Azure.
And as part of this multi cloud architecture, this can also mean that it's
a mixture of public and private clouds. So as
long as you're using two or more of the public providers or two
or more of the big three, I should say, then it should be
classified as a multicloud architectures. So two or more cloud providers
within this same architecture is what we designate as a multicloud
kind of solution. So next, do we really need a
multicloud option? And I want you to think, when was the last
time that you had to deal with an outage of some
sort of managed service? Maybe there were a few in the last
few months, maybe there were some really large ones that caused
some production issues for you. Think about it for a second.
Now, we know that outages are not really uncommon,
but we also don't think that they happen as often as they
do. And unfortunately that's kind of the case.
If we go back two years, we'll kind of see a timeline of
how many of these outages have actually occurred and how much impact
they had. In June of 2019, there was a really large one from
Google Cloud outage. It was so large that they called it the catch 22
that broke the Internet. Now, what happened here was Google
usually runs these routine configuration changes. They do these
maintenance events often, and it's not something abnormal.
In this particular case, they had one of those same usual
maintenance events, and they intended for this to be applied to
only a few servers in a specific geographic region.
But when that happened and when this particular outage
occurred, there was a combination of misconfigurations
and some bugs in the software that led to the
automation piece of the software,
descheduling network control jobs in multiple
locations. So what that amounted to is
wired had a really good analogy. If you think about all the traffic
that's running through Google and the Internet, and you think of
all of that data running through numerous tunnels, what happened
after that occurred was that those tunnels effectively got
blocked. And there weve only two tunnels from maybe like
six tunnels that they had running that allowed data to flow through.
And effectively this resulted in an Internet wide
gridlock. But it wasn't just Google. AWS also
suffered their own outages in this particular case.
Reading from the retrospective that they gave according to
what happened in this outage, they say this event was caused
by a failure of our data center control system, which is used to
control and optimize the various cooling systems used
in our data centers. So what happened here is the control
system had some third party code that dealt
with the interactions between the devices within the data center.
So the things that communicated with the fans, with the chillers
and the temperature sensors, there was a bug in this particular
piece of third party software. And that bug caused
the exchange of many, many interactions in the millions
and effectively, again, this cases the control system
to become unresponsive. And when that happened, the rest
of the fiasco kind of unraveled, right? Raxo services started overheating,
and that bug caused multiple redundant cooling systems to
fail in many parts of the affected availability zones.
And then we see Google again, right? March of 2020,
they had a significant router failure in a data center in
Atlanta. And Azure is not immune to
this either. Specifically, when the pandemic hit, we really
started to see the limits of what our cloud providers can
offer us. As we started to all work from
home, many of us started to use the Internet a lot more,
conduct business meetings, team meetings, do everything
online and started to use more of these services. We really
saw how much of a toll that took on provider
like Azure. As an example, they shared some
stats for some of their services. So right at the height of
the pandemic around April of 2020,
they showed that the Microsoft Teams usage really,
really spiked. So Microsoft Teams is their video conferencing software.
And just as an example, they had about 44 million daily users,
which generated over 900 million meeting minutes and
calling minutes on teams in a single week.
That is incredibly unexpected in terms
of how much new traffic and unexpected and
really scaled traffic that they did not expect. And we saw that with
these outages and with the shortages that they were experiencing
in this particular case, they really felt it in Europe
and on and on. Right. Azure also had
another one in terms of bottlenecks in the APAC region in June of
2020. And the most recent one that was fairly
large was the AWS outage that took down a
big chunk of the Internet. And that was in November of 2020 for the Kinesis
data streams outage. So in between all
of these kind of larger outages that you may have seen or
have experienced, and if you were thinking about my
question prior to starting this kind of timeline, you may have
even more examples of these scenarios where you had
some outages that really caused some issues. And what
we want to see here is that we know that, again, that outages
are not impossible, that they never happen. We know that they
occur. What is more important to see in this timeline
and in the examples and events that you may have experienced yourself is
that no cloud is spared from the outages.
And this is a very important point, because a lot of
companies who are kind of feeling the pain of
vendor lock in, for example, or those who are slowly migrating
to the cloud and are only considering a single cloud
if they are a company that needs to be global or needs to
be highly available, this is something that is near and dear
to their heart. This is something that is a very big portion that can impact
the user's perception of their application and is also something
that they may worry about in terms of their uptime. So the
fact that no cloud is spared from outages means that, yes, we can
kind of confidently say that this is an issue
if we are on a specific cloud provider or only a single
cloud provider. And we have run into issues before where
an outage on one has caused problems.
So that's more on the consumer side and of us experiencing
the outages ourselves. But also in June of 2020,
we took a look at this report from
this CIO think tank. So what happened here was they
got around 30 it leaders in a variety
of tech companies and they started to talk about
multicloud. What does that look like? How do we actually
implement it? Is it worth it to consider a
multicloud architecture for the various
architectures they have and the different industries that they have?
And for the most part, these CIOs across the board acknowledge
that it's not a matter of if, but a matter of when that they're
going to use multicloud. And it would come to a
variety of reasons of why they are not on there yet.
But there are a few quotes that I see here that really
set the tone, really captures the essence of why I think multicloud
is becoming more prevalent and a more approachable solution for
many of these companies as they grow. So the first one is Gregory Sherman,
he's a vp of business platform technology for Fiserv,
and that is an american multinational Fortune
500 company and they provide financial services.
So what he says is the main driver is
what our clients are asking for. We have banks who have
an Azure preference, we have banks who have an AWS
preference, Google Cloud and on and on, we don't
really get to choose. And that's incredibly key for multi
cloud because there are many industries like Fiserv who may have
clients like this. In their case, their clients include banks,
credit unions, security brokers, dealers,
leveraging and finance companies. And especially in very
regulated industries like that, sometimes Pfizer does not have the
option, they need to give the option to their clients of where
to hold their data or where to host their applications.
And so that's why multi cloud is something that is not just a nice
to have, but almost a necessary thing for this particular
company. Next is Mohan Pucha, who's the vice president
of architecture and digital strategy at Aon. Now, Aeon is another
multinational company that offers financial risk migration
products. And what he says is we have to be native AWS
because of their advanced capabilities in analytics and
we have to be in azure because frankly,
developers love that ecosystem and productive
developers are probably the best thing. So I really
like this quote, because it not only speaks to the fact that maybe multicloud
is a decision that comes from the top down, but it's actually also coming
from the bottom up, or from developers who are having more autonomy
in their decisions. And as developers,
we want to use the best tools for the job. So this is incredibly
exciting because most of the scenarios we've seen of
why multicloud has previously been difficult thing to kind of
implement is being made easier with something like a
multicloud cluster, and we'll see why. A lot of the scenarios
I will show very soon, almost next slide is that the
biggest bottleneck is having data portability for their applications,
and so we'll see how those are actually used. Now in the real
world, we will be seeing how some different multicloud
solutions are used on the basis of multicloud
clusters. We'll start with some very common use
cases where multicloud is a very easy thing for
clients to kind of choose and migrate towards, and that's
data sovereignty and data residency issues. So in
this particular case, Canada has this direction for electronic
data residency. Now, the Government of Canada,
every few years or so, they write out a strategic
it plan that is to be dispersed throughout
all of the government entities. What they put in this plan is just some
best practices, some directives for how they are to implement different it
policies. And in this particular case,
one of them is called the direction of electronic data residency.
So what does that mean? This stated that all sensitive electronic
data that's under government control needed to be
located within the geographic boundaries of Canada.
Now, there are a few options with, you know, there cloud be some
locations abroad that have been pre approved by the Government of Canada,
like diplomatic or consular missions, but for the most part, those are very
rare. And ultimately, all of the data,
at least the sensitive data that they have classified as sensitive data,
needs to remain within the geographic boundaries of Canada.
So a couple provinces have actually gone a little bit further than that.
British Columbia and Nova Scotia, they have
said that all of the data, well, they pretty much align with the initial
strategic plan, but they do follow it to a t.
So that's all government data. Public schools, healthcare services,
governmental public data, those all go and have to remain within
Canada, whereas Ontario, they only are applying
this to health records and health data. And that brings us to
the first kind of scenario that I want to share with you, which is the
story of an emergency services application and
how they used multicloud to give them higher availability
while still complying with this directive.
So for the most part, most of these companies, including our emergency
services application client, was hosting their data on
AWS Montreal. And that's not too far fetched
to think of, because last time I checked, the market share pretty
much belonged to AWS for now with 34%. And most companies
are on AWS. And in this particular case, this region,
they all had to be on AWS Montreal, because that's the
only one AWS offers. So in this scenario,
that's not a great thing, because let's
say this happens, let's say AWS Montreal goes out for one
reason or another, even though it's rare. We have seen before
in the last timeline that it's not as rare as we think.
And when this occurs, and assuming that your application was
not built to fail over properly or did not have the means to,
then your application will also fail. It will also be
out. And for an application like an emergency services
application, outages are not only annoying,
they're almost unacceptable in this case. And we
know that this is a really negative outlook for any company,
right? Any type of application downtime where your users can't
access your product or your services, that always, almost always
translates to either financial loss or some sort of reputation loss,
because nobody likes to deal with an application that's not working.
And what really drove it home for this emergency services application
is that the recent AWS outage of November of 2020 did
not help. It did not make them feel confident at all that they could
rely on this single region on AWS.
So what did they do? Well, they already took advantage
of another cloud provider they had to that
was still within Montreal, and the only other region that was
there, or only other cloud provider that provided that same region was
GCP. So they took advantage of that and set up a
failover strategy that way. And because they wanted to
be extra fault tolerant or wanted to have that built
in, they also took advantage of two other regions that Azure
provided. So there is an Azure Canada central region, which is based
in Toronto, and an Azure Canada east region that is
based in Quebec. And so now in this type of architectures,
they're feeling very confident that no matter what kind of outages may
occur, they would be okay. They will not have an
outage. They will still be able to access their application.
So again, if AWS Montreal goes out, no problem,
GCP can step up and fill in the gap there. And in the
even more rare but still plausible scenario that the entire region
of Montreal just goes out, well, that's when the azure regions can step
up and fill in the gap in those outages. So that's
the canadian example of kind of abiding and
still being compliant to this data residency requirement
while still taking advantage of the other providers.
They almost had to specifically for this client because of
the type of application that they had, they did not want to have outages.
And so that was a very great use case for multicloud clusters.
Another very similar one is Australia. Now, Australia passed
some similar legislation. This one is called the My Health Records
act of 2012, and that pretty much states that there's the
requirement to not hold or take records, health records
outside of Australia. So if we take a look at the
cloud providers and regions in the Australia
landscape, you'll find that most of them are pretty much
here in Sydney. There's a GCP region in Sydney and there's
an AWS region in Sydney. But as you'll see,
this still is prone to the same problems before that.
If you are on either one GCP or AWS,
if you had a regional outage, well, you're out of luck there
because those are the only two that were available there in
Sydney. In order to fight against that. Again,
in this scenario, Azure with their Melbourne
region, in this case the Australia Southeast region,
gave these companies a different opportunity
to be able to spread their availability
across multiple regions. So not just the cloud
providers, but across multiple regions. In the
particular case of Australia, what Azure actually found was that
they had another kind of policy that required some in
country disaster recovery options. And so with that,
they actually added two additional regions in Canberra,
one Australia central and Australia central, two built
specifically for this kind of compliance issue.
And if you were wondering, they also do have one in Sydney, in case
you wanted to collect them all. But again, the point of this is to show
that Australia is a fairly large continent, and if
you wanted to work in any of the other territories and needed to service those
other territories, well, you're going to need to start making use of the
other cloud providers. So those are kind of the low hanging
fruit scenarios of these multicloud
clusters in play. And now I want to talk
about some actual situations where they
use multicloud clusters and where we're seeing some different
use cases of how multi cloud has solved different problems.
This next one is called it the recommendation feature. In this
particular case, we had a client who had
some workloads and they were running on AWS.
Now, to be transparent, they were already using MongoDB
Atlas to host their database and host their data, and they were hosted
on the AWS region. They were hosting an AWS and this application
was an internal help desk software type of application.
As they started to expand and as they started to grow, they wanted to
add a recommendation feature. And when they spoke to their
developers who weve about to implement this,
they took some time, researched and found that hey,
we want to use Automl, which is a Google service,
and it's basically a tool that uses machine learning to reveal
the structure and meaning of text. So they wanted to use this to
be able to tap into not only their production data, live data
and patterns, but also the knowledge base that they had
so that they could recommend potentially relevant
knowledge base articles for the help desk technicians
when they were using this software. And so now with this kind of
scenarios, the big glaring thing here now is,
well, how do we get that data over there? We need to get data
over there somehow in order to be able to use this
tool. Or at least that was the goal, because that would make it much
easier to integrate and use the automl tool
for this proposed analytics application that was
to serve as their recommendation feature. So what do
you think were the potential options for this?
Well, to start, one of the common
ways we've already solved this problem is to write custom
code. Custom code. It works and it
will always work, but it's not necessarily the best option.
As most of us know, if you've ever done any kind of custom scripting
like this before, these kinds of things are always
unique. They're too purpose built, they're only made
and written in a specific way that is solely
to solve this problem. And that is a big reason why yes,
it will work, but it's not necessarily the best case because that's a
lot of maintenance to put into something that is super custom and it's
another thing to maintain. And even as devs,
we know this pain, right, we know that this is something that we don't
want to do or try to avoid at all costs. And so even if
we try to automate that kind of pain away by say, using something like kafka,
which is something that can stream updates from one source to another,
for this particular client's case, it was just another piece
that they did not want to maintain. It was an additional service
that they just did not want to
set up. And so this was not the option that they chose,
nor was it the best one for their scenario. So the next
option then is something called backup and restore.
So in this scenario, we would be taking live snapshots
of their live data on AWS and we would restore it
over to GCP or wherever they had
the other database hosted at the time and
have it analyzed. But again, this was another costly
ETL process that they did not want to maintain. It's another separate
piece and too many pieces weve something that they just did
not want. So the bigger issue with
this option too is that when they would work with the
data and work with the snapshots, it was usually done in batches,
which meant the analysts that were trying to use this data
for this application, this analytics application, they were always
waiting for new data to be uploaded. So they were always working with somewhat stale
data and it just was not working for them. So in
this particular case, something like a multi cloud cluster
really was the best option for this scenario.
Now, again, they were already using MongoDB Atlas. This is
part of why this option was much more appealing than the other two.
And by having the same underlying cloud database support
all the different kinds of workloads that they had, it made it a very
easy decision to say, yeah, multicloud cluster makes sense in
our case and what we want to do here. So how did they achieve
this? Well, they did spin up an analytics node
to be able to work with their analytics application,
and they spun that up on GCP.
So what this means is they used a specific
node that is meant for analytics, it is meant
for complex, for long running operations.
And the better part is that this was separate from
their production workloads, their operational workloads,
the ones that are running on AWS. So while they can use
the GCP node to their heart's content for any
kind of analysis and to fuel the automl tool
for their analytics application, their AWS
workloads remain untouched. They're not competing for additional resources,
and they can stay focused on servicing the
production workloads for their application.
So this was a very good fit for multicloud
clusters in their very specific scenario.
Next is another very, I will say,
experimental, and this is still being worked.
But it is something that is an interesting idea
that if they're able to work out the kinks would make multicloud
a really, really awesome options. So this client had some
workloads, and what they wanted to do was be
able to burst those workloads to whichever cloud
provider had the best pricing at the time. So they wanted to take
advantage of any pricing fluctuations and send
their workloads that way. So, for example, if Azure had the best
pricing for them at that point, they would love to be able to
move that workload over there or send some that way.
And likewise, as things fluctuated, let's say AWS
met that kind of threshold they had of whatever pricing they
were choosing for, they could also send some workloads that way and
onward, right? If for whatever reason GCP also went down,
send them there. So this is kind of the holy grail of
this particular production, but there's still a long way
to go, and the reason for that is even though we
try to generalize these workloads, these tasks that we
tried to do, there are still very slight differences
between the cloud providers that still make it fairly difficult
to kind of do this as seamlessly as they
want to. So it's already a very large step in that the
data is already available that was already a big bottleneck
that they had that was removed by using something like a multicloud cluster.
And having the data readily available on all three
of those makes this, it's a step closer to
reality and possibility of being able to do this. But they are
still working out those kinks with the slight configuration differences
between them. But I'll definitely update you if they get this to work.
And finally, there's a last kind of almost very extreme
scenario of cost optimization here. So this client
was a major auto manufacturing company, and what
they did was they had most of their applications primarily on AWS,
and their workloads were running on AWS. But what they did was they had some
conversations, and they had some conversations in tandem,
they had some negotiations with aws and also
with Azure. And what resulted from those conversations
was that, let's say they had some better pricing deals
at Azure. Well, they wanted to be able to switch
just like that. Literally just like that. They wanted to
be able to let their engineers know and say, hey,
we were able to negotiate this better pricing deal for the amount
of workloads that we're about to put onto this cloud provider.
And so being able to migrate over that quickly and that
seamlessly was something that was very appealing and something that
multicloud could offer. How would it do this?
So again, if they were based on a multi cloud cluster, which was something
to consider, what could occur was that they could
have their original clusters hosted in AWS,
but the moment that they needed to switch over, they would just
have to change the provider. They would just have to change the highest priority cloud
provider in the cluster's configuration settings. And you can do this via
the atlas Ui, or you could do this via the command line.
And so this would gracefully roll over to the destination
cloud provider that they wanted to do, and that would
allow them to be able to make these kinds of very quick changes.
Now, of course, caveat. Depending on the size of
data they had, depending on if their applications were architected
in a way that cloud take advantage of this, where it was truly just
the data needing to be migrated over different cloud providers, then yes,
this is a possibility. Right. So I just want to preface that
and not to kind of just generalize and say yes, do a
multicloud cluster and all your problems are solved. That's not the case. Your applications
need to be architectures in a way that could take advantage of that. And if
they are, then multicloud clusters could be a very nice
thing to supplement that kind of architectures that you're looking for.
So weve spoken about these different client scenarios and
now I want to talk about how this affects developers. So as
devs we again want to use the best
tools for the job. If you ask anyone in your network, if you have
looked at the Internet, if you have read anything about any part of
cloud development, you'll know that there are preferences. You might have
your own preferences about what to use and from
where. You ask devs and they'll say yes, I only use AWS
Lambda because it's the best tool for the job. Or if you
want to do anything with machine leveraging or artificial intelligence,
most developers will say yeah, GCP is the place to go there,
although Azure is catching up quite quickly.
But for the most part they did have a stronghold. And when you
think ML and AI, you think GCP's platform.
And then there is a very large segment of developers who
are also on the Azure ecosystem. A lot of european companies are full
azure ecosystems, and so in that scenario they want
to use Azure DevOps, they want to use Azure active directory.
But as we start to evolve and start to use more of
these cloud services, as we start to expand,
we find that sometimes we do want to use some of the other
tools that may be on other cloud providers, especially if
we are on a single cloud provider now or
have been feeling the pains of some vendor lock in.
Maybe the initial choice that was made when you moved to the
cloud was not the best one, and you find that another one would actually
serve you better. There are all kinds of reasons why we would want that flexibility
to be able to use the best tool for the job.
And so something like a multicloud cluster can
help you get there. And the reason we say this is
because for the most part, the biggest bottleneck has
been the data. It's always been how do we move the relevant
data that we need to use the services that we need on
these other cloud providers? And now there's a plausible solution for
that through multicloud clusters. The next thing as developers
that we need to consider is that we're now pretty much
responsible for even higher availability and even lower latency.
So with all the outages that have occurred and as
we expand and have to cover global markets,
there's more of an ask for us to be sure that we
can provide the same experience to all of our customers,
to anyone that's using our application. And so there
are a lot of reasons why we may need to
take advantage of other cloud providers regions,
either because there's a region that is only covered by
one cloud provider, which is very common, or you
need to have absolute availability for very specific
applications like the emergency services application, where outages are
not possible, they should never occur. And so in this
kind of architecture, having a multicloud cluster
gives us a much wider range and a much larger set
of regions to work with to help us solve these problems
and make sure we're able to deliver with these kinds of
asks for availability. And lastly,
as we start to become more comfortable with this and start to
use multicloud clusters and start making
it easier for our applications to take advantage of
all of the cloud providers if it warrants it. I think the next thing that
us as devs will have to worry about now is making sure
that we're as cloud agnostic as possible. So wherever
we happen to put our data, or wherever our clients
ask us to put our data, if we're in a similar industry like
Fiserv, where we don't get to choose, having that availability to
serve the customers where they are is our ultimate goal.
So now I'll quickly go through the last kind of
scenario that we've seen where multicloud has also been a great fit, and that's future
proofing, specifically with mergers and acquisitions.
So another very common use case is that we have european
companies who acquire other companies, and the european
companies are mostly based in Azure. They're a full Azure ecosystem.
And when they acquire these other companies, most of these acquisitions,
they are on AWS. So what needs to happen now is
the cross cloud migration. So how does this work?
Well, again, custom scripting is always an option, but we
already know about the flaws with that, so I'll already move on
to the next one. Now, in this particular scenario,
this acquisition that occurred, again, this client
was already in MongoDB Atlas, and with that there
was the option to use live migration, which is something that
is offered through MongoDB Atlas. So what happens here
is that they could set up a destination Atlas
cluster in Azure, and that would then live migrate from
the AWS clouds and the AWS nodes over to
Azure. But the problem with this, even with this option
in Atlas already, is that again, it's a separate service
and it's also a very big hassle, because the connection
string needs to be properly cut over. You would need to bounce it
to make sure that the traffic is now going to the correct place. And even
this little bit of manual intervention just was not a
good thing for them. They did not want this, they wanted it
to be more seamless than what was already possible.
So again, in this particular case, multi cloud cluster made
sense. And this is because what they already
had, they already had MongoDB Atlas foundation.
They needed to do was just change the cluster settings to not only be
a multi cloud cluster, but to also change the highest priority region
and cloud provider to be from AWS to Azure.
So how does this do this? Right, the whole reason they choose this is that
they want a graceful rollover and they want to make sure that it all properly
cuts over. Well, by default, MongoDB always has
a three node replica set. That's a minimum. And this is always
so that we can ensure at least a reliable election with
the bare minimum of three. That means there's one primary and
there's two secondaries. Now, when we would begin
this kind of cross cloud migration through a multi cloud cluster,
we would first start migrating over the first two secondaries,
so we'd migrate over the ones on AWS
over to Azure, and then we would elect
a new primary. So the previous primary was on
AWS, but now we would ensure that it was now
on the new cloud provider, which is azure, and then finally
we would migrate over the remaining secondary.
And what was really great about this was that of course it was
all automated and taken care of by MongoDB Atlas.
But the best part about it was you did not have to change the
connection string, which was the biggest point of contention in
this scenario. So this allowed them to do this cross cloud migration
much more gracefully, and to also ensure that
once this migration was finished, that it would properly move
traffic over the right way without any downtime.
And that's kind of the buffet of scenarios
where multicloud is being used and is benefiting
specific situations. And so now I will quickly
go through how to create one. But before I
do that, the last thing I'll leave you with is another
quote from that think tank that I mentioned earlier was from Brad Lewis,
and he's the vp and global lead from Dell Technologies,
and he basically says if you want to start to have true portability
of applications, obviously the data has to go
with the application. And so this is why multicloud
clusters have become a step in the right direction when it
comes to even thinking about a multicloud solution,
because the data has been the bottleneck for many of these scenarios
and a lot of these customers. But by using something like a
multicloud cluster, they're able to move closer to
that multicloud strategy and take advantage of it.
And so now I'm going to quickly roll
over and show you how easy it is to set up a cluster.
So if you've ever gone through the MongoDB Atlas UI
and created a cluster, you'll see that this is what you are faced
with, right? You decide which cloud provider and region you
want to use, and that's where your initial cluster would be hosted.
But now, if you wanted to do a multicloud cluster like
I've been talking about, then you would just turn this on.
And what this does is it basically shows you a couple more
options. You now have the ability to choose from electable nodes,
which are the ones that are the only ones that participate in elections.
That means they're the ones that can be elected to be a
primary in your production workloads or your operational workloads.
And you also have read only nodes to choose from.
So these are great for, let's say you have some markets
that are really far away from where you are based in, but we need
to make sure those reads for those markets are just as
lightning fast as the ones in your local area. Well, you can
spin up additional comes in those markets and have them point
to those nodes and make it much nicer and
faster and equivalent to these regions to be able to read for
any of your service areas. And finally, there are a third
type of node, which is the analytics node. And this is the one that the
automl recommendation feature took advantage of.
And this is the one that they chose when they set up an
analytics node in GCP. But for now,
I'll just show you how to set up a cluster really quick.
So I'm going to choose actually GCP as my highest
provider region because I am based in Las Vegas and they technically
would be the closest one to me. And in this case, I'm going to set
up what's called a two two one node distribution. So you'll see here,
we always, always want to make sure that we have an
odd number of comes, and that's to ensure reliable
elections. If we had an even number, it's a possibility to
have an elections be split down the middle and we would not be able to
elect a proper primary. So this is why we ask for an odd number
of nodes. So I'll add a couple more here, we'll choose the next
closest regions to me and AWS.
Remember, we want an odd number, so I'll do a two two one here.
And in this kind of node distribution, this is kind of the bare node
minimum that you would need to be able to provide equivalent
read and write availability guarantees for your
cluster. So this would be the multicloud cluster I'd have.
I'd have the highest priority in Las Vegas and Azure
and AWS as my secondaries. And because I don't want to
make you wait and watch this cluster being generated, I've already
done that. So this is what it would look like. You'll see that you had
a preferred region in GCP, just as I've
asked, with some secondaries in the AWS north
region and the Azure California region
as my scenarios. And if
you remember when I said in a couple of the scenarios
where all they had to do was change the provider,
well, it really is just like that. You can either do this through the
CLI, or if you needed to move over and
change the cloud provider that you would have, you would just
go over here and you would set the
highest row to be your new highest priority
cloud provider. So if I wanted to do what I just
did here, which is set Azure to be my highest priority
cloud provider and migrate some nodes over to
Azure, this is all I would have to change.
And let's go back here.
And that's really how quickly it is to set it up.
Obviously it'll take a little bit more time for the nodes to be deployed and
all of that, but in terms of setting it up, that is all you
would have to configure. If you wanted to set up a multi cloud cluster,
this is also available to if you already had an existing
MongoDB cluster that was not yet a multi cloud designation,
you would also change it in the same fashion, and they are eligible
to be changed into a multi cloud cluster in that way.
And that's it. So, salamat, that means thank you
in Tagalog. Thank you for taking the time to listen to me talk about multi
cloud clusters. If you have any questions, please feel free
to find me either on Twitter or in the chat. I will
be here and I'll do my best to answer your questions. Thanks so
much and I hope you enjoy. Enjoy the rest of Comp 42.