Transcript
This transcript was autogenerated. To make changes, submit a PR.
Welcome from SQL to NoSQL, a gentle introduction
for new developers. My name is Joe Carlson,
and let's get started.
So did you know
that there are the two main bottlenecks for
any web application, generally speaking,
tend to be either network speed or databases
execution speed. But there's only one of
these things that we actually can control with an asterisk.
So network speed obviously is limited to
the speed of light, which for some reason, human beings have not figured out
how to break the speed of light yet. Something Einstein C,
right. We can't do it. We can move our servers
around to make that distance shorter, so the light has to actually
travel less distance. But the easiest thing that we can control
is actually increasing our database performance speed, which brings
us to our subject today. So the more
you know, database and network speed is very important.
My name is Joe Carlson. I'm a developer advocate working
for a small little company called MongoDB. Did I say I'm a developer
advocate and software engineer? Well, I am, and here's my
socials. Important note here before we get started,
if I say anything controversial here today, just know it's my opinions, not my
employer's opinions. And second of all, if you want to follow along
with any of the resources, including code, snippets,
links,
video of this talk, the slide deck, any of that stuff, you can either scan
that QR code in the upper right hand corner, or you
can use that link there, that bit ly link, to get all the information.
So let's get started here.
The first thing I want to do is start from a SQL background,
right? I think a lot of people have the
SQL experience and are looking to evaluate
NoSQL as a potential database. And the hard
part about that in particular with MongoDB, is kind of mapping the
terms and concepts. A lot of them are actually more similar than you think,
but we have a different way of calling them, and there's different reasons why we
do that, which we'll get into. The second thing I want to talk about today
as well is four major reasons why you'd
want to be selecting a NoSQL database like MongoDB over
a traditional RDBMS SQL type database.
Let's just jump in, huh? So first things first,
MongoDB saves data in documents. Now,
if you're totally new to programming, you might think that documents refers
to something like a word document, which it doesn't. In this case.
We're actually talking about JSon like documents. It's actually its own thing,
Bson. It's binary instead of JavaScript,
but it has a couple differences. With this BsOn notation.
It might look a lot like an object or a dictionary or
hashmap you've seen in other programming languages. And that's intentional, right?
It's a data structure as programmers we're used to using.
So just like a object, dictionary or hashmap,
you access data in a document via key value pairs.
So you access a key to go get that data.
You can save things like strings. And this is for the binary, right?
With JSon, every value is saved as
a string. And it's up to whatever language kind of
person to figure out what the language or the thing actually is. Bsun's more specific.
With BSOn, we can save specific data types in it,
like strings here, or integers,
or this array of tuples. It's an
array of integers. It's actually Geojson data. So you can save
latitude and longitudinal data as its own specific BSOn
data type. You can save arrays of data or
arrays, or nested arrays or objects in any structure you want.
So you can be very expressive about how you want to actually model that data
in a document. So that's
what a document looks like. But let's look at like a traditional SQL databases.
How would we model the same data using an SQL database?
Traditionally when we're using some normalization, like to the third form.
So the first thing you'd want to do obviously, is this looks like it's some
sort of user table or user data. So you'd probably want to start with a
user table. But you'll see we actually can't
save all the data in there. And the reason for that is
traditionally with SQL database you want to normalize, you're not repeating data.
So for something like the professions table, you'd want
to split off into a separate table and link that data via foreign
keys. So you can see here, because if you have n number
of additional professions that you want to use. So we want to link those
off. What about that cars table? Right.
We want to do probably the same thing. The only difference here is we're saving
more complex data. So our structure in SQL is going to
have more columns. I always have to think about rows and columns.
I get them confused in my head, but we're going to have those. So that's
traditionally, how would we do this? Right? We were going to normalize this data.
We'd be splitting up into separate tables and joining them via rows and
columns or via foreign keys.
Nothing too fancy. Nothing too fancy. Okay, so first little
bit of vocab here. You know what?
Let me skip over this right now.
Did you know that MongoDB,
I think a common misconception with MongoDB is it's a schemaless database.
This is in fact actually not true. MongoDB is actually a
flexible schema. Databases,
whoops, hit the b button there. So what
this means basically is at a database level, you can actually enforce
schemas for your data, right?
You can be as flexible or inflexible as you possibly
want. A lot of traditional rdbms use,
you have to use scheme validation for every row and column. For every column.
Excuse me. And you can totally do something similar by saving every
key value pair as a schema, but you can
be as flexible with it as you totally want. Okay, let's go through some more
rows or more vocab here.
So a document in MongoDB would be analogous
to a row or rows of data joined by foreign keys in a traditional rdbms,
fields would be similar to a column. Right? Collections is
the same as a table. And. Oh yeah.
So MongoDB collect where there's no concept
of a table. So stop me if you've heard this one before. It's an old
programmer's joke.
So a DBA walks into a NoSQL
bar, but immediately leaves because he couldn't find a table.
Pause for virtual groans thank
you very much. Thank you so much. You guys are all great.
Okay, so we have a lot of different terms. Let's get the one that's actually
the same databases. Databases are same for both NoSQL
and SQL databases. Did you know that NoSQL
databases like MongoDB also support indexing?
They do the same b tree structure for indexing data. You can
do it. And you can actually do more complex things with the nested data structures.
With MongoDB, you can embed that
data. And that'd be analogous to joining, right? When we split up that data with
our SQL data, instead of having to separate that into
a separate foreign key, we can just keep all that data that we need together.
There's no reason to have to go somewhere else to go get it right,
because we keep it right there, which we're going to talk about has some massive
performance gains. Also,
did you know that MongoDB actually supports multi document
asset transactions? Who knew? It's true.
Yes, MongoDB actually supports multi document asset transactions.
So if you can do an update, a delete, a git, a query,
another update, delete, you can all make that a single asset transaction.
If any of those fail, roll it all back just like you would with an
asset or with a traditional rdbms or legacy systems,
right? It's all there, that power.
The difference though is you can be as flexible as you want with it.
You can enforce that or not as much as you possibly want,
which we'll talk about the end. But let's wrap up here.
We're halfway through here, so we want to get through the four huge reasons why
you might want to consider a NoSQL database like MongoDB
for your next project. So those reasons are it allows you to program faster,
it allows you to pivot easier, allows you to query your data faster,
and allows you to scale cheaper. So let's dig into that first one,
programming faster. So MongoDB,
since it's using a data structure analogous to objects,
dictionaries or hash maps, it converts easily to most major
programming languages. So let's say hypothetically you have a
project manager or boss that says, hey, we need you to go ahead and write
a crud operation for our profile page. We need to update,
maybe delete the user page. We need to, you do all that, right? So let's
go ahead and do that. We're going to model our data similarly like
we had before. Just going to be an object with that data in MongoDB
or on the right side.
I get those confused too. We're going to have our users and professions table joined
by a foreign key. So let's see what this looks like to actually update this
data with MongoDB. So first thing we need to do, and we have to do
this with both of these things, but we need to actually connect to our database.
Our databases are actually series going off here on
a separate server that we have to go connect to. It's authenticate
totally the same, right? No problem. Here's where we get
start seeing some differences. So the first thing we want to do is actually
find that user in our database. So what we're going to do is
just do a query. We just tell MongoDB which databases we're in,
which collection we're in, and we have to tell what to find. So we have
our user id and say find user number xyz
and boom, we're done, right? That object comes back as a
object in memory we can start using. Okay, what about
in SQL? So MSQL,
we need to actually go make the SQL query. And then when we
get that data back though, the difference is this thing called object impedance
mismatch. Basically what that means is there's no baked in
data type called a rows and column in any programming language. So we have to
convert that data structure to something we can actually use in memory.
So we have to map that into an object that we have to use,
which we don't have to do with MongoDB because it's already a document, we're done.
Next thing we have to do is since it's in two separate collections,
we have to go make another query, pull that in and append that results to
our object in. Oh right,
and that's just querying the data. So what if we want to make an update
to it? MongoDB, we have our object. Since we're
using a table, we're probably going to be using an object anyways because that makes
sense for saving this data structure in memory.
And we do the same thing with our SQL
and then we want to make can update on that user and all you do
is pass that object in memory that we created to that user and boom,
we're done. That's it. That's it.
Because we're using that object in memory,
MongoDB maps that instantly for us. We can save the data that
we are thinking about it and using it as developers, but we
have to translate that data when we're using SQL,
right, because of that object impedance mismatch. So we
have to go actually update that data and convert that data into a
data structure that can be understood by the database.
And we have to do that for both the user table and also the
professional table. And we have to make those updates to that collection.
That's a ton of work.
So how much does that make us
go faster. So in MongoDB, querying and updating
a user takes 30 lines of code, with SQL takes 73
lines of code, which is the exact same thing.
All right, pivoting easier. So remember what I said about schema validation, right?
Flexible schemas, this is really helpful, especially for prototyping
or as your app changes. But if you want to make changes to the schema
of a document, it's super easy. No problem with SQL,
because you're enforcing data types for each thing, it becomes impossible or becomes
more challenging. You have to make sure you're updating all the other old data with
it. And again, you can totally do this with MongoDB. So for
example, I just built an IoT device and I'm saving some time series data
from the IoT sensors. And I added a new sensor to this device.
I was able to start instantly adding that new sensor data to my time series
data without a problem. I didn't care about the old data anyways because it
was only showing a week's worth of data on my charts. So this wasn't a
perfect use case for having a flexible schema with
my program. It was perfect use case and
performance, right? That's what we're all here for. You may not
know this, but doing joins is extremely expensive, both for
time and space for your databases. What's happening is it's
making a query to all those tables, pulling them all to a center point in
memory, and then doing a query on that data in memory from your
database. This is blocking slow and
does not scale, right? Especially if you're doing these same things over and over again.
It's going to have to keep pulling this in, keep querying this data, keep saving
it. It becomes a massive blocker for applications at
scale. With MongoDB, though, if you can keep all that
data in a single document, it takes a single query to go get all the
data you need and to show back. Nothing has to be pulled into memory and
queried and returned back, right? All the data we need is in
one place. That's massively performant for applications,
massively performant for applications and
scalability, right? So as your server
grows, it can grow horizontally with MongoDB, which I think involves
a little bit of explanation, right? So vertical
versus horizontal scaling. So let's say you have a SQL database
and you run out of space on your server. What do you do?
Well, if you have to go ahead and buy a bigger hard drive,
you have to pause sending data to that old server and you need to start
sending all that new data to that new server. You could have downtime.
It takes a while. It's dangerous. And you probably know this
from memory or like just from experience. But buying humongous hard drives
quickly becomes more expensive, right? It exponentially gets more expensive.
It starts cheap and then explodes up. So you're vertically scaling
because you have to buy a vertically larger database and transfer all that
data over to it. MongoDB horizontally scales.
That means if you run a space on a MongoDB server, all you have to
do is buy another server and MongoDB will
automatically start, it's called sharding, and it'll start sending that data to the other servers
instantly. There's no downtime instantly, right? You can do some cool
stuff with like geolocation data and putting data close to users. There's some really
interesting sharding strategies you can handle. I will also make a note too.
A lot of document databases do not horizontally scale.
If you're building a huge application, make sure your database
can actually do horizontal scaling. All right,
that's all four I think. 1234 yes. Awesome.
So what is next for you? If I've inspired
you at all, this is just a tiny taste of your stuff
here. I wish I could have talked about the four biggest mistakes
I see developers make when they come from a SQL
background to NoSQL, but we don't have time today.
But if I've inspired you at all and want to learn more about
NoSQL databases, I highly recommend breaking
out university mongodb.com. It's a totally free and open training
source developed by MongoDB and you should totally check out the m one
course. It's incredible and an amazing place to start if you're a brand new developer
or brand new MongoDB developer and you can go as far as you want with
those courses. They're amazing though. Other thing I'd recommend too
is just get out there and do it right.
If you have a home project and you're building something,
just swap it out and use MongoDB as a database
layer and then Google and try to figure it out. Come to our community,
mongodb.com or a dev hub and try to figure out how to make the application
for me. I don't expect you to learn something from having come to
this talk. I expect you to have learned something after doing it on your own
and I encourage you to go home and actually try it out.
Dollars and in fact, if you want to get $100 in free Atlas credits,
which is our hosted option in the cloud, which is so
easy, there's a free tier forever. But if you want $100 in free credits,
you should use code Joe K 100 and
sign up. It's incredible. Seriously, I haven't set up a server
side MongoDB instance forever because Atlas is so easy and convenient
to use. You should totally check it out and here is some additional resources
for you. But lastly, I just want to say thank you so much for having
me here. I really appreciate it. You are the best. I feel humbled and lucky
to be just a part of this. And yeah,
I guess I hope I see you soon. So if you enjoy this at all,
I would love for you to hang out with me later in the future.
But here's my socials. You should totally come hang out with me. The best place
to get a hold of me is on my twitter. But that's
it. Again, thank you so much. You're the best.