Transcript
This transcript was autogenerated. To make changes, submit a PR.
Are you an SRE,
a developer,
a quality engineer who wants to tackle the challenge of improving
reliability in your DevOps? You can enable your DevOps
for reliability with chaos native.
Create your free account at Chaos native Litmus cloud
hello and welcome to today's session here at Comp 42.
My name is Allen Vailliencourt. I'm a sales engineer with teleport.
Today we're going to talk about keys or certificates for SSH
access. And why should I care, especially if you're coming from the
SRE world? So let's jump right into it.
So this is probably what you're used to seeing if you're
doing any kind of Linux systems administration work
along that line. This is probably a very familiar screen to you.
You fire up your local terminal and you see
a bunch of publickey private keys, or maybe even this when
you're accessing a remote resources, right? You're using SSH,
what's your identity file, what user? And then
you're logging in and just kind of moving on. This is how we've
been accessing resources for years, and it's still very
popular. It's not going away anytime soon. So continuing
on this, what about your servers? How long
is your authorized keys list? Have you taken time to go look
at one of your production servers and do a can of that authorized
keys list? And you might be surprised and say, whoa,
there's a bunch of entries on here. So the
big question is, do you even know which of those keys
are valid? So let's look at publickey authentication.
Let's look at a few of the pros around it. Why do
we have it today? We're not going to go into a deep history of it,
but just a high level. For years we've been using PKA for accessing
resources. It's not going away anytime soon. I mean
today to gain access to AWS, GCP, probably your routers,
your switches, your Linux servers, wherever they might live,
you're leveraging. More likely than not, something along the line
of public key infrastructure using with OpenSSh.
The original intent behind it was better security.
So with a good PK system in place,
users are not having to worry about postit notes and passwords.
They're not having to worry about long complicated passwords.
Back in the day, for us old salts, you were probably
member something like our host, right? And accessing
remote resources that way, it's come a long way, and then having
passwords is one way. And then when keys, publickey, private keys started
coming out, it just simplified, especially from a systems administration
standpoint. And with PKA, you can even automate
your processes todays. If you're using CI, CD,
Jenkins, BitBucket,
bamboo, Ansible, Circle, CI, whatever,
terraform, GitHub, you can leverage public
private keys and even that kind of thought process to
access those resources. In fact,
most of today's modern services like GitHub or BitBucket
or GitLab do not necessarily recommend using passwords
to authenticate in order to push your code up. They'd rather you generate a
public private key in order to do that, and they have full support
for that. So keys are super easy to create and deploy.
We've been doing it for years, SsH,
keygen, T and then ED 2519.
That's the encryption cryptography standard that I use.
And you might use RSA or one of the others, and that's fine.
This is not the webinar for talking about that, but for generating
a key, it's really easy to do. So that
brings in line, right? It's not overly difficult to find
a system that does not support public key authentication today.
So the question is, right, keys are superior. Change my
mind. That's kind of what we're thinking many times, right? Why change
from keys to certificates for authentication? What's the reasoning behind this?
If the system isn't broke, why change
it up? Well, we're going to talk a little bit about that and then we'll
jump into high level demo of how
you can get started today. So a few cons that
we can say that exist around PKA comes
along this line. Right? What happens when your user moves on
to a new role? Maybe they move on to a new job, change departments.
Maybe they were part of your SRES team, Sysadmins team,
security team. But then they went to developers or product marketing
and they had access to all these resources through public private keys,
but now they're no longer there. So what did you do?
Or have you done anything? What about when a device is
potentially compromised or stolen? Laptops. With all
of us, many of us working remotely, working distributed laptops
are powerful enough that most developers, people in this world,
because we're so remote and traveling, things like that,
that we have laptops. And if it doesn't have local encryption turned
on and it gets compromised or stolen, what then?
So this interesting article from ssh.com talks a
little bit about the spread and the growth of
keys out there. Just in this piece,
talking about one customer, financial sector, 3 million
keys, 750,000 distinct key pairs from
15,000 servers and for large environments, that's probably
on par for the norm. Maybe in your environment you could probably start calculating out.
You're probably like, you know, we've got quite a few. And as an example
you think, well, maybe we're just small. There's only 1020, 30 of us.
We only have like 50 servers and there's not that much. Well,
do the math and you can realize you have 50 servers
times 30 developers, that's what, 15,000 keys that are
now out there that you're having to manage and rotate or
not rotate for accessing your resources. So continue
on. So what about when someone accidentally commits a private
key to their public repo, right? Within minutes this can
be utilized to log into a service and cause chaos.
Let me pull a story. Time, ladies and gentlemen,
is years ago at a place I worked at,
we had a gentleman that was one of our new DevOps engineers,
SRes. And then for this person coming from a traditional world,
they weren't used to using git, GitHub and committing
stuff out there. So they had their private key for AWS
and they accidentally committed it to their repo. And that repo
wasn't set for private. It was a public repo out there for testing. Well,
guess what happened? As you can imagine already bots were able to scan
that within minutes. And within hours we had over 200
servers being spun up on AWS data centers all
over the world. So we got the email alerts from AWS.
So immediately a bunch of us jumped on. We started shutting down, deleting all
these servers. We nuked that key so that way it could not
be reused again. And we had a good post mortem about having a good
git ignore file and on top of that, don't committing your private keys.
But this happens. And today in some of these services they actually
have scanning tools. If you try it today, I think within
GitHub and other places you'll probably get an email really quickly
because their system will scan it and say, hey, it looks like you have a
secret or a key or something publickey. So you might want to check on that.
But still, the nuance of it still depends.
The responsibility sits on you as an SRE to
handle this and manage this. The other big thing, right? Scaling out deployments
can be a wee bit challenging, right?
Businesses are using homegrown methods to rotate keys or
maybe commercial open resources vaults to manage this. I talk
to a lot of customers on a weekly basis that they've
grown and it was fine when they're small, but now they're hiring as
a lot of sectors are hiring a lot and they're rapidly
scaling out their virtual infrastructure. And on
top of that they're having issues like what do we do now,
right, how we manage this. Maybe we had a cron job or
a bash script or an ansible playbook. And then it gets really
complicated place I used to work at another job,
we only had eight or nine developers and I had like 80, 90 servers.
So I wrote an ansible playbook. So I would get all their public keys,
I'd message them on slack or email, they'd send it to me, and then I'd
run my playbook and update all my servers. Whether or not they needed
access to it or not, they were devs and it was just easier to just
put their public key on all the servers and just kind of go from there.
But when that person moved on or that project moved on,
me, with everything else going on, not necessarily having a lot of
time to go back and clean up those keys and definitely pose some
challenges there. You know another big thing, right? Keys don't expire.
Unlike other things out there, keys typically are not
going to expire. So that brings in line that troubleshoot
that if it's out there, it could be out there a couple of years later
and still leverage and it's still valid and it can still work.
So Hackerman over there, give me some keys, he's going
to have a good time with it. Just one of those cons to think
about. So let's segue a little bit to certificates.
Did you know that open SSH, which is pretty
much the de facto standard for SSH certificates
or SSH period, had support, added support
back in version 5.4? And this is actually
from the release notes, talks about SSH certificates,
what they're made of, how you would kind of generate them. So here's the
big kicker, I think. Look at there at the bottom. It was released in 2010,
eleven years ago. So let's let that sink in. So we've
had the ability to use SSH certificates instead
of public private keys or in lieu of them or in conjunction with them for
eleven years. So now you're thinking, I'm intrigued.
Or maybe you're like, why am I just now hearing about
certificate authentication and with all the problems, and I have that
in quotes because they're not necessarily problems, they're things that I think in the sres
world we deal with a lot and it's just kind of par for
the course and we just move on. So certificates,
we do use them all the time. You've been using them for years,
thanks to Google and let's encrypt and other
companies that have made HTTPs a web standard today.
And guess what? Five, six, seven years ago, a majority of the
sites you hit out there probably were not HTTPs enabled. It was
only those ecommerce sres on that final care checkout when you
put in your credit card information. But now, if you hit most
websites across the web, you're going to get HTTPs.
In fact, most of the time, if you hit a site that does
not have that, you'll get a warning from Chrome or one of the other modern
browsers. So the industry has kind of migrated and
using HTTPs as a de facto standard, which is
all certificate based authentication or authorization
and verification on the web. So from
an SSH perspective, we're not using HTTPs certificates, not using SSH
certificates, we're using SSH. So there is a little bit of a learning curve,
as you'll see as we kind of dive into this. But I
believe once you get over that, you'll realize that
it does pay off in the long run. Companies today already using certificate
authentication, Netflix, open source,
the bless protocol, they're open source
using certificates. Lyft has their fork of it, and there's a number
of them out there. And these companies have the developer staff, right?
We're thankful for Netflix and Lyft, that open source, some of their big projects
that become standardized across the board, but they
have large engineering teams, large development teams, and they're software
companies. Whereas your organization might not have that
expertise in house to build and run or develop something
like that. Or also there's a general lack of understanding knowledge
around certificate authentication. And traditionally there's
also been a lack of good tooling around, provisioning around,
storing around auditing and rotating of certificates.
So when you wrap all those together, you're like, that's too
complicated. I'm just going to stick with what I know, which is keys,
and just kind of move on from there. So let's look at a few of
what I would call pros of using something like SSH
certificates. We have a usability improvement,
and part of that is this message that we get
when we're using into a system that we're not familiar with or
we have. Right. We log in and you
get this warning and we're like, do I want to
connect? Yes. No. And guess what? We just
continue on and we just kind of ignore it and go from there.
So there's can operability improvement.
When you leverage something like Ssh keys on
that line, you get host key verification, you get key distribution,
things that help with from a certificate piece.
Then there's a security improvement as well with certificates,
one you don't have to worry about permanent keys out there. And on
top of that, you get the ability, as you'll see, to have things like
some metadata, as well as having, what do
you call, excuse me, expiration dates, things like that to help you
with those certificates. So let's look at it in
an image. Certificates, SSH certificates in an image.
So we have here is we've got a valid
principles, we've got keys, we got a signature
piece of the puzzle there, and we have things
that make it encrypted. And what that does is it
makes it so that way when a key is being
leveraged, I mean, a certificate is being leveraged, you have
the stuff there. So a signature. So if you have a signature
there and someone tampers with a key, guess what? That signature
gets invalid. If that key gets tampered with and it gets broken, we have a
valid after and a valid before date. So you can set dates on
certificates so that they only operate within a certain amount of time.
And then what kind of certificate, whether it's a user certificate.
So MIA is a user logging into a system or
maybe a host certificate, which would be your web server,
your application server, whatever server you're trying to gain access to,
and then some other expenses and then valid principles and then a few
others. So I have a link there to a blog where we talk
about this a little bit more in detail so you can see some of that.
So let's dig a little bit deeper on this. So this part
we're going to start kind of peeling the layers back, kind of walk you through
of how we're building this out. Then we're going to jump into a quick demo
and show how you can even do this today. So, certificates require a
certificate authority to own the public private key pairs to generate
those certificates. So you need to have a CA and
you can roll your own. And as we'll do here in today's demo, from a
cryptography standpoint, we're really not changing anything.
We're not adding anything different. We're just validating and we're
signing those keys across the board. If a
certificate is tampered with, it breaks that signature and invalidates
that certificate. So that signature gets broken,
that cert is invalid. And guess what? Now your connectivity to that system
is now denied. And as I mentioned, once before, and you'll
hear me probably mentioned a few more times,
certificates can be set to expire. This is probably
one of my favorite features about using SSH certificates is
the fact that when one is issued, you know that there's only a certain time
to live for that, and once it's done, you have to reissue
a new one in order to continue accessing your systems.
And of course from a security, maybe even an SRE standpoint,
using a shorter time to live on a cert hopefully equals your security
team sleeping a little bit better at night, not worrying about these keys
that are out there, host certs which are used to identify
hosts. It's that they say who they say they are, and then we have user
certificates which care used to identify the user, that the user is
who they say they are as well. So let's continue to break this down
and start showing you some code and how it
works. So what we're going to do first is what you'd need to
do first is generate that host and user certificate
authority. So what type. As I mentioned, I'm using Ed 2519,
then the file name. So I'm going to, hey, write this as host can,
user ca, whatever it is, a comment so you can have a little
bit more hey, this is a host CA user CA. Now we're going to generate
a host key and then sign it. Then we're also going to generate a user
key and sign it. So I generate my host key again using
25519 for what type my file name,
what's going to be called, and then a passphrase, which is optional if you want
to put in a passphrase. And then we're going to create and
sign the host certificate based off that key.
So again, still using ssh keygen. So what I'm
doing is the host file name of that ca private key. So I generated that
ca private key, so I'm going to use that to basically
sign it with my I is my cert's identity. So this
is just more of for logs and things like that. You know what
the cert identifies as. The h is for
a host certificate. The dash n is our comma
separated list of principles, which would be from a host side,
maybe your fully qualified domain name. So you can see an example.
I've got app example, localhost, app app node.
The v is a time to live. And we'll talk about that here in a
little bit more. What it is. For this demo we got like a plus 2
hours from a host certificate. So after 2 hours from this creation
this certificate will expire. And then we're tying in what
that public key was and it's going to export out that
certificate. So let's go ahead and flip it and we're going to do the same
thing with a user certificate. So now we got a host, one created for our
host. We're going to do the same thing for a user. Pretty much looks pretty
much the same except it doesn't have that h because
we're not doing a host certificate. And my time to live
is a little bit shorter because I want user certificates just
to be a little bit shorter. But other than that I'm using my user ca
to sign it. My identity, hey, this is an
app or whatever, my identity, username, email, whatever my
dash n, which is my principles would be something like linux login
name. So if you're sshing into a system,
if it's like ubuntu or EC, two user using AWS or
whatever your name is, that would be your list of login names that
you're allowed to ssh in as the v is your time to
live. And then of course going back here is what
you're signing up. So talk a little bit about that v part. So you
can actually set a certificate to be like two weeks
ago up until two weeks from now in the
documentation of Openssh. Or if you go someplace like explain shell and
you look at that v and read the man
notes on it, there's a really whole host of options that you can
have for plus 30. You can do a -30 plus
30. So it's valid from 30 minutes before until 30 minutes after.
You can put specific dates. It can get really complicated really
fast. But as you look in your environment, take a look and you
architect and design and plan it how you need to have
it work with your systems. Let's continue breaking down. So now let's
go view those certificates. So you think, what happened now?
Well, you look on your system, you're going to have a bunch of files and
you're going to see that pub file
gets appended with a CRT. So it's the name of the file.
But if you look at that CRT file with the ssh
keygen l option, you're going to see a
certificate. It's going to display on your
browser. You're going to see everything we kind of talked about. You got a host
certificate there, you've got the key id,
you get the principles and then you have a valid expiration
type along that line, which is really kind of cool being able to say,
oh, that's pretty neat. You can just view that and see how valid it
is. All right, so let's jump quickly into the demo piece
of the puzzle here and show you what
it's going to look like and how you can even get started today
as well. So let me switch over here to my
terminal here. All righty, so in
this, I've got a GitHub project that
is up and running and the links are in this repository
in this demo at the end that you can pull off GitHub in order to
pull down. So if I look, I've got a couple files, I got a Docker
file. So what we're going to do is we're going to run this out of
Docker and you're going to be able to see standing
up two docker images. One is called an app
node that we're using to ssh to. The other is a bastion. So over here
we can see, I have no images at all.
So I'm going to go ahead and build this using Docker compose.
So let's go ahead and build this out and give me
a minute or two here doing that. So when you pull this file, I've got
a readme out there. There's two branches on this repository, the main branch,
and then I've got a 30 minutes branch, which is the demo
I'm using for this. So feel free to switch to that, but you can dig
a little bit more. And what those docker files are,
there is some SSH configuration that you would need to also
run within your environment to set
the SSHD configurations for me on
the containers. All right, so we've got a build. So let's go look at it.
We've got an app node and we have a
bastion node. And so our goal is to ssh
to the application node from the bastion node
or from my local MacBook here without using
standard username passwords or a public private key. So now that's
up and running. So I'm going to start these containers.
I'm going to go to docker compose up, going to detach it.
So now we're going to see here my systems
are running. We can see that one is listening on
port two, two. One is listening on port two,
two, three. So we're going to do a docker logs.
I'm going to follow both of these systems here so
we can see what's happening. We can kind of see in real time what we're
going to do. So what we have here is the
application node, which is our end node that we want to get to is
we're using to eventually ssh into, we're going to do it through a jump host,
through a bastion host. So now that I have my system up,
I've got a script here called
copy keys. And what this script is going to do,
it's going to copy down my
certificate authority information, put it in my known host, it's going to copy
my certs and then it's going to build out my ssh config that I'm going
to need to access these systems here. So I'm going to
go ahead and run this. And again, this is
a demo repo, this is not something you want to
run in production. So let me just caveat that I use this just
for learning and experiment. So please.
Security is probably not the best on this, but it's really designed to help
you as a user understand how SSH certificates work in a
little bit more detail. So now we've got, everything's been copied
locally and if I actually look in my
known host file and actually
can my known host file here, let me do that real quick.
Oh,
ssh known host and we're going to grep,
I'm going to see, you're going to
see that I've got the host ca, so what I've done is
added those host certificate authorities into my,
and this is part of the using SSH certificates. If you read
the documentation on it, it talks about having this. So that way systems
know, hey, this is a known host, this is valid. We are good to
go. So let's go ahead and CD over to
my temp ssh files folder that was created here.
And you can see I've got my public private keys and my certificates.
Some of this I don't necessarily need, but for the demo
let's not worry about it. In fact, if I want to view what
that certificate looks like, and you can see
that right here, this certificate is valid
for about five minutes. So in this demo I changed
it. So in but five minutes my certificate is going to expire.
This lets me know as a user I can log in as
these principles and what actually ssh extensions,
I'm allowed to do that. So if I look at my configuration file,
you can see I've got a host and a
bastion here. And what I'm going to do is I'm going to proxy jump from
one into the other. So let's do that and show you
how that works. I'm going to do an ssh minus f and
going to call my config file and I'm going to log into
my application node. So mealy, what happened is
you saw some information come across these other screens and you see
it says accepted certificate, right?
So this accepted and it validated that my
certificates was legitimate, it wasn't
expired, and I am now within this server.
So now I can run and do the work. So I spun all these up
just to run dad joke, right? What did the beaver say to the tree?
It's been nice gnawing you. So that's
as an example showing hey, how we ssh in. You notice I did
not get the warning, hey, do you recognize this
host? Should we ssh in or not? I'm able to do
that really right off the top on that. So let's exit
out. So you're thinking what about if? Did it really
leverage some of that certificate so I can run this command?
And what it is I'm going to ssh in I'm using to add some
little verbose debugging information.
And you'll see it from my local Ssh that it's using
certificates. So it logs in here and it sees hey, this certificate
is valid for a certain amount of time and this
certificates is valid and how we're accessing and how my host
matches this host certificate and how my app node
also matches that host certificate. So it's
leveraging those handshaking with them on the back end and saying hey,
we're connected, we're authenticated, we're able to access those
systems and we're good to go on that.
So let's look at the bastion certificate as well and see
what kind of time to live did we have on that one. This one is
a 15 minutes time to live. So that certificate will
still be valid for another 15 minutes before it expires,
whereas my client certificate should
be expired fairly shortly. So at the time I'm
recording time of this video 403.
So let us take a look and see if it will deny
me and let me into the system once
the certificates expires. And so
boom, here's what happened. I just tried. So, you know, a few minutes ago I
was able to log in, but now that certificate expired,
so guess what happens? I'm not able to log into
that system anymore, which is super awesome
because I don't have to worry about a public key that's sitting out
there. That certificate expired. So when that happens, it talks back
to the system, says, sorry Alan, you're not allowed in
because your certificate has been basically revoked, it's no longer valid.
You have to issue a new one in order to gain access
into your systems. And to me,
that's the power of leveraging SSH certificates
over keys is being able to control and
gate some of that access across the board. So as we
kind of finish up here and highlighting
some of that, how do you get started? You're probably wondering,
that was awesome, hopefully. So if not, that's okay too.
There's a link to the GitHub repo where you can pull this down and
you can add that vv for verbosity. It uses Docker,
so you can use Podman or something else and just modify,
but uses Docker and Docker compose to stand up these consider,
and this is just a simple high level overview of
how this works. If you want something that is
way more complicated and way super cool, go check out
teleport. It's a fully open source access plane project.
Almost 10,000 GitHub stars on it, used in
production by companies all over the world. It is the company I work for,
so a little bit more on that. But we take this concept
and we extend it and then we expand it so that way users can
leverage short term certificates to access resources
wherever they might live. And of course there's some great resources there.
In fact, this presentation, a lot of my content written by
some really smarter folks that I just repurposed for it. So there's
two links to how to SSH properly using SSH certificates,
as well as a little bit more diving into what they are,
where you can find me, you can find me. Shoot me an email, love to
connect. Find me on LinkedIn as well, or even on
Twitter where I tweet about food and or technology
every so often. And of course, as I mentioned, our GitHub for
our teleport, we also have can open source slack
community. We'd love to have you join in on that and ask questions as
you're playing around and learning how this works and you
want to just kind of chat about things like that. On top of that,
we are hiring. So take this moment. If you're interested in
working for a series B funded, fast growing startup, love to have
you apply with the link there.
100 employees, we're fully distributed working on open source,
so if you've got a passion for open resources and security,
give us a shout. Anyways, that's all I've got
today. Just want to thank you for attending today's session here at Comp
42 and have a wonderful day.