Transcript
This transcript was autogenerated. To make changes, submit a PR.
This post is from Shangping Zhao, the CEO of
world's largest cryptocurrency trading platform.
What he's referring to is one of the largest data each
in terms of number of people affected.
In the summer of this year, a national police database of a
certain country was leaked. This database contained
terabytes of information on a billion people.
The hacker offered to sell the lot for ten bitcoins,
roughly $200,000 at the time,
CZ or Shangbang. Zhao's post came in. Four days
after the hacker's post, a reporter from
Wall Street Journal verified that the leaked information is
valid when they started calling people from that leaked
database. As software professionals,
we might be more interested to learn how this leak
happened and how to prevent this.
Apparently a developer wrote a piece of blog
with valid elasticsearch credentials. This credential
was valid for a year and the service could be accessed publicly.
If that breach doesn't scare you, and you think
that you and your organization are immune to data
breach, I'll recommend checking out this website that
visualizes largest data breaches across the globe.
Welcome to Devsecops 2022 in the
security track. I'm Devon Ahmed with the session,
who's managing the credentials for your data infrastructure.
I'm a developer advocate at Ivan, a company that
builds and manages your data infrastructure based on
open source technologies. I'm from the beautiful
New Brunswick, which is in the east coast of Canada.
For the last ten years I have been focusing on application and data
infrastructure. In my free time. I'm a pro bono
carrier coach. That means I help students and new grads
to start or transition into a carrier in tech.
Now, when we talk about data infrastructure security,
that's a very wide field. There could be issue of
physical access. Someone might run away with the
disks and then there is nothing you can do. Your data services are
running on some physical or virtual machines.
If the host machine is corrupted or compromised,
then the services that are running will be impacted as
well. There could be some SQL injection or insertion
attacks from a web application,
and any of these three might mean you would
have data loss or some backup related issues.
Today, I'm not talking about any of these. I'm talking
about database access. And that is because 80%
of data breaches are the result of poor or reused
passwords. Talking about poor or reused passwords
in the agenda, we have the problem,
the database access. We'll talk about how
dynamic credential can be a solution since
we don't have any limit on the number of choice we have.
I'll talk about the strategy on choosing the right tool.
And finally, the best way to understand something is by
a demo. So let's talk about secret
sprawl. What is secret sprawl? You might have a CI
server running somewhere with your database credential.
Some unhappy employee left the company six months ago
and they might have access to your data infrastructure.
You just don't know where you have your passwords.
This issue is called secret sprawl.
Since we all love to roll our own, you might
be inclined to create an encryption as a
service, although that might not be your business expertise.
So this is another issue where companies
are creating their own custom solutions, including custom
cryptography, whereas there are tons of useful
solutions or products out there. Here's my
favorite. Using the same database password since
forever. And this is password under
the table where everyone sort of knows the password which is using
passed around in the office. And if
you have passed any credential
to an application, that's a bad idea.
Applications are not very good at keeping secrets. The moment they have it,
they're going to leak it to some audit logs,
some sort of outputs, and it
creates a disaster all throughout.
So with that in mind, let's talk about the protection,
the AAA model, which is authentication,
authorization and auditing. It's easier to
use that AAA model to a product
or service that we might be aware of. So let's take Apache
Kafka as an example. How does a or authentication
works with that? So the idea is we have to be able to
properly authenticate Kafka clients to the brokers.
And for that there are two mechanisms. One is using SL
or secure sockets layer, and the second one is
SASL simple authorization service layer.
So with SSL the idea is to issue certificates to your
clients signed by a certificate authority or a CA.
This is the most default or common setup if you're talking to
a managed Kafka cluster. The second one which is SASL.
There's a term simple within the name, but don't be deceived.
It's not that simple. The idea is that the authentication mechanism
is separated from the Kafka protocol. It's popular in
big data systems and most likely if you have Hadoop setup,
you are already leveraging this.
Once your kafka client is authenticated, the brokers
need to decide what they can or cannot do. This is
where authorization come in in the form of ACl or access control
list. The idea is pretty simple. User a can't
do operation b on resource c from host
D. The last a which is auditing.
And you can think of Kafka audit logs. The value
of audit logs is that they provide data you
can use to assess security risks in your kafka clusters.
You can also use a number of sync connectors to move your
audit log data. Let's say an s three bucket so that you can analyze
data. However, there are some challenges to
address with ways in which enterprise handle
tasks related to Apache KafK configuration.
That's why open source projects like Claw can help
to add an audit layer on top of Apache Kafka.
Claw, as a data governance toolkit can manage topics
sels schemas. With Claw, it's easy
to check later who requested what and when the
change went live thanks to the audit logging feature. If you'd like
to know more, check out Clawproject IO.
So now that we talked about the problem,
let's talk about the solution. What is dynamic credential and how can this
be a solution? As the name suggests,
dynamic credentials are generated on demand. They did not
exist before. This provides time bound
access. Let's think of a scenario. You have an engineering team
where the engineers need access to your database for
8 hours the time period. They work every day.
So at the morning they start their work and they generated
a dynamic credentials which is good for the day. Now imagine
you have some applications that also talk to your database
and an average call is a few seconds
and the application generate credentials right before
they make the call. Does it make sense to give
those applications an eight hour access as well?
Probably not. Probably you want those applications
to have a few minute of TTL or time to live for
their credentials, whereas your human users might have
eight or 9 hours of access.
Dynamic credentials mean that the calls that your human and
machine users are making, they can be audited as
well. One thing we're certain that there is no shortage
of tools. That's why it's important to talk about the
factors when choosing the right tool, the first one being flexibility.
Your developers love flexibility. They might
be talking to the tool using a UI,
ClI or an API call.
Your engineering team might have a number of services for
cloud providers, some other managed services. You don't
want to have different secret management tool for each.
So you ideally would like a lot of integrations for
the tool you choose. If encryption
or cryptography is not your core business strength,
you might want the secret management tool
to handle the encryption as well.
You'd also want the automatic expiry of tokens or secrets
to be handled so that you don't have to create business logic for that and
in the event of a breach, you would expect
the passwords to be able to revoked as well.
With that, I propose Hashicorp Vault, which is
an open source secret management tool that started at
Hashicorp as an open source project.
With Hashicorp you can interact with a tool using a
CLI UI or an HTTP API call.
There are a number of providers, including different cloud providers
or database, so that you can generate dynamic credentials.
Besides dynamic credentials, you can store long
lived credentials within vault as well. Vault provides
encryption as a service so that your data would be encrypted
both at rest and in transit.
You can manage or revoke leases on secrets.
If you have experience handling x aiven hundred and nine certificates
yourself, you know how much painful that process is.
So Vault can actually take away that burden
from you and it can act as a root certificate authority
or an intermediate certificate authority. Depends on how you set it up.
Finally, a number of customers have used
Hashicorp vault in production and still they are.
It's safe to say that vault has been battle tested.
So that's one more factor to consider when
you choose your secret management tool.
Now let's take a look at a setup where there is no
secret management tool and you're using static password to communicate to
the database directly. How would that interaction look?
So you tell the database, here's my static database password,
give me some data. And the database would say,
yep, that looks okay, here's your data.
Now let's take a look at a setup where you have something like Vault in
the middle. At first you'd have to authenticate to Vault.
Vault would talk to the provider. In this example,
this is the database and verify that your user credential is
valid based on the ACL or
access control list that is set up. Vault would generate
a dynamic credential for you that would be time bound
and you, a human or an application would be
talking to the database using that dynamic credential.
Let's take a look at Vault's architecture so that we can better
understand the demo. There's a clear separation of components
here. Vault's encryption layer, referred to as the barrier,
is responsible for encrypting and decrypting
vault data. The HTTP API and storage
backend, they're outside of the barrier and that's why they're
untrusted. Data that is written to the
backend is encrypted. In order
to make any request to vault, the vault has to be
unsealed. Following an unsealing process,
when the vault is unsealed, a user
can make request to vault and each request
has to have a valid token that is
handled by the auth method. The token
store is responsible for actually generating the tokens and
you have policy store to configure and see
the proper authorization is in place.
You have different secret engines for cloud
providers or databases to generate these dynamic credentials.
You can configure one or more audit device
for doing auditing, but we can only understand
so much by looking at an architecture diagram. So let's dive
in to the demo. Here we have the Aiven
console where we can create PostgreSQL or
any other data related services using the Aiven
console, Aiven CLI or terraform provider.
Now I'm choosing Google Cloud North
America Northeast one region and I'm using
a specific startup plan. But for your
business case you might choose a higher plan.
I'm going to the service to see the service creation.
You can see that under connection information I have the URI,
the hostname, port name and other connection details. The blinking
blue indicates that a VM is being provisioned.
For the sake of time, I'll fast forward the service creation.
A solid green indicates that my service is up and
running and now I can use a postgreSQL client
to connect to the service.
I'm using the PostgreSQL admin credentials here,
and by the time you're watching this video,
these credentials are already invalid.
All right, so now that we are connected to our PostgreSQL service,
let's start by creating two tables. The first
table is weekly metrics reporting table.
This has information like product downloads,
GitHub stars, Twitter followers, so all public information.
The second table is employee salary table. This would
have sensitive data on our employees.
So imagine an application that is supposed
to be able to read the weekly matrix reporting table,
but should not have read access to the employee salary table.
So let's now add some sample data
to both of these tables.
After that, we'll configure hashicor vault to generate
dynamic credentials so that we can only read the
weekly matrix reporting table and not the employee salary
table. So let's start vault by Vault server
dev. The dev flag indicates that vault
server started in a development mode. This means that
the server is unsealed. It gives us the root token,
which is like the master password. And this is almost
all the time. This is for demo purposes. Do not use this
in production. All right,
so now that we have the vault server up and running,
let's export vault underscore ADTR so the
address of the vault server on our CLI,
just so that our CLI knows where the vault
server is running. So I'm running the vault server on
my same machine. So that's why it's local host localhost and
it's running on port 8200.
So once I have that, I need to
enter my postgresql service password a couple of times.
That's why rather than typing it again and again, I'm adding
that as the environment variable as well. So once I do
that, I enable the database secret engine. You can have different secret
engines enabled for different providers, but this demo we are
creating database secrets. So I'm enabling the database secret engine.
The second step is I'm configuring vault with proper
plugin and connection information. So the plugin for database is
postgreSQl database plugin.
I'm calling the role metrics read and write and
the username and password. Rather than hard coding the value
within the connection URL, I'm using a template. This is for best
practices. So once I do that,
the third step is I'm configuring a role that maps a name
involved to a SQL statement to execute and create
the database credential.
So here I'm saying that for this
specific metrics read and write role, the creation statement
is granting all on weekly matrix reporting
table so it can perform all operations.
And the credential would be valid for 1 hour.
So that's the default TTl or time to aiven. So these
were the three steps. Enable the database secret engine, configure vault with
the plugin information and then configure a role.
So once I do that, I can use vault ClI,
the vault read command to generate a dynamic credential
like this. Or we can also make the HTTP
API call to generate the credentials.
You can use the vault UI as well, but I'm not showing that in
this demo. So in order to make the API call
we need a token. So I'm using the root
token here. So here
you can see that we generated two sets of postgreSQl
credentials and both are valid. They will be valid for the
next 1 hour or so. So once
I have that, I'll go back to my postgresql
service. I'll connect and then start my testing.
My testing will have two steps. Step number one is
I'll connect using the admin credentials and try to
read these two tables and expect that I should be able to
read it because those are admin credentials. Step number two would
be to connect using one of these dynamically generated credentials
and see if I can still see those two tables.
So first I'm connected using the admin credentials
and I'll try to read the weekly metrics reporting
table. I can see it,
it's expected. And then I'll try to read
the employee salary table.
I can see that because this is an admin credential.
Cool. So now let's disconnect and reconnect
using the dynamically generated credential.
So we generated two credentials. We can pick any
one of those, whether the one using the CLI or using the
HTTP API. Doesn't matter, both are same.
All right, so let's pick the one we generated using the CLI.
So this is the username, and default Db is the name of the
database. And let's copy the password for
this username. All right,
so since the authentication works, so we are able to actually connect
to the database using this dynamically generated password. Let's check
the authorization part. We can read the weekly matrix reporting table
because that's how we set up the role. Now, moment of truth.
Okay, so we are denied permission on the employee salary
because we didn't give any permission other than the weekly
metrics reporting table. So it seems that our authentication and
authorization worked. Now time to test the
auditing feature for Hashikar fault. By default,
auditing is not enabled. So step one is enabling the
audit option, and you can enable the
audit in the default path or a custom path like in this case,
I'm enabling it under vault underscore audit underscore
one file. Once I read
the audit file, I don't see any information because after
enabling auditing, I didn't interact with fault.
So let's interact with fault. Let's generate
a dynamic credential and then let's see
the audit file. This time
we can actually see some data. We can see that a secret was
generated, it has a lease id, and under the data we
can see that the secret is not in plain text,
which is expected. You don't want your credentials to be in plain
text on an audit file. In this case,
the data is hashed with a salt using HMaC Shah 256.
If you are a system administrator, you might be thinking that
how can I use this information to be able to tell
who accessed my resources?
In order to do so, you can actually generate hash
using the same audit hash function. So in this case,
let's copy this username. So let's say you have a suspection
suspicion on this user and you're
passing their username into let's say a payload
file. And the idea is to use this payload
file to make an API call using the same
audit hash function, you generate the audit hash
and the hash would match
the hash under the log file.
So now let's make the call and
this is going to under Sys
path into the vault audit one audit
hash function. So you can see that the
hash that's generated, it matches. It ends
at 10 two, and the username
from your audit log also ends at 10 two.
So you can run one or more username or any other
data on this audit hash function and then
generate the hash yourself. This would let you take care of
the auditing as well. So with Hashicorp vault,
we checked the authentication that worked. Authorization also
worked. We were not able to read the tables, which we are not allowed to.
And now it shows you how you can use auditing as well.
So you did everything that you were told, you followed all
the best practices, but someone still figured out how
to crack your infrastructure. What do you do? Do you
have a break glass procedure? That means do you
have a bypass mechanism that in case something catastrophic
happens, do you have a procedure where
you can still access your services? I'm not going to
go into too much details on the break glass procedure for
data infrastructure security, but you can read
more on that and make sure that your organization has
a break glass procedure besides having the
regular procedures in place. I know this is
a ton of information for 30 minutes and you might
be wanting to do the demo yourself.
If you'd like to do that, there's a blog link in
this slide and there's a QR code that will take you to the
same blog. You can follow along the blog. You can
create free account on Aiven. Alternatively, you can use
another postgresql service for the demo if you have one running.
I'd love to hear your feedback. You can reach out using
the LinkedIn or Twitter handles or shoot me an email if
you have any question. I look forward to networking
with you throughout the event and thank you so
much for joining my talk.