Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hey, everyone.
My name is Dewan Ahmed, and I'm a Principal Developer Advocate at Harness.
I'm thrilled to be here at Conf42 DevSecOps 2024.
Today, we're diving into securing Git repositories, preventing
accidental secret commits with GitLinks and Harness open source.
Now, it's a common challenge for developers, accidentally committing
secrets into Git repositories.
But it doesn't have to be costly.
I'll show you how tools like GitLinks and Harness Open Source
can prevent this disaster and protect your Git repositories.
Let's dive in.
Let's start with the first question.
Why are Git repositories vulnerable?
There are a few reasons.
The first one is the very collaborative nature of these resources.
Multiple developers are working on this together, and there are higher chances of
someone accidentally committing secrets.
The second one is the ease of accidental commits.
you are checking in multiple files that you are working on locally.
Maybe there is a n file you are trying to test.
Maybe that has a database credential.
accidentally committing secrets is actually more common than you might think.
Then of course there are growing number of repositories and code base that
your admin is managing There are public repositories private repositories
open source contributions And with containerized applications microservices
There are too many repositories these days that without Proper access control
and procedures in place, it's much easier for some secrets to just seep
in to your organization repositories.
Let's look at what we mean by secrets in code.
So here we have the main, the protected branch, or supposed to be
protected branch, and then a developer is working on a feature branch.
they're accidentally committing an API key, and then, someone
caught it, and they're adding a commit to remove the API key.
The security fix is the green branch, which is catching all those issues.
Then there's a DB update, another feature branch, the blue branch, and then someone
accidentally adds the database credential, an SSH key, and an access token.
And then again, the green security fix reverts the sensitive
information using a commit.
But an interesting thing here is, all of these are just
staying in, in your history.
so it's incredibly difficult to erase, history.
Once these sensitive information or secrets are going into Git, it, it gets
highly likely to be used by bad actors.
So what do we do?
We shift left on security.
What do we mean by shifting left?
It means catching issues at an early stage in the development life cycle.
Why that's better?
The benefits are it minimizes the exposure risk, the lower time it is
exposed, easier it is to remediate.
It also reduces the time and effort you need to spend fixing the issues later.
So we'll present you two open source tools.
The first one is Gitlix, an open source secret scanner for Git repositories, but
it can also scan files and directories.
The key features are predefined and customizable secret patterns.
We'll show you a predefined secret pattern in the next slide.
It's lightweight and pretty fast.
You can use it locally.
You can also use it in CI CD integration.
Here's an example of GitLinks rules.
So this is a piece of code I proposed to Gitlix to add Harness Personal
Access Token and Service Account Token.
kudos to Gitlix team.
Their contribution process is very straightforward and
the pull request was merged.
So now Harness Personal Access Token and Service Account Token
can be recognized by Gitlix.
So let's go over this.
The ID, of course, the unique name, Harness API key.
Description tells the reader what this is.
RegEx, let's explain the RegEx.
The term would start with PAT or SAT for personal access token or service
account token, followed by a dot, followed by a 22 character alphanumeric.
It could also include a dash or underscore for this part, followed by a dot.
And then 24 character alphanumeric followed by a dot and a
20 character alphanumeric.
for most of us, we'll probably ask ChatGPT to generate, brackets for us.
And you can also, explain using, some services.
Let's see how GitLinks works.
just like the previous screen, the detector, detection process
will use a pattern based scanning.
It will look at the pattern.
If there is a secret that matches that pattern, then it would be detected.
You can configure rules, and then there are some modes.
You can configure GitLinks locally for local scanning, as pre commit hooks,
and of course in CI CD pipelines.
Now, GitLinks is one part of the puzzle.
Let's introduce the second tool, Harness Open Source.
Harness Open Source comes as a lightweight Go binary and you can install it locally.
You can run it on a 4 digital ocean droplet or an AWS EC2 VM.
Harness Open Source, this instance comes with user management as
well as a built in secret manager.
Projects within Harness open source, you can relate to GitHub orgs.
Each project can have one or more repositories.
And the CICD pipelines are tied to each repository.
You can have one or more pipelines within each repository.
You can also have built in artifact registry for pushing and pulling
artifacts, for example, Docker images, as well as Git spaces.
Git spaces are the pre configured dev environment.
If you have heard something called a dev container, it's based on that.
Think of this as GitHub code spaces.
We'll learn more about these during the demo.
How is Harness open source, different from Harness platform?
Harness open source is ideal for individual developers or small
teams, comes with these five modules, codepository, CI, CD, Artifact registry,
and pre configured dev environment.
We are calling it Git spaces.
And then, of course, there's an easy upgrade to the platform
with more modules, with robust.
governance, and scalability.
All right, so enough talking.
Now time to go into the demo on how we can prevent accidental secret
commits to the Git repositories.
Let's get started with Harness open source.
I'll provide the link to this documentation in the video description.
So here, all you need is Docker installed on your machine and you
can get started with this command.
So I've already run this command and if you go to localhost port
3000, you'll see the Harness open source instance up and running.
All you need to do is create an account and then you'll see a screen like this.
Projects in Harness open source is like collection of
repositories, something like this.
GitHub orgs.
So let's actually create a project.
I'll call this Confort2, create the project.
Once you're within the project, you can either import repositories
or create a new repository.
Let me import a repository.
I'll choose GitHub as the provider.
I'll use Harness Community as the organization.
And I'll use a popular cloud native project called PodInfo.
It's a public repository, so I don't require authorization.
I could import pipelines, for example, GitHub Actions from GitHub, or if
it's GitLab, the pipelines from there.
But for this one, I'll create the pipelines myself.
so I'll skip importing the pipelines.
Let me import the repository it's almost instant and then I see the repository is
imported here Once you have the repository It's just like a Git provider you can
do all operations That you typically do for example, you can make a clone You
can contribute it by adding a new file.
You have commits branches pull requests So let's clone this repository locally.
So what I'll do is I'll hit clone.
I'll copy the clone URL.
Once you have copied the clone URL, you can choose your favorite code editor
and you can clone the repository.
For example, here I'm using VS Code.
I'll click on clone repository, paste in the link and then I'll select a location
within my local machine And hit clone.
Alright, you can see the repository is cloned locally.
Once the repository is cloned, I'll try to add a secret and
try to commit that secret.
You can follow this GitHub repo, harness community, harness open source lab.
I'll add the link in the video description.
And here you can see an example.
If you scroll down, you'll see an example of secret detection.
So let's follow this.
We have an example for a harness personal access token.
It follows the same pattern, but it's not actually a valid secret.
So let me copy this and then go back to my VS code here.
I'll add a new file.
You can name it anything.
I'll call it, config dot YAML and paste in what I just copied.
You can see that.
This is Pat, followed by the same pattern, what we showed you, for
harness API key, as a gate leaks rules.
So let me save this file before I actually try to push this to the repository.
I'll go back and within the repository I'll click on manage repository.
Under security, that there's a toggle for secret scanning.
It block commits containing secrets like passwords, API keys, and tokens.
So this is based on GitLinks.
So let me toggle it to enable secret scanning and hit save.
So once we have enabled secret scanning, then we'll go back to VS
Code and add a meaningful message, whoops, and then hit commit.
I'll try to sync changes and then it'll ask me for my, get credentials.
So for the username, you can go here, In files and you'll see there's a button
called clone click the button clone And then generate clone credential So you
can see my username is the one for you.
It will be different and then there is the password or api token as well.
So let me copy this go back to vs code So for username i'll enter my
name the one For password, I already have it copied in my clipboard.
I'll paste it and then You'll see that I was not able to push the commit Let me
open the git log to see what's the issue
So I can see that it says failed to push because The push contains secret
You So it found a generic API key.
it even tells, where's the file.
It's config.
yaml file, the secret, the commit ID, and a description.
It detected a generic API key, potentially exposing access to various services.
All right.
This shows that when I have the secret scanning enabled in my Harness open
source repository, which is based on GitLinks, it looks for a pattern.
So this pattern matches one of the rules in GitLinks that is Harness API key.
And if it finds that pattern, they need to understand that this is actually a secret.
So I won't let it go in to the Git repository.
This is much better than you find a leaked credential in your Git
repository and then trying to remediate and trying to minimize the exposure.
There is one heads up though.
So the heads up is this method prevents new API keys and secrets
from going in to the Git repository.
So after you have enabled, it prevents new, secrets, but if you already have
existing secrets in the repository, it doesn't scan and find and detect those.
So that's something to keep in mind.
Let's sum up.
So in this video.
We understood why Git repositories are vulnerable due to their collaborative
nature, the ease of accidental commits, and the growing number of
repositories and codebases these days.
We highlighted the right approach, which is shifting left
for your development workflow.
Then we talked about two popular open source tools.
GitLinks and Harness open source and how they can come together
from preventing accidental secrets in your Git repositories.
So what are the secrets?
The secrets could be API keys, personal access tokens, database
credentials, any sensitive information you don't want to go on public.
Then we watched a demo on how easy it is to get started with Harness open source
with its light footprint You can get started within a few seconds whether you
install it locally or host it yourself on a cloud VM We saw how we can import a
repository Enable secret scanning and once it's enabled it will prevent any secrets
that is defined within a GitLinks rules.
And also, if your organization secret is not already in
GitLinks rule, the contribution process is very straightforward.
So you can define your own GitLinks rules.
I really hope you give Harness open source and GitLinks a try.
I'll add all the required links for this presentation in the top description,
as well as you can find me on LinkedIn.
for listening.
Thank you for joining this talk.