Transcript
            
            
              This transcript was autogenerated. To make changes, submit a PR.
            
            
            
            
              Hello and welcome to this session on GitHub, where we're focusing
            
            
            
              on fortifying your code base with GitHub specifically. There's a lot of
            
            
            
              great features in GitHub. We just got through GitHub Universe and there's some
            
            
            
              amazing copilot and AI innovations there and so many features
            
            
            
              that I think a year from now you'll wish you had started today to implement
            
            
            
              some of these things. And I love that quote by Karen Lamb showing here as
            
            
            
              we get started in today's session. My name is Travis and I work
            
            
            
              as a distinguished software engineer for a company called SPscommerce.
            
            
            
              And you may not have heard of SPS commerce. That's because we're a business to
            
            
            
              business organization that's focusing on connecting suppliers and retailers together
            
            
            
              into a massive retail network, the world's largest retail network,
            
            
            
              in fact. And I focus specifically there on developer experience.
            
            
            
              And you might be asking yourself, developer experience, what does
            
            
            
              that mean exactly? That can mean so many different things to different people.
            
            
            
              And over the last few years, this is one of my favorite definitions that I've
            
            
            
              seen pop up and land on. And that's the developer
            
            
            
              experience is the activity of studying, improving and optimizing
            
            
            
              how developers get their work done. So we're not interested in how a developer is
            
            
            
              going to communicate with the HR department to change their address.
            
            
            
              Instead, we're focusing on how we can engage with the developer and have
            
            
            
              their user experience and their developer principles line up to form this frictionless
            
            
            
              experience that they can use day to day to deliver code to
            
            
            
              production, to deliver features to production. And that's
            
            
            
              so important, especially as you think about the history of your organization, the number
            
            
            
              of existing tools that you have that are kind of forming this different experience of
            
            
            
              this CI CD tool, this source control tool, this observability tool.
            
            
            
              And we need to all bring them together to form this nice cohesive ecosystem that
            
            
            
              allows you to have the best quality of life possible. And one of my favorite
            
            
            
              quotes kind of describing this problem is that developers work in
            
            
            
              rainforests, not planned gardens. This idea of
            
            
            
              a rainforest or a jungle, that these tools have really
            
            
            
              popped out in your organization over the last 20 years when
            
            
            
              you needed a particular need, but they haven't been curated together or planned together
            
            
            
              what that ecosystem looks like. And so as we think about how we can
            
            
            
              more effectively create planned gardens for our developer experience,
            
            
            
              the reality is that there's a lot of work to do,
            
            
            
              especially when we think about just coding alone. As an engineer
            
            
            
              specifically or developer who's writing code to deliver to production,
            
            
            
              your job is far more than just delivering code. In fact,
            
            
            
              you are expected to deal with infrastructure as code, CI,
            
            
            
              CD pipelines, dev environments, configuration.
            
            
            
              You also, which is really important for today's discussion,
            
            
            
              have to deal with a plethora of supply chain, SaaS, Das, remediation issues,
            
            
            
              all related to security. And on top of that, or I should say
            
            
            
              on bottom of that, you have to deal with code quality, tech debt, feature flags,
            
            
            
              testing and of course just the overhead of the day to day
            
            
            
              operation within an organization, whether it be meetings or management or
            
            
            
              just other stuff. And when we examine this and we pull the
            
            
            
              stats from software, we find that developers code
            
            
            
              on average 52 minutes a day. That's not very much. And so
            
            
            
              we need to make that 52 minutes longer and better and
            
            
            
              better quality, better quality of life so you can accomplish more during that. From a
            
            
            
              productivity perspective, this quote from software
            
            
            
              CTO Mason McLeod, who says code time is often undervalued,
            
            
            
              continually interrupted and almost wholly unmeasurable. And I definitely agree
            
            
            
              with that, especially in my coding experience. So we need to
            
            
            
              work to improve daily work. We need to fix bottlenecks,
            
            
            
              we need to include more automation, we need to reduce feedback cycle
            
            
            
              durations. Codified best practices is one of my favorites. I don't want
            
            
            
              to have to read a whole bunch of documentation. I want it to be part
            
            
            
              of the process that I'm working in and the tool set that I'm working with.
            
            
            
              Effective documentation is so important. We so many times don't
            
            
            
              even think about documentation and how important it is to be not just present,
            
            
            
              but also accurate and of course streamlining collaboration.
            
            
            
              And one of the key toolset that we find in developer experience
            
            
            
              that can impact many of these areas is GitHub. And GitHub
            
            
            
              has had a long, interesting journey from when it started started way
            
            
            
              back as early as 2008. Right? That's when we first saw GitHub
            
            
            
              and they were really focused on the idea of git repository hosting.
            
            
            
              No longer a pain in the ass. Finally a code repository that works
            
            
            
              as well as you do, which is incredible. At the time we were just happy
            
            
            
              to get managed source control that worked so excellently quickly.
            
            
            
              They realized what they were onto. And in 2011 we see their mission and their
            
            
            
              focus move towards this, lowering the barriers of collaboration
            
            
            
              by building powerful features into our products that make it easier to contribute,
            
            
            
              which is true. We see them moving just beyond githing and saying we're going to
            
            
            
              allow you to collaborate better. And of course, moving back
            
            
            
              to the acquisition from Microsoft in 2018, we see the complete developer
            
            
            
              platform, build, scale and deliver secure software and
            
            
            
              if you've been paying attention, especially to GitHub universe, there's lots of
            
            
            
              new, exciting features that were launched even this particular month.
            
            
            
              And so now GitHub has transitioned, as of November
            
            
            
              2023, to the world's leading aipowered
            
            
            
              developer platform. And that's an exciting place to be in.
            
            
            
              But at the same time, recognize that staying up to date with GitHub features is
            
            
            
              almost a full time job. It would seem if you track the
            
            
            
              releases per month, I'm only going back as far as 2018. You can
            
            
            
              see that we're getting as many as 60 70 releases,
            
            
            
              feature releases of GitHub per month. And that's just so many
            
            
            
              explosion of capabilities that are both exciting. But have
            
            
            
              you worrying about what do I focus on? What don't I focus on? So I
            
            
            
              found a lot of our teams are looking for the hints at where to explore,
            
            
            
              where do I go next? So as we dive in today on fortifying your code
            
            
            
              base, we're zoning in on GitHub on how we can maximize your developer productivity,
            
            
            
              specifically with two GitHub tools.
            
            
            
              This is important. If we look at the Gardener 2020 report,
            
            
            
              it says that 29% of organizations have the shift towards
            
            
            
              consolidating security vendors due to operational inefficiencies.
            
            
            
              And we see that growing. That grew to 75%
            
            
            
              on the same report in 2022. And I imagine in 2024 it's going
            
            
            
              to be even more interesting on top of that. And so what is
            
            
            
              that all about, shifting security vendors due to operational inefficiencies?
            
            
            
              Well, we find some answers deeper inside the Dynatrace report, focusing on
            
            
            
              application security, where it talks about tool sprawl. And if you're
            
            
            
              in developer experience, you know, tool sprawl is a big problem. We have so many
            
            
            
              tools all over the place, and this comes back to that curated
            
            
            
              garden that we want to build. It's very difficult when you have so much
            
            
            
              individual or independent tooling and incumbents that are there. And so as
            
            
            
              we look to this and we gauge we're already in source control, GitHub does
            
            
            
              so much of what we need already. What if it could do more? What can
            
            
            
              it do for us from a security perspective, to bring in that tool sprawl
            
            
            
              and allow us to focus on what we do best in code?
            
            
            
              And GitHub really is in some cases that swiss army knife of
            
            
            
              tooling. But at the same time, some of the tooling that it has, a lot
            
            
            
              of the tooling it has, does an incredibly great job of integrating with the ecosystem.
            
            
            
              And so today we want to look at Dependabot which is all about transparency
            
            
            
              and automation to keep your supply chain dependencies up to date.
            
            
            
              And it's going to be super effective. If you haven't seen Dependabot yet,
            
            
            
              it's going to feel like a breath of fresh air. And of course, GitHub advanced
            
            
            
              security we've seen recently take a large presence
            
            
            
              on GitHub and it's all about the centralization and the transparency
            
            
            
              of code security, really focusing on static code analysis and how
            
            
            
              it can support that. And so with that, let's dive in. Let's take a
            
            
            
              look at GitHub Dependabot. And this is all about
            
            
            
              supply chain security. And in this particular feature,
            
            
            
              GitHub defines it as monitor vulnerabilities and dependencies
            
            
            
              used in your project and keep your dependencies up to date with Dependabot.
            
            
            
              What does that actually mean? Don't worry, we're going to explore it. But this idea
            
            
            
              that in all of your repositories, whether it be pypy
            
            
            
              packages or like a requirements TXT, whether it be a nuget config
            
            
            
              for. Net or whether it be a maven settings,
            
            
            
              XML, whatever you have, whatever ecosystem you're
            
            
            
              in, you have a number of dependencies. You rely on abstractions that
            
            
            
              are really important, but keeping them up to date can feel like a
            
            
            
              nightmare, right? But if we look at the mend IO 2021
            
            
            
              report, it says that over 90% of cves aren't present in most recent
            
            
            
              dependency versions. That's incredible. That means that the single best
            
            
            
              security practice that you can do in terms of consuming external supply
            
            
            
              chain security is to just keep your packages up to date all the time.
            
            
            
              Just use the latest and you're going to save yourself a lot of pain.
            
            
            
              And I like to think about this as Mendio describes it, which is kind
            
            
            
              of like going to the dentist. If you only update your dependencies every five
            
            
            
              years, it's going to be painful, right? It's really going to hurt. But if you're
            
            
            
              doing it every month or continually every week, it becomes second nature.
            
            
            
              It's a simple best practice, right? Just as we think about CI CD
            
            
            
              and doing that more often, and so we'll dive into three components of dependent
            
            
            
              bot alerts, security updates and version updates.
            
            
            
              All right, so first bit of an overview. If you go into
            
            
            
              your GitHub, you're going to need admin access to your repository and you'll be able
            
            
            
              to find this security section that we'll be exploring today, which is code
            
            
            
              security and analysis. And it's got a dependency graph
            
            
            
              present. And dependency graph has been around a long time in GitHub and basically maps
            
            
            
              all of these supply chain dependencies. So that way you can generate a pretty clear
            
            
            
              software bill of materials or an s bomb. And turning that on
            
            
            
              is free and cheap and easy and there's no reason you shouldn't use your dependency
            
            
            
              graph. And once you have that data set enabled, then you
            
            
            
              can begin to take advantage of the dependent bot features that we just introduced
            
            
            
              and there you'll be able to then drill in. You can
            
            
            
              see your dependency graph where you can actually take a look at all the packages
            
            
            
              in your repo or better yet, see what dependencies are used across your entire organization
            
            
            
              as a part of that sbom. And when you drill into it,
            
            
            
              then you'll be able to look at your dependent bot alerts. And so by enabling
            
            
            
              the dependent bot alerts, we can very quickly see well, here's my dependency graph,
            
            
            
              but highlight for me the things that are critical or high concerns
            
            
            
              related to cves that are out there. And you get that as a part of
            
            
            
              your security tab that you can see here. And on that security tab you can
            
            
            
              drill in and check out the individual details of each and every one of these.
            
            
            
              And there's no other infrastructure you have to turn on for this, you just simply
            
            
            
              have to enable the feature. Once it's enabled,
            
            
            
              you'll be able to drill in. And from here you can do a couple of
            
            
            
              things. First, that's pretty neat is you can actually create a security update
            
            
            
              immediately from this particular issue, and it's going to create
            
            
            
              a pull request on your repository for you. If you decide that
            
            
            
              this isn't a fix that you need to make, or perhaps the surface area of
            
            
            
              this particular cv doesn't affect the way that you're using it well, you can easily
            
            
            
              dismiss it. And there's plenty of workflow options that allow you to track and
            
            
            
              see why certain things were dismissed over time. And so you also
            
            
            
              have the option in your organizational settings to turn
            
            
            
              on this capability across the entire organization. You can enable and disable
            
            
            
              all from it as an administrator and an.org owner.
            
            
            
              However, a word of warning, as you begin to turn on and play with these
            
            
            
              features, especially the ones that actually create pull requests,
            
            
            
              that's the security updates alerts. Just remember, tell me about
            
            
            
              a problem. Security updates actually submit pull requests when there's a security
            
            
            
              concern disabling, or I should say enabling security
            
            
            
              updates for everyone. Keep in mind that if you have 3000 repos in your
            
            
            
              organization, you're about to turn that on across the board and
            
            
            
              each one of those may submit a pull request, which in turn will submit
            
            
            
              a status check related to your build provider, and all of a
            
            
            
              sudden you're about to kick off a plethora of builds that's really going to jog
            
            
            
              up that queue, I think. So just be careful as you think about organizational
            
            
            
              rollout, but it does seem pretty trivial and easy
            
            
            
              to do. So here. You can also find views at that level
            
            
            
              about who has it enabled, who has alerts enabled, versus security updates,
            
            
            
              and how many of your repos are protected version updates.
            
            
            
              Take us to the next level then they say, I don't just want security updates,
            
            
            
              actually give me updates for all packages that are out there, any package that I
            
            
            
              have in my ecosystem, and I'm a big fan of using version updates across the
            
            
            
              board. And GitHub defines version updates as automated pull
            
            
            
              requests that keep your dependencies updated even when they don't have
            
            
            
              any vulnerabilities. And so you can see here an example of
            
            
            
              a pull request that's been created that clearly outlines
            
            
            
              an update that I'm making for this particular package, and has release notes and
            
            
            
              commit information available to you, as well as labels that are there. And the
            
            
            
              supported ecosystem is pretty substantial here. I think you'll find that
            
            
            
              a lot of the core languages that you work with will be supported, whether it
            
            
            
              be go, maven, gradle, NPM, nuget,
            
            
            
              PiP, Elm, even some interesting ones that you might not
            
            
            
              have thought of would be docker, for example, or terraform modules, or even
            
            
            
              git sub modules or GitHub actions can all be updated.
            
            
            
              If you're specifying a Docker file and it uses semantic versioning,
            
            
            
              you can automatically have that from statement updated as a part of Dependabot.
            
            
            
              And a little bit on my wish list is that Helmchides could be part of
            
            
            
              that too, but maybe we'll see that in the future. It does support
            
            
            
              private feeds as well, so you likely have internal packages
            
            
            
              that are part of your organization, and you can include those here as a
            
            
            
              part of it too. And organizationally configure secrets that
            
            
            
              would allow private access to a JFrog feed. For example,
            
            
            
              you can specify an update schedule, which is important because you don't always just want
            
            
            
              to update in real time. Sometimes you want that to happen on a regular cadence.
            
            
            
              You also have metadata configuration, and we'll talk about the metadata configuration options
            
            
            
              in a second. And we have behavioral configuration, and we'll see that
            
            
            
              too. So as we begin to explore, you'll find that that dependency graph now is
            
            
            
              going to be populated. And as a part of that, here's where you can generate
            
            
            
              that s bomb that we talked. But, and 83% of security teams don't
            
            
            
              have access to a fully accurate s bomb in real time, which is crazy that
            
            
            
              you can have that for free here. You can automatically hit the
            
            
            
              check for updates and you can look for updates anytime that you need to and
            
            
            
              process through that. All right, so moving on to configuration. Now, version updates
            
            
            
              are not configured through the UI like the rest of the dependent bot capabilities
            
            
            
              were. Version updates are actually going to move into source
            
            
            
              control and configure it in the way that you expect with the YamL file.
            
            
            
              So you're going to create a yaml file called Dependabot Yaml,
            
            
            
              and you're going to place that under your GitHub metadata folder that exists
            
            
            
              in your repository here. Then we're going to specify version two because dependabot comes
            
            
            
              from a previous preview that had a different schema. So we're just specifying the version
            
            
            
              of schema we want to use, followed then by a series of registries.
            
            
            
              These could be private registries inside your organization that you want to make use of.
            
            
            
              In this case, I'm going to use a private Nuget feed that's attached to Azure
            
            
            
              DevOps. And you can see here that I can tokenize and use secrets
            
            
            
              that are pulled from the organizational level, which is great. It means I can use
            
            
            
              this configuration across many repositories.
            
            
            
              And now I'm going to indicate the ecosystems I want to update and the directories
            
            
            
              for those. So if you have a monorepo, you can specify multiple ecosystems
            
            
            
              in a single file and specify just one if you need. And you can set
            
            
            
              that schedule here in the interval of how often you want to update. You can
            
            
            
              also have several other options around open pull request limits. In this case,
            
            
            
              I'm going to say I don't want any more than ten pull requests ever at
            
            
            
              a time. You can also include additional metadata around custom labels,
            
            
            
              signees, reviewers, commit messages, lots of information
            
            
            
              you can explore for how you want to customize and piece together your workflow for
            
            
            
              how it creates pull requests. What's neat though, is that you
            
            
            
              have the ability to ignore certain dependencies. In many cases you
            
            
            
              have some of your capabilities, or I should say some of your
            
            
            
              packages are updated in like a nightly build, and you might retrieve those far
            
            
            
              more often than you want. An example of this that I've seen is like AWS
            
            
            
              SDK seems to have almost a build every single day for
            
            
            
              some of them. And well, I want that build. I want to get updated.
            
            
            
              Boy, I don't necessarily want to worry about it every single day,
            
            
            
              maybe once a week or whatever that cadence is. You can ignore certain types
            
            
            
              of updates, and you can also ignore in some cases, if you're not ready
            
            
            
              to make a major upgrade to your system, ignore major version numbers or patch version
            
            
            
              numbers, depending on what you want. One of the largest additions
            
            
            
              that makes Dependabot even so much better now than it was a few
            
            
            
              months ago is the ability to handle grouped
            
            
            
              pull requests. And by that I mean we won't actually group several changes
            
            
            
              or several package updates into a single pr.
            
            
            
              And that's essential because it causes a lot of problems, a lot of noise,
            
            
            
              by generating ten pull requests. In some cases, the granularity is
            
            
            
              too small that updating one package causes another one to break, and you'll never
            
            
            
              get both of those to pass your status checks as it creates those pull requests
            
            
            
              in GitHub for you, requiring some manual intervention and moving between branches
            
            
            
              in order to figure it out. And so this is why grouped updates allow
            
            
            
              us to say, hey, take all of those test dependencies and squash
            
            
            
              them together into one pull request. Take those core dependencies and those
            
            
            
              packages that rely on each other. Make sure they're together in one pull request.
            
            
            
              Take all of those AWS updates and make sure they're in one pull request
            
            
            
              together, not individual ones. And this is pretty essential,
            
            
            
              I think, for the effectiveness and the productivity
            
            
            
              of dependent but, and so if you've come from dependent but years ago and you
            
            
            
              thought it's too noisy for me, try it again, because this is a big difference
            
            
            
              that's enabled now and available. So custom
            
            
            
              groups are awesome. I can add those. I can add exclude patterns per group
            
            
            
              so I can say include all these, accept these. You can also
            
            
            
              do a catch all where you could actually say I want all my dependencies
            
            
            
              in one easy pull request. And that makes it nice and easy to validate
            
            
            
              and merge when it's successful. But what about when it's not successful?
            
            
            
              Then you have to try and filter through and understand exactly which update failed
            
            
            
              what? So there can be good and some bad with that. It also supports dependency
            
            
            
              types as well. So you can say, hey, I want all of my production dependencies
            
            
            
              or development dependencies if your ecosystem supports
            
            
            
              that. And of course you can do other update
            
            
            
              types to say, I actually only want to update minor or patch versions,
            
            
            
              don't give me major version updates. Those are something that I need to plan for.
            
            
            
              I can't just have prs being open for. And so the usage of
            
            
            
              dependent bot with grouped updates and updates in general is critical. I know,
            
            
            
              at SPS commerce, one of the key use cases that we have as well is
            
            
            
              inner source distribution, really focusing on velocity.
            
            
            
              And so internally when you're setting up a new library and you're distributing it and
            
            
            
              your applications are consuming it, typically the only reason
            
            
            
              these applications are going to update a version number without something
            
            
            
              like Dependabot is because they did an initial install, they're doing
            
            
            
              a major upgrade, or they need a feature that's actually as a part of that
            
            
            
              and they've been following it. Otherwise the only way you're going to get upgrade is
            
            
            
              through Dependabot. And so if you're interested in that at all, feel free to
            
            
            
              check out. I have another session at other conferences
            
            
            
              called compelling code reuse in the enterprise. You can feel free to Google
            
            
            
              that and find it online as well. But this is essential to enabling
            
            
            
              inner source distribution and velocity. And you can filter your
            
            
            
              updates independent, but by using the allow tag and
            
            
            
              saying I actually only want this individual dependency to be updated. And so if you're
            
            
            
              not going to use it for the rest, at least use it for your internal
            
            
            
              organizational velocity.
            
            
            
              And so with that, a couple of thoughts. Some pitfalls. If you're not
            
            
            
              using grouped updates, you need to be, because that is a big difference here that
            
            
            
              makes it go ten times further. There's no auto merge capability.
            
            
            
              So assuming your checks pass and everything's good, there's no ability
            
            
            
              to merge it in without some additional extensions or using GitHub actions in order
            
            
            
              to accomplish that. And I would love a feature here that allowed us
            
            
            
              to look at the package maturity or the package age and say,
            
            
            
              I only want to include updates for packages that are x number of days old.
            
            
            
              I want someone else to go through the process of finding those particular bugs and
            
            
            
              kind of have a pre baked period for that.
            
            
            
              There are alternatives. If you're not in the GitHub ecosystem and you're really
            
            
            
              liking this one alternative out there, it's kind of deprecated. Now is new keeper.
            
            
            
              It was kind of new get specific. But it had just a ton of features
            
            
            
              and was really before its time. And a more popular one
            
            
            
              then would be renovate that you can make use of and renovate is cross platform
            
            
            
              and provides a lot of the same functionality, if not even more capabilities
            
            
            
              in some cases. Merge queues if you're using
            
            
            
              merge queues, which is a brand new GitHub feature as well, we don't have time
            
            
            
              to cover that today. But you can actually integrate and use merge queues along with
            
            
            
              dependent bot to try and get some of that grouped update effect in there kind
            
            
            
              of throttle some of those deploys a little bit. So that way you can group
            
            
            
              a number of merged dependent bot updates all at the same time and
            
            
            
              custom dependencies. So looking at this,
            
            
            
              trying to understand your dependency chain, what's proprietary,
            
            
            
              what's internal, can be helpful, but can also be really
            
            
            
              problematic as well. And of course, from a security governance
            
            
            
              perspective, enable those defaults, get your dependency graphs on, get your
            
            
            
              alerts on, and have access to that s bomb,
            
            
            
              and begin to assess what your organizational kind of perspective
            
            
            
              looks like from security. And you'll be able to actually see
            
            
            
              who's using some of the packages you maybe thought are a little bit funny.
            
            
            
              So with that, I want to move on to GitHub advanced security.
            
            
            
              And while dependent bot was all about supply chain kind of
            
            
            
              scanning other people's code and consuming other people's code, GitHub advanced
            
            
            
              security is a feature that is all about thinking about the practices around your
            
            
            
              own code security. So now the code that we actually write, and so that's why
            
            
            
              it pairs very well. And going back to our introduction,
            
            
            
              you'll recall that we talked a lot about this tool. Sprawl and
            
            
            
              team silos and Dependpot is great,
            
            
            
              but it doesn't necessarily allow you to hook in with other tools. What we're going
            
            
            
              to find is that GitHub advanced security provides a centralization,
            
            
            
              a mechanism for visibility of not just information that
            
            
            
              we're seeing related to GitHub itself that is generated, but how we
            
            
            
              can integrate other tools into the same interface as well,
            
            
            
              which is a massive advantage compared to what we're seeing elsewhere.
            
            
            
              And so we want to do a little bit of an overview. We want to
            
            
            
              check out code scanning, and we want to then separately check out CodeqL,
            
            
            
              which is going to interact with code scanning to provide some static analysis as a
            
            
            
              part of that centralization. And as we get started, we'll see
            
            
            
              a couple other components here with GitHub advanced security as well.
            
            
            
              First is you're going to be in the same section of security that we were
            
            
            
              before for dependent bot, but you're going to scroll down the page a little more
            
            
            
              in your settings, and you're going to find GitHub advanced security in there. It's got
            
            
            
              these two sections that you can enable here,
            
            
            
              enabling then gives you access to code scanning
            
            
            
              and secret scanning. And so code scanning
            
            
            
              basically is what we're going to focus more on in a minute. But to give
            
            
            
              you a preview of secret scanning, we'll see that too.
            
            
            
              And that's where we can receive alerts or even block commits to your
            
            
            
              repository that it thinks contains secrets.
            
            
            
              For GitHub advanced security. It's important you know that this is a paid portion
            
            
            
              of the ecosystem. And so depending on if you're a public repo or
            
            
            
              you're an enterprise or what your implementation of on premise
            
            
            
              is, you'll have to look at the licensing for this. And the licensing
            
            
            
              is a bit odd, mind you. It's actually one license per user for every
            
            
            
              active committer, which is the last 90 days on your particular repository.
            
            
            
              And once you're licensed in that organization, then you don't take up a license in
            
            
            
              another repository that's there. So just be mindful of that.
            
            
            
              But as we dive into secret scanning, I think you'll find that it's interesting to
            
            
            
              see that push protection, when it went
            
            
            
              generally available for public root pools, blocked over 17,000
            
            
            
              credentials in one year, which is incredible. And so enabling
            
            
            
              secret scanning is a no brainer. If you have the license, you're going to want
            
            
            
              to turn that on and you can verify then if a secret is valid or
            
            
            
              not as well. So as it detects a secret inside your
            
            
            
              repository or the code that you're committing can actually go and verify that
            
            
            
              with providers. So think about AWS and taking those
            
            
            
              particular credentials and seeing that not only did I find credentials that match a
            
            
            
              pattern, but I've actually validated these credentials are real and they work.
            
            
            
              That's obviously going to raise a much larger security risk
            
            
            
              than invalid credentials or credentials that don't match a particular pattern.
            
            
            
              And so as we take a look at this and we're thinking about the
            
            
            
              number of blocked credentials in a year, think about the impact this can have to
            
            
            
              your organization. I'm sure your security team would love that. And in
            
            
            
              addition, you can also add custom patterns that you can see there in the background.
            
            
            
              You can block the protection. So as someone commits, don't even let them
            
            
            
              commit, they're going to see this message here instead that says, hey, I see a
            
            
            
              secret in your code. I see a secret in your code based on this custom
            
            
            
              pattern or based on our standardized patterns that we see, you might
            
            
            
              internally, for example, have your own implementation of a token and you can codify those
            
            
            
              patterns across the organization and include them. But better yet,
            
            
            
              if you're following GitHub universe, we saw that GitHub copilot, which is
            
            
            
              basically finding its integration to everything we do in GitHub,
            
            
            
              has the ability to auto detect passwords based on the context and
            
            
            
              information around it. So that's exciting to see that being even more effective
            
            
            
              for detecting credentials even without custom patterns in place. So that's
            
            
            
              great, but let's dive into code scanning. Secret scanning is a no brainer.
            
            
            
              Turn that on. If you have a license, there's no reason not to. But code
            
            
            
              scanning has a lot more interesting architecture and details
            
            
            
              that we need to think about. First of all, recognize that with code scanning it
            
            
            
              allows me to include a number of tools. And so you can see here,
            
            
            
              first thing it says is, well, what tools would you like to turn on that
            
            
            
              can contribute to the code scanning of detecting anomalies
            
            
            
              and coding errors? So first is the first class citizen of CodeqL.
            
            
            
              CodeqL was a purchased product, or I should say an acquisition to GitHub.
            
            
            
              It was originally the product was SEML, and now they've integrated that capability,
            
            
            
              first class with integrated CLI that can upload
            
            
            
              directly to code scanning capability here. So you
            
            
            
              can go ahead and hit the setup option. And this setup option here is going
            
            
            
              to create a GitHub action for you essentially, that has this ready to go
            
            
            
              that can execute on your repository. And of course you can explore other workflows
            
            
            
              and pull those up. And we'll just shelve the idea of codeql here
            
            
            
              for a second now, and we'll talk about the interface that code scanning provides
            
            
            
              that any tool can contribute to. First, here is the
            
            
            
              interface. It looks a lot like dependent bot. In fact, you'll see when I go
            
            
            
              to the security tab and I scroll down to the pendantbot section for
            
            
            
              vulnerability alerts, or right below that is code scanning. And you also
            
            
            
              see there's a secret scanning section. So it's all very nicely outlined on where
            
            
            
              you find your alerts on different components. And here under code scanning,
            
            
            
              then you get the same classic view the GitHub provides.
            
            
            
              Here's a list of the different warnings or critical items or even notes
            
            
            
              that we've detected related to your code specifically.
            
            
            
              Drilling into one of those then gives you the nice view that you can see
            
            
            
              exactly what happened. In this case, it's calling out a generic catch clause,
            
            
            
              indicating that you probably should be more specific in your exceptions and not just
            
            
            
              grab that. And of course you still have your workflow on the right. You can
            
            
            
              see there where you can dismiss a particular code scanning item and say, I'm not
            
            
            
              going to fix this, or this is actually just used in tests,
            
            
            
              it's not production code, so I'm not going to worry about it. And that information
            
            
            
              again is just part of the workflow that tracks. So you can see who and
            
            
            
              the reasoning why they might dismiss something with a bit of a description.
            
            
            
              And what makes code scanning so great? Not just the centralization
            
            
            
              of it, but the fact that it executes on your pull requests.
            
            
            
              And so when you're configuring code scanning in the security section, you're going to have
            
            
            
              this option to say, what's your pull request check failure? Do I
            
            
            
              want to fail pull requests if code scanning detects an error? Probably,
            
            
            
              I think so the best thing that we can do is to bring this left
            
            
            
              as far as we can, meaning for engineers and developers, the best experience is
            
            
            
              I'm submitting a pull request. I'm going to have other people look at and make
            
            
            
              comments on the pull request. Why not have code scanning automatically do that
            
            
            
              as well, and reject or fail the status check?
            
            
            
              That's exactly what I'm doing. That's a zone I'm working in. And so we
            
            
            
              can configure the level of failure that we want. We can also configure
            
            
            
              a status check here to actually bubble up as a first class citizen.
            
            
            
              So you can see that check and see whether it's passing or failing.
            
            
            
              But the best part about code scanning on pull requests
            
            
            
              is that it actually creates an annotation on your code as well. So just like
            
            
            
              any other reviewer, you get that right on your code, only for
            
            
            
              the code you changed. You're not actually going to see this for all errors in
            
            
            
              your system, that doesn't make it easy for you to get a pull request in.
            
            
            
              You need a kind of a baseline start from. But code scanning by default will
            
            
            
              only block you if you're introducing a.
            
            
            
              Net new item in the code that you've changed.
            
            
            
              And so in this case, here's a warning saying I have a useless local variable
            
            
            
              and I've also configured to give me code warnings. I don't just care
            
            
            
              about security related information, give me some obvious things like unused
            
            
            
              variables, because I can just clean up my code too,
            
            
            
              once you've worked with it in a pull request like this, it's so nice that
            
            
            
              this takes away some of that manual effort that maybe an
            
            
            
              individual contributor would have come in and reviewed this and called out some of those
            
            
            
              things. I can have all those things obvious things fixed and all the
            
            
            
              security problems fixed before a reviewer even gets to my code.
            
            
            
              And so in my mind, I love what Mike Lyman says from synopsis.
            
            
            
              He says it makes no more sense to write code without code scanning tools
            
            
            
              than it does to write a paper without spell check. Just like we're all using
            
            
            
              AI now to help us as well. The differences with something like AI
            
            
            
              and Copilot is, it still has the
            
            
            
              potentiality to write security problems in it too, because it's trained based on our code
            
            
            
              basis. So you're going to want to continue to scan all of
            
            
            
              your code, no matter where it was generated or who created it.
            
            
            
              And so for me, this is fantastic. Correlating alerts from different
            
            
            
              tools is labor intensive with many false positives. But now
            
            
            
              if I can shift this left as far as possible to the pull request workflow,
            
            
            
              this is a huge key in ensuring that these things are fixed before they
            
            
            
              even get introduced. And on top of that, with GitHub Copilot
            
            
            
              and where it's going to take us, they've introduced the ability to auto fix,
            
            
            
              meaning that right on the pull request. Now, when I have something, a useless
            
            
            
              assignment to a variable, I can just hit the auto fix button and just clean
            
            
            
              that up for me and just make me one step faster to some
            
            
            
              of those tedious things that are maybe obvious. But as we
            
            
            
              dive in more to this idea of what is code scanning and what is CodeqL,
            
            
            
              it might not be entirely separated for you yet. And so I
            
            
            
              want to just discuss the differences and where those barriers are a little
            
            
            
              bit. Code scanning is the framework, right? It sits on GitHub. It acts
            
            
            
              as a user interface that we can interact with that provides alerts and capabilities
            
            
            
              that are tracking across the GitHub ecosystem. And you as an engineer,
            
            
            
              a developer, and operator, we interact with those, whether at a specific repo or
            
            
            
              at an aggregated level in your organization. But code
            
            
            
              scanning and the rest of these tools sit outside of that. We choose when
            
            
            
              we want to run CodeQL, formerly SEML,
            
            
            
              or any of these other great tools that are out there, whether you're using Sonotype
            
            
            
              or 42 cronch or checkmarks, all of them can also contribute
            
            
            
              and upload information to code scanning, meaning that now I can begin
            
            
            
              to pick and choose and use codeQL for code scanning,
            
            
            
              but I can use 42 crunch to also submit security analysis on
            
            
            
              an open API design. Or I can use another one of these providers
            
            
            
              to submit information to code scanning about
            
            
            
              infrastructure as code related concerns. So you can explore just
            
            
            
              a ton of those other options. When I took this screenshot, there were 67.
            
            
            
              I'm sure there's a lot more now, but essentially we get code
            
            
            
              security analysis, and that's given to us from CodeQL. That's free.
            
            
            
              We get code quality analysis, meaning I've enabled queries that not just for
            
            
            
              security, but also those unused local variables and the other gotchas that I want
            
            
            
              to call out. It is database driven. So CodeQL
            
            
            
              is specifically going to create a database and index all your code locally, and then
            
            
            
              you'll fire queries against it. That's how it operates. But the
            
            
            
              queries that it runs are also open source queries that you can find on GitHub
            
            
            
              today. You can take a look at and understand completely what kind of things
            
            
            
              it's searching for in the code, and you're going to find that. CodeQL is pretty
            
            
            
              well adopted across a ton of languages in the GitHub ecosystem, and these are definitely
            
            
            
              all the core languages that we use at SPS commerce, since that makes a lot
            
            
            
              of sense. But the key is that whatever tool you're
            
            
            
              using, each of them are going to kind of execute differently. And you'll have to
            
            
            
              investigate and explore that and figure but how you're going to upload then information
            
            
            
              into the code scanning framework. Here's how it works for CodeQL specifically.
            
            
            
              You'll have a GitHub repository, you'll have a
            
            
            
              database create option. So you're going to call CodeQL database create.
            
            
            
              You can say, here's my language, here is the database that
            
            
            
              I want to create, and it's going to go and index against a
            
            
            
              repository that you give it. And you can specify other
            
            
            
              custom build commands that you want or many other overrides here.
            
            
            
              But it's going to look at the CodeQL query packs and code queries that existing
            
            
            
              on GitHub today. CodeQL, it's going to create that database
            
            
            
              and then we're going to specify that query packs that we want to use here.
            
            
            
              It's a QLS and the database that was created and
            
            
            
              we're going to say create a serif file from this. So now it's basically taking
            
            
            
              the database, taking the queries and executing all those commands.
            
            
            
              The output of that then is a serif file. And a
            
            
            
              serif file, if you're not familiar with that, is a static analysis results interchange
            
            
            
              format. It streamlines how static analysis tools share results. So it's
            
            
            
              a generic JSON schema essentially. And so you can
            
            
            
              follow that schema by creating your own tools and uploaded code scanning or using
            
            
            
              many of the existing tools that can follow that format and upload to it.
            
            
            
              Now, of course, with the tight integration that we see between CodeQL
            
            
            
              and GitHub, CodeQL CLI comes built in with a CodeQL
            
            
            
              GitHub upload results, which is hitting an API endpoint that I can
            
            
            
              pass the serif file to on that particular repository and that's it.
            
            
            
              It's submitted to code scanning pretty easy and you can commit multiple
            
            
            
              configurations to that. So different subdirectories, different tools,
            
            
            
              they can all contribute and create this suite of capabilities that you're now
            
            
            
              analyzing against your code base. Be asking, what is a CodeQL
            
            
            
              query exactly? I'm no expert on CodeQL queries. I'm still
            
            
            
              learning as well. But think of it as a standard kind of SQL
            
            
            
              like query language that you can kind of drive. Where you're importing libraries,
            
            
            
              you're using a from statement, a where statement, and a select statement. Here's an
            
            
            
              example how you can really simply find an empty if statement
            
            
            
              and then go ahead and write that as a custom
            
            
            
              query. And so there's lots of tutorials you can find online about that for
            
            
            
              writing custom queries. You can also define custom query packs, meaning I can just
            
            
            
              configure the exact number of queries that I want to use in a YAML file
            
            
            
              and then provide that to the CodeQL CLi as well to really fine tune
            
            
            
              it. And they also come and query suites too. And you can create your
            
            
            
              own suites internally for your organization. What makes sense for you? Kind of pull
            
            
            
              those together. There is a vs code extension that can make that easy,
            
            
            
              but you can see that generally speaking,
            
            
            
              the CodeQL repository itself, where the open source maintained queries
            
            
            
              are, is fairly popular, fairly regular, and is in
            
            
            
              my opinion maintained by lots of great experts. And so I'm
            
            
            
              glad to be able to pull in what they're doing, but also augment it with
            
            
            
              some of the small minute things I might want to add.
            
            
            
              So advanced security provides a ton of stuff, but there can be a
            
            
            
              high setup cost and that depends. Are you using GitHub actions? It can be easy
            
            
            
              to set up, but do you have specific dependencies?
            
            
            
              Do you have specific requirements in order to build it that don't need
            
            
            
              to be integrated with it? It will take a little bit of architectural understanding
            
            
            
              in order to put that together, but in some cases it's as simple as running
            
            
            
              the CLI tool, understanding your build command and away you go.
            
            
            
              Dynastray says 62% of organizations use four or more solutions.
            
            
            
              Well, I'm really glad that this is a simple integrated experience. This is one
            
            
            
              final solution that we can put a lot of backing behind and
            
            
            
              see it in one central pane of glass. It is remote only
            
            
            
              and that's something to consider. A lot of our teams have asked about. Well,
            
            
            
              I want some of that analysis done in my Ide locally
            
            
            
              and you can see that information in your ide when
            
            
            
              it pulls it from GitHub and you can see it locally and highlighted in your
            
            
            
              code, but it's not generated locally. It has to be done on the server or
            
            
            
              you have to do it as part of the Codeql Cli commands.
            
            
            
              And that can take three, four, five minutes. So this is not something
            
            
            
              that is comparable to linting in real time where you'll get those results. It's there,
            
            
            
              and GitHub has indicated that's not their intention either. So you might want
            
            
            
              to look elsewhere for some of those easier linting problems that you're solving.
            
            
            
              And of course, the VS code extension helps you pull down that information and see
            
            
            
              it and pull request workflow is fantastic. We all use that workflow
            
            
            
              at our organization. And if you are, this is a great place where you can
            
            
            
              put it in organizationally from a governance perspective and begin to rally
            
            
            
              around it. Depending on where you're at and what
            
            
            
              your investment in GitHub is, the cost can be significant, but we've
            
            
            
              found it to be actually significantly lower than some of the other comparables
            
            
            
              and some of the other tools out there that would do something similar. So there's
            
            
            
              a really nice blend of capabilities and getting code scanning and then
            
            
            
              using CodeqL as part of that for free.
            
            
            
              As we said, you can write custom queries, you can bring your own. I'm really
            
            
            
              looking for custom queries that we can write on YamL and JSON and basically non
            
            
            
              supported languages. Even so, I can detect other things and other linting warnings
            
            
            
              and other kind of organizational problems in our code bases
            
            
            
              that we're seeing. But the complexity to writing custom queries does take
            
            
            
              a little bit of onboarding experience and knowledge to get started with. So it's not
            
            
            
              the simplest. And in terms of interoperability, it's a
            
            
            
              huge win here. Ecosystem of tools in the standard serif format to
            
            
            
              even build your own integration is the win that you're
            
            
            
              looking for, I believe, and this is what we're looking for in terms of building
            
            
            
              our ecosystem of security tools together. So that's all
            
            
            
              the time that we have for today. Thanks for checking out this talk on fortifying
            
            
            
              codebase with GitHub. I hope these two tools are something you're able
            
            
            
              to take advantage of, especially dependent bot. That one's really easy to get started with.
            
            
            
              Code security is a little bit more involved, but not that
            
            
            
              difficult either, especially if you're already on GitHub actions.
            
            
            
              And at the end of the day, this comes down to this quote that we
            
            
            
              started, which is that developers work in rainforest, not planned gardens. And so if
            
            
            
              we can bring the GitHub ecosystem a little bit more to being that planned garden
            
            
            
              for engineers, let's give them that quality of life and let's continue to
            
            
            
              work towards this centralized ecosystem and this single pane of glass.
            
            
            
              So, thanks all, and we'll catch you at another talk.