Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, I'm Munawar Hafiz, chief technologist of open refactory.
As you can see from my backdrop, I'm in a hotel room right now.
I'm traveling, and I'm attending the open source summit North America
in Seattle. At this point in between the conference talks,
I'm taking some time off to record this video. My talk
explores the static analysis tool landscape for go language,
specifically what kinds of bugs they find, what kinds of bugs they miss,
and how open refactories, intelligent code repair,
or ICR, is bridging the security tool gap
that is left by the other tools.
We are all aware of the challenges of keeping applications secure.
The average cost of a security failure now is about $10
million, yet over 150
security vulnerabilities are reported every week
in the National Vulnerability Database. Current static analysis
and dynamic analysis tools fall short of detecting
these bugs in time so that these breaches
can be avoided. In fact, research data
has shown that existing static analysis tools
detect only about 14% of
all the bugs detected. The remaining 86%
of the bugs are detected by pure luck.
This is pretty staggering. At the same time, the static
analysis tools operate at about 50% to 90% false
positive rate. Imagine a 90% false positive
rate. Ten bugs that is reported by a static analysis
tool. Out of that, nine of them are not a
bug at all. And somebody in your team needs
to painstakingly go through each of the bugs to
triage and identify what are the bugs that your team needs
to fix in the first place. Finally, the static analysis
tools fall short of fixing the bugs
they only detect. Bug fixing remains
a manual process and oftentimes this
is very error prone. There has been several incidents
where a bug fix was not only able to address
the security problem, and in other cases it
has also introduced some other bugs.
The most daring example in this space is that of the log four shell vulnerability.
When the log four shell vulnerability was reported, there were four different
fixes that were released in a span of four days,
and each of the earlier ones were not able to
fully address the problem. This,
unfortunately, is very common in the industry, and because of
this, teams miss deadlines,
administrations have budget overruns, and companies
have recurring security breaches.
Open refactories intelligent code repair,
or ICR, addresses these challenges head on.
ICR finds bugs that other tools miss.
It does that with dramatically low false positives, and it
is the first tool in this space that can automatically
synthesize fixes for about half of the bugs that
has been detected. Because of this,
developers get their life back and companies can deliver
high quality software on time and within budget.
ICR is available for Java, Python and go
programmers. An instance of ICR harbors
many different fixers consider ICR as
a band of superheroes where each fixer is a separate
superhero with its own separate power.
It can detect and occasionally fix a different kind
of security, reliability, or a compliance bug.
ICR fixers are built upon three
key technologies. We use deep static analysis
based on path analysis to detect the
bugs with high accuracy. We use machine learning
and also generative AI for different
cases to improve the results.
Specifically, generative AI is good for summarization
and generation. We use the summarization
capability to understand the programming context better to
reduce the false positives, and we
also use the generation capability to further improve upon
the bug fixed generation capability of ICR.
Finally, ICR is based on code refactoring technologies.
I've been part of the code rewriting community for over 20 years,
and I've built code rewriting tools in over ten different languages.
Open refactory brings together that expertise in
code rewriting to deliver the fixes for you.
I've already mentioned that the results from ICR is highly precise.
On benchmark, we have shown as low as 3%
false positives in ICR results, and typical
bug detection tools operate at greater than 80% false
positive rate. Therefore, ICR allows
a development team to operate at premium release velocity without
compromising the quality. It improves the productivity of
the developers by about 11% or more,
and the developers are now freed from the mundane task
of bug fixing and can focus on more
creative work of feature creation.
Now I'll share a video made by Charlie Bedard, who is our chief operating
officer, and he will explain the ways you can use ICR
and the basic workflow of using it. ICR is available
in Java, Python, and go, so the video is generic,
but it gives you the idea of how to use it.
Now let me give you a brief overview of how ICR
works. ICR is a suite of Docker containers that
control access to your code repositories,
analyzes your projects, and allows you to review and
accept the results. The navigator is accessed via
a web browser and it authenticates each user and provides
a view of the user's repositories.
Repositories are accessible through many popular version control systems
such as GitHub, GitLab and BitBucket.
The analysis engines do the heavy lifting of scanning your projects
and detecting the bugs. There is one analysis engine for each language that
ICR supports. The reviewer is used to examine
the output of the analysis engines. You can examine
the bugs that ICR has detected and accept or reject them.
If a fix is offered and accepted, the reviewer can also apply those
fixes directly to the source code in your project. To analyze
your code, you first log into the navigator,
select the project you wish to analyze. The navigator
then fetches the code from your vc's. You then
select the branch that you want to analyze and start the analysis.
The navigator scans the project and presents you with the language options.
After selecting the language, the appropriate analysis engine is
invoked. When analysis completes,
the results are placed into a database for later review.
Let's look at that next. To review the results of an analysis,
the user connects again to the navigator. For this step,
the analysis engines are not used. Their work is complete.
However, we do need to start up a reviewer session to go over the
detected issues. With the reviewer connected to the user,
they can browse the detected bugs and choose which bugs to
accept or reject. Once a bug is accepted and if there's
a synthesized fix for it, ICR can push the
fix to a temporary branch for further review by the development team.
Using ICR is simple and effective in finding critical security vulnerabilities
in your projects. Let's see these same steps from the
user's point of view. After logging into
ICR, the navigator presents the user with the repositories
available through the chosen VC's. In our example
here, we are using GitHub.
We scroll through the projects and choose the one we want to analyze.
I have chosen an open source Python project named Manim.
We first clone the project to pull the source code into ICR.
Then we choose the branch that we want to examine. In this
example, we'll analyze the main branch. Click on the analyze
button to begin the analysis. ICR detects
that this is a Python project and asks us to choose the version
of the Python library to be used for this analysis.
We do that and continue. Analysis begins.
As the analysis proceeds, you can log out of ICR and
return later to see the results, or you can monitor the progress
from the monitor page. Some analyses may take quite
a while, but this one is quite quick so we can see the analysis proceeding.
When it completes, we can look at how the reviewer is used to
go over the results. With the analysis complete,
we return to the navigator to begin the process of reviewing the
results. Again, we scroll down to the Manim project and
select its main branch.
We now see that the review button is available to us.
Clicking on that brings us to the analysis summary page
where we can see the history of all the analyses previously
performed on this branch. In this example, we have the
analysis that was just completed. Clicking on the
show results button brings us into the reviewer. At the
top of the page we see a summary of the results.
The bugs themselves are presented below that and are prioritized
with the highest severity issues shown first. At the left of
the window are some filters. We use the category
filter to narrow down the particular class of bugs we want to review.
For example, we can look at just the bugs in the
improper method call category. There is an issue titled
IMc m One. It is a
medium severity issue. The description gives details
about the issue, including links to external sources
that describe the importance of the issue.
By clicking on show diff, we can see the specific issue
highlighted in red with the recommended fix shown in green.
If we want to accept the fix, all we need to do is click on
the accept button. In this fashion,
you can review all of the identified bugs and choose which ones to act
upon. To read more about how to
use ICR for your projects, you may want to read our
online ICR user guide or feel free to contact us
at info@openRefactory.com dot now
let us go through a demo of running a
few options of static analysis tools on a
single application and identifying what kind of bugs are detected by
the different static analysis tools. For this, I will
be using three static analysis tools. One is
an open source tool created by Google called Google
Gosec. The second is the ubiquitous sonar
cloud, and the third one is ICR or intelligent code
repair. I'll be running all of these on
an application that has been built by Red Hat.
It's called service provider integration operator.
The application is not very big. It is about 20,000 lines of go
code as well as others. For this talk,
I'll be just concentrating on what these tools are finding
in the go code. There's one other thing.
In order to make things interesting, I'm not running the
tools on the current version of service provider integration
operator. Instead, I went back in time
and identified a version which had two
critical problems, one across site
scripting and another one an open
redirect issue that was there and fixed.
Later, I'm going back to a version that has both
of these vulnerabilities. So the challenge for the
tools is to identify these two bugs.
If possible. That particular vulnerable
version of the service provider integration operator is
stored in a separate public repository called service provider
vulnerable. I'll be running the tools on that application
first let's run Goseq. GoSeq is run from command
line and is pretty straightforward. You just specify the output
directory where you want to store the results and run it.
Note here we are running on the current directory,
but you can instead choose any sub directories on
which you can run. You can also pick and choose the set
of rules that you want to run. Here we're running
with all the possible rules that are provided by Gosec.
When we look into the results, we see that Goset does not find anything
at all. This is ok in terms of false positives because
at least we do not have to triage through tons of
false positives thrown by Goseck.
However, we are not getting any improvement in terms of security.
Specifically, we know that there are two critical security problems
here. So the fact that Goset does not
find anything means that your risk remains unmitigated.
Now let's look into sonar cloud. We are running
sonar cloud on the same application.
I've already run the application and are now showing the results.
Sonar cloud identifies 33 security bugs,
six reliability bugs, and 257
maintainability bugs in the code.
That's quite a handful. Let's look into each one of them.
But before we do that, first filter the sonar cloud
results so that we concentrate on the results
identified in the go code only. If we
do that now, our number of bugs have
come down to only 206 issues.
But as we see that we do not find any security
issues anymore.
We are only finding maintainability related issues,
some of them high severity, some of them medium severity, and some
of them low severity.
We see that the results reported by sonarcloud were
of three kinds. They were about
duplicated string literals. Some of them were about
the cyclomatic complexity of some functions being high.
And then there are some empty functions. None of them appear
to be very serious at all.
In fact, we are still searching for a tool that
can identify the critical security problems that
are lurking in the source code that we are running upon.
Now let's look into ICR again. In this
case, you are accessing ICR through the web portal.
You are selecting the GitHub repositories that you have,
and once you're connected with GitHub you'll see all of your repositories.
Here I run on the vulnerable service provider
application beforehand. And so let me open
the application. You see that there are several branches
on which we can run separately, but in this case
we ran already on the master branch and we
have the results already available. Once we see the
results we see that ICR only identifies two
bugs for you, and voila, it has identified
both the security issues for you with zero other
unnecessary issues, which means that there is
zero false positives. If we
see the output of the first bug, which is the open redirect,
we see that ICR provides you a trace of
where the taint came into the application and
where it was exercised so that the open redirect
will happen. In this case, ICR was not able to synthesize
a fix. Now, if we look into the cross site
scripting, we see that the trace is longer because the
code flows through multiple paths.
ICR provides the detailed trace.
But what is interesting is ICR was also able to
synthesize a fix. In this case, the fix was
sanitizing the input coming from an untrusted user
with a well known library called Blue Monday.
ICR not only included the library calls, but it has
also imported the library module.
So we have seen that ICR has only identified
two bugs, both of them that we were looking for
in this case, and it was able to synthesize a fix for one
of them.
Okay, as you see that with the magic of video
editing, I have changed my dress and I've come back home
from the conference. So let's summarize what we have
covered in this talk. I've looked into the problems that current
developers face when they're looking for static analysis tool
support. I've also done a test drive
of multiple static analysis tools on a
known application that has been created by a reputable vendor.
And I've shown that only ICR on intelligent code repair
was able to deliver the promises that was made
by it in the first few slides.
It was able to find the hard to find bugs. It was doing
that with no false positives in this case, and it was
able to synthesize fixes for one of the high severity cases.
Consider using ICR as opposed to another tool
that was reporting over 200 issues, but all of
them were false positives. If your developer was taking
on average about ten minutes to look into each one of
the bugs and identify whether that's a true positive
or not, then it's going to take him about 2000
minutes in order to triage the bugs. Now you can
tell me 1 hour the problems that were identified were
pretty benign and it doesn't take ten minutes to triage them.
Well, in this case it was true, but on another application it
may not be. The bottom line was that other
tools were not able to find any critical bugs
and they were not and they were just instead generating
a lot of false positives. And there you have
it. If you have any questions, feel free to reach us@infoenpenrefactory.com.
We'd be happy to let you do the test drive yourself
and identify the benefits that ICR is providing.
Thanks a lot for your time, and I'm looking forward to chatting with you
afterwards.