Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, thank you for being here. A journey of the thousand binaries.
I'm super happy of being here with you.
My name is Xiao Reese, I'm from Mexico,
I live in Switzerland. I'm a Java champion. I work
for JFrog. I collaborate in some organizations like the Open
Source foundation, the CNCF, the CDF,
and I'm an open resources enthusiast. I also
have collaborated in some books, the latest one being
DevOps tool for Java developers. You can get it there.
And the reason why I like this particular session is because
we all know the world runs on software and
there is a key part or key piece when I
refer to that one. And that key piece are
dependencies. For example, according to the 2022
open source Security and Risk analysis report by synopsis,
we learned that 78% of code in
code bases was open source. And not only that,
85 contained open source, that was more than
four years out of date. So this is
important. We have our code, our artifacts that
we're publishing out there are a combination or
a combination of several other kind of software,
other dependencies. So the first question
we should be asking is what type of dependencies,
what kind of dependencies are we using and for what
reason? And dependencies, it's a broad
name, it's a broad label, it can mean so many things.
So for example, I'm going to read to you two
statements and you will tell me if they are true or
false. Dependencies are collection containing
high quality tested code that provides functionality,
and that functionality requires significant
expertise to develop. This is true.
Dependency managers like NPM have made possible
that almost trivial functionality be packaged
and published. That is also true.
So we have noticed that there is in a spectrum,
really complex libraries or applications
or frameworks and trivial functionality
out there. So what are the different types
of dependencies? For example,
we have frameworks, libraries, package modules and resources.
Resources can be a collection of files,
for example templates, media, audio, video or images,
plain text files or blob that need to
be included by the application to execute correctly.
And for example, modules are
set of methods or functions that provide a self
contained functionality. A module issue has an interface
that explicitly and abstractedly specifies both the
functionality it provides as well at the functionality depends
on. And usually when we find an interface,
there has to be at least one implementation of
that functionality. So sometimes for
us it's kind of a black box.
And finally, packages. Packages are
a collection of modules that hold in general
the same functionality type. For example,
if you are thinking about JavaScript, it's usually a directory, it can contain
some file that describes the metadata about the package.
And by aggregating this, the authors make it
easy to remove or to add in a specific set of functionality.
And finally, a library. A library is a
collection of related functionality divided or
defined in several packages. And how
a library usually behaves is use call
in a specific function. This will execute whatever
functionality it says it's providing and return the
code or the no, not the code, return the control back
to you, to your application. And on the other hand,
a framework behaves or has
more and more abstractive design and has
even more behavioral built in. So it
provides its own workflow. And what it
usually happens, sometimes what you do
is these frameworks have hookups or specific places
where you can add your specific code. So they are
executing a specific workflow and suddenly they call
your logic, and then, well, they complete the
functionality and continue whatever workflow
it is supposed to be doing. So why am I talking about
these two things? Like why am I making
a difference, for example, between libraries and frameworks?
While frameworks usually have a
lot of lines of code, they are usually opinionated,
there is a roadmap, and they have versioning,
they have a license specified,
and most likely, and usually this is like
the standard industry, they have a healthy
number of tests that are continually checking that the
behavior is still as they are supposed to be done. And the
differences, they don't only are important
to me just to understand what type of dependencies we're adding
to our projects, but for example, if you are in the JavaScript world,
you probably have read or use angular and react and
react is self defined JavaScript library
for building user interfaces, while angular
self describes has the modern web development platform.
So you can see what is the vision of these two
development teams when working on this specific project.
So it actually matters because it sets some
expectations. Remember when I was telling
you about that statement that actually it
is true that we are able to package very
interesting functionality, even if it's very
specific. Well, this is a collection.
It's a GitHub repository that has the
smallest NPM packages,
and there they are,
it's a collection. And you can see there
are projects for each number is even and
a lot of people use them.
So why am I telling you
that we should care about the different types of dependencies?
Because any dependencies that you add into
your project that it's providing some functionality,
otherwise you wouldn't have added it in the first place. But every
single one of them will have an associated
cost. What do I mean by that because we need to update
them at some point or we need to migrate them or
we need to remove them. So it is important
that we know what are we adding, what do
we have there, because in a way our functionality
or final artifacts or whatever we are developing or providing,
it is dependent on all
this. So think that if something
goes wrong with one of your dependencies or your code,
you have to make a decision at the end of the day,
should I patch it, should I change it? Should we
refactor the entire code and use a different one?
Or these are decisions that will affect
the quality, the performance and also the budget
in your team, because every single one of these
operations requires effort and
sometimes things go wrong.
So we have to be very cautious.
For example, when we are adding a dependency,
we are outsourcing the developing of
that code, designing, writing, test, debugging,
maintaining to someone else, and it's usually
the unknown programmer. And I'm not
talking down open source because
problems in software can appear both
in closed resources or open resources. But it is
important that we understand, according to several reports out
there, that more than 70% of the
code that we are writing, well,
the applications that we are releasing,
where we are working and writing our code, there are not
only our code, they have a huge
amount of open resources software included
in there. So I'm going to talk about open resources,
because this is where I want us
developers to have more ideas,
more a better understanding of
it, and in a way a mechanism
to give back. So if
you are thinking about your dependencies, and you should be thinking about dependencies,
I will recommend you to read this particular paper,
surviving software dependencies.
And in this paper the author, Ross Cross
is actually talking about the cost of adopting a
bad dependencies. For example, if something goes wrong
with a dependency, well, as I said,
we will have a problem either because we need to
fix it directly, patching the
dependency, refactoring the code,
and that's already talking about when we
need to fix it. I'm not talking about what happened
during the outage of our service or our
product, or the life we affected, or the quality of life
that we decrease when software doesn't behave
as it should be. So the expected
cost of a dependency, of adopting a
bad dependencies, is the sum
of over all possible bad outcomes of
the cost of each bad outcome multiplied by
its probability of happening. What is the risk?
So before adding any dependencies, we should have
some questions asked. For example,
in terms of the design, is the documentation
clear? Does the API have a clear design if the authors
can explain the package API and its design well in
the documentations, that increases the likelihood that
they explain the implementation well to the computer, in the software,
in the source code code quality.
Is this code well written? Read some of it.
Does it look like the authors have been careful, conscientious and
consistent? Does it look like code you
would want to debug on a Friday
night before a release just because
there is some issue with this particular dependency that
happens to be critical for your applications?
That's a good question.
Testing does the card have tests?
This library, this dependency does test.
Can you run them? Do they pass?
Do they establish that the code of basic functionality is
correct?
And also actually, how important
are the tests? I cannot stress
that high enough because they are only not documenting
what is happening or should be documenting what is happening and preventing
us for incurring in some errors if we
decide to refactor. But also it's telling our consumers
that we are serious in keeping
a correct and documented functionality.
The functionality of the library,
the artifact it is correct.
Bug fixing do they have an issue tracker?
Are there many open buck reports open?
How long have they been open? Are there many fixed bucks?
Have any box been fixed recently?
You want to know all of this if it's a critical dependency?
Again, these questions are good for all dependencies,
but are more important for your critical dependencies.
How long has the code been actively maintained?
This is towards maintenance. Look at the package commit
history. How long has it been?
If they have been maintained actively for an extended amount
of time, they are more likely to continue to be maintained.
And how many people are working in that package?
I'm not going to tell you, like a single
developer, it's a bad idea, because that's not true.
I have met very committed
open source developers that are so passionate that they
drive. They are the forcing for the they are the life
force of the projects, and they will continue to be
that for years to come. And they are more productive, more responsive
than a whole team. So that can happen. But at least
it will give you an idea like what is the frequency
of this project? And another word of caution.
Maybe the functionality is so well
defined, the scope is so well set
that there is no need to add more functionality.
Maybe it's only an upgrading of dependencies,
maybe it's only verifying the correctness, or there is no security
vulnerabilities on the go. But it's important that you
see that the authors at least care on
those two things.
Usage how many people are using it?
Companies single users. Sometimes the
users are behind closed doors, so you don't
know. But at least if you look into the different forums,
into their lists, mailing list or stack overflow
or different other sites like that that provide some interaction
of different users will give you more possibilities
for a larger community to ask questions and hopefully receive
answers. And licensing,
licensing and security first, security do they
check? Like do they seem to
robust against malicious input? Do you know
that they are signing the packages or in any way complying
with the different security responsibilities of open
source maintainers? For example, at this point many should be
using two factor authentication, should be running some
tools for checking the dependencies version, et cetera,
et cetera. So these questions should also
be asked by the authors of the software that
you are thinking about, depending in itself.
Licensing, do they have a defined license?
Is this acceptable for your project or for your product?
It is shocking to see how many GitHub projects
have no clear license.
And as I said,
any library except the very core ones
that they don't depend on anything, may have dependencies
on their own. So in a way we will have to be
very careful, for example, of our transient dependencies
or have in mind that the authors of the application
that we're using, of the libraries that we are using have these
concerns in mind at least.
So now let's talk about tools and the
one for security and
infrastructure, like checking the best practices of your code.
One of it is going to be obviously for me,
x ray it fully automated binary analysis
supports all major package times and sees into all
layers of the dependencies. For example packages, container images
and files. There is another
project that you should or you could have a look.
It's based on the Owasp. This is
dependency track and this particular foundation
is a nonprofit that works to improve the security of software.
The other one that I can totally recommend you is I'm part of
the OSSF foundation and we
have just released the concise guide for evaluating
open source software. So this is a one page, we're very proud
of that. This is a one page document that you
can read and it will
have some of this consideration that we already talked about and some
orders more generic to
kind of have a checklist to know if your dependency
is the one that you are intending to use. It is
actually covering some
of these concerns. And if you are an
open source author, I urge you,
I ask you to use for example scorecards.
This is an automated tool that assess a number of important
heuristic of checks associated with software security
and assigns a score between zero and ten and
you can use this scores to understand specific areas to
improve in order to strengthen security posture
of your project and when you are running
it, it also get all this information and pull it back to the different foundation,
the open source foundation, and provide us
with information that it's thought to
be. Give us a clear or more clear understanding
what is happening out there and try to help open
source projects to improve in a way.
Again, more than 70% of the software out there
has open source, so the importance of it
is huge. Another one,
this is the concise guide for developing more secure software.
Again, one page checklist at least
will get you the get go for good practices. If you
are developing software at this point
with the type of dependencies,
how much do you depend on them and the quality
of their dependencies, you should have at least a
different idea of where you are, what is
the map like, what coordinates
are and what is the risk, and therefore for changing,
updating or removing it.
And let's again talk about other kind of
tools. And one of the tools that is
going to be very in my mind is going to
be artifactory in combination with x
ray. And why is that? Because for example,
well,
aridofactory, it sits at the heart of every
DevOps workflow. Because not
only we support several different types, so we match
with whatever you are consuming or publishing,
but the other part is with x ray as integration, we can
verify the dependencies in some cases, for example with
maven to the binary level or the docker images,
we can analyze the different parts like the base image
related to packages, the SIP files, et cetera, et cetera. And we will provide.
I cannot stress the amount of super
intelligent people that we have at JFrog in the security team
working to predict attacks, to verify the
attacks, to verify if a specific CVE
actually applies to you. Because if
we are looking at the cves or only going with the risk level,
well that is. Well, I will never
say don't do that. But it is important to also know
that not all cves are exploitable or
not every single person are at risk for using
a specific tool or dependency. This is also super
important. You may be using a
library that doesn't have any
CVE reported, it is secure,
but if your configuration is
not correct nor complete, you may be
using it incorrectly. So security is
not only about CVE, security is about the information that we
generate. Like is this library, this new version
of the library is actually running correctly
in our context or performance. Is this
library upgrading this library is still delivering our
software requirements specifications or service specific
surface level specifications is this
and all this kind of evolution can reside in
a single source of trust. So that
for me is the amazing part.
And the three tools, well, yeah, the three tools
that please start using today, because you
don't have to do a lot. You can go right
now into JFrog and start for free.
Get your free tier instance and start using
Frogbot, for example. Frogbot is a gitbot
that will advise. You will create reports
whenever you add a new dependency or existing dependencies
in your code base. It will trigger a
security scan and it will tell you you're perfect.
Go ahead or oh my goodness,
there is a problem. And the best part of there is
a problem is that you can define watches.
What does that mean? You can have different filters
in when you are asking about the security vulnerabilities
or security issues, when a specific dependency,
and it will only tell you what is important
considering a specific policy that
you define. I mean, it is important to
react to notifications of there
is something problematic here, that's for
sure. But imagine if you receive 1000 of
these notifications when everything is a priority,
nothing is a priority. So we need to
retake and get real priorities
based on policies based on your specific
needs. So that's why Frogbot for me it is
so amazing. And if you're already using docker containers,
I totally recommend you to have a look at the Docker
desktop extension, which is going to do almost the same thing that I
explained with Frogbot. You're going
to select a docker image and I will generate a security
report where it's also telling you about the different
type of abilities it will show what's the package.
If we have a specific report that meant even the
different ones from different public databases,
we will do this is ship left to
the maximum. This is our ide's
extension. In this case I'm showing idea because that's the one that
I use. And as soon as you
are typing your dependencies in your palm file,
this is a maven project, a very basic maven
project. As soon as I'm modifying that palm file and
adding a specific library with a version with
a vulnerability, in that moment I
will get that notification. The same thing
saying me what is the version of a package affected
or there isn't a fixed version? And if there is more information
about the specific vulnerability will also let me
know. There are other open source tool
by JFrog, for example NPM security. Install the
packet checker and the NPM issues statistic.
I hope you enjoyed and happy coding. Thank you for being
here. I'm Michelle Reese. Please let me know if you have any comment.
Bye.