A journey of the thousand binaries

Video size:

Abstract

Dependencies, trust, security and how to invest our development resources has become critical in recent years with several huge security / trust scandals in the development industry. Every dependency that we add is potentially a source of headaches… how do we face this challenge.

Summary

Dependencies are collection containing high quality tested code that provides functionality. For example, we have frameworks, libraries, package modules and resources. Dependency managers like NPM have made possible that almost trivial functionality be packaged and published.
Before adding any dependencies, we should have some questions asked. For example, in terms of the design, is the documentation clear? How long has the code been actively maintained? And how many people are working in that package?
Owasp is a nonprofit that works to improve the security of software. I'm part of the OSSF foundation and we have just released the concise guide for evaluating open source software. More than 70% of the software out there has open source, so the importance of it is huge.
Aridofactory sits at the heart of every DevOps workflow. With x ray as integration, we can verify the dependencies in some cases. Security is not only about CVE, security is about the information that we generate. Three tools that please start using today.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hi, thank you for being here. A journey of the thousand binaries. I'm super happy of being here with you. My name is Xiao Reese, I'm from Mexico, I live in Switzerland. I'm a Java champion. I work for JFrog. I collaborate in some organizations like the Open Source foundation, the CNCF, the CDF, and I'm an open resources enthusiast. I also have collaborated in some books, the latest one being DevOps tool for Java developers. You can get it there. And the reason why I like this particular session is because we all know the world runs on software and there is a key part or key piece when I refer to that one. And that key piece are dependencies. For example, according to the 2022 open source Security and Risk analysis report by synopsis, we learned that 78% of code in code bases was open source. And not only that, 85 contained open source, that was more than four years out of date. So this is important. We have our code, our artifacts that we're publishing out there are a combination or a combination of several other kind of software, other dependencies. So the first question we should be asking is what type of dependencies, what kind of dependencies are we using and for what reason? And dependencies, it's a broad name, it's a broad label, it can mean so many things. So for example, I'm going to read to you two statements and you will tell me if they are true or false. Dependencies are collection containing high quality tested code that provides functionality, and that functionality requires significant expertise to develop. This is true. Dependency managers like NPM have made possible that almost trivial functionality be packaged and published. That is also true. So we have noticed that there is in a spectrum, really complex libraries or applications or frameworks and trivial functionality out there. So what are the different types of dependencies? For example, we have frameworks, libraries, package modules and resources. Resources can be a collection of files, for example templates, media, audio, video or images, plain text files or blob that need to be included by the application to execute correctly. And for example, modules are set of methods or functions that provide a self contained functionality. A module issue has an interface that explicitly and abstractedly specifies both the functionality it provides as well at the functionality depends on. And usually when we find an interface, there has to be at least one implementation of that functionality. So sometimes for us it's kind of a black box. And finally, packages. Packages are a collection of modules that hold in general the same functionality type. For example, if you are thinking about JavaScript, it's usually a directory, it can contain some file that describes the metadata about the package. And by aggregating this, the authors make it easy to remove or to add in a specific set of functionality. And finally, a library. A library is a collection of related functionality divided or defined in several packages. And how a library usually behaves is use call in a specific function. This will execute whatever functionality it says it's providing and return the code or the no, not the code, return the control back to you, to your application. And on the other hand, a framework behaves or has more and more abstractive design and has even more behavioral built in. So it provides its own workflow. And what it usually happens, sometimes what you do is these frameworks have hookups or specific places where you can add your specific code. So they are executing a specific workflow and suddenly they call your logic, and then, well, they complete the functionality and continue whatever workflow it is supposed to be doing. So why am I talking about these two things? Like why am I making a difference, for example, between libraries and frameworks? While frameworks usually have a lot of lines of code, they are usually opinionated, there is a roadmap, and they have versioning, they have a license specified, and most likely, and usually this is like the standard industry, they have a healthy number of tests that are continually checking that the behavior is still as they are supposed to be done. And the differences, they don't only are important to me just to understand what type of dependencies we're adding to our projects, but for example, if you are in the JavaScript world, you probably have read or use angular and react and react is self defined JavaScript library for building user interfaces, while angular self describes has the modern web development platform. So you can see what is the vision of these two development teams when working on this specific project. So it actually matters because it sets some expectations. Remember when I was telling you about that statement that actually it is true that we are able to package very interesting functionality, even if it's very specific. Well, this is a collection. It's a GitHub repository that has the smallest NPM packages, and there they are, it's a collection. And you can see there are projects for each number is even and a lot of people use them. So why am I telling you that we should care about the different types of dependencies? Because any dependencies that you add into your project that it's providing some functionality, otherwise you wouldn't have added it in the first place. But every single one of them will have an associated cost. What do I mean by that because we need to update them at some point or we need to migrate them or we need to remove them. So it is important that we know what are we adding, what do we have there, because in a way our functionality or final artifacts or whatever we are developing or providing, it is dependent on all this. So think that if something goes wrong with one of your dependencies or your code, you have to make a decision at the end of the day, should I patch it, should I change it? Should we refactor the entire code and use a different one? Or these are decisions that will affect the quality, the performance and also the budget in your team, because every single one of these operations requires effort and sometimes things go wrong. So we have to be very cautious. For example, when we are adding a dependency, we are outsourcing the developing of that code, designing, writing, test, debugging, maintaining to someone else, and it's usually the unknown programmer. And I'm not talking down open source because problems in software can appear both in closed resources or open resources. But it is important that we understand, according to several reports out there, that more than 70% of the code that we are writing, well, the applications that we are releasing, where we are working and writing our code, there are not only our code, they have a huge amount of open resources software included in there. So I'm going to talk about open resources, because this is where I want us developers to have more ideas, more a better understanding of it, and in a way a mechanism to give back. So if you are thinking about your dependencies, and you should be thinking about dependencies, I will recommend you to read this particular paper, surviving software dependencies. And in this paper the author, Ross Cross is actually talking about the cost of adopting a bad dependencies. For example, if something goes wrong with a dependency, well, as I said, we will have a problem either because we need to fix it directly, patching the dependency, refactoring the code, and that's already talking about when we need to fix it. I'm not talking about what happened during the outage of our service or our product, or the life we affected, or the quality of life that we decrease when software doesn't behave as it should be. So the expected cost of a dependency, of adopting a bad dependencies, is the sum of over all possible bad outcomes of the cost of each bad outcome multiplied by its probability of happening. What is the risk? So before adding any dependencies, we should have some questions asked. For example, in terms of the design, is the documentation clear? Does the API have a clear design if the authors can explain the package API and its design well in the documentations, that increases the likelihood that they explain the implementation well to the computer, in the software, in the source code code quality. Is this code well written? Read some of it. Does it look like the authors have been careful, conscientious and consistent? Does it look like code you would want to debug on a Friday night before a release just because there is some issue with this particular dependency that happens to be critical for your applications? That's a good question. Testing does the card have tests? This library, this dependency does test. Can you run them? Do they pass? Do they establish that the code of basic functionality is correct? And also actually, how important are the tests? I cannot stress that high enough because they are only not documenting what is happening or should be documenting what is happening and preventing us for incurring in some errors if we decide to refactor. But also it's telling our consumers that we are serious in keeping a correct and documented functionality. The functionality of the library, the artifact it is correct. Bug fixing do they have an issue tracker? Are there many open buck reports open? How long have they been open? Are there many fixed bucks? Have any box been fixed recently? You want to know all of this if it's a critical dependency? Again, these questions are good for all dependencies, but are more important for your critical dependencies. How long has the code been actively maintained? This is towards maintenance. Look at the package commit history. How long has it been? If they have been maintained actively for an extended amount of time, they are more likely to continue to be maintained. And how many people are working in that package? I'm not going to tell you, like a single developer, it's a bad idea, because that's not true. I have met very committed open source developers that are so passionate that they drive. They are the forcing for the they are the life force of the projects, and they will continue to be that for years to come. And they are more productive, more responsive than a whole team. So that can happen. But at least it will give you an idea like what is the frequency of this project? And another word of caution. Maybe the functionality is so well defined, the scope is so well set that there is no need to add more functionality. Maybe it's only an upgrading of dependencies, maybe it's only verifying the correctness, or there is no security vulnerabilities on the go. But it's important that you see that the authors at least care on those two things. Usage how many people are using it? Companies single users. Sometimes the users are behind closed doors, so you don't know. But at least if you look into the different forums, into their lists, mailing list or stack overflow or different other sites like that that provide some interaction of different users will give you more possibilities for a larger community to ask questions and hopefully receive answers. And licensing, licensing and security first, security do they check? Like do they seem to robust against malicious input? Do you know that they are signing the packages or in any way complying with the different security responsibilities of open source maintainers? For example, at this point many should be using two factor authentication, should be running some tools for checking the dependencies version, et cetera, et cetera. So these questions should also be asked by the authors of the software that you are thinking about, depending in itself. Licensing, do they have a defined license? Is this acceptable for your project or for your product? It is shocking to see how many GitHub projects have no clear license. And as I said, any library except the very core ones that they don't depend on anything, may have dependencies on their own. So in a way we will have to be very careful, for example, of our transient dependencies or have in mind that the authors of the application that we're using, of the libraries that we are using have these concerns in mind at least. So now let's talk about tools and the one for security and infrastructure, like checking the best practices of your code. One of it is going to be obviously for me, x ray it fully automated binary analysis supports all major package times and sees into all layers of the dependencies. For example packages, container images and files. There is another project that you should or you could have a look. It's based on the Owasp. This is dependency track and this particular foundation is a nonprofit that works to improve the security of software. The other one that I can totally recommend you is I'm part of the OSSF foundation and we have just released the concise guide for evaluating open source software. So this is a one page, we're very proud of that. This is a one page document that you can read and it will have some of this consideration that we already talked about and some orders more generic to kind of have a checklist to know if your dependency is the one that you are intending to use. It is actually covering some of these concerns. And if you are an open source author, I urge you, I ask you to use for example scorecards. This is an automated tool that assess a number of important heuristic of checks associated with software security and assigns a score between zero and ten and you can use this scores to understand specific areas to improve in order to strengthen security posture of your project and when you are running it, it also get all this information and pull it back to the different foundation, the open source foundation, and provide us with information that it's thought to be. Give us a clear or more clear understanding what is happening out there and try to help open source projects to improve in a way. Again, more than 70% of the software out there has open source, so the importance of it is huge. Another one, this is the concise guide for developing more secure software. Again, one page checklist at least will get you the get go for good practices. If you are developing software at this point with the type of dependencies, how much do you depend on them and the quality of their dependencies, you should have at least a different idea of where you are, what is the map like, what coordinates are and what is the risk, and therefore for changing, updating or removing it. And let's again talk about other kind of tools. And one of the tools that is going to be very in my mind is going to be artifactory in combination with x ray. And why is that? Because for example, well, aridofactory, it sits at the heart of every DevOps workflow. Because not only we support several different types, so we match with whatever you are consuming or publishing, but the other part is with x ray as integration, we can verify the dependencies in some cases, for example with maven to the binary level or the docker images, we can analyze the different parts like the base image related to packages, the SIP files, et cetera, et cetera. And we will provide. I cannot stress the amount of super intelligent people that we have at JFrog in the security team working to predict attacks, to verify the attacks, to verify if a specific CVE actually applies to you. Because if we are looking at the cves or only going with the risk level, well that is. Well, I will never say don't do that. But it is important to also know that not all cves are exploitable or not every single person are at risk for using a specific tool or dependency. This is also super important. You may be using a library that doesn't have any CVE reported, it is secure, but if your configuration is not correct nor complete, you may be using it incorrectly. So security is not only about CVE, security is about the information that we generate. Like is this library, this new version of the library is actually running correctly in our context or performance. Is this library upgrading this library is still delivering our software requirements specifications or service specific surface level specifications is this and all this kind of evolution can reside in a single source of trust. So that for me is the amazing part. And the three tools, well, yeah, the three tools that please start using today, because you don't have to do a lot. You can go right now into JFrog and start for free. Get your free tier instance and start using Frogbot, for example. Frogbot is a gitbot that will advise. You will create reports whenever you add a new dependency or existing dependencies in your code base. It will trigger a security scan and it will tell you you're perfect. Go ahead or oh my goodness, there is a problem. And the best part of there is a problem is that you can define watches. What does that mean? You can have different filters in when you are asking about the security vulnerabilities or security issues, when a specific dependency, and it will only tell you what is important considering a specific policy that you define. I mean, it is important to react to notifications of there is something problematic here, that's for sure. But imagine if you receive 1000 of these notifications when everything is a priority, nothing is a priority. So we need to retake and get real priorities based on policies based on your specific needs. So that's why Frogbot for me it is so amazing. And if you're already using docker containers, I totally recommend you to have a look at the Docker desktop extension, which is going to do almost the same thing that I explained with Frogbot. You're going to select a docker image and I will generate a security report where it's also telling you about the different type of abilities it will show what's the package. If we have a specific report that meant even the different ones from different public databases, we will do this is ship left to the maximum. This is our ide's extension. In this case I'm showing idea because that's the one that I use. And as soon as you are typing your dependencies in your palm file, this is a maven project, a very basic maven project. As soon as I'm modifying that palm file and adding a specific library with a version with a vulnerability, in that moment I will get that notification. The same thing saying me what is the version of a package affected or there isn't a fixed version? And if there is more information about the specific vulnerability will also let me know. There are other open source tool by JFrog, for example NPM security. Install the packet checker and the NPM issues statistic. I hope you enjoyed and happy coding. Thank you for being here. I'm Michelle Reese. Please let me know if you have any comment. Bye.

Slides

Download slides (PDF)

See all 46 talks at this event!

Conf42 DevSecOps 2022 - Online

December 01 2022

A journey of the thousand binaries

Video size:

Abstract

Summary

Transcript

Slides

Ixchel Ruiz

DA/DX @ JFrog

Join the community!

Featured event

2025

2024

Info

Conf42 DevSecOps 2022 - Online

December 01 2022

A journey of the thousand binaries

Video size:

Abstract

Summary

Transcript

Slides

Ixchel Ruiz

DA/DX @ JFrog

Join the community!