Transcript
This transcript was autogenerated. To make changes, submit a PR.
At Devopsace Eindhoven this year, I suggested an open space about
how Devsecops is just a band aid for a bullet wound.
After a talk about supply chain security tools, I know
a risky move, and now I'm telling you about it at this Devsecup conference.
Anyway, I opened with while we should certainly
scan our code for vulnerabilities, and ideally have those checks
be automated, we should also invest in mitigating some
of the root causes for vulnerabilities creeping into our code bases
through open source use in the first place,
shifting devsecops further left,
if you will. And I don't mean Sec DevOps.
So one of the people in the room mentioned that
just having a mirror of all of components in use as
a solution, which congratulations, you're now the
maintainer of a bunch of mirrors. But also
sometimes vulnerabilities are in our code bases for
many weeks, months or even years before anyone notices.
So new releases bring fixes as well.
In this modern world, we rely on a lot of components to make our stuff
work and make it continue to work. I know you know this
to be true, but I will also bring you some stats the
2022 Austra open source security and risk report
produced by the Synopsis Cybersecurity Research center
examines the results of over 2400 commercial
code bases and the audit came back that 97%
of those contained open source software.
Four of the 17 industries that were represented
in this report, computer hardware and semiconductors,
cybersecurity, energy and clean tech and IoT,
contained open source in 100% of their audited code bases.
The remaining verticals had open source in 93%
to 97% or sorry, 99% of their code bases.
Large enterprises rely on libraries that are maintained by a single individual that
is in over their head. Sometimes projects are handed over to
other maintainers who don't always have the best of intentions.
Individuals or organizations may restrict the use of their technology
or end of life versions of their software, posing real challenges
for organizations who rely on that software. So how
can we contribute to the viability and sustainability of open source?
Hi, my name is Flor, I'm based in the Netherlands. I'm a
staff developer advocate at Aiven IO. We manage your favorite
data tools open source data tools without exploiting the projects or their
maintainers. Previously I worked in developer
relation roles at Grafana Lamps and at Microsoft. I'm a
Devopsdays code member and I organize the DevOps days Amsterdam
and DevOps days Eindhoven city chapters.
I am a Microsoft MVP for developer technologies and I
organize a bunch of meetups, including but not limited to
contributing today. Devrel Solon, Amsterdam and
the Amsterdam ruby meetup.
So what are some of the issues that we see in open source? One of
the issues is relicensing projects relicense
in order to avoid free writing, to make sure
that bad people can't use our code to do even more bad or
to alleviate responsibility. Another issue is
the projects that are maintained by the proverbial
single individual in Nebraska. That's a shout out to the XKCD
comic that you see on the slide. While Curl
is successfully maintained by Daniel Stenberg mostly in his lonesome, for every
curl there is a log for J. And with every mpm
library that you bring in, you bring in a whole host of MPM libraries
and their transitive licenses and
possible vulnerabilities too. Lack of resources
prevent maintainers from spending the time that a project warrant,
given how businesses depend on it globally,
and maintainers can make rash decisions. They're much like other humans in
that way. We've seen maintainers
pull their code to avoid it from being used by the likes of ICE,
the US Immigration and Customs Enforcement, or more recently,
to protest Russia's attack on
Ukraine. Are these
the only issues that are plaguing open source?
No, I don't think so. I would love to see open source be
a more inclusive and equitable space,
but for the next 30 minutes or so, let's look at some of those licenses.
License changes, maintainer drain, and the rise in supply chain
attacks in recent years,
we've seen an increase in kinda open source licenses.
Let's have a look at some of those licenses. The Commons
Clause aims to restrict commercial free writing on open source code,
especially cloud service providers who don't give back to the FOSS community.
Commons clause conflicts with the FSD, which is the right to use software
for any purpose, and the OSD, the open source definition,
in that the license shall not restrict any party from selling or giving
away the software. There is a bunch of ambiguous
wording in the Commons clause like value
derived entirely or substantially because what is
considered substantial. Mongo used the Commons
clause for a while, as did Redis labs, which combined it with the
Apache license. So it was a dual license, which is anyway
tricky business, and both moved to a nonstandard available
source or cloud restricted licenses afterwards.
So in Mongo's case, MongoDB moved to SSPL in
2018, which is kind of like GPL but with
restrictions, and it's not approved by the open source initiatives
who are the stewards of the open source definition SSPL
forces wide copy left impact on the cloud infrastructure.
Its justification again is that notice or
this notion that large cloud vendors capture all the value but contribute
nothing back to the community. In this case, it was directed at Amazon
Web Services in particular. Then there is the Redis
source available license for certain redis modules created
by Redis, while code redis remains under the BSD
three, the TLDR is that Redis
source available is a license to do all the usual actions.
So use, modify, distribute, copy and sublicense except
when your application is distributed or made available
as a database product. So that would allow the
community to develop their own applications but not distributed or make
available for use as in or
as a database product. Because, you guessed it, cloud providers
elastic 20 then again, you'll find clauses to prevent hosted
or managed service providers from using the project.
It is copy left like SSPL, but with straightforward
prohibitions. So it prevents using elastic as part
of a hosted or managed service provider.
It prevents third parties obstructing trademarks or
branding, and it can embed license keys to prevent
circumvention, which is very much not an open source thing.
Its impact was Elasticsearch Kibana. It all
got removed from hosted surface infrastructures like Azure
and AWS. Then there are some others
like the timescale TSL,
which basically says no timescale as a service, no forking,
or the confluent community license with which
you can use modify distribute unless that competes
with confluence business, which could potentially be
a moving target.
There's also ethical licenses like the hippocratic license,
which prohibits the use of software in the violation of internationally
recognized human rights, or the Mi
five, which makes an explicit connection between the license
and a code of conduct. The ethical
source working group says that over the past 20 years, open source community
has come to thrive, enjoying wild success and
permanently changed the technology landscape.
But the world has also changed in the past two decades,
and they think it's time for open source to evolve to meet the
magnitude and complexity of today's social,
political and technological challenges.
Open source developers don't seem to have any records,
no way to prevent their work from being used by
people to harm others. And that's where that working group is
determined to make a change.
This tweet by former colleague at Microsoft
Tierney hits right in the fuels for me.
Tierney is a staff developer advocate at Twilio and
works on code Electron,
OpenJSF and NPM. The currently
accepted community understanding of open source as a concept is
fundamentally at odds with the open source definition provided by
the open source initiatives is what Tierney says.
And they go on to say that more specifically, the accepted community understanding
of open source usually includes some level of
humanity, users, community maintainers,
and it's simply missing from the definition.
If this hits you in the fields too, and you want to learn more about
ethical source in particular, I suggest you check out ethicalsource
dev because I won't go into it much, but I think it's wildly interesting.
I know what you're thinking. Open source is not really about licenses,
it's about community sharing, openness,
freedom. Licensing was supposed to be just the instrument,
right? A way to formalize the relationship.
I think the discussion around the impact of cloud restricted
licenses was an important one to have with
the open source community. But I think we can all
agree that cloud restricted licenses are not a way to save open
source because they're taking the code and the project pipe.
Maybe that's okay for those projects.
Are these licenses really required for economic sustainability
of a project? Mongo and elastic argue that yes,
they felt used by cloud infrastructure service providers,
but new Linux is used commercially also by
everyone and they still have great community. Perhaps because of and not
despite of MongoDB and elastic
themselves were large companies in their own right before the license
change. And even taking enforceability out of the picture,
being and winning in cases of copyright or Pentagon infringement is
actually really hard. Changing to a more restrictive license might
cause companies and community members to walk away,
which could be what is actually detrimental to a project
and the ecosystem. They do prevent free
writing, right? Like cloud providers have stopped using these services,
but they also push the open source community to create alternatives or to
move to open tools.
So some argue that these projects, Mongo Elastic
were never really open source to begin with, but I don't think I agree with
that. I think they brought tremendous value to the community but then
confused open source for their business model and couldn't reconcile
with others making money over their businesses.
Let's look at some more examples.
Lightbend changed aka's license from Apache 2.0
to the BSL version one one if you're interested,
which is a business source license, and it would
start with Aka 2.7, which was delivered
last October. And with any such change there is talk of
a fork. I've seen people advocating for foss with an aggressive copy
left license so that the now proprietary licensed original can't
make use of bug fixes to the fork.
It remains the question how effective this would be and
if hurting our fellow developers is anything but really misdirected
and anger aka can't be replaced.
There is a lot of projects that build on top
of AKA. A disclaimer before we move forward.
I work for a company that is very invested and
involved in driving Opensearch forward as the open
source alternative to elasticsearch. When elastic
released the publication informing about the license change,
a shockwave went through the community. Several players
eventually decided to collaborate and fork Elasticsearch,
including AWS,
Apache, Kafka development. Kafka is also a project
in Ivan's portfolio, or rather the decision of what makes it into
that project is primarily in confluence hands.
The single vendor issue is rather prevalent in open source
databricks has a strong hold on sparks. Google and Beam
is a very similar story as well.
Grafana, Loki and Tempo relicence from Apache two
to AGPL, which is an infectious copy left license.
Google warns against using HTPL,
saying that the risk heavily outweighed the benefits the cloud
native Computing foundation so the CNCF, in response to the
license change of third party dependencies to AGPL,
encourages to either switch to an alternative component,
to freeze the component at the version prior to
the license change, or to seek an exception from the governing board.
Needless to say, they're not big fans.
If you install electron, you have to add 87
packages, and that means 87 license dependencies.
Every single package is likely to have their own dependencies
as well, and therefore another license that you have to comply
with. As you can imagine, license management
can be done manually and when done incorrectly,
can result in technical depth. There are
over 300 open source software licenses,
and that list is only growing. However, the good news is
that around 20 licenses account for 80% of all
the commonly used open source in enterprises.
So a deny and allow list of those licenses, together with a
scanning tool already provides a very good starting point in managing
them. What you can do to help track licenses
inside your code is the license auditor tool,
which sends notification after spotting a potential problem.
There's also a little cheat sheet, a link to a little cheat sheet on this
slide where you can find, but more about what
kind of different licenses there are.
License litigation may end up forcing you to release code under the same license
as the package dependency that you've used. Other potential
problems include being sued for financial
liability by the creator of the component,
getting penalties and restrictions on selling your software until
the compliance is met, or losing reputation and
getting negative press coverage, certainly in more sensitive
industries.
I want to switch gears a little in 2021,
a tight lift survey of 400 open source maintainers
found that 46% of maintainers are not paid at
all, and only 26% receive
as much as $1,000 per year for maintenance work.
Over half, 59%, have quit
or considered quitting maintaining a project, and almost half
of the respondents listed lack of financial compensations as
one of their top reasons for disliking being a maintainer.
Open source libraries enable you to move faster,
but if they're poorly maintained, if they're not healthy, they become a
single point of failure. The 2016
example was leftbed. All that leftbed did is
pad out the left hand side of strings with zeroer spaces.
Still, thousands of projects, including node
and Babel, relied on it with leftped removed
by NPM by the maintainer of a spythe. These applications and
widely used bits of open source infrastructure were unable to obtain
the dependency and thus fell over during development
and deployment. Leftpat's maintainer felt pushed
in the corner by messaging Kick's lawyers
over another one of his NPM libraries, also called Kick.
The lawyers went to NPM admins, claiming brand infringement.
When NPM took kick away from the developer, he was furious
and then unpublished all of his NPM managed dependencies.
The maintainer later said that the situation made him realize that NPM
is someone's private land where corporate is more powerful than the
people what happened to fix
the Internet? Which was really not a hyperbole? Laurie Foss, who is the CTO
and co founder of NPM, took the unprecedented step
of restoring the unpublished library.
NPM has forcibly resurrected that particular
version to make sure that everyone's stuff kept running.
Maybe had the Leftpad maintainer had access to representation,
maybe by foundation, the Leftpad incident could have been prevented.
This maintainer had over 200 libraries
to his name. We need to give individuals
incentives for staying in open source and maintaining the software we've come to
rely on. For better or worse.
Seth Fargo after discovering a contract between software automation
companies chef and Ice, deleted his code, and in
doing so, more or less discontinued chef's services.
It's a temporary thing, for sure. The nature of open source means that
we can just roll back an unarchived previous version,
and legally there is nothing that Seth can do. He licensed
his code as open source. So Seth
claims that his code lived in a personal repository
on GitHub and under a personal namespace on Ruby gems,
but they were actually created in a time when Seth
was still an employee of chef. But then again,
no OSI license or employment agreement
requires Seth to continue to maintain code
on his personal accounts. They were conflating
code ownership over code stewardship is what Seth
said, and he added to that that
he has some very specific instructions
in his will and how to deal with the code that he
owns when he dies. So he basically said
that if he would have died that day, the same thing would have happened.
That kind of makes you think, doesn't it?
Another example then, the GitHub project colors
JS is simply known as colors on the NPM repository,
has scored over 3.3 billion downloads throughout his
lifetime, and has over 19,000 projects that depend on it.
Similarly, Faker JS exists on NPMs
Faker and has been retrieved 272
times a million times from the NPM repository,
and has over 2500 dependents.
Both projects are developed and maintained by the same author.
The immense download rate of these two components
can be attributed to the basic but essential functionality
that they provide to JavaScript developers. Colors lets
you print colorful text messages on the console, whereas Faker
helps developers generate fake data for their applications for
testing and staging purposes.
The hijacked colors version trapped applications in
an infinite loop, printing liberty, liberty,
liberty. And then, followed by some gibberish,
the developer himself introduced that infinite loop in colors,
thereby sabotaging its functionality and perched functional
code from the Faker package in version six,
which I mean really, the version number should have given it away.
It's likely that this stunt relates back to November
2020, where the developer explicitly expressed an
intention of no longer wanting to support big companies
with his free work, and that businesses
should pay him a fee in
the six figure area.
Then, mid March this year, the developer behind the
popular NPM package node IPC, released sabotaged versions of
the library in protest of the ongoing war in Ukraine.
Mid March this year, developer behind the popular NPM
package node IPC, released sabotaged versions of the library
in protest of the ongoing war in Ukraine.
Newer versions of the Node IPC package began deleting
all data and overriding files on developers'machines,
in addition to creating new text files with piece messages.
With over a million weekly downloads. NodeiPC is a
prominent package used by major libraries like the Vue JS CLI.
The package appears to have been originally created by the developer
as a means of peaceful protest,
as they mainly edit that message of peace on a desktop
of a user installing the package. But then chaos unfolded when
select versions of the code APC package library
were seen launching a destructive payload to delete all data
by overriding files of users. Installing the package
for users in Russia and Belarus, only this
has been called Protestware and is one of the newest versions
of supply chain attacks.
Open source is part of our infrastructure,
products and tooling, and for this reason we need to care
about them like they were our own projects. No company
will leave crucial parts of their in house developed tech stack unmaintained,
so why are we willing to do so for the ones that are open
source? I want you to ask yourself the
following questions. What are the departments or roles
in your company responsible for identifying and mitigating
impact of license changes? What projects
in your stack do you think may be at risk of posing
a similar challenge as elasticsearch did?
Who is looking at the health of the software that you rely on?
Who leads research and due diligence of alternatives so
that when you will need to change, it won't be a knee jerk response?
I'd be remiss if I did not talk about the log for j or
log for shell flaw today. The remote
execution code execution vulnerability that scored ten out of ten on
the cvss, which is the common vulnerability scoring system.
The impact of log for J was and is huge.
Even if you scanned your code base and you thought that you could relax after
confirming that you don't use log for J anywhere,
you were not in the safe yet, right? Like you could be depending on a
library that in turn uses log for J and still be exposed.
Security firm Snick actually found that 60% of Java
applications rely on the library indirectly,
versus the 40% that rely on it directly.
Log for J has been developed by the Apache Software
foundation and that certainly signals health, right? And yet
this happened. We sometimes talk
about open source being inherently secure.
The code is out in the open. If something is broken,
people will see it and they will fix it. But then how
do you explain law for J or heartbeat or
the starts vulnerability? The many eyes argument
is very shaky. It needs the right people to look in the
right places and security is hard. I find that
developers are looking at open source for solutions, not problems.
Installing an NPM package introduces an implicit
trust on 79 3rd party packages and
39 maintainers, which creates a very large attack
surface, 150 dependencies,
which is kind of typical for a Java project.
And those dependencies maybe release a new version ten
times a year, which is an average amount per year. That makes
1500 updates for you to
consider. A software builder materials or an
S bomb is a list of all of the open source and third party components
present in a code base. An S bomb also lists the
licenses that govern those components, the regressions
of the components that are in use their patch status,
and that allows security teams to quickly identify and
associate security or license risk.
The concept of a bill of materials derives from
manufacturing, where a bill of materials is an inventory
detailing all items that are included in a product.
Sounds like a bunch of work. There's a good thing then that there
is software compensation analysis tools or SCA
tools that can help you do the job, like the one by
Thomas Steinbergen OSS review toolkit ORT,
which includes software package data exchange SPDX,
which is an open standard for software bill of materials.
SPDX allows the expression of components, licenses,
copyright, security references, and other metadata that is related to
your software. It is the perception
that open source okay,
it is the perception that open source is risky,
but actually 98% of projects have safe versions
available. Most vulnerabilities,
they are patched before they're aiven disclosed.
Lock for J was patched in 15 days and
the patch was made available by the time that the CVE went
public. We're just not really good at managing
open source. When asked,
68% of IT leaders are confident that they're not using
vulnerable versions. But that same number,
the same 68% of applications use
a component with a known vulnerability. So what's
the deal?
Produced in partnership with the Harvard Laboratory for
Innovation Science and the open source
Security foundation, or the OpenSSF census two is the
second investigation into the widespread use of free
and open source software and aggregates data from
over half a million observations of fauce
libraries used in production applications
at thousands of companies. It aims to shed light on
the most commonly used FOSS packages and applications
at the application library level. Such insights will
help identify critical open source packages
to allow for resource prioritization and address security
issues in widely used software.
And one of the outcomes that just
enterprises what I've just said is that much of the widely used Foz is developed
by only a handful of contributors, and the OpenSSF sees this
as a problem as well.
Related project is the open source project criticality
score, maintained by the members of the OpenSSF
securing critical projects Working group.
The goals are to generate a criticality score for
every open source project, to create a list of critical projects that the open
source community leans on, and to use this
data to proactively improve the security posture of these
critical projects. A project criticality
score defines the influence and the importance of the project, and it's
a number between zero least critical and one
most critical. Based on an algorithm by Rob pike.
Using the parameters on this slide, the tool derives the criticality score
for any open source project.
So what can we do? We can contribute.
We can contribute with our time, with code, with code reviews,
documentation, but only when it's appropriate. Sometimes there
is no lack of community contributors, but the maintainer just
lacks the time to look at those. Maybe,
probably and definitely. Unfortunately, they can't work on the project full
time at the place of work. So be an excellent open
source citizen. Written communication
is very hard, which is why it's extra important to invest in
your communication skills. Be patient and
be graceful, even when your otherwise excellent pr
doesn't make it into the project. An advocate on behalf
of Foz inside your VR company even with
open source being the standard and so widely used,
many people don't know about its abundance or the rules of
engagement. Your organization could maybe
sponsor a project. Many companies run a
foss fund of some sorts, and they regularly award sums of
money to projects. Another caveat here is
that some projects are not ready nor equipped to deal with
large sums of money and don't know how to distribute it between
core contributors. And also money coming
in the one month and then nothing the next can be as bad as
no money coming in at all when your
company uses open source software. Spoiler the answer is
yes. We discussed this. Here is how you can help it stay
secure. Manage your third party licensing exposure
just like your security exposure. Be careful
with automation and third party library or package updates
during CI CD. If you're extending OSS
functionality, maybe prefer plugins over downstream modifications.
And when you're fetting a new component, here's maybe what to
consider. Which license are they using and
who is behind the project? What is their governance
policy? And sometimes
vendor distributions or software as
a service solution can really shield.
I recognize the irony in saying that after we discussed
how cloud providers aren't always seen as the good guy.
But sometimes, if a maintained service
is considered a well meaning citizen in the open source world,
that is the sweet spot where you want to be. Look into tools like
the criticality score, the OpenSSF scorecard, maybe sonotypes
tooling which is open source as well. So you can help improve
those tools as well and
navigate participation in open source well by
abiding by the principles of authentic participation,
which were derived at the sustained Summit 2020
event in Brussels, Belgium. There, Duan O'Brien
and other specificated discussion groups loosely focused on corporate
accountability in the context of open source,
the principles are these it starts
early. This came out of the discussions about organizations showing
up with mature, fully baked contributions over which
the community had no input. It puts the
community first. This reflects the
consensus that when an organization and the community want different things,
the community prevails. It starts
with listening. This is a reflection of comments
of companies showing up to a project telling
that had no context whatsoever in telling them all the things that
they did wrong. It has transparent motivations.
Without a shared understanding of the motivations, it's impossible
to resolve differences. So there
should be no hidden motives. It enforces respectful
behavior. Participants they agree
to adhere to community installed codes of conduct
and organizations commit to holding participants on their
side accountable for their behavior,
and it ends gracefully. That means that there's no sudden
withdrawal of resources without notifications and
a contingency or an exit plan. There should be clear
documentation that will allow the community to pick up projects
when a company decides to withdraw their support.
Further, as a company, when you're hiring maintainers,
make sure that they're empowered to balance internal and external feature
needs and please don't change scope and
strategy on them with every new fiscal year.
Foundations like the Apache Foundation, Linux foundation and
the CNCF, maybe the Eclipse foundation, they act as
stewards for the open source projects in their care and in their incubation
pipeline. Supporting these organizations to do
more great work is definitely a way to leave open
source a better place than you found it.
As a foundation supporting member, you gain a seat,
or maybe multiple seats at the table. So make sure that you make that seat
count and send the right people there.
Discussions around the sustainability of open source are hard,
but they're necessary. Free and open source software is ubiquitous.
It's omnipresence. Yet we're still struggling to live with
open source in a healthy, safe and productive way.
We need better support systems to avoid maintainers burnout and
to avoid regressions from creeping into our supply supply chain.
We need to spend less time firefighting and more
time nurturing our open source software supply
chain. Thank you.