Conf42 Internet of Things (IoT) 2023 - Online

Your cloud emits CO2

Video size:

Abstract

I have all my infra in the cloud, so it does not emit CO2. Yes, it does. And you will need to take care of it. I’ll show you why and how you can measure it, and how you can make your infra carbon-aware.

Summary

  • Carbonifer is a platform for managing, measuring and reducing the Carbonifer emissions of your cloud. The cloud is emitting more co2 than flights and these numbers can be around 8% by 2038%. Looking at the carbon footprint of a company is also a great way for attracting SG funds.
  • In order to know what the company is responsible of very three scopes. Scop. Three is basically everything else. Other indirect emission, value chain emission. Companies are responsible of defining the parameter of their scope three.
  • The Green software foundation came up with this Sci formula. SCi stands for software carbon intensity. The idea is to calculate how many gram of co, 2 /hour your software is going to emit. The easiest way is to use some probes, but if you are running on the public cloud, you don't have access to the hardware.
  • All major hyperscalers, public cloud like AWS, GCP and Microsoft Azure are claiming to be net zero or even carbon negative by 2030. So basically you would like to use some finops good practices to be greenups.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Everyone, welcome to my session. First I want to thank comfort. Amazing this conference. Now let's talk about the carbon emission of your cloud infrastructure. I'm Olivier Biale. I'm usually software engineer at elastic in the reducing team, but I'm taking a break to work on Carbonifer. Carbonifer is a platform for managing, measuring and reducing the Carbonifer emissions of your cloud. Here is a sentence I hear quite often. I have all my infrastructure in the cloud, so I do not emit co2. Yes, we all know that whatever you put in the cloud disappear. It's just magic. It doesn't have physical existence. But still we know that it's running on servers and data centers and those are powered by electricity. But electricity is clean, right? No, not really. So let's talk about electricity. First let's look at some numbers. We considered that data centers are responsible of 20% to 25% of the electricity used by the digital sector, which is one 3% of the total electricity used worldwide. And if we look at some projection, we considered that by 2030 the digital sector is going to be responsible of 20% of the worldwide electricity demand and amongst that the data centers are going to be responsible. Third, if we talk in term of greenhouse gas emissions, we consider that the cloud is emitting more than 300 megatons of co2 yearly, which is around the same carbon footprint as countries like France or UK. We have recently overthrown the civil aviation. Basically the cloud is emitting more co2 than flights and these numbers can be around 8% by 2038%. That's huge and that's concerning as human being, but it's also concerning as company. Why? Because there is some regulation and lows that care coming. Looking at the carbon footprint of a company is also a great way for attracting SG funds. Those funds are investing money in companies that are showing some sustainability and obviously as a company you want to attract them. It's also a great tool for recruitment and staff retention. You care more keen to join a company if it's showing some ethics and sustainability. And there are some studies that are showing that it's also great for customer retention for the same reason. And that could also lead to some cost reductions. So in order to know what the company is responsible of very three scopes. The first scope is your direct emission. So basically if you are burning fuel or if you have diesel generator for example, in scope two is your indirect emission related to purchase of energy. So basically that's your electricity bill. Scop. Three is basically everything else. Other indirect emission, value chain emission. So it's basically your business travel purchase of servers, and even the purchase of cloud services or your subscription to a cloud provider. So basically, if you are running your infrastructure on a private cloud, that means the electricity used by this private cloud is going to be accounted in your scope two, but the purchase of those servers is going to be accounted in the scope three. If you are running only on public cloud, that means everything is going to be accounted as the scope three. So what should we put in the scope three? For example, Mozilla had a very wide definition of the scope three in their sustainability report in 2019. They counted the use of their product and they discovered that 98% of their carbon footprint is coming from the usage of their product. Basically people using firefox. And there is even some people that are thinking of counting as the scope three. Even things like Spotify, for example. If you need to have music streaming while you are working, while you are coding, should we account that as the scope three of your company? Because you need it. So you have emitted co2 for your work at your company. So where should we stop? So basically, companies are responsible of defining the parameter of their scope three. In term of regulations, there are some directive and laws coming in the next years. For example, in Europe, you have the CSRD. CSRD stands for corporate sustainability reporting directive. It's going to be enforced in 2024 for large companies and in two years for listed SMEs, it's going to be for scope one and two and even scope three. The SFDR is for financial companies, so they will need to account as their scope three, the carbon footprint of the company. They are financing. They are financing. And in us you have the SEC climate disclosure rule which is encompassing scope one and two, and even scope three as optional. And it's going to be enforced pretty soon. So here are some terms you probably have heard. If we are talking about co2 offsetting, that mean we are doing some compensation or removal. Removal is basically direct air capture, removing co2 from the atmosphere. Compensation is giving money to some project which is helping offsetting co2, basically planting trees. If we are talking about elimination, that obviously mean we are not emitting co2. We are eliminating the source of co2. So if a company is talking about being carbon neutral, that means they are not necessarily looking at their emissions but just focusing on offsetting. So they are emitting the same and they are just compensating or neutralize all their emission by again planting trees. If we are talking about net being net zero, that means we are focusing on elimination. We are reducing the co2 that we are emitting and we are offsetting the rest. If a company is talking about 100% being renewable, that means they are using only renewable electricity. But is it powered by or matched by powered by. That means you are sure that the electricity coming in the data centers are coming from renewable energy source if it's match by that means. For example, if they are producing twice their need with solar panel, but during the night they still need to buy electricity on the market. So in that case it's match buy, it's a little bit less interesting. So now how we can measure your cloud infrastructure and the carbon footprint of your cloud infrastructure. So how it's working. Electricity is produced different ways. For example, coal or gas plants, nuclear plants for example. Side note, in this presentation, nuclear is considered as clean energy because it doesn't emit any co2. It could be also produced by renewable energy, solar panel, windmills and hydroelectricity. So all of this generator of electricity is mixed in what we call an energy mix. And this electricity is translating over your regional grids. And when it entered the data center, a little bit of this electricity is taken for powering the administrative building, the air conditioning, the heating, et cetera, et cetera. But the large majority of the electricity is for the servers. So if you have a cloud infrastructure on a public cloud provider, that mean you will have VMS, containers, applications running on those servers and you can take a ratio of the electricity powering this data center for your own component for your needs. So in order to calculate that, the Green software foundation came up with this Sci formula. SCi stands for software carbon intensity. So the idea with that is to calculate how many gram of co, 2 /hour your software is going to emit. So let's look at the end of the formula. What is the m here? It's called embodied emissions. So, embodied emissions is emission generating during the manufacture, the delivery and the disposal of your hardware. So for example, if you bought a server and this server required 4000 be manufactured, that means virtually every year you will emit 1000 kilogram of co2. So you can do the math for knowing the ratio per month, per hour per process, et cetera, et cetera. The usage emissions are the emissions during the use of your software. So for example, if you are running AWS, m four large machine with two cpu and 8gb of ram during 50%, with 50% use by your process, there is some ways to calculate that it's going to use ten watts during an hour. So there's different ways to calculate that. The easiest way is to use some probes, but if you are running on the public cloud, you don't have access to the hardware. So in that case there is some different ways to calculate that. So let's assume it's ten watt during an hour. We multiply that by the power usage effectiveness. The power usage effectiveness, the pue is how many kilowatt hour I need to inject in the data center in order to have 1. So in AWS Ireland, we consider the PuE is one two, which means we need one two kilowatt hour in order to have 1 usable by the servers. We multiply that by the carbon intensity of the grid. For example, in Ireland we consider that the kilowatt hour generated 300 grams of co2 during an hour. So that means that the energy mix the different ways to generate electricity. In Ireland, the average during a year is going to emit 300 grams of co 2. So if you multiply all of that, we have the greenhouse house emission in grams of co2 equivalent per hour. So we are talking in gram of co2 equivalent because not only co2 are emitted when we care generating electricity, that could also be methane ch four. The methane is 84 times more potent as greenhouse gas as the co2. So in order to ease calculation, we care talking of gram of co2 equivalent. So if we circle back to the sci formula, if we multiply e by I, which is the emissions of the use, we have three eight gram of co 2, then we take the ratio of the embodied emission, basically the manufacturer of this hardware. And in this calculation is when the two gram of co 2 is now five gram of co 2 /hour so where should we find this data? All those average carbon intensity of regional grid care, really well known. It's pretty easy to find. And you even have this application name electricitymaps.com. With this application you can browse a map where you can click on some countries and have live carbon intensity of the regional grids. In this example, in March 22 in France it was emitting 157 grams of co 2, while in US in New York region, it was emitting 100 grams of co2 more than France. PUE is also pretty easily available data. So for example, we have all major cloud provider pv per data centers there are data that are less available. For example, the CPU memory and the storage, the energy consumed by this hardware. For example, if you are running on the public cloud, it's really difficult for you to have access on the name of the CPU, the brand of the GPU, the type of memory, et cetera, et cetera. And you need to have that in order to calculate what the power needed by those, and you need to have their power profile, et cetera, et cetera. So in order to ease that, we are using estimation and coefficient. So with estimation it's sometimes sufficient to have rough number and to have some relative calculations. But you need to be careful. For example, if you are using gravitron instance on AWS with ARM CPU, ARM CPU are really efficient in term of energy. It can be lower than 1, while legacy cpus could climb up to eight. So you need to be careful with those estimations. The embodied emission of the hardware, again, the emissions generated during the manufacture of this hardware, to know that, you need to know what is the brand of hardware, where it has been produced and when. And this data is pretty impossible to get if you look at your AWS or GCP console. But fortunately we have some data sets around and some benchmarks and the energy mix is pretty difficult to predict. Obviously, to know what was the electricity mix in the past and to have the carbon intensity of some regions in the past is pretty easy to have. But in the future, in one, two, three weeks, one month, it's pretty impossible to guess. In order to ease that, fortunately, we have some tools. So major cloud providers like AWS, JCP or Microsoft Azure have released a carbon footprint tool, dashboard or calculator. With that, with one click, you can have the carbon footprint of your infrastructure per month and per account. If you want to dive in a little bit more, you have tools like cloud carbon footprint. It's an open source tool and basically reads the bill of your cloud provider and get how many instances, what type of instances, during for how long, what kind of serverless services you use, like lambda or anything else. And it gives you those co2 emissions per day. And you can even do some nice comparison, for example, knowing how many flights from New York to London. It is similar. And yeah, you have a lot of nice things in this dashboard. If you want to be more accurate, you can use tools like Scaffondra. So basically Scaffonder is monitoring cpu power, so it's using some probes to get what the actual electricity is used by your cpu. So obviously you will not be able to use that on the cloud. But if you have access to your physical machines, that's pretty interesting. It's not directly telling you the Carbonifer emissions, but you have the most complicated part, which is knowing the electricity used by your software. You have some interesting APIs like climatic. Climatic is a commercial API not only for the cloud also for different sectors like freight or agriculture or anything, you name it. And among that they have cloud computing. So basically you can query this API asking them hey, I have an instance running on AWS for this time and it's going to give you what is the Carbonifer emissions of that. You can also use tools made by Boa Vista. So those are free open source tool and you have the Boa vista API. Same thing you can query. I have a machine running on AWS M four large during 1 hour and it kind of give you what is the co2 emitted during the use of this instance and it will give you the manufacturing emission, the embodied emission for the entire lifespan. And finally my little tool named Carbonifer. So basically Carbonifer is able to read terraform projects. So if you don't know terraform, terraform is an infrastructure as code tool. So as a code you will describe what is the infrastructure we want to have deployed, for example on JCP. So in this example you have an e two standard, two instance. You want to deploy it in Europe west nine and if you run terraform plan and terraform apply, sorry, it will deploy it on GCP in the region you pick and that's what terraform is. So Carbonifer is pretty similar, but instead of deploying it will read your terraform project and try to estimate the carbon emissions. Basically it's reading your terraform infrastructure as code files and try to make some estimations. Obviously those estimations will not be accurate, it's just rough estimation. But it's already a good way to make some comparison and take some decisions there. Care some tools in this awesome list powered by the Green Software foundation strongly encourage you to check those out. So now we know how to measure your cloud infrastructure, how we can reduce it. Obviously if you are running a poorly designed, not optimized code on your infrastructure, the only way you will get the performance you want is by increasing the size of the machine. So obviously you need to optimize your code and use some eco design. Good practice for your software once you have done that, very good practice is to choose the right instance type and the right generation of cpu. For example, you can use gravitron instance on AWS and choose exactly the size you want. You not necessarily need to overcome it. And obviously a nice way to lower your carbon emission is to move to hyperscaler. So all major hyperscalers, public cloud like AWS, GCP and Microsoft Azure are claiming to be net zero or even carbon negative by 2030. They are making a lot of improvement on their pue, on the lifespan of their hardware and buying more and more renewable energy. One thing which could be also interesting is looking at alternative cloud providers. For example, Denver in France is deploying some floating data centers. So basically they're going to be naturally cooled down by the stream. So those are really interesting to check. So basically if you have all your servers on premises, you tend to have numerous underused servers just in case you will have a search of traffic. If you are in the cloud, you can adjust it to the bare minimum and have fewer highly used server which leads to using less resources and then emitting less co2. So basically you would like to use some finops good practices in order to be greenups basically. So FinOps is a way for better controlling the cost of your infrastructure. For example, if you are bringing some automation to shut down instances during the night or adjusting the number of instance to your traffic, which is called auto scaling group, in that case you will pay only what you need and not more. And in that case you're using less resource and again you are emitting less co2. Be careful with the reserved instance. With the reserve instance you're going to save a lot of money, but it's going to be still the same amount of instance. So in that case you will still emit a lot of co2. Obviously you need to pick the region which is having the best carbon intensity. So in this example, it's on Google platform website. You have the different regions in Europe, and as you can see, that could be as low as Paris with 70 grams of co 2, up to more than 700 in Europe Central two. So you need to pick the region very carefully. And then if you want to be more fancy, you can move to carbon aware. So Carbonifer aware is the ability of a software or an infrastructure to know its own emissions and to adjust itself according to its own emissions. So the first things you can do is temporary shifting. So basically temperate shifting is scheduling asynchronous tasks when the electricity is going to be less carbon intensive. For example, video processing, AI training, machine learning, et cetera. So if you want to train your AI model, you don't necessarily need to do it right now. You can wait for the electricity to be less carbon intensive. That's exactly what they are doing at Google for their own need. So basically they care, waiting for the right time frame to launch some synchronous tasks. And doing that, they lowered a lot their carbon footprint. Another thing you can do is demand shaping. So basically demand shaping is doing more when electricity is emitting less, for example, lowering the quality of the video when the electricity is very carbon intensive and go back to high definition when the electricity is less carbon intensive. Another good example is a CI server, adding more or less worker to the CI server depending on the carbonifer intensity of electricity. So basically, if it's a peak of co2, during a peak of co2, your developer are going to wait a little bit more for having their pr built and it's going to go back to normal when the electricity is going to be greener. Another good way is special shifting. So basically special shifting is moving your infrastructure alongside with the carbon emissions of local grid. Or if you have two different data centers, picking the one with the greenest energy mix. So for example, in odd carbon 23 conference raised some people that are shown, sorry that they are doing that with two different data centers in two different regions. So they have a load balancer directing the request from the user to the data center, which is the closest to the user, and then they change it to take into account also the carbonifer intensity of those region where those data centers are. And by doing so, they managed to reduce their carbon footprint by 21%, which is already great without any compromise on the latency and with an acceptable latency, which means that the user will barely notice it, they managed to reduce their carbon footprint by 51%. So this is super interesting. So as a conclusion, if you want to be the less carbon intensive and to have a greener cloud infrastructure, the first thing you need to do is collecting metrics. You cannot reduce what you cannot measure. So first measure it. Then once you have your metrics of your cpu, ram, et cetera, you can estimate the power used by those. And then you can even estimate the carbon emissions in term of gram of co 2 /hour once you have that, you can use some tools, some dashboards, some analytics to make some thinking and correlation and come up with actions reductions, resizing and scheduling according to the intensity of the grid. And then if you can go cardboard aware. That's all for me. Thank you very much for listening. If you want to know more, you can send me messages, but you can also check the Green Software foundation website. They care very good material on this topic. You can listen to their podcast and I strongly encourage you to also check boavista.org to have a lot of good documentation and tools for helping you on that journey. Thank you very much.
...

Olivier Bierlaire

Founder @ Carbonifer

Olivier Bierlaire's LinkedIn account Olivier Bierlaire's twitter account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)