Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, I'm Oliver, welcome to my talk about DevOps automation
with Go. So I've been a software engineer for more than 20 years
and I discovered go back in 2017 and I
immediately fell in love with it. It's a great language to write, but especially
to read when you have a large code base to get into. I've been
the lead developer at Restorepoint since 2019.
Restorepoint is the name of our company, but also so of our
main product, which is a network automation device backup
and restore solution. It's all written as a go monolith,
so we have a single binary which is highly
concurrent. We have our own scheduler, HTTP server,
FTP server, TFTP server, a lure environment,
et cetera, et cetera. And all this runs inside
a Linux environment which we tightly control.
So most of our customers run it on premise or in their own
cloud, and it's updated either
manual or automatic by an update server.
So we currently have around 120,000 lines of go,
not counting comments, plus roughly
2.8 million from external libraries. And we use GitLab
for our whole development lifecycle. So how does our
DevOps look like? So we have three different release
versions. We have two target environments. We do
weekly production releases. We could actually release every day if we wanted to,
but most of our customers prefer a weekly release.
So we released in the middle of the week. But we do our development
releases internally. They are released whenever there's a change.
So that's continuous. And we have multiple internal
tools that make our lives easier. And as you can see down
here in that image, that is how our pipeline looks
at the moment. So one of our internal tools is
the release API, which avoids us having to
copy the build artifacts from our build server
to the update server. So it's a tightly controlled
solution and it's used by multiple of our products,
and it's a single binary service as well. And so it
has two sides. So the build server sends a call
by a post of course, and it sends
the final build artifacts as a TGZ, it md five
sums the TGZ and then sends additional metadata.
So down here I've copied the call that
we actually sent to our server. As you can see, there's a
lot of additional metadata, doesn't apply for all
products, for most of them. And then there's a shared secret between the build
server and the release API, so that the release API
will only react to calls that contain that shared secret.
And then on the receiving side, so the release API
receives that post request that I mentioned checks that all
required metadata fields for a product have been passed, checks the
shared secret, of course. And then it writes the
file that's been passed and calculates the MD five sum
at the same time, which is quite a nice trick
you can do in go by using a t reader. And if the
calculate MD five sum is not the same as the one that has been sent
in the request, then the release is also
aborted. And once all the checks are done,
then the metadata is written to an end file as well as the
TGZ, and then it's passed to an individual release
script based on the product. And this is a single
binary service, as I mentioned, and it's
maybe 100 lines of code and it's a really nice, like it's
one of the powers of Go in my opinion, that you can actually
write a web server and very few lines.
Another tool that we have is the Freshdesk GitLab bridge.
So for our first line support we use Freshdesk.
And as developers we only deal with issues in GitLab
and our support engineers decide when to escalate issues
to us as developers. And we've written
a temper monkey script around that which injects a button into the freshdesk
UI. So it's quite easy to trigger that escalation
process. And it will copy all comments from Freshdesk
and all attachments into an issue in GitLab.
And it avoids creating duplicates as well. And also make sure that
both sides have a link so you know which ones have been
escalated and which ones are not. I can show that real quick.
So this is a video that I took just, you can see that button
over here. This is the injected button, and it will ask you
if you really want to do this. And then it will copy the files from
a freshdesk and will create a GitLab issue
out of the freshdesk issue. And that's quite
a neat way for us to deal with customer support
without having to expose the whole team to
all custom issues. Not all of them are related to development.
And also this is a single binary service as
well. And then we have another tool which we call
the automatic version check. It warns us because we have
more than one production release, we have three, actually. It warns
us if we are trying to merge mismatched versions. So if
I want to say, as you can see here in the screenshot, we have
a five three one version and a five four version. When trying to
merge that, then I get this warning as
a comment and the way it works with merge requests internally
you cannot merge a merge request unless you have resolved all issues,
like all discussions on a merge request. So this will keep the merch request
from being or accidentally merged. This works
by a webhook. So this is also a service that's running on a server.
And GitLab basically sends all merge requests,
or like signals, all merge requests via webhook
to this endpoint. And then we use the GitLab API to
check the version of the source and target branch. And then
we have an additional thing for automating our
development workflow. So GitLab has
these things called boards, and you can use different statuses,
which are labels in GitLab. And these labels,
we use them for everything, for the area of
the product it applies to, if it's a UI
or an API issue, if it's a fresh dust ticket for example, but also for
process. So our GitLab issues always
go through that stage from open to to do
to in development to in review to test to testing,
and then eventually they get closed. And we just make sure
that we automatically transition issues when
a merge request is opened. So the only thing a developer has to do is
to actually mention the number of GitLab issue in
their merge request, and then the ticket will automatically be
set to be in review. And when the merge request is merged, then it's changed
to test. And this really reduces the
amount of manual updates that we have to do, because as developers
we tend to always forget these things. But it's nice to have
our issues in the right state so it's clear where we
are, what the progress is, et cetera. And then another
thing that because we have a highly concurrent piece of software with
a lot of lines of code, so we from time to time have data
races and go has this nice way
of allowing you to detect race conditions, so it will see if a variable
is read and written to at the same time. And therefore all
of our internal development builds have race condition detection enabled,
which is bit of a performance, or it has a performance
impact. So I think it increases cpu
usage by, I can't remember, but it definitely
takes more cpu cycles, but especially memory, I think it doubles the memory usage.
So we only do this for development builds internally.
And the reason why we have to do this is because most of our
race conditions, they happen whenever a certain code pass is
hit. And we have course fixed
all the low hanging fruit, but there's always something left
somewhere and also sometimes it's library code. So we have discovered
quite a lot of race conditions in external libraries and then reported
that as well. And so we have a lot of internal boxes
that replicate all the common usage scenarios that we have,
and they run twenty four seven, and then they write race condition
error messages into their log files. And then we run this
race condition check tool once every day on
these individual machines. And then if
a race condition is found in logs, then it will automatically
create an GitLab issue for each entry. And if
an entry already exists, then it will add a comment
instead to keep the issue fresh. So I copied here an
example of how that looks like in a log. So it starts with
warning, colon data, race, that's the start
marker. And then it usually goes like right at blah blah
memory address and go routine, number, number something, and then the
code, the function where this occurs, this is what we
use as the title, then everything below. So between
the start and the end marker we put into the issue, and this
ends up looking like this. So I had to blur, of course, the details for
obvious reasons, but it will basically show this,
it shows where it occurred, where the write was,
where the previous write was, where a read was, and it
will automatically label it with the race conditions tag,
which is important. So we can actually see if
that this was an actual race condition problem. Yeah,
and that's a really nice solution for that.
And then we have another tool which is for
automatic library versioning. So we have roughly 20
internal libraries that are being used by different products,
and these are consumed via go modules, of course,
and go like semantic version tags. So when
you do a go get and then you say the name of the library or
the URL of the library, and then add and then the version
tag. And we built a tool around that,
which is a job that's run on the individual libraries
CI CD pipeline. It's a tag job,
and it will basically, whenever the master branch of
the library is updated, it will tag the library automatically
using the last commit message as the description
of the tag, and increases the patch level of the
previous tag, and therefore create a new
version which then can be used in the
product that is using the library. And it will make sure that it will
either increment any existing tags, or if
no tags exist, then it will just create a new one. Yeah, and this is
it. So this is how we automate our own DevOps
at Restorepoint. And I have
to do a shameless plug at the end, of course. So we are hiring in
either remote UK or EU, and our pitch
is of course, if you're tired of the same old go microservice on
Kubernetes pitch, then maybe have a chat with us.
As I explained, we ship an on premise go monolith wrapped
in a Linux box every week and our customers love it.
And yeah, we're looking for driven and analytical
software engineers, ideally with go experience. But we
can also consider you if you are really experienced
in another language and you want to cross train because Go
is relatively easy to pick up. Yeah. So please come
and talk to either me or hit our careers page.
Thank you very much.