Transcript
This transcript was autogenerated. To make changes, submit a PR.
What if you could work with some of the world's most innovative companies,
all from the comfort of a remote workplace? Andela has
matched thousands of technologists across the globe to their next career
adventure. We're empowering new talent worldwide,
from Sao Paulo to Egypt and Lagos to Warsaw.
Now the future of work is yours to create.
Anytime, anywhere. The world is at your fingertips.
This is Andela.
Hello everyone. Thanks for having us at this year's
Conf 42 Golang.
You know who Andy and I are from our introductions, but let
us introduce form three as well. Form three
are a payments technology provider who work with some great customers and
partners. As you can see on this slide, we have
a fully go code base and work with some great cloud native technologies.
We currently have around 260
employees, of which about 130 care engineers.
We are a fully remote company and we're hiring.
Today we will be looking into the world of load
testing. I will very quickly cover performance
testing fundamentals and some common tools that we can use
to write our tests. Then Andy will take over and
tell you about our open source testing tool f one and
give you a live demo of how to use it.
Everyone loves a bit of live coding in a good demo,
right? Okay, without any further ado,
let's dive into the world of performance testing.
Performance testing is the general name for tests
that check how the system behaves and performs.
Hence the named performance testing.
Performance testing examines stability,
scalability and reliability of your software
and infrastructure. Before performance
testing, it's important to determine your system's business needs,
so you can tell if your system behaves satisfactorily or not
according to your customer needs.
Often, performance testing can be done on anticipated future
load to see a system's growth Runway,
this is important for a high volume payments platform like
ours. Under the umbrella of performance
existing, we can look at three test subtypes,
load spike and soak tests. Load testing
tells us how many concurrent users or transactions your
system can actually handle. Different load scenarios
require different resources, so it's important to write multiple
tests. Load tests should be performed all
the time in order to ensure that your system is always
on point, which is why it should be integrated into
your continuous integration cycles.
Now, on the other hand, a stress test is a type of
performance test that checks the upper limits of your system
by testing it under extreme loads.
Stress tests also look for memory leaks,
slowdowns, security issues, and even data corruption.
There are two types of stress tests, spike and
soak. If your stress test includes a sudden
high ramp up in the number of virtual users.
It's called a spike test. If your
stress test is over a long period of time to
check the system's sustainability over time, but with
a slow ramp up that is called a soak test.
You should run stress tests before major events
like, for example, Black Friday. If you're a commercial retailer.
Before you run your tests, it's important to have monitoring
in place and agree what your failure threshold should be.
You can see some common things to monitor on this slide,
such as average response time, error rate, or cpu
usage, which are important indicators that can
show you whether your system is healthy. These important
metrics should therefore be monitored and alerted on before
you write your tests at
form three a lot of our systems use asynchronous
processing and queues. Today we'll be
looking at this simple example application.
We create a service which exposes a single endpoint
payments. This service receives
requests, does background processing on them,
and then outputs a message to an SQS queue
once processing is complete.
Now we need a way of connecting requests to their corresponding
result from the SQS queue. If we rely on the
response, the 202 accepted response, which you can
see on this slide, will make it seem like the request
completes immediately when actually background
processing is still happening. F one
is the open source solution we will be talking about today,
which can help you do just that.
So now that we have established the basics of performance testing,
let's have a look at two common tools that we can use.
The first tool we'll be talking about is jmeter,
and it is an open source Java testing tool
that is widely used. It allows us to
configure tests using a recording GUI and some predefined
templates. For asynchronous testing support,
we can use long polling or another request to check
whether an operation has completed. We can then
configure a number of threads and a ramp up period in
seconds for load specification. Jmeter also
offers a plugin for step and spike ramp
up, even though it is not supported natively
in jmeter. Next up,
another common tool used for performance testing is
k six. It is an open source
go project run by Grafana tests care
configured using a scripting language similar to JavaScript.
K six does not provide support for promises
or asynchronous execution, but we can
achieve asynchronous testing support using virtual
users. We can then configure the
load for the test using an options object
which states how many requests to run for each stage
of the test and how long each stage of the test
is. This allows us to configure linear
and step ramp up. Now at form
three, we invest a lot of engineering time
into performance testing our platform. As we have
already seen, it is a very important test
that should be run on your platform all the time.
We initially used K Six to develop and run these
tests, but it did not fully fit our ideal load
existing tool, which you can see described on
this slide. Our ideal tool should
allow us to easily write asynchronous tests which
integrate with our queues and services. This was not
always easy to do in JavaScript,
especially because our platform is fully written
in Go. It should also allow our engineers to
write tests in Go, which is what they're most comfortable in
anyways. And it should also integrate well with
our CI pipelines as we want to perform and test
our platform. Often, writing in our tests
in Go would be a huge game changer for our engineers,
as it would allow us to make use of go routines and channels for
test configuration, and these are really
important components in the care language that we would like to
leverage. Finally, as our
platform operates under huge amounts of load, the tool should
allow us to run different modes of load as well, not just
linear or step ramp up. Andy will
tell you more about the different modes of load that we need
later. And as you can see, the existing
solutions did not provide us any of these features.
So this is why we decided to write our own solution
and then open source it for the community to use.
I'll now hand over to Andy who will tell you all
about f one. Take it away,
Andy. Okay, so I'm going to take you through what
f one is and why we decided to write it.
So what is f one? Well, f one is our own
internal load testing tool.
We wrote it initially to support our own use cases,
but then we realized actually it was a pretty general purpose tool,
so we decided to open source it.
And it's written in Go. So it natively supports
writing test scenarios in go. And that means
that you can use all of the sort of concurrent and asynchronous primitives
that go offers when writing your test scenarios.
So existing these kind of asynchronous systems
is pretty straightforward in Go, much more
straightforward than it was, for example, using JavaScript based tests
in k six. One of the other things f one
supports is a variety of different modes for
injecting load. One of the problems we had with k
six was that it basically only supports
a single mode of operation, which is using a cool of virtual users
to apply load that wasn't aggressive enough
for some of our use cases. So for some of our use cases, we really
needed to be able to apply load more aggressively.
And so when we wrote f one, we built that in from the beginning.
So f one supports this idea of using a cool
of virtual users, but it does also support a variety of other
modes of injecting load, which makes it much more suitable to
our use cases. So what
I'm going to do now is basically take you through a demo for 15
minutes or so. We're going to set up a
system to test that looks sort of similar to this asynchronous
system that Adelina mentioned earlier. And then we're
going to write a simple load test that's
going to sort of exercise that system.
So I'm just starting with sort of a blank
folder here, and we're going to start from the beginning.
So first of all, I'm going to set up an environment
that we're going to use to run load against using docker compose.
So what I've done here is created a Docker compose file with
two containers. In the first container go AWS
is a local SQs mock. So it's a mock of the
AWS SQS service, and we're going to use
that sort of to mock out an
AWS based message queue. And this docker
compose file also contains a dummy service
which we're going to write in a minute. And you might notice here that go
AWS requires some configuration. So let's create
one of those config files. So this config
file basically just contains a single queue that we're going
to use called test queue.
Let's also create a docker file for our service. So our
docker compose file is using a local docker
file. So here's a
Docker file. So what we're going to do is just build an app
that's in this command service main go file.
So let's dub out that service. What I'm
going to do here is I'm going to make a
new directory command service, put in a file,
a minimal file, and initialize a go
module. So now I've got a go mod file,
and if I have a look in this directory, I've got
an empty main function. So what
we're going to do now is we're going to implement a sort of simple application
in that file that we're going to use when we're writing our load tests.
We're going to inject load against that.
So let's edit that file.
Okay, so first of all,
let's delete this empty function,
and what we'll do is we'll put in a
main function which listens
for HTTP requests on this relative URL payments
and an HTTP handler for that. That's just going to return
an accepted status code. So there's
a sort of simple application. Now what we want
to do is we want to publish SQS messages when
our web requests are made. And that's going to simulate this sort of asynchronous
feedback where we're injecting load synchronously via HTTP and
then asynchronously consuming feedback via SQS.
So let's set up some global variables
to store an SQS
client, and then let's initialize that SQS
client at the start of our main function.
Okay, so what are we doing here? We are setting up
an AWS client to use go AWS.
So our local sqs mock using some dummy credentials,
creating a new SQS client and getting the key URL of
our test queue.
Okay, I just need to replace
one of these imports.
Let's pick
up the wrong imports.
Okay,
got some. Got there. Okay. Right, so there
we go. We've got an application set up there with an SQS
client. Okay, so what do we want
to do next? Right, well let's go to the HTTP
handler. Rather than just returning an
accepted status code, let's add some functionality to
that handler. Let's just delete this entirely.
Okay, so what are we doing here? So we're saying, okay, we only
want to handle post requests. If we don't get a post request, we're going
to return a 405. Then we're going to construct
an SQS message and send it to
that queue. So that's our sort of asynchronous feedback. And this demonstrates
our system doing some work asynchronously.
Oh yeah, and let's, I shouldn't have deleted that.
Let's leave the status
accepted on the end it.
Okay, so with any luck that
will run.
So let's just download the dependencies
for that app,
and then we should be able to run it using docker compose
app. Just wait
for these dependencies to download.
Now, when we run it locally,
we should be able to make web requests to that endpoint
and get a 202 back. And we should also be able
to see SQS messages being published as a result.
So let's see if we
can run that app.
Okay, so if I have a look at what's running in Docker.
Sorry, I'll just kill
all my running containers and
then start again.
Okay, so what have I got running here? Okay, so I've got go
AWS running, that's my sqs mock and I've got my test service.
So if I try making a web request,
great, I get a 202 so I can make some web requests.
And then if I just configure the AWS
cli locally, oh yeah,
that's all set up.
So I should be able to list my queues.
There's my test queue and if I
get the queue attributes I
got three messages there. So if I make
another web request,
I've got four messages there. So sending my web request is publishing
sqs messages that demonstrates our app.
Okay, so now let's write a load test. So what I'm going to do
is write another command line entry point.
So I've now got this command f one main
go. So let's
edit that. First thing
we're going to do is add an import for
f one. So we'll just import this,
oops, this package go imports getting carried
away. And in order to use
f one, all I need to do in an application
entry point is new up
this f one type and call execute.
And this will give me a fully fledged f one command line interface.
So if I download that dependency and
then run that application, I will get a command line interface pre
configured with all of the bells and whistles that f one
comes with. So this is how you use f one. It's not like a separate
binary that you download and add load tests to or something.
You just use this package directly and build your own binary. So if I
run this entry point,
just put help on the end, we'll compile
the app and then we should get some help out. Here we
go. So this is the f one command line interface
that I've got just from importing that package basically.
Okay, so let's go back to our file. What we
need to do now is we need to start registering
new test scenarios. So I
didn't add that correctly. Here we go. Let's,
so this allows me to register a new, a new test scenario
and that test scenario will be available from our command line interface.
So let's just add a sort of dummy implementation here.
So what does this dummy implementation do?
So basically this function runs some code and
then returns a run function. And this code
that it runs at the beginning is where you would put any setup code that
you need to run one time at the beginning of your test scenario. And then
this function that you return is executed every
time you run a test iteration in your load test.
So if you're running 100 iterations a second,
this run function gets executed 100 times per second.
And this testing t has a sort of similar API
to the go testing t, but you'll notice it's actually an f one
type.
Okay, and now if I go
run, add another command on the end,
hopefully we should see that our test scenario has been registered.
Great. So this is now a test scenario that we can execute from
the command line. Okay, so let's do
something similar to our HTTP handler.
Let's configure an SQS
client here. Oops.
Okay, so what we're doing here is
we're configuring an SQS client to again use go
AWS locally with some dummy credentials,
getting the QRL for our test queue.
And so I've got an SQS client here available within
this function, which I can use in all of my test iterations
to receive messages.
And this is where we start to stumble upon the power
of using go to write these load test scenarios.
Because what I'm going to want to do is consume messages
from my SQs queue in the background and then check what
messages are arriving from my test iterations.
So let's do that here. Let's run a go routine
in the background to do that. So what have I got
here? Okay, so having initialized my SQS
client, I've now created a channel that I'm going to use in memory
for receiving messages. And I've started goroutines,
which is basically polling the SQs queue
and sending messages into that channel. So in the background
of my load test, I've got this channel which is sort of buffering inbound
messages from SQS.
And that means that the actual implementation of my run function
is pretty straightforward. So let's
do that. Let's put an implementation in,
and what we want to do here is basically what we were doing from
the terminal earlier. So we're going to make a
post request. So we're going to make a web request to our local web
service. We're going to check that we got a 202.
Then we're going to wait for up to 10 seconds for a
message to be received from that channel. Now, in real life, you'd probably want to
do some logic, execute some logic here to sort of stitch together the SQs message
that was received with the web request that you sent, maybe by
id or something like that. And in reality, that's what we do is
we'll send some kind of HTTP request. There'll be
some attributes about that HTTP request that identify
that unit of work. Then there's a whole load of background
asynchronous processing, the output of which is an SQS message that
contains some ids that let us stitch it back together. So our
test iteration function here will be sending that web request and
then waiting for inbound SQs messages that we can stitch back together.
And if you don't receive one within 10 seconds,
that test iteration fails.
So that's it. We've written our load test scenario,
so we should be able to run it. So we've still got our app
running in the background. Let's compile
our application into a handy binary called f one.
So I should be able to f one help f
one scenarios list.
Okay, so let's see now. I should be able to
f one run constant.
So this is one of the modes, constant mode. I'm going
to run test scenario array of 1 /second
form, let's say 10 seconds.
Okay, so we've got one tests iteration per second
using executed. They're taking about 1.5
milliseconds and they all passed.
I can also do this in a similar mode to
k six, so I can use a pool of virtual users
with a concurrent cool, let's say of, I don't know,
one user, and you'll see
now I get something quite different. So now I'm running, let's say
close to 1000 requests per second,
taking up to three or four milliseconds. And that's because my
pool of one virtual user is making requests one
after another as quickly as it can. So these two
modes allow you to sort of inject load
however you like. That constant mode allows you to
really control the rate which load
is being injected. And that becomes particularly useful
when your cool of virtual users would become
saturated. So if you've got ten users and
each request or each sort of asynchronous process
takes 2 seconds, your pool of ten virtual users
can only make five requests per second because they're each making
their requests consecutively, one after the other, and they have to wait
2 seconds between requests. And that's problematic if you really,
really want to apply ten requests per second load to your
system. The constant mode doesn't care about virtual users,
and it will aggressively apply ten requests per second to
your application. And that's one of the reasons we developed f
one separately.
So that's it. I hope you found that demo useful.
So I guess just to sum up, f one
is a sort of battle tests load testing tool that we use now.
We use it every day. We use it to apply synthetic load
in a number of our pre production environments.
It sends hundreds or thousands of sort of payments
per second into our environments and it's really
a first class citizen in our software development lifecycle.
So I guess this statement at the top is important is that I think
for large systems,
which will be processing volume at scale in
production, load testing needs to be a first class citizen.
It shouldn't be an afterthought and it should be part of
the way you're developing your applications.
And certainly for us, we found it really useful at spotting
performance bottlenecks or scalability problems well before
we would have encountered those problems in production, which gives us plenty of time
to fix those problems.
So this has been us, Adelina and I, thanks for listening.
And if you're interested in learning more, check out f
one on our open source organization or
look us up online at the addresses shown here.
So, yeah, thanks a lot.