Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello Comp 42. Thank you so much for joining
my talk today. My name is Joe Skeen and this
is falling in love with unit testing.
Buckle up. This is going to be a really
intense talk, but I hope that you walk away with
a lot more appreciation for unit testing, why it
is that we're doing it, how to do it without making it
a drudgery, and why we really shouldn't
let AI write our unit tests for us.
I've presented this talk in a few different flavors
at different conferences using different programming languages.
Since this is the DevOps conference, I'm going to be
using Powershell and Pester today
in my code examples. But everything that I talk about are
going to be universal principles that you can
adapt to any programming language or unit testing
framework that you happen to be using. So let's jump into it.
If you'd like to follow along with the slides today,
I have them posted out here. You can go ahead and follow that link
and follow along with me. All right,
so before we jump in too deep, we need to answer
the question. What is unit testing?
Unit testing is can automated test
that validates your logic at the lowest level possible.
So when you think about kind
of the spectrum of testing,
so to speak, you have, your unit tests are
going to be the ones that are testing the smallest
parts. Then you have integration testing or API
testing that will combine different parts
of your program and validate that. Then you have
your end to end tests above that which
have basically all of the parts of the program together.
And then on top of that, you have manual testing.
The thing that you need to remember about this is
that the higher you go
up on this pyramid, the more tests
you have to write in order to validate all
use cases. Think about that a second.
If you have a system that is very complex,
you will need to have an end to end
test that covers every
possible permutation and combination of
use cases of every single feature of
your app in order to have full test
coverage. In order two, do that with
unit tests. However, because you're able to
isolate each component into its own area
and mock out dependencies and just focus on
the logic, you can write many more tests
which take a lot less code,
but give you a
higher amount of assurance that
you have actually covered all the use cases and that there aren't
any edge cases out there waiting to come
back to bite you in the middle of the night. Been there, done that.
Not fun. So unit tests
should be the basis of your
automated testing strategy.
That doesn't mean that you shouldn't have integration tests or end
to end tests. Having some of those is
often a good thing. However,
if you are using end to end tests to
make sure that you don't have to write unit tests,
that's going to be a problem. Now why is that
a problem? Well,
for that I'm going to go two my personal definition of unit
testing, and that is that unit
testing is the process of aligning requirements
with reality. Let that sink in a second.
The process of aligning requirements
with reality.
In this talk, I'm going to be showing you my
style of unit testing, which for lack
of a better term, I'm calling intuitive unit testing pattern.
And I'm going to show you exactly
how requirements are going to
be tying directly to each and every one of your unit tests,
and how you can take your unit test suite
two, a nontechnical stakeholder, and be able
to show them, hey, these are the requirements
for the application as I understand them, am I
correct? And you can sit down and talk
through all of the edge cases that you have
been able to test for. This will provide greater
confidence between the stakeholder and the developer
that everything has been coded correctly and
that you are ready to ship whatever
feature or product that you are building.
If you think a little bit further.
And this ties into the reason why we don't
want to have AI write our tests for us.
If you think about it, most of the time,
if you are able to give an
AI, like Chat GPT, for example, or GitHub Copilot,
a description of the requirements for a function,
it can spew out the code for you in
any language you want.
But in order to get really good
code, you have to be really good at prompt engineering.
In other words, being able to be very specific and
telling the AI exactly what
you expect as the outcome of
the code. Well, if you
take all of the time to do that, you have already done about
90% of unit testing.
If you can define the requirements in
a very plain and easy two understand and
not easily confused way,
you have basically written an outline
for your unit test suite, for the component or class or
whatever function you are testing.
We'll get into that in just a moment here.
But I want you to think about that. We as developers
are paid not because we can
write code in whatever language, not because
we can write the business logic. The AIS can do
that for us now, but we are paid for
our genuinely human focused thought.
And to turn the process
of writing the tests over to an AI
would be to lose out
on a valuable opportunity
that you as a developer can take to make sure
that all of the requirements are well understood by you
and that you can agree with the stakeholders
what those requirements are. That is
super crucial. And I think that unit
testing as a result is one of,
if not the most important skills
for a serious software engineer
to have to be able to have that mindset,
being able to take requirements and break them
down and be able to prove through
automated tests that your code
works the way that you think it does.
Well, enough talk. Let's actually get into
a practical example here. We'll do this thought exercise
and I like to decouple the
ideas that I'm teaching about unit testing
and the thought process that you go through. I like to decouple
that from software in
general because this is really about reasoning with requirements.
So we're going to talk about an object
that hopefully everyone has experience with and can
relate to, and that is going to be a door.
Now, because this is a unit test that we are trying
two achieve here. We don't want to try
to tackle the entire door system at once.
Instead we want to find the smallest
piece that can be interacted with and
test that. So for that,
I'm going to go with the privacy doorknob.
This is the kind of doorknob that you may be able two find
on a bedroom door or a bathroom door. Pretty commonly
some of them will have a turning mechanism
for the lock or a button to push.
This one, we're just going to use this example
that I grabbed off of Amazon of
this door that has a knob
on the outside, on the inside, a push button
and a latch bolt. And we will just focus on that
and start talking about what kinds of requirements
would you expect for this
knob to be working properly.
Okay, so we're going to start
just by writing sentences that
first define the thing, the what that we are testing
and then explain the when we
are doing something to it, we want
this result to happen. So our first sentence that
we can use to describe the behavior is a
privacy doorknob. When the push button is not
pressed and the user turns the inside knob,
should also turn the outside knob.
Okay. Some of you may have
seen sentences like this before
using gerkin or cucumber syntax,
and they will use
like given when then statements to
be able two describe this.
You are more than welcome to use that style.
I think it does help you to kind of things through the
parts of that structured sentence. But I'm just going to
use kind of a more natural language
sentence for my examples here.
So let's add another use case here.
A privacy doorknob. When the push button is not
pressed and the user turns, the inside knob should
retract the latch bolt. So not only is
the outside knob turning, but also the latch bolt is being
retracted when you turn
the knob. Okay, but what
about if you turn it one way versus the other way?
You need to always be thinking, am I being specific
enough? In this case, there are two different ways that you can turn
the knob either clockwise or counterclockwise.
And specifically, you expect the
opposite side of the knob to
be rotated the opposite direction with
respect to an observer from
that side of the door than the direction that you are turning
the other side. So when the inside knob is
turned clockwise, the outside knob should be turned counterclockwise
and vice versa.
Okay, so we are going
two, just keep on piling on some more sentences
here. So I
believe we have both of those.
A privacy doorknob, when the push button is not
pressed, when the user turns the outside knob should retract the latch bolt.
Yes, we had that one already, and now we have both
the clockwise and counterclockwise versions of
those. And you can see at this point we
are getting to have quite a few test cases
and we haven't even really scratched the surface. There's a lot of other things
we need to account for. And so this is
where I like to take a step back. When you
start getting overwhelmed, you start getting a little bit anxious about
how many test cases you are authoring here.
Take a step back. And we
are going to apply some
of our code quality principles
that we already know and love, like dry two these
test cases, we don't want everything to
be so overly wordy that we can't
really scan through and understand what
is happening and what the actual requirements are.
So let's restructure it. To reduce the
duplication. We have a privacy doorknob
when the push button is not pressed, when the user turns the
inside knob clockwise. And then we have two different
outcomes that we are expecting. Should also turn the outside knob counterclockwise
and should retract the latch bolt. This format is
way more palatable already and allows you
to kind of group things
that are set up the same way.
Now, you will notice that we still have some of the results are
duplicated, but that's okay because in
this case, we need to be very specific about
how we want those outcomes to be. And so it's
better to remove the duplication at the beginning of
the requirements and then have multiple
expectations that may be similar,
but should be really easy to reason with.
And we'll see that in just a moment.
With this new structure, we can now expand. We can
look into other use cases, such as when
you have the lock button pressed,
well, you shouldn't be able to turn the outside knob and
you should be able to turn the inside knob, but when you
do, it should pop the
lock button back out. And so we're
just continuing to iterate and to think about, okay,
what are some other things that we could add to
our behavioral spec specification
for this doorknob component?
Okay, so let's think about this also.
What if the door is locked, but it's open,
and someone tries to close the door,
thus pushing against that latch bolt?
Should the latch bolt stay stationary or
should it retract and allow you to close the door,
even though the
knob is in the locked state?
So you need to think about that as well.
There's also some exceptional use cases we should at least
consider. What if the user exerts
way too much force on the knob?
Should it break? Should it give
in at a certain amount of stress?
These are things that you should at least be
aware of. Even if there isn't a
graceful solution for each one of these exceptional
use cases, it may be okay to say that it breaks,
right? But you need to at least be
aware, okay, these are the ways that it can break.
And I'm okay with that.
Now, I said earlier that once you have all of your requirements
together, you have 90% of your unit tests done.
So I'm going to try to make good on that promise as
we transition from this bulleted list of
requirements to actual working unit tests.
So we're going to be using this pattern
called aaa or arrange act assert to
implement our unit tests. And we'll touch
a little bit on that in a second.
But basically a range is all about
what are the preconditions that need
to be fulfilled in order to be able
to run your test case. So all
of those when statements in our requirements,
there's the action, which is actually the last
when statement. So when the knob is turned,
this is the thing that we're not sure whether it's
going to work properly or not, and so we need to test it.
And then the assert is just about observing
the side effects or the return value or whatever
result comes from that action that
you just tested. So we're going to apply this
to a doorknob that I wrote in
Powershell. Don't have to look too carefully
at this because it's
not really about the implementation here that's important.
It's about can your unit tests verify
that the implementation is correct? And I
personally don't write classes in Powershell because
there's a lot of gotchas there very often.
And so this will just kind
of be a sample.
Okay, so in Pester,
which is the unit testing framework of
choice for Powershell, you have
this really great describe context it
structure that you can use. This is called behavior
driven developers. When you're able
to basically use these given when, then type
statements and turn them into tests.
So we're going to fill them out.
First of all, we'll just paste our outline there.
And this is actually admittedly
a part of the implementation that you could use
AI to help with. You just don't want them touching any
of these sentences, you know.
Now we're down to code slinging and AI is pretty good
at that. But wait, because there's
some patterns and gotchas that we're going to cover
that you don't want to just let
the AI go crazy unchecked.
Okay, so we took that first bulleted
list. There we have our outer describe.
We've got a context block now for the
when. And then the cool thing is you can nest these contexts
as deep as you want to. So if you have when this
happens and this happens and this happens, they can all have their
own context. Then you have your three it statements
here and those are going to be kind of the leaves of
your requirements tree. Okay, so we're
just going to zoom into this test
case here and implement that using arrange
act assert. So to arrange there's
a few things we need to do. First of all, we need to bring the
code that we are trying to test into context in
Powershell. We do that by sourcing that file into
this file. Next we're going to
create an instance of that class that we've
created and then we are going to set the
state that we want to test
against. So this is
a test where we're seeing if it's locked.
Someone tries to use the outside knob,
it shouldn't turn. Okay, so the action is the
person trying two do the thing. Right.
So we've got knob turn, outside knob, we're turning it
clockwise. The result that comes out
checking the outside knob property,
it should be null. And just to kind
of, let's see back
on this one here we've
got things.
Knob interaction result is kind of the data structure I put together
for the output of this so there may be an
inside knob, an outside knob, or a latch bolt property
on there describing what happened to them, whether the
inside knob got rotated, the outside knob got rotated, the latch
bolt was
extended or retracted. So basically we
come back here and this
is a pretty good arrange act assert implementation
for our unit test. And it's pretty clear,
if you take a look at it, what the different steps
are. So we can take things and let's do it
with the next two test cases. Things is the
one that we did already. So the setup
is going to be the same for this. So we source
that in, we create the instance,
we press the button, we turn the outside knob.
This time, instead of looking at the outside knob, we're looking at
the inside knob and
that one should also not be turned.
And then for the latch bolt, we do all
the same things again. But instead we're
going to be looking at the latch bolt property. It should remain extended,
it should not retract when someone tries to open the door
when it's locked. Okay,
now this is where things start getting really
cool, is that if you run pester,
and I'm not using the default configuration,
I'm telling it to use the detailed verbosity,
which I really wish was the
default, quite frankly. But look at
what you get coming out of this describing a privacy doorknob.
Context when the push button is pressed. Context when the user tries to
turn the outside knob clockwise, should not turn the outside knob at
all. This is our outline. It's come back
out on the other side of the tests. And now we
can say, yes, this particular requirement
down to the t is fulfilled by our code.
And this is the really exciting thing is when you
can start seeing a bunch of these stacking up and
being able to see all of your requirements coming
to life. Okay,
but it's not all sunshine and
roses, because if we look here,
we have all this duplicated code here.
And if we start getting two tens
and hundreds of unit tests,
having all this extra noise here that's duplicated makes
it way harder to maintain these tests,
especially since we've
already kind of established that all
the tests within things context
here should have a doorknob
and it should have the button pressed already. And so
how do we deduplicate this? We're going to do some
dry principles here. And this is where you get to use
before each loops or before each statements
or functions.
And this is really where we go from normal
unit tests to what I like to call intuitive unit testing.
So we're going to create a before each
block, on every single one of these blocks
here, not inside the it, because there's only one of
those. You can't have anything nested inside of an it.
But inside these before each is, we want to define anything
or do any of the setup that
is required to fulfill what is right above it
here. So a privacy doorknob. In order to have this
fulfilled, we actually need an instance of a privacy doorknob.
So that means we need these two lines in
our before each.
So we got that. And now we
should be able to get rid of all of that duplication
in the other test cases.
Okay.
And we got some extra code there that I
needed to clean up there. Okay, so we've
got our arrange.
We still have knob press button. And you'll notice that
is defined here. So we should do that in
the before each right here. So we're going to move that up
here. And now look at
that. Our range block is completely gone there.
That's weird.
Okay,
let's go ahead and clean all that up.
Okay. And now the next thing that's duplicated
is this, when the user tries to turn
the outside knob clockwise. So we could put that
in here. Now, the one
thing that I don't like about this
is we're kind of just treating this as if
it's just another before each, which is kind of
like our arrange part of our unit test.
And I'd like to be a little bit more specific
and actually point to the fact that this is the action
and you can throw a comment on that. Yeah, I guess that makes sense.
You could do act here, right. But then you'd
have to go like arrange
and arrange. Right.
Another way I like to do it is to just
define the action as
a function, or in Powershell's case, a script block
that you can call whenever you are ready
for that. And so we
have different actions that we can do.
Maybe we do the outside knob here, but inside this context,
we could also do the inside knob. But this
is where we're actually defining our action.
And sometimes that might be at the very top, depending on how your
test is structured. If you're testing a class versus
just a function or a script file.
But since this is where we are basically saying what
the action is, this is
where the before each is. And I'm just going to put action
equals, and I'm going to put
a script block here. And so I'm
going to paste that into there.
And then we
can say invoke expression
or sorry, it's invoke command script
block action.
And that's how you run a script block. So here
it doesn't make quite as much sense because we just define
it here and then we immediately call it. But if you had this
action defined up here because
you were only testing a single function, but then you had a bunch of
different setup steps, it would be nice to
say this is my action, it's already out of the way,
and then I don't have to worry about it until I'm ready to basically execute
that action. Okay,
so I'm going to put that back like this since that's not the case for
things particular test suite.
And another thing that we probably will want
two do, and I can go through and clean these up.
You can see that we've basically gone from,
what was it like eight to twelve lines per test case. We're down
to just one in each it. And it's just a single assert
that ties into this
statement here should not turn the outside knob at all. And you're
just basically translating that statement into code
as an assertion. Okay.
One of the other things that I'll generally do,
and this is very specific to pester, is that
sometimes the scope between
these different script blocks can
get kind of weird. In earlier versions of Pester,
you could define things outside here like
my VAR equals two.
I promise I can type. But in the latest
versions of pastor, this will basically be completely
invisible to your tests. All it will see are things that
are inside your before each and your
it. And that's it.
So because of the PS script analyzer
saying oh no, things aren't being used and
I can't remember, maybe sometimes it actually doesn't carry
over the scope the way that I expect it
to. I like two turn these variables into script scoped,
which means that these variables
will live during the entire lifetime of
this script file. And so you don't have to worry if you're
in a script block or in some
other weird situation, you can just put
script on there and then now you don't have to worry
about the scoping of those variables.
Okay. And then I'll bring script
onto here as well so that we don't have to worry about the scoping
there.
And paste. There we go.
Okay, so this is good to generalize it
just a little bit further. I like to
use some terms. So we've already got action here
for that to kind of identify.
This is the action part of our arrange act assert.
But you also can do some other things, like instead
of calling this script knob, you can be a
little bit more descriptive in the testing vernacular
and say that this is our subject
under test sut.
And so that is saying that in this
test case, things is the test subject.
It is the thing that we are going to be acting on
in our action. And so that's kind of a
good like checking over things and making sure that things are
right. Does your action function or
script block call something on the subject under test?
If not, you might be accidentally testing your own test
or testing your own mocks, which will lead
to a lot of problems there
and then. Yeah, this is just a nice way
of assigning a variable outside of that that you
can assert against.
So this is pretty cool. And with
this in place, we really could
continue to implement the rest of these really easily
as long as we keep putting in those before each
blocks where they make sense and things
just kind of become very intuitive and very easy to implement
as far as the testing is concerned.
Things are not cluttered, things are not duplicate.
You can come through here, if you were to get a test failure,
you would know exactly where to go and what
was failing and why without having
to interpret lots of error messages. So to
recap, the intuitive testing pattern as it
applies to Powershell and pester is you
always want to have a before each block,
inside of each, right inside of each describe,
inside of each context and
things is where you're going to do all of your arrange
steps and define your action.
And once you've gotten to the point where you are
to the innermost context,
you can invoke that action.
The its should be one liners if at all possible.
Even though it's possible to have lots of different assertions
in the same test,
it's not great. It would be better to make an it that has
a good description as the name and then have a one liner expectation.
It does take a little bit of getting used to to think about
writing tests things way. But I found that this,
whether I'm writing in Powershell or Javascript or c
sharp or whatever language I happen to
be working in that day, this pattern
works really, really well and will help
you to be very successful when testing.
So I hope you now understand why
it is that we don't want to just hand over the task
of unit testing two an AI. If I were two, go to
Chat GPT right now and tell it to write me a
unit test suite for a doorknob.
It is unlikely that we would get the same level
of maintainability in the output, because a lot of the training
data is based on unit
tests that were not written this way. And unfortunately,
one of the big barriers to unit testing
is the precedent of a lot of
duplicate code, a lot of test names
that don't actually tie to requirements, but is
like something like test one. Right?
I know you've seen that in code basis before. I've seen it
so many places. That is the
input for the AIS. And until the AIs
are trained in how to write
tests in an intuitive way like this,
I wouldn't trust the AI to write your tests for you.
But again, the requirements gathering
step, regardless of how well the AI can
churn out the code in the right format, you want
to be in charge of your product's requirements.
That is why you are a
human and not a computer writing software.
Before we wrap up, let's talk about one more important thing,
which is how to make sure that you don't over test,
making superfluous test cases that don't
actually add any actual value to your
confidence level. Let's consider this
powershell function called remove vowel takes
in a string that is the input,
and it'll output a string that is
the same as the input string, but having removed
any vowel characters.
Okay, so the things that you
should be testing empty string as
kind of a test boundary case. Yeah.
What if somebody passes in null or an empty value or
just a value with nothing but
white space in it? Right.
You should test a small string with some vowels, a small string
with only vowels, a small string with no vowels,
a very large string. Make sure that performance wise,
it doesn't have problems. And then you can also
throw in a string with complex unit code characters
like emoji or other languages
that you wouldn't necessarily expect.
But you should know what's going to happen if
you throw something like that at you. The world is getting
a lot more international every day, and you should be
not just thinking about your current locale,
but that being said, you don't see
in things list that we should test it with every
possible combination of letters.
Right. That's not really feasible. It doesn't make sense to
do that. Instead,
you're looking at classes of input and
making sure that you have a test case that
will cover each of those classes of input.
Okay, so with that,
I know that there is so much more that we
could cover. We've just barely scratched the surface.
But what I really wanted to do today was
help you to understand how unit testing can
actually be an exciting and fun thing to do,
rather than feeling like it's a chore,
but yeah, we could have easily gone over
how to implement this in continuous integration and
continuous deployment pipelines as
this is a DevOps conference, but I felt like this probably
will benefit you guys more in the long
term if you get things idea that
unit testing is important and it's fun and
you're not going to be perfect at it right away.
Try going through the same process that we've gone through today with
the doorknob and going from distilling your
requirements down to arrange, act, assert, and finally
into the intuitive testing patterns that we've demonstrated
here. Just keep on practicing and you'll get better
and your code will get better. You'll start thinking about
those edge cases a lot more than you did previously,
and your code is going two be a whole lot more robust.
So thank you so much for watching again,
feel free to contact me.
I basically do this kind of thing full time,
helping out other developers learn how to up their skills
and up their game, and so feel free to reach out.
And I'd be happy to repeat
some of this and tailor it to
your specific team's needs if
you would like something like that. Again, thank you for Conf
42, for letting me speak here,
and I hope you guys have a great time at the
rest of the conference and good luck. Unit testing
see ya.