Transcript
This transcript was autogenerated. To make changes, submit a PR.
Welcome everyone.
I'm Ludovico.
And today we're going to explore how prompt engineering can enhance test
automation by improving the accuracy, speed, and overall reliability
of AI driven quality assurance.
But before starting, let me introduce myself.
I'm Ludovico, nice to meet you.
I'm the senior test engineer at NearForm, also SheTech
ambassador and Grafana champion.
As well, I co founded different startups in the tech field.
I like to contribute to open source work, mainly in the educational field.
And also have some passions like cars, animals, and video games.
But let's start to introduce my company near form in quick words
Near from is a company from ireland that makes software and as well.
We are really Love open source contribution and doing open source as
well And you can find right here in this slide the numbers About ours contribution
in the open source world If you're interested to see more you can check the
website and that's why we are hiring.
So check this out You let's start with the talk.
this session will address critical challenges in test automation.
We will introduce some practical techniques and show how Prompt
Engineering can help you.
Let's say increase your skills for future proofing QA processes with AI.
So let me Say why I bring you this talk because we want to solve real QA
challenges that we have in the QA world Gain, maybe some practical skills and
also hands on skills to check if this prompt engineering can solve the our
problems And also stay ahead with the high driven qa trends that are going on
right there So what is prompt engineering?
I created a simple quick definition of it And for me is designing precise
contextual inputs to guide AI models towards specific outputs but in other
words We can say that is the art and science of creating the right inputs so
that the model responses aligns closely with your goals But what is the context?
between a prompt engineering and QA in the context of quality assurance.
Prompt engineering is particularly valuable because it allows us to
create test cases, simulate real world user actions, various, let's
say handle various edge cases.
And also validate some outputs produced by the model itself.
this is essential to ensuring that AI driven products are available and perform
as expected in different scenarios.
And, before starting, let's see, let's see what we need to clarify.
because, I think that success in this case for prompt engineering
can be something like this.
So we need to define, as you can see, miserable outcomes
that aligns with your use case.
Ask yourself what is a successful, successful response for the AI model
should look like in your specific context.
And as well, this can be accuracy, relevance, clarity, and any
kinds of, formal output that is essential for you and for your case.
as well, the evaluation method, because, with the success criteria in place,
you can set up evaluation methods to measure how well your prompt performs.
Are performing because for example, if you start writing down In a
plain text what are the outputs that you want to have for a test case?
For example, I want to create the task for a button and I want that the output of my
prompt Will be that particular test cases.
I can write that down and this involves a testing prompt Because you have let's
say an outcome writing down and you can test if the prompts that you are
using Are giving you the right output that you let's say ticket before so
it's important to track performance Over the time because this will improve
it directly the expectations and the performance of the prompt itself.
As well, the last point that I've written down that you need to do for reaching
the goal and is to start simple with the first draft prompt and refine that through
different many iterations because your initial prompt doesn't have, let's say, to
be perfect because, you can think, at this point, with this prompt as a baseline that
you can use to improve upon in the time, because you need to begin with some kinds
of idea and, maybe you want to, know that, that idea is something that, that the,
the eye that you are using can accomplish.
And as you go, during the time, you can use that, to, let's say, reach your goal.
as well, you can use many different tools if you have, problems
during structuring the prompt.
like the Antropic console that you can find online.
I know that Antropic, if you search on it, does have a prompt
generator that you can use.
so if you have some struggles during the prompt, iteration, improving performance,
you can use as well, different tools like this one, that maybe, can help
you generate and improve, better the prompts that will align with your needs.
So moving on, we have this, we have this, engineering workflow.
That is a screenshot from Anthropic Documentation.
As you can see, we have this flow that is a step by step workflow
for prompt engineering, for the effective prompt engineering.
And we can see that we start with the developed test cases because we need
to start, as we heard before by me, defining some test cases that can,
let's say, cover different variety of scenarios, and different typical cases.
Moving to the, let's say engineer primarily prompt.
So with the test cases in mind, we can create an initial version of our prompts.
and this will guide us to, towards the, desired outputs that we want.
To have, this move to the test process where we actually test the prompt against
we, against our test cases to see, how well it performs and as well to evaluate
the model's responses for each of the scenarios that we have tinkered and
notes where the improvements are needed.
So after that, we have executed our, word prompts, we can write down every single
outputs and see where we can improve that outputs and we move to refine prompt.
So based on that results, we can refine our prompt to address any possible
issues found during the testing.
And this can be something like rephrasing, adding more context or
just adjusting instructions to make the prompt more effectively effective.
And the last point here is to share polished prompt because once
let's say the prompt consistently produce the desired outcome.
So we are, let's say, good to go across all the test cases that we are thinking.
it's ready to be shared.
So we are confident that our prompt is performing good and we can share the
prompt with the other team members.
So they can say, Hey, okay, this prompt is working fine.
And they have the, the thing that they need.
the final, let's say, polished prompt that we have is, The relevant version that
we have in production and, this, let's say is the, the goal of this process.
also what we can say about this, that is this cycle of testing and
refining, is also called, evils.
And it's crucial for me and for also Anthropic.
To achieve a dyke level of accuracy and reliability in the AI responses, because
if we talk with the AI and every time we need to create a prompt from zero,
so from scratch, the result will not be something that is quality like, because
every time that we let's say, start executing a new prompt, we don't have
the, let's say, trust of the outcomes.
And this is, this can be really a pain during the time.
So really keep in mind this process and start doing like that.
so now I'm giving you some examples that I've creating to, show you
what a good prompt should be like.
this is a simple prompt that I'm creating and, as you can
read, I giving the context.
the, I will start acting like a test engineers with 10 years of experience.
after that, I will say.
that I will give in the input some screens and after the screens, will
be given to the prompt, it will output all the test cases in the given one,
then format in both Excel files.
And in the, chat, so you will see in a bit what is the result.
This is the input that I've, inserted inside the prompt.
This is just a simple website.
It's just an e commerce one.
And you can see now the output.
So you have a table with plenty of test cases formatted in
the way that I defined it.
And you can see on the right, The file, the Excel files, and as well
the link to download, the file.
So this is just, it's just a quick example, and we can move to the challenges
in the traditional test automation field.
Because we can see that during the, the time, the test automation world
have encountered different problems.
Because, traditional test automation faces multiple challenges in real life.
From adaptability issues to high manual effort to maintain,
stability in the CICD pipelines.
I think that prompt engineering can help address these challenges, by
enabling, more adaptive test cases and also increasing the speed of testing.
Of the, fixes as well.
And here we are with some key techniques that we can use, in prompt engineering,
each of these techniques from zero shot, few shot and chain of thought
prompting are essentials for creating precise and adaptable responsive.
In responses in a different testing scenarios and moving on beyond manual
prompt engineering is because, while manual prompt engineering relies
on trial and error frameworks like DSP, why allow us to, let's say,
structure the prompts like code.
So we don't have, let's say the prompt that we.
right with simple plain text, we can actually use this DSPY
framework to code the prompt, right?
And so we can achieve more data driven results and we get as well an output that
is much way better than the single prompt.
So manual prompt can be a starting point, but maybe in the future,
give it a try to the DSPY framework.
How can we efficiently manage and track the performance of different
prompts in a dynamic QA environment?
This is a question that, different people asked me and I answered
it with this prompt layer.
Prompt layer is a tool that, That, allows you to create and manage different
prompt versions in one platform easily.
And you can track, you can track different versions of the prompt.
You can log and analyze how the prompt is actually, going,
if it's going good or not.
So you can track the effectiveness to see which of the different prompts
performs better during the time.
And this is, I think, a really good A way to, let's say, iteratively,
upgrade and improve the prompts with a platform like this.
Because for the qa, for example, we can, use these to identify what are the
prompts that are performing good and quickly applied updates, for the cases
that we have, written before using the, flow that they presented you before.
This is some screens about prompt layer.
As you can see, it's a simple PLA platform.
You can have different, features like create the prompt template,
log the request, blah, blah, blah, and use as well the templates.
And this is the editor that you can use to write actually the prompt
and to check, after that if the, the prompt is performing good.
right now I'm giving you some advice that I use.
So for my side, I build actually a custom GPT for each of the project
where I'm working on, because.
I think that right now, custom GPT is the much, let's say much easy
way to create something that really can help you out with your work.
Let me dig it out for you.
What is a custom GPT?
So chat GPT itself, if you pay the plan, you get the plan.
Can allows you to create this custom GPT that are actually some chatbots
that you can Customize by inserting the instructions that the chatbot Will
execute for you and also you can give the instructions and the data That
you can, give to the chatbot and we will use them, for, generating the
outputs better, based off the data.
using the user story, take, sorry, taking the user story, taking the technical
details, and the screens of a project and building a custom GPT for that
will increase the chance to get outputs better, in the context of your project.
So please give it a try.
And I think in this case, as well as you can read in the second
line, you will have for each of the prompt that we will create, maybe
using prompt layer or other tools.
Let's say more project specific for you, so you will have better outputs as well.
Another thing that they really, let's say having the value to use is to create some
documentations, chatbots with custom GPT.
So for each tool, like maestro playwrights or anything else, the
thing that I'm doing is to create a chatbots that takes the markdown pages.
of each website documentation.
And after that, let's say I have a powerful, chatbot that knows
everything about single tool and, can answer my questions based on
the latest versions of the tools.
Because, sometimes when you ask ChatGPT for some questions, It will
gives you the, let's say, unupdated code and, versions about, the tools
that you are using and asking for.
So I think using this approach can leverage the code that's actually being
generated by, chatGPT in this case.
And this is an example.
So I just take the maestro for you and you can see that I'm giving some
instructions, for example, to answer questions about maestro and generate
code based on the provided official documentation that I'm giving.
To, ChatGPT with this upload.
So I take it every single page from the Maestro, Official documentation, put it
in this chatbot and actually it works.
So the outputs that I, let's say receiving from this chatbot, are much
way better than the single outputs that I will have when trying to prompt inside
a single official page of ChatGPT.
So this is something that you can try and, Check if it's working for you.
And also another topic is this the agents because if for example For
the prompts you will have let's say a static result with agents.
You will have much more dynamic Interaction because agents Can pull real
data from the outside, interacts with APIs and adapt, in different contexts
and making the responses more precise.
we don't have the time to check and see some agents, but they know that
there are some agents in the world that are starting, let's say, relying
on the software testing field as well.
So check this out, please, because it's something that can improve as well your
outputs if you need something like that.
Also, let's move to the understanding knowledge memory, topic because,
knowledge memory, as the first that we have seen, in the chat GPT with custom
GPT, when we are passing actually the, markdown files enables you, to let the
eye remember the past interactions.
So previous prompts, previous outcomes.
as well, previous data that you have passed inside the chat that is really
beneficial for your test that you have, let's say, in mind to, reach, for,
involving sequential possible actions or complex workflow, for your AI.
So knowledge memory is really important because, this can improve,
as we told before, iteratively the outcome of the prompts.
And as well, we define the word prompt tuning that you
can hear in the future, maybe.
That is something that optimizes the prompt itself.
not the model, okay?
Just the prompt.
Because we can't improve the model by ourselves.
The model has some, let's say, rules and some formulas that are behind that.
Okay?
So we can't improve the model, but we can actually prompt tuning the prompt.
So the technique, this technique is really useful, when we, let's say, want
to expand our detailed test cases, or we want to also evolve the test cases
that we are using for checking if the prompts are actually working fine.
And, yes, these are for me the best prompting techniques that
you can use, the few shots, the chain of thought and the react.
So a few shot, let's say, as you can read, use example for nuisance task,
a chain of thought, helps AI expanding its reasoning and react dynamic
responses based on observations.
right now we are moving to these four phases that I've prepared for you,
for me, it's like going to the design.
So when we in QA, word, start writing down the test cases, starting thinking
all scenarios, start thinking if, the things that we are, wanted to
do during the user story creation.
Actually are missing something.
So with all of these phases, let's say I'm reaching you The possible
way to use prompt engineering to help you out if you are test engineer in a
real world scenarios So starting from the design phases prompt Engineering
is possibly used to score the automatability on, the test of the
test cases that we want to, implement.
So if you have, list of possible test cases that you want to do,
maybe you can ask with the prompt they are to check what are the, mail
test cases that you need to start.
Automate.
And this can be also done better if you use a custom GPT like the
one that I showed you before.
one, the one that have the context of the project as well.
this really can help you out if you have some troubles and you can't really
know if some test cases can, let's say, automated Before someone else as well.
Another one is to generate diverse, diverse user personas because, I
think, having in mind that, a possible user that will use your website
can be, let's say a normal one.
so maybe using, the, hands, not both hands, but maybe just using one hands.
So I think, having different user personas can be, useful when you need to create
a more inclusive and comprehensive, test cases for your, test plan as well.
So if we can move to the implementation phase, when we actually want to
implement our, code, our test.
We can say that during implementation, prompt engineering aids in
identifying UI elements and generating diverse test data dynamically.
Because, as you can see, we have the xpath name suggester that is Something
that I've tried on, you can pass some images with some rules for each part
and you will, it will tell you the name that you need to use for each of
the possible, components in the page.
A data driven test generator.
that can help you out, maybe with the data to generate
diverse, test data dynamically.
So not talking about the test itself, but the test data.
So maybe, the possible configuration of the users, the possible ways,
of a user that can use your, your, your software, something like this.
And also automated code generation, because some tools like autoplay write,
can actually generate, code inside Playwright based on some prompts.
So if you use this keyword driven approach.
It will start, let's say, replacing the code itself.
this, actually, if you use the tools like chatgpt before, can, let's say,
allows you to implement also the test automation, Faster, instead of doing
it with only code and reporting phase, because, this is really important
one, because, in the reporting phases, prompts help you detect maybe a recooling
failure, because if you can pass the logs and also the, let's say, the reports
that are made by the frameworks that you can use like PreWrite on Node RES.
Maybe you will can, you can, let's say, prioritize some issues that
you have, ask, promise you to, fix something before, another one.
And so this maybe will make it easier for you to focus on that.
more on the critical areas for remediation instead of checking it manually.
Also, maybe, as you can see, creating some kinds of strategy behind this failure.
creating some, let's say, fixing practices based on risk, based on
the issues that we are going on.
And the last one is the reading reports phase because, in this case,
I can translate the test results into some accessible summaries
for those that are not technical.
And this is, I think, is a really great feature because if you take something
that is really technical, It is really techy, and you can translate this in
really seconds, to let some other one in the team, like the stakeholders, read
what you have done, what are the reports, what are the data that maybe you can't
understand with the initial report.
This can really help.
help you out, assess the possible impact that the tests, are, are
making also in the, in your team.
And this maybe can guide also the QA focus of the team.
As well, the last thing is to, suggest focus on the testing focus for the new
features, because if we have all of this data stored in a place and we can remember
this with the memory that we have seen before, this can also guide ourselves
to, improve the strategy behind creating a new feature, because actually when you
have collected all of this information and we want to, let's say, start and
create a new feature, Another feature we can ask the eye with all of this data
if It's a good one or it's a not good one because we have the data to think if
that feature Maybe can address some the issues that we have found And yes, some
three takeaways from this talk I think prompt engineering transforms QA, as
you can see, and I think you will try to have a chance to create something like
the things that I've shown you before, maybe the custom GPT, maybe trying some
out, trying out some prompts or prompt layer, because I think, for my opinion,
prompt engineering really can help you out doing your job, also if you are a QA.
The effective tools and techniques as it exists, like the Anthropic that we have
seen before and also the prompt layer.
But, really in the web world, many tools exist.
Maybe you can start by yourself using the chat GPT prompts.
And after that, improving, maybe using some auto prompting strategy.
so this is let's say a way to, how to insert prompts inside the eye
to iterate automatically, until we get the output that we need.
Or as well using maybe an agent that is a let's say next step For does not
have so many skills in this field But you can for sure start with the chat
gpt tools or prompt layer ones And also the last thing that I want to say
that you need to remember is that these things are really cost efficiency for
the qa processes because Think that, there are in the world a lot of persons
that are thinking test cases every day.
So with these kinds of technologies, we can, let's say, bypass having a
lot of tip, a lot of people that are working in these test cases writing.
And let's say, focus ourselves more on automate things and, check if the software
is good or not manually or automated way.
I think these are the three main points that I want to, remember.
experiment with the promise engineering in your testing workflow, and also,
start small and iterate, because if you want to do this at once, you will,
fail, because it's not something that you can accomplish, in an easy manner.
Thank you.
I hope that you are really happy now, having discovered these things.
And these are my contacts.
You can meet me, write me a message, and I'm really happy to see you.
Bye!