Transcript
This transcript was autogenerated. To make changes, submit a PR.
This is Srinivasa Bhitla.
Before I proceed, I would like to make a quick disclaimer.
All views expressed here are my own and do not reflect the opinions
of any affiliated organization.
Today's topic is going to be on revolutionizing software
testing with AI and ML, driving scalability and accuracy in QA.
So the agenda for today's talk is I'm going to introduce the Then
I'm going to explain about the challenges in traditional testing
and introduce AI and ML testing.
Compare with traditional testing with AI and ML testing.
Then I'm going to explain about how AI and ML enhances the testing.
Where I'm going to explain about some case studies, demos.
Then I'm going to talk about tools and technologies in AI and ML testing.
And challenges in adopting AI and ML in testing.
Then I'm finally going to cover about, future of QA and key takeaways.
Introduction to software testing.
Traditionally, how we have been doing the software testing, right?
So any software or any application under test that we carry out the testing,
We ensure that the quality standards are met and consistently the software
works according to the expectations.
So we generally used to follow like two types of testings, like one is manual
testing and another is automated testing.
In manual testing, we traditionally used to capture the test cases and test steps.
either in the test management tool or in a typical spreadsheet, Excel spreadsheet.
When we interact with the application under test, capture the results and
mark respective test as a pass or fail.
And we traditionally used to execute step by step and ensure that the each step
is actually meeting the expectations.
That is all about the manual testing.
In the automation testing, Same step by step interactions.
We used to do it like a script.
The script used to, have the expected results.
And that's what you're going to verify.
You're going to write the verifications as a script and feed those scripts
to the automation tool, execute them and see whether a automated test is
passing or failing automation test.
We generally, introduce for any repetitive tasks.
For example.
If you are running in, running the test on the cross browsers, or if
you want to run on different versions of the software, or if you want to,
do a frequent releases, that's where the automation testing is scalable.
So now let's get on to what are the challenges in traditional testing.
So manual testing is definitely time consuming because a software
test engineer need to sit.
And he has to interact with the system manually.
And, this is definitely a time consuming thing.
And human errors are typical problems.
If you do any kind of step, if you miss, or if any test case is
missed, that is error prone thing.
And it is hard to scale.
When it comes to automation testing.
So it is always, high maintenance.
The reason is if the software is undergoing frequent changes,
you have to update the scripts.
So that is where, the challenges are.
And also when it comes to the complex applications, you have to do a
comprehensive testing, otherwise you can't release, especially any financial
healthcare products or mission kit.
So definitely, you have to do the comprehensive testing.
That's where, you have to spend quite a bit of time to check for the, quality.
So the question for you is, can current testing methods keep up
with the rapid software releases?
Let's see.
So that's where we introduce AI and ML testing.
So here we are going to talk about AI.
AI is nothing but artificial intelligence.
So if you really, if I want to define in a single sentence, which mimics the human
intelligence for a problem solving, right?
Yeah.
So you can always read more about AI elsewhere.
I don't want to go too much detail into it.
Machine learning is.
learns the patterns from a given existing data to make the predictions.
So if you want to predict something, you need to give some kind of, feeder data.
Large volume of data is always better to do the accurate predictions.
So these AI and ML, we are going to introduce to the testing groups.
How you can introduce a human intelligence in a tool and how you can
actually give the data to the system and predict how The test results is
going to be in the future and how the system is going to behave in the future.
So that's what we are going to predict.
So now the question for you is, how could AI ML reduce testing
bottlenecks in your projects?
Let's go into deep and then see what's coming up.
So this is where we are comparing traditional testing
versus AI driven testing.
So here I divided the features into five categories.
One is test case creation.
How it actually done in the traditional testing.
It is in the manual or the scripted way.
In the AI driven testing, it is automatically generated, used based
on AI, artificial intelligence.
And script maintenance, especially in traditional
testing, it is very high effort.
In AI driven testing, self healing scripts.
Meaning like the tool itself will let you to correct some mistakes on the fly.
And defect detection in traditional testing, it is always reactive, right?
Once it happens, you are going to go act on it.
In AI driven testing, you forecast and predict what can go wrong.
And then you come up with the remedies.
And scalability in traditional testing is very limited.
In AI driven testing, it is highly scalable because the less effort
you are going to spend on Accuracy in traditional testing human errors
are possible in AI driven testing.
It's high precision.
So the question for you is which feature do you think makes AI testing is superior?
So you have five features and giving a food for thought.
You can think what could be appropriate based on your project.
How AI and ML enhance testing.
Here we have so many test artifacts we generate, or we create, in a
traditional testing phases, you actually manually produce, you
manually write a lot of things.
So in the AI ML enhanced testing, AI will help you to generate the test artifacts
based on the historical information that your organization already has it.
So once you feed all the historical data to your specific model.
You should be able to generate a new document or new test strategy like the
artifact one of that would be a test process or manual test cases or automated
BDD tests from existing historical data.
To do that, of course, you need to have your, machine learning
models deployed, and you may need to feed all the historical data,
and you need to train, then you may be able to generate all of these.
So that's the process we are going to learn later.
But that's how it works.
And that self healing scripts, if something changes while you, maintaining
your tests or running your tests, you'll also get a chance to, self heal by itself.
And then you can also do the predictive analysis where you can forecast
high risk areas in applications.
So what can go wrong based on the historical defect trend, based on
the historical changes that happened in the application in the test?
How do you predict you're going to learn?
So then dynamic test strategy.
So how do you generate a test strategy?
here I am using the model, LLM model, chart GPT 4.
0. So here I'm going to give you a prompt to the chart GPT.
And throughout my presentation, I'm taking an e commerce application as an example.
So here, the prompt that I'm giving to chart GPT Is I work as
a software QA engineering manager.
Our organization is developing an e commerce application.
Can you generate a test strategy for e commerce application?
This is the prompt I'm giving.
So now what I would get the output from the chart GPT.
So here is the, comprehensive test strategy generated by chart GPT, where
it described the high level objective and the And also it defined the
scope of the test stat, the testing.
And then it also generates what is the testing approach you may need to follow.
What are the test levels?
What are the test slides?
That you may need to follow to carry out the testing for the e commerce application
and which are the tools that your tools and technologies that you may recommend
based on your historical information and also what are the test environment
you may need to set up and the test deliverables what is expected out of the
testing like test plan, test cases, test execution reports, defect reports, right?
And also, you can also get the roles and responsibilities, like what kind of
roles and responsibilities are needed for carrying out these tasks, like
QA engineering manager, QA engineers, developers, product owners, DevOps
team, and what is the entry criteria and what is the exit criteria, right?
And what is the risk and mitigation, techniques that you may need to follow?
And what are the test metrics you may need to produce?
And how do you want to, continuously improve your, testing.
ChartGPT generated pretty much everything just by giving some prompt.
And I don't recommend you to take this as is, but if your organization is already
having some kind of, documentations, you feed it to your own privately
deployed models, LLNs, and then make it trained, and then use this prompt.
And I don't recommend you to upload your documents in the public card, GPT
or any LS that may actually know, leak your pri privacy or any kind, copyright
data you are sharing on the public domains that may not be advisable.
I strictly don't recommend you to put it on any public, GPT charge, GPT or any LLS.
Okay?
If you're any internal things you can do that.
train and then use these prompts, but if you want to generate a generic test
strategy or a generic information, I would suggest you to use this, but never
keep your private information out there.
And then this is the dynamic strategy, which is generated with one single prompt.
You got so much of information.
You mean you don't need to search anywhere else, right?
The test strategy document is generated.
Now you can use this as part of your, testing.
And another prompt is for test process generation.
So the prompt is I worked as a software testing professional,
outline the process I would follow if I were conducting a manual test.
So here is the process that you are supposed to follow.
Understand the requirements, do a test planning, develop the
test case design, set up the test environment, execute the test cases.
And log the defects.
If there are any failures, retest and regression test, retest is like whenever
the new features are coming up, you do retest whenever the issues are fixed.
So you do the regression test.
And then finally, if the application meets the standards, do a test closure
and do the continuous improvement as the product evolves and what are
the best practices you want to, do it for your test process, then the
dynamic manual test case generation.
So now again, I'm going to use another prompt to generate the test
cases for my application under test.
So the prompt is, I work as a software testing professional, our organization
is developing an e commerce application.
Can you generate manual test cases for such feature?
Here, I'm specifically saying, generate the manual test
cases for the such feature.
So now let's see what chart GPT does for us.
So functional test cases.
So it listed about five test cases where you have test case
description and test steps to evaluate expected results, preconditions,
and usability test cases, right?
Again, you've got the same thing, then boundary and negative test cases,
performance test cases, edge test cases.
So with single prompt, you've got so much of data in a tabular format, and
you may not use as it is, make the changes according to your requirements.
And then you can carry out those testings.
And if you have a historical test case document fed to your
AIML model, you may regenerate automatically without, sweating.
Then dynamic BDD automation test case generation.
So this is like completely automated.
So BDD meaning behavioral driven automation testing.
So here I'm giving you a prompt.
Again to the chart GPT on e commerce application.
So generate BDD scenarios and Java behavioral test for searching
products in an e commerce application, add the files such as e commerce
application methods, step definitions, feature files, runner file, and pom.
xml with explanation of each file.
So this is the prompt I'm giving it to the chart GPT model.
Okay.
Now see what it generates.
It generated a BDD feature file where you have three scenarios listed for such
feature and then I got a step definition java file which, which is having your
imports and the step definitions which are generated like with given and then
and whatever the methods are required by those BDD specific test steps
associated with the step definitions.
I didn't list everything because of the space constraint.
And then it generated a test runner class, which has associated with the feature
file and also blue, which folder the step definition files are supposed to go.
Thank you.
And this is the way test runner is defined.
Then an e commerce application method.
So this is like typical implementation of your application,
how it's been implemented.
And this is what the method will be called from the step definitions
and then the project structure.
So typical projects are structured, how the files are supposed to be kept as
part of your application folder or the test folder that you are going to create.
It could be an IntelliJ or Eclipse or wherever you want to keep your files.
And then the maven can configuration palm.
xml.
So here it actually describes about, all the file, all the dependent libraries
that are needed for your PDD test.
That may include your cucumber jars and the selenium.
If you are using selenium related dependent jars, and if you are
using any other dependence, it's going to generate from here.
And then you have the explanation of each file.
Which file is intended for what and it also gave you the commands like
how you want to run your tests.
So with a click of your prompt, you got all the information that
is needed, including the code.
I also generated, the same code and I created a git repository.
If you want to explore and if you want to learn how it is generated
and there are some kind of, runtime errors you may face when you set it
up because the packages and all that you may need to configure it properly.
if you want to play around, you can go here, download and run from there.
So I can give you a quick demo of the same test that is created
from, the prompt from chat GPT.
So I'm opening the IntelliJ.
And then I'm going to, play from there.
So here is my IntelliJ.
So here is my, feature file.
Here you see search for products by package.
So I want to enter this step.
Yeah, I want to enter this step.
So it went to the step definition file.
So this is the step definition file.
And this is the e commerce app.
In the e commerce app, I see load homepage.
This is the load homepage.
This is where all my, application specific methods are defined.
And this is my runner class.
Here you see the feature files.
Here, test, com, e commerce, yeah, test, com, e commerce.
Right features.
So this is where my feature file is there.
And this is what my runner class and these are step definitions, right?
And glue is you're gluing to the step definitions folder just to come
commerce step definitions And this is where your palm dot xml is So this
is the the full full fledged palm dot xml And you may sometimes, sometimes
because it is chart GPT, you may get some hallucinated, plugins it
may add, sometimes it may not work.
So I recommend you to download and then make it run from here.
some of the plugins you may need to add or some of them you may need to delete.
Because this is all about, the LLMs, which you may not have the right amount of data.
If you want to run from here, run it, execute and capture the results.
So this is all about, how you can actually use, your, chart GPT to generate the
whole behavioral driven test cases.
And then you can use it as part of, testing.
So though these B, the BDDs are not directly useful to you.
If somebody want to generate a BDD driven tests, and if you don't know how
to do it, you still, go to charge GPT, get the whole structure and customize
these automated, that tests according to your, application under test needs.
At least you'll get the framework structure and then you can
start building on top of it.
Even if you're not using for the e commerce application, you can get the
basic structure and play around with that.
So this is all about BDD and let's move on to the next slides.
So ML models for BDD and automated test case generation.
So here you see so many, you that are here OpenAI model and
DeepMind alpha models and MLflow.
So these models are going to be useful for you to feed your existing, the files
that you have that are like related to BDD or any, anything related to CACD.
So all of this information, you can feed it to your locally deployed.
LLMs and then you can use for generating your own personalized company specific
code and these are the capabilities and you may also have challenges when
you are deploying all of these things.
So if you don't have enough data, these LLMs also do hallucination and that's
when you may not get the accurate results.
So you need to be very careful with that.
So something similar to Codex, you also have other models you
can leverage, and you can also see some AIML tools for the scalability
and accuracy in testing, right?
So some of the tools you can see, those are, these tools can be used
for test case generation, code generation, and you can also see
which are the supported frameworks and whether it is AI powered or not.
So here OpenAI Codex, can it generate the test cases?
Yes.
OpenAI Codex Can it generate the code?
Yes.
Which are the supported frameworks?
Cucumber, Behave, SpecFlow, and so many others.
And similarly, you can also see, the chat GPT.
Again, these all are yes.
It also supports Behave and Cucumber, right?
And Cucumber itself has some kind of AI related features.
but yeah, so it supports Java, JavaScript, and Python.
Okay.
So now let's get on to the.
How A and ML enhance the testing.
So for example, you are using Selenium and you are using a locator feature
to identify an object by name search or by ID search button in the next
release, somebody changed the search identity to find and find button.
So in a typical test execution phase, when the locator is not found, the test will
fail and then it will not proceed further.
But if you are implementing the same thing with Helium, here you can see Helium
WebDriver which works on top of Selenium WebDriver if you use this dependency
and if you run the tests, right?
So even if it is changed to find, search to find, the Helium will
find out, scan all the identifiers and see which is closest to name.
It'll identify the locator and it'll make the test to be successfully executed.
And then it'll also give you a provision to update back that specific
identity in your test script, so you don't need to manually maintain it.
It'll automatically gives you an option to update the locator from
search to find in the script.
And the next time it is going to run without any issues.
Even in the current test, it will give you a warning saying that this
is the object which is changed.
Do you want to update your automated scripts?
So that's how it's going to work.
So you can see the helium demo here.
And you can also play around.
One of my, ex colleague friend has developed, some kind of, the, The example
kind of thing here, you can download and see experiment with the example here.
So as I mentioned, Hellenium will detect the failures automatically and updates
the DOM object as a broken locator.
So with this self healing, so it is going to update the self, the
Hellenium will update all the broken links and if there are any kind of
flow is broken, all of those things will be automatically updated.
Okay.
So AI detection will optimize your execution time and also
it will save a lot of time.
So as a result, you can see like 30 percent of reduction in testing time,
45 percent is in the bug detection data.
So real bugs can be detected rather than just updating the,
field names or whatever it is.
So the question for you is how would faster bug detection impacts your
product definitely positive, right?
So predictive analysis.
So what is prediction?
Like, how is it going to be in the future?
So that's what the prediction is all about.
So how do you predict?
Unless you give some kind of a data, it is very hard to do the predictions.
So whatever the metrics you may have, you may need to feed it to the system and
then play in the model and then predict.
So I'm going to explain you.
Step by step model here.
So if you want to do any prediction, first you need to do data collection.
So for any testing, what are the data items you have?
One could be the defect reports, code metrics, test metrics, release
data, or developer activity.
So then the data pre processing.
So based on the data that you collected, you need to pre process
everything, meaning format and make it like, unstructured data,
make it like a structured one.
Then exploratory data, do the correlations, like whatever the data
elements that you have, and make it like a formatted data so that they are relatable.
And feature selection, some of the things you may need to really, feed
it, like code complexity metrics.
It's not related to testing, but these are input data which will make
the model to, predict way better.
Change metrics that are happening in the code.
And then test coverage, what is the coverage of test in respect to the
code that is produced and developer activity, like how many frequent
check ins, frequent failures are coming based on the code, right?
So those are the metrics you may need to do with the feature selection, right?
Then you need to select the model.
So this model is specific to the AIML.
So you may need to find out what kind of models are
effective for this kind of data.
Thanks a lot.
So is it a classification model you want to choose, a regression model
you want to choose, or do you want to use any deep learning models, right?
And then model training and evaluation.
So once you get all of this data, you may need to feed to the, the model.
So when you are feeding this data to the model, you may need to
split the data into two categories, training data and test data.
Thank you.
So generally training data, people will take about 80, 70
to 80 percentage tested data.
They'll take from, 20 to 30 percentage.
Then you check.
Whether the model is predicting properly or not, you find out the score F1 score
and different types of scores are there to see whether the model is really
predicting according to the test data or not, then start doing the predictions.
So once you have the train trained model, once it is giving accurate
results, start predicting like what can go wrong based on the historical
information and then collect the data.
And then.
improvise over the period of time based on the data that is producing, whether
the prediction is coming right or wrong.
If you want to tune the model and all that, you may need to do it.
So all of this is related to training the model based on, test artifacts.
So ML in mobile app testing.
So far I'm talking about software application.
I didn't specify about any device, but here this is
specific to mobile applications.
So I'm working as a software testing professional, our organization
developing an e commerce application.
We tested the application for two years based on the data.
Again, this is e commerce related mobile app.
So then how do you actually, predict the defects, right?
So here you have approach to the defect prediction.
So you feed your first, you do the data collection.
What are the factors influencing the defects?
Then predictive models.
What are the models you are actually using, right?
What are the defect prediction trends are coming up?
So all of these things you are actually, pre processing and some of
them are your feeding and all that.
And then visualization of the predicts, how you want to, visualize
the predictions, whether you want to generate a bar graph or stack bar
graph or pie chart or whatever it is.
Once you have everything.
So what are the tools you want to use for the defect prediction?
Okay.
So now, once you capture everything, you may need to get historical information
versus what is predicted, right?
So here I'm just describing with some metrics.
for example, you have for an e commerce application, you have
search, checkout, payment gateway, recommendations and user profile.
These are the modules.
And the past defects are for search, you have 120 checkout, 200 payments.
150 recommendation 80 and user profile 50 and due to recent code changes.
So what are the type of, what is the level of code change?
High code level changes, medium level, high level, low and medium.
So based on the past historical information and based on the
recent code changes and what is the prediction, predicted number
of defects here you see, right?
This may be like.
Fixed month's data or one year data or five years data, you don't, but the
past defects are definitely way lesser.
and it could be like very latest data.
That's why you're getting these many number of defects it may come.
So this is a prediction it is making.
So the more data you have, the accurate prediction you make it.
Okay.
But again, it is based on only historical data, what you train, what you get.
Okay.
So this is all about a predictive analysis.
And let's see how A and ML enhancing the testing here.
here I'm also talking about some of the tools.
so some tools like this may be specific to, specific predictions and all that.
And, you also check when you're choosing your tool, whether it
can be integrated with CACD.
And you also check whether this tool has any data analysis capability or not.
And see whether it can support AIML, if it supports, whether it supports
like strong medium kind of thing and what kind of use cases it can.
So these are the things you may need to, understand before you, choosing your tool.
So here you got tools like Azure Machine Learning, IBM SPSS
Modeler, SonarQube, Jenkins.
So you see, which are the tools and what kind of purpose or what kind
of use cases they are catering to and you can choose accordingly and
then implement your predictions.
So here, when you are talking about the prediction, so you may need to capture
the data, feed it, analyze and predict, analyze and feed the data to the model.
And then finally, you can do the, predictions.
OK, and then before they actually occur in the.
application under test, you may predict and then take the course corrections.
So again, coming to the same mobile app testing, let's say if you're doing, if
you're having a device fragmentation, if it causes any inconsistent in user
experience, how do you really solve it?
Device fragmentation is the one which is the biggest challenge.
For example, if you are doing a mobile testing, you got like
various, sizes of the devices.
If you go to Android in Samsung.
Oh, it's like unimaginable.
If you are testing that kind of a device fragmentation for every device,
whenever the locator changes, look at locator position changes, it's very
hard, like even if you do automation.
So that's where you use the AAML device, device compatible things where the AAML
can automatically go detect the object based on certain predefined conditions,
and then it will optimize the solution.
So in these kind of scenarios, AIML definitely scales high.
If you really look at it, the results have shown that more than 50 percent
of reduction in app crashes, 60 percent increase in device compatibility.
Okay, so this is the kind of scaling you can see with the AIML,
especially in the mobile app world.
The question for you is, do you face challenges testing
across multiple devices?
So mobile testers can answer that question.
So the metrics that matter, speed coverage and accuracy a reduces the test
execution time by up to 70 percentage, and coverage expands by 60 percentage,
and accuracy improves by 40 percentage.
So this is the kind of, historical data that is being captured from charge GPT.
So you can always go take it out, if you start implementing you, whether you may
see this kind of accuracy and all that.
That purely depends on what kind of data that you provided to your
AI and ML and how you trained is also very important, right?
Tools and technologies in AI and ML testing.
So the popular tools are Selenium with Helium, Test.
AI, AppliTool, Mable.
These are some kind of, UI specific tools.
And technologies are like natural language processing, NLP.
Predictive analytics and computer vision.
So these are the technology you can implement and see how those are useful
in your regular day to day testing.
The question for you is which AI tool would you like to explore further?
So if you are going with open source Selenium, I would recommend a couple of
other commercial tools also there here.
Implementing AI or ML in QA.
So I would divide this implementing AI or ML into four steps.
Step one, identify repetitive testing tasks.
So generally you do in automation, even in the AIML, you can do the same thing.
Choose the suitable AIML testing tool, right?
And then start the pilot project.
Don't do everything at one go.
Scale AI integration across workflows.
And see where and all you can actually, start expanding your AI integration.
The question for you is, what's the first step your team can take toward AI testing?
So it is still in the beginning phase, but see where and how you want to
take your first step in AI testing.
Challenges in adopting AI ML for testing.
because AI ML is not 100 percent there yet, but still, it is evolving.
it, the first thing is the lack of skill resources.
Especially in testing for AIML.
So a person needs to know a little bit of data science, a little bit of machine
learning algorithms, AI model development.
So with all of this, the person needs to have software testing knowledge as well.
So it is a lack of skills at this moment, but a lot of people are picking
up and, eventually they'll be there.
So again, AIML is not.
Like anybody can go and then implement it.
It also requires like initial costs.
so for example, if you may need to, get a hardware where the AI, the
machine learning algorithms need to process where it needs a lot of
GPUs and they are really expensive.
and sometimes you may need to pay for the models if you are
using any paid models, right?
And also you have data dependency.
So the data, do you have any historical data to really feed it to?
You need to get quality data and then you need to data massage to fit to
the AI model and change management.
So are you ready?
Is your team ready to change?
And which needs learning curve and all that.
So that's another problem and explainability and trust in AI.
So yeah, if you don't give the right amount of data, as I said,
it will hallucinate and may give produce the wrong results.
So that's where you don't know how this, specific, hallucination
came into the documentation or a test of the result, right?
So that's where, you have to be extremely careful, right?
So the question for you is what challenges could your team face when
adopting AI in testing, if some of you already implemented, I would
like to hear from you the future of QA with AI ML autonomous testing,
AI will handle testing end to end.
without you sitting there, without you correcting the test cases, without
you correcting any kind of, changes in your, the links or buttons of
the properties or whatever, right?
So how can you achieve the autonomous testing and real time defect prediction?
So as the test is running it, but what it can foresee, Hey, this
feature is completely changed.
I may foresee another 20 tests failing before it reaches to the 20th test itself.
Because of the dependency and its learning algorithms, continuous
learning, AI adapts to new technology.
So it also actually learns from the current data, and then it'll try to train
itself and then becomes more robust.
So the question for you is, how do you envision the future of software testing
key takeaways, air ml boost testing, speed, accuracy, and scalability.
If you do it in the right way, I want, that is the kind of a prerequisite.
Real results, proven case studies show measurable improvements.
Again, the feeder data and the AI ML model is very important.
Yes.
Proven case studies show there are definitely improvements.
Adoption.
Don't do everything in the beginning on the step one, it's a start small
and slowly scale with the confidence.
The question for you is.
What key insights would you apply to your QA process?
So I, when I prepared this material, I also, gone through some of the, the case
studies and some of the, the papers.
So if you have any questions, if you want to further learn, you can feel
free to use these acknowledgements and the documents or write papers.
And, so this is almost the closing of the session.
If you have any questions.
feel free to reach me with my email or LinkedIn profile, and this is my website.
The question for you is, what would you like to explore more in AI Driven QA?
Thank you guys.