Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello all, thank, welcome to con 42 python.
I thank you for attending this session and, I would like to go over some of the,
use cases of python in, deep learning and how it can help us in, building a
lot of deep learning and data engineering and these kinds of transformative apps.
And, let me first quickly give you a short introduction about myself.
I have total experience of eight plus years in machine learning, data
engineering, as well as cyber security.
I graduated from Northeastern University with information systems
with specialization in machine learning and data analytics.
So in this presentation, we will dive deep into how.
Machine learning and deep learning have enabled transformation and all
these applications, they are built on Python using kind of packages
like TensorFlow, PyTorch and all the other deep learning packages.
So let's go through this slide over here, which illustrates
about reinforcement learning.
So as we, as we know, there are a lot of capabilities in machine learning.
These are related to.
Supervised learning, unsupervised learning.
So these kinds of algorithms are kind of very basic algorithms.
They are used widely in industries, but right now reinforcement learning
is making a lot of inroads because of a lot of good algorithms are
being developed on basis of that.
And these kinds of algorithms are shaping the future and they are enabling
a lot of superhuman capabilities.
capabilities.
So these kind of reinforcement learning they have once they are
trained in a very proper way manner with less errors and higher accuracy.
These can lead to, you know, building superhuman or, you know,
a deep brain kind of capabilities.
Like if you see chat, GPT and other LLM models, they are generative and they
are able to Kind of do the thinking and provide us with the answer.
So these kind of capabilities have evolved through reinforcement learning.
For example, if you see DeepMind, AlphaGo, then, recent breakthroughs in theory,
these have a lot of, reinforcement learning involved, and these are shaping
the way for the future, especially in robotics and various other fields.
Then we look into the generative models over here.
So over here I have mentioned about generative adversarial
national autoencoders.
So what is the difference between the two?
GANs are mostly generative models and they are Kind of more of generation
image generation and that kind of thing and they have They currently have higher
accuracy as compared to VAEs, but VAEs are more powerful probabilistic models which
have capabilities of anomaly detection Then threat detection working on the
data set in cyber security realm So, in cyber security, these kind of VAEs are
very powerful and, they are utilized and in future we might see lot of trends
of utilizing such kind of algorithms.
Both of these algorithms are based on neural network based algorithms and
they have evolved and probably they will make a system more capable and a lot of
applications will be developed in future.
Now we see over here the rise of generative models in design and art.
So, currently machine learning and deep learning technologies are being
used in as well as healthcare and cybersecurity fields, but in future
we'll see a rise of these kind of models being utilized in architecture,
art, and all those industries because they have the capability to
kind of generate like They have the ability to understand the pattern.
So if somebody has certain things in mind, and if you enable such kind of
decision making, then these kinds of models can generate certain kinds of
design, whatever you have in your mind.
And if you are able to convey it to the model and the model will in turn
generate the design and art for you.
So it will become that kind of a powerful system in future.
Now in, AI, AI is very useful in cyber security, not because, it can
do all the basic tasks, but because these generative models, they might be
able to do a lot of threat detection, vulnerability assessment, fraud
prevention, and such kind of activities.
And this is because the algorithms and, Algorithms have evolved in such
a high way that these kind of threat detection, instead of taking sim tools,
we can use such kind of algorithms to detect threats in the organization.
So for example, let me give you a scenario.
If there are a lot of cyber criminals, which are Using AI based tools
like chat GPT and various other tools to do phishing attacks, DDoS
attacks and such kind of things.
So in threat detection, if we leverage such kind of models, deep learning models
and train them using Python, then this will become kind of good, kind of, it
will become, it will help us in developing and mitigating such kind of threats.
And it will also enable vulnerability assessment, fraud prevention, and such
kind of activities, which are not.
Clearly and not that powerful using the existing tools, although I would
say, although these kinds of existing cyber security tools are good enough,
but there is 10 percent chance where such kind of tools might fail.
And in such kind of situation, If we take help of AI tools like RNN,
Recurrent Neural Network, these kind of AI algorithms can help us
to detect certain threats, then predict vulnerabilities, some fraud
prevention, that kind of activities.
So these, these tools, instead of relying on such kind of cybersecurity
tools, we can leverage AI algorithms, which are trained using deep learning.
neural networks using Python and this can enable us such kind of better
security posture within the organization.
There are multiple, many threat, many algorithms like neural network, for
example, Microsoft, as well as Amazon also has certain kinds of threat detection
algorithms, which they are training.
And building upon their AI tools to improve the cyber security and
operations within the organization.
So, in future, we might see vulnerability assessment and fraud prevention
activities done by AI, and it will have better accuracy score in future.
Now we come to graph neural network.
So GNNs are also a powerful tool within cyber security, which can enable us to
identify anomalies, predict threats, do complex kind of transformation,
relationship mapping, and, maybe it can, detect certain kinds of anomalies
within the data, vulnerability data, which SIM tools like Splunk, AlienVault,
and such kind of tools cannot detect.
So these algorithms, if they are trained and if they are able to
provide a good picture about the anomalies and security posture, these
algorithms can become very useful.
Good helping hand in, uncovering certain kinds of threats.
And what I am explaining over here is not just using these kinds of
tools, but also integrating them with the existing cybersecurity tools.
Organization can have better outcome.
of detecting DDoS attacks or certain kinds of threat threat actors
and identifying cybercriminals.
So it's quite beneficial and these kinds of models should be trained
so that security posture and threat detection capabilities can be increased.
Now we come to future of AI in cyber security.
So, There are multiple aspects where AI can help.
This is right now in threat detection.
Obviously, it can help us to provide some immediate inroads.
And in future, it can also help us to identify the attackers.
If let's say if the attacker is using AI based tools.
These kind of things can be kind of detected by AI tools rather than using
the existing cyber security tools.
And organizations can improve their security reporting then when there won't
be that many requirement of that many folks to look into the basic aspects.
So basic security aspects and reporting can be automated by using AI.
Whereas security engineers and other folks can look into some other advanced aspects.
So the work can be distributed and cyber security operations
can be improved in such a way.
And now we come to ethical considerations in AI.
So over here, AI can be a good thing in cyber security and other operations.
But AI, there are a lot of issues like Privacy and security issues, then bias
and discrimination and all those things and job displacement kind of issues.
So these are some ethical considerations of AI, which we need to look
into as we try to fine tune and deploy the models in production.
We use.
We need to look into the compliance aspects as well.
Compliance and governance related processes that AI would be able
to have any privacy implications or security related implications.
So based on what kind of data you are feeding to the AI algorithm, these
kinds of things needs to be discussed with the compliance and I would say
the security and compliance team and also governance and other teams.
All these things need to be discussed so that the AI algorithms are
capable enough and they don't hamper any of our ethical considerations.
Now we come to recurrent neural network.
So recurrent neural network, what is a recurrent neural network?
This kind of a neural network is a very advanced level neural network, which
focuses more on kind of detecting threats.
finding the anomalies in the dataset.
Then, predicting the anomalies and that kind of thing.
It behaves very different from convolutional neural network because
convolutional neural network, for example, is used more into autonomous
driving and detecting images and classifying images and such kind of thing.
Whereas is a neural network, which is based on more of.
Detecting anomalies in a data set and finding and improving the security
posture within the organization.
So those, those are the use cases of RNN and RNN is currently very extensively
used in such kind of anomaly detection.
I, I would say before implementing RNN algorithm, we should look into some basic
algorithms like XGBoost is one of them.
Then we have IsolationForest.
XGBoost.
which is also a very good algorithm, which I have trained to look
into the cyber security data set.
We'll go through that in our next few slides.
So the RNN architecture, this slide mentions about the architecture of RNN,
and it has input layer, hidden layer, Then activation function and output layer.
So it's like advanced features as compared to neural network where he,
it has input layer, then hidden layer, which are trained in a different way.
Then we have activation function, which is also kind of, developed in a different way
as compared to traditional neural network.
And some of RNN are associated to language modeling, generating
Then, speech recognition.
Then we have, translation of machines, image recognition,
and such kind of things.
There are a lot of use case of RNN currently in cybersecurity where it
can detect threats and then provide details of the threats, vulnerability
detection, and that kind of thing.
The best, the currently the best algorithm which can be used for
time series is the transformer base.
Time GPT.
So time GPT is the algorithm, the model, which is developed by next law.
This company uses heavily transformer based models to predict it provides a time
series analysis, and it is very powerful.
And the use cases are related to finance, web traffic analysis, internet
of things, weather forecasting, are related to anomaly detection,
finding anomalies in data set.
Classifying the anomalies and such kind of thing.
So time series is also time.
GPT is a very important algorithm and it's, since it is transformer based,
it has evolved quite drastically as compared to traditional neural network.
And I would like to, I, I. We will be able to see a lot of applications
of transformer based algorithms in near future, because right now the
trend is going on where a lot of AI applications are being developed on
a transformer based kind of models like chat, GPT, and all those things,
all these kinds of applications are developed using transformer based
models, even these kinds of algorithms, which are being used in deep seek.
Kind of these kinds of companies are using transformer based
models, which are trained in a very different way as compared to normal
neural network based algorithms.
So I believe transformer based models will have a lot of big
impacts in next few years.
Now we've come to this very good use case of Python with Spark integration.
So I have done some work on Spark earlier.
So I am providing some details about how Spark.
Can help backend and machine learning engineers, because this is a distributed
framework for big data processing.
And it is very easy to use and, it helps to integrate with REST
API and various other, databases.
Spark, currently has a capability to integrate with these backend
data stores, like Cassandra, then, MongoDB, then, all the, Elasticsearch
and other faster NoSQL databases.
Spark also integrates very quickly with Kafka.
So, if your distributed system has Kafka and all these tools, then you
should also consider Spark integration.
In Spark integration, what you need to do is to use PySpark.
Where you try to write an API, query the API using Spark, PySpark,
and then store it into the tables.
You can also store it into any kind of databases, and it is very powerful.
It works with distributed systems.
It can, Provide you the transformation.
You can do transformation using Spark and then dump the data
into any kind of a data store.
And it is very fast and very scalable enough.
The real world applications of Spark are related to big data processing, developing
ETL pipelines, performing transformations, then querying and writing into APIs,
fetching, ingesting data from APIs.
And all those things and also machine learning capabilities in spark is
also very good, but it is not very much good like the existing deep
learning capabilities and streaming spark applications are mainly used
for streaming, building streaming applications and taking the data from
Kafka and putting it into any kind of data store after transformation.
Now, next we will go into deep dive into the code.
So this slide over here provides details about the Python PySpark code.
This code is basically to ingest data from a REST API.
And after, We ingest the data in from REST API, we are writing it and this
code over here, it will try to take the data from API and put it into a Spark
DataFrame and Spark DataFrame is very powerful because it can handle tons of
data as compared to, and we can data store after ingesting the data from web API.
Now we look into this.
This is a vulnerability data set, which I was talking about, where, deep learning
and machine learning based capabilities are good enough to detect threats.
So this vulnerability, this data is a vulnerability data with, having multiple.
columns related to vulnerabilities.
Then, we want to identify these kinds of vulnerabilities not by
using tool, but by using AI model.
So over here, I have developed this AI model where I am
cleaning the vulnerability data.
And, we have certain criteria over here where we are taking
the clean data with complexity.
Over here, we are trying to find out high complexity and medium severity
and CV as a score of greater than six.
And then I have written this, first of all, I have done perform cleaning
over here, and, After cleaning the data set, I have performed transformations
and looked into certain features which we require like CVSS score,
severity encoded, complexity encoded, and then I have trained the model.
Isolation forest over here.
And, after I perform model training, I have also, predicted
and developed the anomalies.
so this, anomalies of clean data consists of, high, all those, rows which have,
complexity high, severity of medium and CVS score of greater than six.
And, finally, I am able to, the model is able to provide the anomalies and, this
can be obviously automated, by using, other, highly capable models as well,
but I have tried using isolation forest.
And then, I had also tried, this kind of anomaly detection using XGBoost and,
I am, trying to leverage RNN and other, transformer based model to, predict,
certain anomalies in the dataset.
So.
in few months I will be using transformer based models to predict,
certain anomalies within the dataset.
now we come to over here, the convergence of AI and security.
So, AI and security.
these are, very different kind of, things, but, AI can be used in cybersecurity,
in, Improving the security function.
So it cannot, generally replace all the existing security tools and
features, but it can help us to provide more, better mitigation, detection,
threat detection, fraud detection, and those kinds of capabilities.
So it is likely to help.
The team in, building more, better, security posture within the organization
and, transforming industries.
The impact of AI advancement is, kind of helping in building more of,
autonomous systems, drug discovery, and, developing personalized medicine.
So these kinds of things are the use cases which can, be Done in future, but
right now the capabilities of AI are to improve the cyber security as well as
some of the tech industry improvement of the processes within the tech industry.
And, what are the key takeaways?
So in this slide, I have mentioned about the deep learning since it
is, having a lot of, a lot of, AI models like reinforcement learning,
generative models, and, all these things, but it is critical to understand
what Use cases, they can help in.
So we have to understand the use case in a better way.
And then, go into specific models, which can help.
And, the basic things are to write, start with the simpler models and then automate
it and improve upon, and then look into some advanced modeling capabilities.
So in that way we can automate certain processes, in our, In cybersecurity,
in data engineering, as well as, analytics, and it can also help
in, building analytic pipelines.
So that's it from my side.
hope you like this session and let me know if you would like to
have more, additional information.
Feel free to connect with me on.