Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, my name is Omer Farooq, and today I'll talk about how to secure applications that
are built around Gen AI or integrate Gen AI capability into existing applications.
But before we get started, I'd like to talk about a little bit
about myself and introduce myself.
I am a founder and security engineer at Oxford Security based in Maryland.
We've been around since 2018 and we've been able, we have been able to help over
50 different customers build out their digital transformation general journey
from on-prem to hybrid to multi-cloud.
We've been able to help them integrate DevSecOps, including DAST, SAST,
and interactive testing and security validation of the networks, and then
their phishing attacks as part of their large scale security initiatives.
We are experts and we take a lot of pride in presenting data as technically
close to as, with engineering rigor and engineering practices.
We make sure that our data is transparently available to not
only our customers, but when possible to our larger audience.
That's why we take a lot of our time and we make sure that we present
and give back to the community.
So today, in that spirit, we'll talk about how to secure Gen AI applications
or applications that exist and are integrating Gen AI capability.
And what that mean and how to do that rapidly and do it in a way that scales up.
So we will start off by a little bit of short introductions
about LLM applications.
We'll go into how LLM applications are built and what are the challenges.
This will include application frameworks and include an
application development methodology.
And then we'll dive into more into how to automate LLM application deployment.
either as part of existing tooling's tool set or to build new tool sets.
And finally, we'll cover LLM security, how to make sure that you
be strategically built LLM application secure, and how do we tactically do
things to make sure that they are actually built securely by engineers.
One of the key underlying themes we would like to make sure that we follow along
with this Is that we understand that LLAMP applications are no different from
existing applications, but the trick is to make sure that we do not increase the
OpEx or the operating costs to expand the application sets or the functionality.
We should be able to integrate these new function exciting technology into
our existing applications without having to redesign or reengineer our delivery,
our integration and our security.
So let's get started and go and talk about LM applications.
So the talk title is LM SecOps.
And I like to type to put that together and used and put that
together to a, to the Venn diagram that we have on the screen.
LLM applications usually are on the form of chatbots.
That do summarization to translation.
Basically, there are things that are either used by machines or users.
that are based on the principle of a question or query.
And there is a response.
But there's much more than that.
LLM applications usually have built in semantic search.
They have built in complicated data rights management platforms.
They have a lot of data life cycles.
So we'll talk about that as part of our LLM Ops.
So when we are integrating and we are building out applications with LLM
capability, we need to understand how to make sure the data life cycle policies The
data ownership and classification policies are extended for the current existing
applications into LLM applications.
And this overlaps with the security.
As it, as the organizations shift left, we need to make sure the LLM security
Shift left at the same time, and thus, we'll talk about how to secure the
models, how to secure data, how to ensure data privacy, how to build the
data rights management, how to ensure the data is only in granularly available
to the right set of people when they're asking a question on a large data sets.
So once again, LLM SecOps is a combination of LLM applications,
LLM CI CD automation, LLM security.
I'd like to, I'd like to start out with a high level picture.
This is a one view of the world, but it's not, and it should not be
considered all reference models.
But one common reference model for LM applications is
the model around how to build
a chat bot or a machine to machine interface that essentially has a
query and then gets back a response.
And essentially, usually a query is translated to our embedding.
Those embeddings are looked up in a vector database to find the closest
data set that is related to that.
Sometimes this is also used, AI is used to build the right level of query
by something called semantic index.
the data itself is injected into initial data prompt.
and the data prompt is optimized by either building a bigger context or
using L-M-A-P-I to ensure the prompt itself is functional and optimized,
and then the you, the prompt is routed through the LM ap, LME, through
the l to the LM through the API or sometimes you have LM caching the data.
The response is content filtered.
And the content filter response is provided back to the user of the machine.
This high level architecture presents, is not super difficult to understand.
But one of the things that I like to point out at this point is that
it is important to maybe possibly build a reference application or
build LM applications in two ways.
One, keep the front end, the UI or the machine interface
separate from the back end.
And this can help organizations in multiple different fashions.
From OPEX point of view, building a single large multi tenant LLM backend
can help you build out one, one, one, one backend that has a detailed rights
management for data and assets that can be front to multiple different
application types, either bots, Other translation services, data summarization
services, middleware that use the LAI, or even other AI services.
This segregate, this decoupling of the frontend to the backend will
help also with security and the development and deployment of it.
We'll show that how decoupling of this application at the backend and the
frontend is critical in actually making sure security scales up at the same time.
Not all frontend applications are going to be equally complex, but they will
have their own set of requirements.
It makes sense to secure the frontend applications using current existing non
LM security models, and then ensure that all the new future LM security provisions
only have to be done once in the large one singular backend that could be
frontend for multiple different frontends.
what are some of the high level challenges that are usually part
of LM applications development?
One, there's a lot of integrations.
LM applications are data routing applications where a certain
amount of data is pulled, a certain amount of data is acquired from
sometimes real applications.
Near real time systems.
It could be coming from OT equipment.
It could come from IOT.
It could come from Bing search.
It could come from search engines from the, from specific catalogs.
It, the data could cut, the data could be egressed and ingress
from all types of data sources.
That's what makes LM applications very powerful.
Being able to build and, Chains of events or change of data interactions that makes
the query and the response very powerful.
Data integrations represents few critical security issues.
All integrations require you to have secure network, all require you to
ensure that you have delegated access.
And furthermore, it ensures that you, the data that's being accessed
on the base, based on the initial request user has entitlement to that
data that's going to be used for his semantic search is going to be used
for his AI search or his AI query.
Furthermore, as the data is being embedded and pulled into the embedding
databases, the LM applications, that data needs to be equally.
Part of your data lifecycle process, but the retention policy,
the classification policies and data catalog and tags and labels.
This becomes a difficult challenge when you're dealing with a lot of data.
But at scale for different front end systems.
We'll talk about how to use specific tools and technologies
and concepts to build that out.
That's part of our LM Ops.
And finally, if not equally important is the AI services.
It is possible an AI application or AI back end that is LM application
back end will interact with multiple models, if not one or more.
We need to be very careful about how those model integrations and models
APIs are protected, not just for cost.
But also for availability, ensuring those meter services are logged, making
sure that those model communications are protected over the public
Internet with at least with some level of data encryption at rest.
The data itself, when you're inquiring from the user to all the way to the
model, can potentially include a lot of proprietary, if not sensitive data.
including personal information, including things that could be considered privilege.
Thus, AI API model security is equally important and needs to be considered
as part of the LM application security.
We'll cover this and we'll talk about how that can be secured as part
of our LM SecOps, which is in the latter part of this conversation.
So before we get into a little bit of how and what LM application
look like, I like to start out with saying, let's put a developer hat on.
And how do you build an LM application or how do you actually add an LM application
capability into existing application?
you have two major components.
You have the actual logic and of data integration, and then you have the model.
And Understanding the use of the right model is very important.
That also is equally important because not all models offer the same type
of security and responsible ethical AI capability as other models.
Pulling an open source model from public internet versus a model that's managed
and curated has different set of side effects and different consequences.
Be careful when you choose the right model.
Also, like as we mentioned prior.
Keep the front end of the application, the bot, the user, or the machine to
machine interfaces separate and try not to consider those as part of the
LM applications as much because those could be factored out and developed
using existing application interfaces or with low code, no code solutions and
keep a focus on building one time an LM backend that could be secured with data
rights management and other features.
And does not have to be redeveloped.
Another feature of LLM applications, this is very important to cover because
this dictates how security and how development works is the fact that LLM
applications are cloud native artifact.
They were built in the cloud.
They've come from most of the development is in the cloud.
And that means is that a lot of the components, part of the rapid
prototype and fast results methodology for LM application required to use
a lot of open source data router frameworks like slang chain, a
semantics kernel, and llama index.
You will have a lot of components of all resource.
If not, the entire model could be open source.
You will be using a lot of low code, no code for your front end.
You will be using, leveraging a lot of cloud services, including
serverless and managed services for AI and AI related compute and storage.
And most of the times, if not all, you will be utilizing some kind of a
vector database that will be sitting in the cloud or could be on prem
that needs to be protected equally, just like the way we traditionally
protect traditional databases.
Though the purpose of the vector database is an It's a bit different,
and the capability is different, and the data context is different.
We still need to protect the vector database equally as the data, how
we protect traditional relational and non relational databases.
before we get further, I'd like to point out three large principles for data
application security for LM applications.
One, decouple the architecture.
Applications should be cloud native whenever possible.
Applications should be managed services as much as possible and vector databases
should be, should, per storage and performance should be considered.
Now, these principles are not entirely for LLM application development for security
development, but they're part of a general principles that you should consider.
Decoupled architecture will provide you better return on your investment.
Applications in cloud native will allow you to do rapid prototype, which is part
of what all applications are built like.
And vector databases that have higher performance of storage functionality
will allow you to make sure that you have the data rights management at scale, so
you can build a decoupled architecture that we want to build for security.
So the next concept that we want to talk about is now that we know
what the, what LM applications look like, and we understand the design
principles and how the methodology for LM application is, which means fast
prototypes, build fast, fail fast.
How do we do that?
And the one and that is basically, and how do we do that with existing tools?
And how do we do that?
with, without learning new tricks and how do we make sure that we do it
quickly in a way that we don't slow down the prototypes and the development.
going from Jupyter Notebook, for example, to an application can be done
quite literally over, over a few hours.
But when you do it at a scale and do it from an organizational point of view, how
do you bring all those teams together?
you can use your existing pipelines.
The good news is if the additional of a few stages and steps in your cloud native
or your on prem automation, you should be able to support LLM application.
How do we do that?
let's start with automation and deployment.
The backend applications that are part of the LM applications are no different.
There are still cloud native, but yet, but the minute differences are
about the configurations of the model, how to integrate the keys and how to
integrate this, the service accounts.
All of those should be deployed and automated in the
similar fashions we do today.
Integrations of secrets, integrations of tokens should be done in a similar
fashion as we do today, except with the caveat that, remember, a lot of the
applications for LM are meter services.
In that scenario, we do suggest to pick up the speed and use delegated access.
Whenever possible, make sure that you build LM apps within your application.
To utilize the end users credentials and users, identity for meter service
past that responsibility out of the model and don't manage it at a
large level manager, the user, the organizational department level.
Build new indexes and new detections and new threat detections for you
in your SecOp tooling for LLM ops.
There, there won't be that different for your existing, but they will require
to have separate LLM For example, for prompt injection and data and prompt
poisoning, you will need to build that in your, in the threat detection.
For example, if you see a user.
Continuously trying to do prop injection.
you need to build out new threat detection scenarios in your same tool or your threat
intelligence, threat planning tool, keep it, you need to pull in the fees for the
threat intelligence to make sure that they include the LLM application types.
And if not the last and the most informed data management, you do
need to enable encryption at rest.
Encryption in transit, encryption in use, we need to make sure the embeddings, the
question and the answer, the responses that are logged into the applicant,
that are logging the data, the questions that are asked by the users of services
are considered privileged information.
One of the key things we need to make sure the part of the LMOPS is that you
need to make sure there are column, if not in column encryption to protect.
The response and the requests that the application will see.
This can be considered equally privileged as two people's conversation.
Because people and users will ask and inquire things that are either
very personal or PII, include PII, or do include company private,
company sensitive, or general sensitive data in their queries.
Thank you for listening.
the queries have to be protected, have to be only available to a certain users.
The honor trail and the investigative and the forensic investigation should
only be available to a certain users.
The data rights management of the log data is a critical component.
And we suggest that, we suggest using column encryption.
We use suggesting using double encryption to make sure those columns with the data
requests and data prompts are protected.
Before we get into LMSecOps, security ops, we need to cover the Some of the what
left high level privilege and what are we really protecting and work and trying
to gear towards with an LM application?
And how is that different for from a traditional application?
Traditional applications are also equally worried about data leakage, but they
are less worried about prompt rejection.
They're less worried about what model poisoning to some extent because
sometimes there are no models.
adversarial attacks against LLM applications are different than from
the nature of the attacks that are for existing non LLM applications.
For example, adversarial attacks on LLM application could be to
introduce LLM hallucination.
And finally, the ethical concerns are our primary part of the LLM SecOps and
how do we make sure we manage those.
So let's start with ethical concerns.
Ethical concerns are something that you, that we should think about when we're
building out applications, and those could be tuned using content filtering.
Data leakage can be addressed using data rights management and double
encryption or column encryption.
Adversarial attacks can be protected and detected using SecOps and threat
detection and threat intelligence.
Model poisoning and prompt injection can be protected using the traditional
input validation, metered throttle responses, content filtering, and
those type of built in cloud resources for making sure that the model stays
protected and inputs are checked.
Next, but finally, we get to the point where we can actually do things
today to protect our LLM applications.
I filed them under the Tactical LLM SecOps and essentially these are the
few steps at high level that we can take today using existing tooling
to protect our LLM applications.
Number one, use your existing tooling for DAST and SAST to ensure that all LLM open
source components that include SCA, that includes using SCA tools or using SAS
tools, Make sure you all the code that is written for your LM application is
equally goes through your secure coding practices as your other code, even though
it might be cloud native, even though it might be more configuration heavy,
even though it might be something that looks as an, as a serverless function.
All of this code needs to be protected and reviewed.
we suggest using tools for.
as existing pipeline steps to run scans of this code, making sure that
these particular code snippets of the code libraries are able to protect
against prompt injection are able to protect against model poisoning.
There are certain different tools out there that are able to
look for these kind of patterns within the LM based type of code.
So if you already have code that is able to do build semantic index or is able to
do response or input query caching, you can build out and scan this particular
piece of code with these scanners to ensure you are focused and hyper focused
on finding those vulnerabilities.
Ensure that content filtering, ensure that content filtering is both ways,
that the input and the output, ensure the model does not, is not, does not,
is not able to accept PII, or is not able to, or is able to accept PII if
needed, but to a certain limitations.
Make sure the data that gets ingressed, ingressed into the model
is classified, is labeled, is tagged.
Those are the things that at the code level you can check today.
You can make sure that the rights management with your back end
logic is tested and using DAST and other validation tools.
And last if not least, ensure all your logs for all your
applications are end to end.
And they're able to capture data capture from user to machine and back to the user.
In case you need to go back and trace out LLM interactions to
either find detection of abuse, or detection of illegal data usage, or
model hallucination, or model prompt injection, those ones will be critical.
Thank you for joining me today for this conversation about LLM Security Ops.
I hope you had, I hope you had enjoyed this conversation and took away the
key points about how to keep the data front end and the back end separate,
how to build up the automation and the security from a shift left
perspective from day one in your LLM applications or adding the capability
of LLM into existing applications.
If you have any further questions or you would like to have a dialogue
or if you'd like to have questions, feedback, feel free to contact us at www.
auxin.
io.
My email is omarfarouq at auxin.
io and I'm always happy to respond back to constructive feedback.
Thank you and have a great day.