Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, Tim Spann here.
my talk today is codeless generative AI pipelines prompt automation.
Now I want to apologize.
I was in the hospital this week.
So my talk today may not be as fast or as fun as normal, but
we'll do as well as we can.
And the slides are available to you and all the source code.
So thanks, and bear with me.
If you have any questions, please contact me.
You can get me on LinkedIn, social media, or email.
Thank you.
Now these slides, again, The link is right there.
You can also hit the QR code and pet my cat on the way to the slides.
before we get into it, prompt automation is really about making prompts as
powerful and as easy as possible.
and taking out the drudgery.
Very awesome.
Often there's a lot of extra work that people need to do and
we're going to minimize that.
But first, what is this all about?
recently, if you haven't heard, and I'm sure you have, with the power
of the new gen of AI unstructured data is now extremely useful,
whether it's documents, images.
spreadsheets, notes, emails, audio.
Lots of things in your knowledge base, tons of documents.
With using these, date learning models, we can now convert them into vectors,
which are just array, big arrays, thousand or more dimensions of an array
that can be stored in a vector database.
And this makes your unstructured data available for very fast searching and
for, utilizing whatever you need to do.
And the power of this is because of these vector embeddings.
And, like we mentioned, unstructured data is everywhere.
Because, as you imagine, it's social media posts, it's logs, it's email,
text from anywhere, documents, legal documents, PDFs, whatever you have.
And, images, obviously, videos, and a ton more.
Now, we have to be able to find it before we can use it.
Now, once we've gotten this data stored, we want to be able to
search, and make these things readily available to whoever needs them.
And so we take them, transform them into vectors, we get our embeddings, they're
stored and now someone's gonna do a query.
Now this is where we could automate a lot of things around the prompts
that people are putting in.
because maybe they just put the most basic, query here to get back,
whatever makes sense from the vector database, using an approximate
nearest neighbor similarity search.
We'll do the math to get whatever is close to what you need, which is nice.
You get Just what you want, especially if it's been stored properly, we
use these results, send them to the LLM, and everything's good.
Now, Milvus is the open source product that I'm talking about.
It has at this point over 30, 000 stars, tons of users, tons of download.
It's easy to use.
Pip install it in a notebook and use it right there or use it as docker or
in a big cluster or in Zillow's cloud.
So you can write code once in the notebook and it is available wherever
you need it to be, which makes it great.
And it's integrated with everything you need it to be, whether it's
different models, people hosting models, different libraries.
sources and support for dense and sparse embedding filtering re ranking
and all the features you expect.
Very easy to get started.
And why is this?
Why are you automating these prompts?
With the support of RAG, this makes us even more powerful.
We can easily take whatever you use your question and we can also cache these.
Like keep the most popular ones in a collection in the vector database.
Map ones that are the same or nearly the same, and we can do that with scalar
filtering, as well as vector search.
So this makes that pretty easy.
This will improve the accuracy.
Reduce, hallucinations and make it specific to your domain of
data that you put in there.
I think this is really important and we could do that against any model.
It is not tied to any platform or model.
Whatever's in your vector database can be used.
This is important.
So cached queries.
finding, documents and information that makes the prompts easy and using
libraries like lang chain mama index that automate a lot of this rag and
build some of these prompts for you.
And of course you could use prompt templates as well.
And again, you own the data.
You don't have to worry where it's going.
And often you may not even need to call the LLM if we've got things cached.
Now, how can I get data in?
How can I automate building the prompts?
take this a level up.
Now often you just write a Python app and those can be really good and
there are libraries for automating a lot of the prompts, but what
if you don't want to do that?
What if you have real time workloads?
You could do dataflow pipelines and these could be from a number of real
time, tools such as Apache NiFi.
there's some tools in Flink and there's some, other open source
ones in the real time space.
But the major things that they do are make sure we can get the context
externally from, wherever it needs to be.
So ingest this data, route it, clean it, enrich it, transform it, parse
it, chunk it into pieces, vectorize it, get it everything you need there.
Crafting these prompts for you automatically, again from either templates
or applying different logic from, what decisions you've made, pre parsing your,
initial query or suggestion to build out a proper prompt that'll get the results
we need and obviously augment that with, a context we retrieve from, external
sources, especially a vector database.
Thanks.
And round triplets so we can talk to things like Discord, Slack, REST
Interface, Kafka, SQL, whatever those are.
So we can make that managed for you, get the results back.
With the latest version of NiFi, which is finally in production 2.
0.
Use the latest Java.
It is very fast and it lets us run Python.
So we can take those really powerful Python libraries and apps we had and
make them in a manner that makes it easy to automate things enrich things
and improve our prompts very easily without having to hand code and
connect all these things and be able to scale it out so I can take anything
that comes from Kafka and do that.
I've got one that gets company names out.
So if I want to parse your prompt before we do some things, I
can get your company names out.
when we have stuff coming through, this may be on a, towards the end.
I want to be able to show captions with my images.
Very easy to do that.
And, add additional classification and information there.
we do not want any, problematic images coming through the system.
So we'll use a model to detect those and we could automate that as well in
regular Python or in our NiFi automation.
detecting facial, emotions could be helpful for me right now with
what's going on with, surgery.
And finally, next part is, we have it, I need to distribute these workloads, maybe
to Flink, to Ray, to Spark, maybe directly dump them into Milvus, Kafka is the key,
Kafka is an awesome writer, there's a great museum in Prague that I visited,
but it is also a really powerful tool.
central data hub to move your data around, and that becomes really important.
If you want to run NiFi, you can do that extremely easily
by just, running it in Docker.
And you can use the 128 or 2.
0.
2.
0 is the one to go with.
That gives you the Python, lets you do all this automation.
And I have an example here.
where we are streaming street cameras and we use this so we can chat with
slack makes it very easy to do if you want more information check out our
discord give us a star in github i have a couple of interesting use cases
there you could try out one of them using text one of them on a raspberry
pi So that one's a cool edge use case.
And every week I have a newsletter out, please check it out.
let's take a look at some of the systems here.
Now I have a number of different collections of data.
This one is specifically for I'm doing a quick search here for chatting
with airplanes because I capture live airplane data as it's in flight and we
could take a look at what that data is.
It's things like, images, latitude, where things are, and when you have
this sort of data here in, there's the metadata, there's the vector, it makes
it very easy to do a conversational, things with your We see here we
have to rerun this query, timed out.
I had this running since last night, probably shouldn't do that.
Okay, so we could look, there's a lot of them here.
Too much data to show.
and you can see we're getting back all the data we need.
And then we could do that with whatever data.
Now if you're doing a standard RAG, it's a little more simple.
But this data is useful when you're building up your
prompts and automating it.
But just to show you the power of what we can do with the visualization.
Now I have an example rag here.
We will load it, have the embeddings come through and then with the power
of a Lang chain, this'll build up the proper prompt from my query.
My query is like what's in this image and then it just passes back
the results to what I need here.
It is pretty straightforward on your part.
You don't have to do too much hard work.
thanks for attending my session and I'll see you again.