Transcript
            
            
              This transcript was autogenerated. To make changes, submit a PR.
            
            
            
            
              Hi, Tim Spann here.
            
            
            
              my talk today is codeless generative AI pipelines prompt automation.
            
            
            
              Now I want to apologize.
            
            
            
              I was in the hospital this week.
            
            
            
              So my talk today may not be as fast or as fun as normal, but
            
            
            
              we'll do as well as we can.
            
            
            
              And the slides are available to you and all the source code.
            
            
            
              So thanks, and bear with me.
            
            
            
              If you have any questions, please contact me.
            
            
            
              You can get me on LinkedIn, social media, or email.
            
            
            
              Thank you.
            
            
            
              Now these slides, again, The link is right there.
            
            
            
              You can also hit the QR code and pet my cat on the way to the slides.
            
            
            
              before we get into it, prompt automation is really about making prompts as
            
            
            
              powerful and as easy as possible.
            
            
            
              and taking out the drudgery.
            
            
            
              Very awesome.
            
            
            
              Often there's a lot of extra work that people need to do and
            
            
            
              we're going to minimize that.
            
            
            
              But first, what is this all about?
            
            
            
              recently, if you haven't heard, and I'm sure you have, with the power
            
            
            
              of the new gen of AI unstructured data is now extremely useful,
            
            
            
              whether it's documents, images.
            
            
            
              spreadsheets, notes, emails, audio.
            
            
            
              Lots of things in your knowledge base, tons of documents.
            
            
            
              With using these, date learning models, we can now convert them into vectors,
            
            
            
              which are just array, big arrays, thousand or more dimensions of an array
            
            
            
              that can be stored in a vector database.
            
            
            
              And this makes your unstructured data available for very fast searching and
            
            
            
              for, utilizing whatever you need to do.
            
            
            
              And the power of this is because of these vector embeddings.
            
            
            
              And, like we mentioned, unstructured data is everywhere.
            
            
            
              Because, as you imagine, it's social media posts, it's logs, it's email,
            
            
            
              text from anywhere, documents, legal documents, PDFs, whatever you have.
            
            
            
              And, images, obviously, videos, and a ton more.
            
            
            
              Now, we have to be able to find it before we can use it.
            
            
            
              Now, once we've gotten this data stored, we want to be able to
            
            
            
              search, and make these things readily available to whoever needs them.
            
            
            
              And so we take them, transform them into vectors, we get our embeddings, they're
            
            
            
              stored and now someone's gonna do a query.
            
            
            
              Now this is where we could automate a lot of things around the prompts
            
            
            
              that people are putting in.
            
            
            
              because maybe they just put the most basic, query here to get back,
            
            
            
              whatever makes sense from the vector database, using an approximate
            
            
            
              nearest neighbor similarity search.
            
            
            
              We'll do the math to get whatever is close to what you need, which is nice.
            
            
            
              You get Just what you want, especially if it's been stored properly, we
            
            
            
              use these results, send them to the LLM, and everything's good.
            
            
            
              Now, Milvus is the open source product that I'm talking about.
            
            
            
              It has at this point over 30, 000 stars, tons of users, tons of download.
            
            
            
              It's easy to use.
            
            
            
              Pip install it in a notebook and use it right there or use it as docker or
            
            
            
              in a big cluster or in Zillow's cloud.
            
            
            
              So you can write code once in the notebook and it is available wherever
            
            
            
              you need it to be, which makes it great.
            
            
            
              And it's integrated with everything you need it to be, whether it's
            
            
            
              different models, people hosting models, different libraries.
            
            
            
              sources and support for dense and sparse embedding filtering re ranking
            
            
            
              and all the features you expect.
            
            
            
              Very easy to get started.
            
            
            
              And why is this?
            
            
            
              Why are you automating these prompts?
            
            
            
              With the support of RAG, this makes us even more powerful.
            
            
            
              We can easily take whatever you use your question and we can also cache these.
            
            
            
              Like keep the most popular ones in a collection in the vector database.
            
            
            
              Map ones that are the same or nearly the same, and we can do that with scalar
            
            
            
              filtering, as well as vector search.
            
            
            
              So this makes that pretty easy.
            
            
            
              This will improve the accuracy.
            
            
            
              Reduce, hallucinations and make it specific to your domain of
            
            
            
              data that you put in there.
            
            
            
              I think this is really important and we could do that against any model.
            
            
            
              It is not tied to any platform or model.
            
            
            
              Whatever's in your vector database can be used.
            
            
            
              This is important.
            
            
            
              So cached queries.
            
            
            
              finding, documents and information that makes the prompts easy and using
            
            
            
              libraries like lang chain mama index that automate a lot of this rag and
            
            
            
              build some of these prompts for you.
            
            
            
              And of course you could use prompt templates as well.
            
            
            
              And again, you own the data.
            
            
            
              You don't have to worry where it's going.
            
            
            
              And often you may not even need to call the LLM if we've got things cached.
            
            
            
              Now, how can I get data in?
            
            
            
              How can I automate building the prompts?
            
            
            
              take this a level up.
            
            
            
              Now often you just write a Python app and those can be really good and
            
            
            
              there are libraries for automating a lot of the prompts, but what
            
            
            
              if you don't want to do that?
            
            
            
              What if you have real time workloads?
            
            
            
              You could do dataflow pipelines and these could be from a number of real
            
            
            
              time, tools such as Apache NiFi.
            
            
            
              there's some tools in Flink and there's some, other open source
            
            
            
              ones in the real time space.
            
            
            
              But the major things that they do are make sure we can get the context
            
            
            
              externally from, wherever it needs to be.
            
            
            
              So ingest this data, route it, clean it, enrich it, transform it, parse
            
            
            
              it, chunk it into pieces, vectorize it, get it everything you need there.
            
            
            
              Crafting these prompts for you automatically, again from either templates
            
            
            
              or applying different logic from, what decisions you've made, pre parsing your,
            
            
            
              initial query or suggestion to build out a proper prompt that'll get the results
            
            
            
              we need and obviously augment that with, a context we retrieve from, external
            
            
            
              sources, especially a vector database.
            
            
            
              Thanks.
            
            
            
              And round triplets so we can talk to things like Discord, Slack, REST
            
            
            
              Interface, Kafka, SQL, whatever those are.
            
            
            
              So we can make that managed for you, get the results back.
            
            
            
              With the latest version of NiFi, which is finally in production 2.
            
            
            
              0.
            
            
            
              Use the latest Java.
            
            
            
              It is very fast and it lets us run Python.
            
            
            
              So we can take those really powerful Python libraries and apps we had and
            
            
            
              make them in a manner that makes it easy to automate things enrich things
            
            
            
              and improve our prompts very easily without having to hand code and
            
            
            
              connect all these things and be able to scale it out so I can take anything
            
            
            
              that comes from Kafka and do that.
            
            
            
              I've got one that gets company names out.
            
            
            
              So if I want to parse your prompt before we do some things, I
            
            
            
              can get your company names out.
            
            
            
              when we have stuff coming through, this may be on a, towards the end.
            
            
            
              I want to be able to show captions with my images.
            
            
            
              Very easy to do that.
            
            
            
              And, add additional classification and information there.
            
            
            
              we do not want any, problematic images coming through the system.
            
            
            
              So we'll use a model to detect those and we could automate that as well in
            
            
            
              regular Python or in our NiFi automation.
            
            
            
              detecting facial, emotions could be helpful for me right now with
            
            
            
              what's going on with, surgery.
            
            
            
              And finally, next part is, we have it, I need to distribute these workloads, maybe
            
            
            
              to Flink, to Ray, to Spark, maybe directly dump them into Milvus, Kafka is the key,
            
            
            
              Kafka is an awesome writer, there's a great museum in Prague that I visited,
            
            
            
              but it is also a really powerful tool.
            
            
            
              central data hub to move your data around, and that becomes really important.
            
            
            
              If you want to run NiFi, you can do that extremely easily
            
            
            
              by just, running it in Docker.
            
            
            
              And you can use the 128 or 2.
            
            
            
              0.
            
            
            
              2.
            
            
            
              0 is the one to go with.
            
            
            
              That gives you the Python, lets you do all this automation.
            
            
            
              And I have an example here.
            
            
            
              where we are streaming street cameras and we use this so we can chat with
            
            
            
              slack makes it very easy to do if you want more information check out our
            
            
            
              discord give us a star in github i have a couple of interesting use cases
            
            
            
              there you could try out one of them using text one of them on a raspberry
            
            
            
              pi So that one's a cool edge use case.
            
            
            
              And every week I have a newsletter out, please check it out.
            
            
            
              let's take a look at some of the systems here.
            
            
            
              Now I have a number of different collections of data.
            
            
            
              This one is specifically for I'm doing a quick search here for chatting
            
            
            
              with airplanes because I capture live airplane data as it's in flight and we
            
            
            
              could take a look at what that data is.
            
            
            
              It's things like, images, latitude, where things are, and when you have
            
            
            
              this sort of data here in, there's the metadata, there's the vector, it makes
            
            
            
              it very easy to do a conversational, things with your We see here we
            
            
            
              have to rerun this query, timed out.
            
            
            
              I had this running since last night, probably shouldn't do that.
            
            
            
              Okay, so we could look, there's a lot of them here.
            
            
            
              Too much data to show.
            
            
            
              and you can see we're getting back all the data we need.
            
            
            
              And then we could do that with whatever data.
            
            
            
              Now if you're doing a standard RAG, it's a little more simple.
            
            
            
              But this data is useful when you're building up your
            
            
            
              prompts and automating it.
            
            
            
              But just to show you the power of what we can do with the visualization.
            
            
            
              Now I have an example rag here.
            
            
            
              We will load it, have the embeddings come through and then with the power
            
            
            
              of a Lang chain, this'll build up the proper prompt from my query.
            
            
            
              My query is like what's in this image and then it just passes back
            
            
            
              the results to what I need here.
            
            
            
              It is pretty straightforward on your part.
            
            
            
              You don't have to do too much hard work.
            
            
            
              thanks for attending my session and I'll see you again.