Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, everyone. My name is Ben. I'm the co founder of Symante AI. And in
this talk, we're going to dive deep into semantic search and
how to set that up, what the technical difficulties are, and most importantly,
where you would use that in a business sense. So without
further ado, let's just jump straight into it, and we
have to give a little bit of technical detail first. And that is
what the building block of semantic search is, and that is vector databases.
This is actually a technology that's been around for some time. In fact, it dates
back to sometime around 2005, but with the
emergence of large language models, particularly, obviously, chat GPT,
but many others as well, we've seen a revamp of this technology,
and most importantly, we've seen a huge technological breakthrough. So now
semanticsearch is technically so much easier to
implement than it was a couple of years ago without large language
models. Actually, Gartner predicts that by 2026
or so, roughly 30% of all businesses will have implemented
this vector database, which is the thing that powers
semantic search. So what are vector databases? Let's just take a
look at the example on the right, on the bottom right, where you have a
matrix which consists of four words, man, woman,
king and queen. Now, obviously, those are words that are somehow connected.
Man and woman are genders. King and queen are both types of royalty,
but you know that they're polar opposite between each other in that subcategory.
And so if you were to assign them coordinates, so king would be
one and one, queen would be two and two, man is one, one and a
half. Woman is two, two and a half. You can see on that matrix that
man and king are much closer to each other than, for example, woman and king.
And conversely, woman and queen are very close, but queen
and man are a bit further apart, and that is because they are
semantically linked. They're closer to each other. So what
happened in this example is that these words were stored as vector representations,
then they had to be somehow compared, and that comparison
was done using one of any number of methods. There are multiple,
just to name a few, cosine, euclidean or Jakkar distance.
So you basically measure the distance between those coordinates, in a sense,
or the vectors, and then you have to use an algorithm which
is going to somehow find the next nearest
neighbor when you have a query. So that's kind of the, the way that's
vector databases. Power semantic search as the backbone,
there are multiple solutions out there, and new ones are sprouting just about every day
now. But what we found is that they usually only give you a
part of the process. The most typical part that they give you
is this vector database, which basically says,
okay, you have a bit of data here, it can be documents,
structured data, audio, images. And what
we're going to do for you is create vectors out of that data.
So you put that data in via rest API or whatever
they have at their disposal and we'll spit out some vectors for you.
And you might be sitting there saying, okay, well, that does not seem like the
full search solution, so we need a little bit more. And that
more is now offered by large language models. And the large
language model basically allows you to do two things.
First and foremost, represent a user's query
in a vector state as well,
and also generate an answer once you retrieve some information.
But again, you might be saying, okay, so I have on one
side the vectors from my data, on the other side I
have the vectors from what the user types in. So I still need
to do a little bit of work here. And so what we decided to do
is package up that little bit of work, which is actually
a lot of work as you're going to see in just a second, and just
offer it as a solution. And that's what Symante basically is.
And the key difference here is, well, first and foremost, there's similarity
search already built in. Everything is configured for you
and packaged up, and you have everything
accessible as a set of rest APIs. So to use our technology, you just
configure API endpoints and that's it.
And the next slide is going to dive very deep into all kinds
of problems we ran into. And let me tell you, there are
a lot of pitfalls when you set this stuff up.
The first pitfall, or the kind of first question you have to ask yourself
is which LLM do you want to use? Somebody would typically say, all right,
let's just use GPT four, it doesn't matter. But there are a lot of considerations
to make. GPT is not necessarily the cheapest model,
and in fact it's not the best model for some use cases.
Yes, there are some use cases where a much,
much less powerful model will actually be more optimal for that
use case. Just to name an example, GPT four will
translate text to English. So it does work
in a multilingual fashion. But because everything is kind of translated
to English, the results may be different if you ask in
different languages. And so if you want to have a
truly multilingual solution, you're going to have to use a model that is language
agnostic, meaning that it doesn't actually matter which language the data
is in and the query is in, because everything is just represented as
vectors and it's completely irrelevant which language
it ultimately is. The second aspect is obviously
choosing the right similarity algorithm. There are many of them,
and you have to choose the one that fits your use case best. Some of
them have pros, some of them have other pros and cons, so you
know you're going to have to make a selection here. But the most important aspect
of them all by far is how to handle large datasets.
The first implementation that we did had 1.5 million
records, so that one was really a test for us. That's a very large database
to give you a benchmark. 1.5 million records using GPT
four would have taken several years just to get vectors from,
and it would have cost hundreds of thousands of dollars to set everything up.
That's not something we wanted to invest in. So with our solution,
we had to somehow optimize that whole vector
generation and the whole process of creating
vector embeddings and also the cost associated with it.
And so that's something that we really tweaked and played around
with very significantly. Secondly,
you're going to want to curb the cost. So some things are very
easily handled by much less powerful models than GPT four.
On the generation aspect, you might use GPT four. On the
search aspect, you might use another language model.
It's all your choice, but you'll have to make that choice if you want to
make something up and running. Now, in the real
world, your database very rarely remains static over long periods
of time, right? Entries change, new entries are
added. So every single time something changes, you're going to
have to re index your database and you're going to have to go through that
process again. So we've taken through this re indexing
and we've really made sure that that's something that is not
anything you have to consider when you use our solution and then
finally implementing it in practice. So you still have to integrate it
into the customer's application landscape. And so
this is why we just make sure that everything is accessible as an endpoint.
Essentially it's just an API because we wanted to make sure it's very easily
integrated into your customers landscape.
Finally, you're going to sometimes run into very complex queries.
A user might ask something like, I want to have a
red house which is no bigger than 400 m²,
but also not smaller than 200. It should have a garage.
Those are very complicated queries. I guarantee you virtually
no large language model other than GPT four will actually
understand that query properly. But more importantly,
some parts of that query can actually be resorted
back to keyword search. Right? You have a bunch of parameters
in there. A red house,
200 garage, yes or no.
Those things can actually be kind of done in a
hybrid fashion where some parts are going to be searched
using keywords and other parts are going to be searched using
semantic search. Another example is when a user just
types in red car. Probably overkill to
use semantic search for that. So you're gonna have to make a decision on when
to fall back to your regular keyword search.
And finally, when you have an eshop or a product catalog,
they're almost always gonna be an underlying SQL database there.
And so you're gonna have to create some kind of
mechanism to sort of translate AI search to
SQL search. And this is something that we've really tweaked and
played around with quite a bit. But you're gonna have to do this if you,
you want to use semanticsearch on your eShop, for example.
I could actually go on and on and on about all the things that we
ran into, but these are just some of the more important technical
intricacies that you have to deal with. Now, on top
of all of that, we've actually done something a little bit extra. This is
obviously not a must have, but we decided to also visualize
data using this neat 3d map where you can see the
semantic proximity of your words or, or queries, let's say.
And you can browse through this, you can click through it
and it's interactive. So that's just another feature that we decided to
implement into semantic search. So we
have the technology down. And so what's kind of the way that you
would use it in real life? So the most obvious is obviously search
in an eshop or within your knowledge base. You can just
search using human language and you don't have to use keyword
search. This one is pretty easy to understand and it's one
of the most common ways that semanticsearch is used.
But there are other ways. For example, you could use categorization.
So an entire text can very easily be placed into one
or other categories. Or you can use a generative
model to suggest categories for that text,
which is very useful in, for example, incident management or when you're working with customer
service, that immediately the request is routed
to the proper department because it deals with a
certain type of request. Other things you can do is
similar image search. So you can kind of paste an image
into an eshop and it will spit out things that look similarly.
Chatbots is another example where you have the
generative part where the customer is actually talking to that chatbot,
but then you have that retrieval aspect where you can actually. Okay, so what is
the customer actually asking about? You can sort
of look in your knowledge base. Is there something that's been answered
before? Yes. Okay, let's generate that to the customer.
That's a very powerful chatbot right there. And then other things you
can do is recommendations. So for example, rather than using the typical recommendation
engine, you can actually say, okay, if a customer is using kind
of, let's say, nose drop as a search, you can
also offer them tissues and you can offer them allergy relief.
So kind of things that are similar in terms of meaning.
And anomaly detection is an interesting one because we're dealing
with patterns here. So banks are often searching
for things that are kind of out of pattern. And interestingly, semanticsearch offers
a way of doing this very scalably.
Matching engines are another thing. So if you have a long, long text,
one of the things that you can actually do is match it to
other long texts that are similar in terms of meaningful. If you
have a use case for that, you can use semanticsearch.
And this is all nice, but let's see a
real world application. So because I had to stick to PowerPoint,
I had to create a GiF. But we're going to go through this gif many
times. So what you have here is an actual demo of a real world
application. It's a database that has roughly 100 bmws,
and you can search in it using human language.
So, for example, here, cheap car is suitable for my dog,
and what we're looking for is basically two things.
It has the cheap aspect and it has the suitable for
my dog aspect. Cheap obviously refers to a low price,
very easy to do. So we have to spit out cars that are
cheap. But the suitable for my dog is a kind of semantic
aspect, which means we're going to have to probably have a larger trunk. So you
can see all three of these cars have kind of that extended trunk.
They're active tours. What about extravagant cars for successful managers?
So again, there's two aspects here. One would be that they're
supposed to be probably expensive because they're designed
to be for successful managers. And extravagant is kind of somehow
unusual, right? So you're going to want cars that are kind
of flashy, maybe very expensive, maybe just spec to
the maximum. So what we do when we type this in here, we have
the XM, which is the most expensive suv that you have,
and then the other two are kind of flashy colors. They're spec
to the max. They're very expensive. Expensive. So that's the idea behind
semantic search when you use it in the context of a product
catalog like this. And that's actually it for this
talk. Guys, I'm very thankful that I was able to be here. Thanks so
much for paying attention. If you have any questions whatsoever, just ping
me a message and I'll be very happy to answer.