Transcript
            
            
              This transcript was autogenerated. To make changes, submit a PR.
            
            
            
            
              Hello everyone, welcome to today's session on unleash the power of
            
            
            
              Genei. Generate growth and innovation with data.
            
            
            
              My name is Akanksha Sharon. I am principal data lead
            
            
            
              at AWS for UK public sector.
            
            
            
              Now generative AI has taken the world by storm.
            
            
            
              I'm sure all of you have heard about applications
            
            
            
              like Chat GPT and it just shows how powerful
            
            
            
              the latest machine learning modules have become.
            
            
            
              The true power of generative AI goes beyond a search
            
            
            
              engine or a chatbot or, you know,
            
            
            
              chat GPT. It will essentially, you know,
            
            
            
              transform how companies or organization operate or
            
            
            
              will operate in future. Just to share some
            
            
            
              perspective here, Gold Goldman Sachs
            
            
            
              forecasted a 7 trillion increase in global
            
            
            
              GDP. They also predict that JNAi
            
            
            
              has lifted the productivity growth by
            
            
            
              one and a half percentage over ten periods of time. This is
            
            
            
              just a small glimpse of potential of Genai.
            
            
            
              Now, I've worked with lot of customers, enterprise customers,
            
            
            
              public sector customers, and I think it is safe to say
            
            
            
              that everyone acknowledges the power of Genai and
            
            
            
              they are comfortable in thinking big with Genai
            
            
            
              or they are actually making plans how Genai can be utilized
            
            
            
              in their organization. But what I find is nearly
            
            
            
              everyone I speak with focuses on foundation
            
            
            
              models and LLM more broadly. So the
            
            
            
              iceberg that you see, the tip of the iceberg is the
            
            
            
              Genai application, right? There is more
            
            
            
              under the glacier, there is more that
            
            
            
              meets the eye at the first glance, right? And this is like my favorite slide.
            
            
            
              So what enables you to drive the value of
            
            
            
              Genai? Now, Genai applications
            
            
            
              are still applications, and like any other application,
            
            
            
              you need a database underneath it. So you eventually
            
            
            
              need an operational database to support your user experience
            
            
            
              and your Genai applications. So if
            
            
            
              you see on the slides on the right, you need a storage
            
            
            
              layer, you need a database layer that will have purpose built
            
            
            
              databases like document, DB, graphDB,
            
            
            
              vector enabled databases, right? And then there
            
            
            
              is data integrations. You need the source of
            
            
            
              your data, you know, whether the data is going to come in batches
            
            
            
              or via streamlining or, you know, you set up
            
            
            
              your pipelines to keep up with data change.
            
            
            
              Finally, you also have to consider governance,
            
            
            
              you know, process to ensure data quality, privacy,
            
            
            
              security, right? So while it is very tempting to
            
            
            
              think about generative AI at a surface level or as a
            
            
            
              tip of the iceberg, really the data you
            
            
            
              need to nail down effectively
            
            
            
              and use modern data architecture,
            
            
            
              right? And data is, I could say
            
            
            
              a foundation module of building your Jennai application.
            
            
            
              And in future slides, you know, in upcoming slides we're going to
            
            
            
              talk more about, you know, how data is more important. Right?
            
            
            
              Next slide. Yeah. Another data point
            
            
            
              that we have is from a McKinsey report. And,
            
            
            
              you know, there's a link to that report that you can see.
            
            
            
              Companies that have not yet found ways
            
            
            
              to effectively harmonize and provide
            
            
            
              access to their data, unable to
            
            
            
              fine tune their generative AI and which
            
            
            
              will eventually, you know, they won't be able to use
            
            
            
              Genai for their customers or will not be able to unlock full
            
            
            
              potential of Genai to do this. This requires
            
            
            
              a very clear data and infrastructure
            
            
            
              strategy. Now, why does data matter so
            
            
            
              much? I've been talking about data in like few slides now, right?
            
            
            
              So let's see why data matters a lot.
            
            
            
              Now, when you want to build your Genai
            
            
            
              application, there are unique
            
            
            
              to your business needs and for, and unique for your customer base,
            
            
            
              right? Your data is your differentiator. You know,
            
            
            
              as the name suggested, the data is your key differentiator.
            
            
            
              And let me give you more thoughts on this right now. When you think
            
            
            
              about it, every company has access to some foundation
            
            
            
              models, right? Some of them are easily available in the marketplace.
            
            
            
              Some of them are easily available in GitHub that you can download
            
            
            
              and use it, right? But the companies will
            
            
            
              be successful if they build a Jenny eye application with
            
            
            
              real business data or business value data
            
            
            
              that will help them to build a
            
            
            
              amazing Genai application which caters their
            
            
            
              customer history, their needs, their utilization pattern
            
            
            
              and whatnot. Right? So data is the difference between
            
            
            
              the generic genai application and
            
            
            
              those that know your business and your customer deeply.
            
            
            
              And I've seen it with many customers where, you know, they have taken on the
            
            
            
              shelf foundation modules and not really taken care
            
            
            
              about data. And eventually they don't see much benefit of those applications
            
            
            
              now, right? Whereas I've seen organizations who
            
            
            
              work backwards from their data build the
            
            
            
              Jennai application, or even if they take off the shelf genei
            
            
            
              application, they actually embed it with their own business data.
            
            
            
              And that really helps them to make it, you know,
            
            
            
              serve their customer in a better way. Right? Now,
            
            
            
              using data for Jnai actually doesn't mean that you have to
            
            
            
              go and build your own model, right? So it doesn't
            
            
            
              mean that, as I said right. Now, while some companies will build,
            
            
            
              and there are a couple of type of, you know, company, so there could be
            
            
            
              one organization that will build and train their own
            
            
            
              large language modules with vast amount of data,
            
            
            
              and many will use their organizational data to fine
            
            
            
              tune their foundation models for their unique business
            
            
            
              use case or their unique needs,
            
            
            
              right? But underlying all of this, the key
            
            
            
              message is the data is your differentiator.
            
            
            
              Now, you know, in a recent surveys with CDO's,
            
            
            
              we found that 93% of CDO's
            
            
            
              said that the importance of the data strategy
            
            
            
              and its role in making generative AI custom to
            
            
            
              their business is one of the most important thing that they
            
            
            
              can do, right. And on the right hand side,
            
            
            
              37% of CDO's agreed that lack of
            
            
            
              the right data foundation or a data strategy was one
            
            
            
              of the top challenges to implement generative AI.
            
            
            
              So now data foundation matters for generative AI because
            
            
            
              the access to high quality data about your organization
            
            
            
              and your customer improves the accuracy
            
            
            
              and reliability of these GenAI modules and their
            
            
            
              responses as well. Now this is another example.
            
            
            
              Now, if you I would like to share an example of an online travel agency
            
            
            
              here, and they want to generate personalized travel
            
            
            
              itinerary. So when you want to do this personalized itinerary,
            
            
            
              what you would like to use as an organization is your customer profile data
            
            
            
              in your databases. And based on this data,
            
            
            
              you would like to tailor the recommendation based on
            
            
            
              things like past trips history,
            
            
            
              travel preferences, hotel preferences, preferences of
            
            
            
              family members, age of the family members and things like that,
            
            
            
              right? So what you will do is you will marry that
            
            
            
              data with the other company details like flight
            
            
            
              details, hotel inventory, promotions and things like that.
            
            
            
              So if you look at this, you know, there are a couple of data points
            
            
            
              that you're using. There are two kind of different data sets that you're using.
            
            
            
              So again, it is very important, where is this data set residing
            
            
            
              and how easily can you access it? The more easily
            
            
            
              you can access it and secure it, the more easy your
            
            
            
              response is going to generate. The personalized travel
            
            
            
              itinerary. Now there's another example here we
            
            
            
              have. Now, you know, I've been talking about this powerful
            
            
            
              capabilities of Genei to create content,
            
            
            
              right? But to make this content, you know,
            
            
            
              relevant to your organization, you would definitely like to
            
            
            
              customize it and customize it with things like your
            
            
            
              own brand logo, your own brand guidebook.
            
            
            
              What were your previous ad, you know, content from your data
            
            
            
              lake as well as, you know, company data, like real
            
            
            
              time inventory of your transactional database and so forth,
            
            
            
              right? So you eventually are going to use the jenny,
            
            
            
              but you're under using the data from all your different
            
            
            
              traditional or transactional databases as well.
            
            
            
              Now, to get the high quality data for JNAi,
            
            
            
              you need a strong data foundation,
            
            
            
              right? In fact, like, I'm sure many of you who are listening
            
            
            
              to me would have already spoken or would have
            
            
            
              had a data strategy. You know, in your
            
            
            
              organization, that's a different thing. Whether you're working towards it
            
            
            
              or you're running into some issues with it, but we can definitely help you
            
            
            
              with that process. Right. But Jenny, I make this
            
            
            
              data foundation even more critical than ever because your
            
            
            
              data is your differentiator. Right? I've met so many organization
            
            
            
              that were not really to adopt cloud,
            
            
            
              they were not thinking about data strategy. But now with JNai,
            
            
            
              it is becoming more and more critical for them to
            
            
            
              put this as a priority, right? So your data has
            
            
            
              to be up to date, complete,
            
            
            
              accurate, discoverable and available.
            
            
            
              Right? So that is like your key things for your data
            
            
            
              strategy. Now, these are a couple of modules
            
            
            
              for JNAi that we have. Obviously we have purpose build LLM,
            
            
            
              then we have fine tuning of LLM, and then we have Rag,
            
            
            
              right? And for the purpose of this presentation, I'll pick up
            
            
            
              the rag use case and work towards it.
            
            
            
              Now, with Rag, the external data used to augment
            
            
            
              your prompts can come from multiple data sources.
            
            
            
              It could include documents, different repositories,
            
            
            
              databases, APIs. Right. And Reg
            
            
            
              helps the module to adjust its output with data retrieved
            
            
            
              as and when needed, so that you know, it can prompt you
            
            
            
              with right information. So this is just a quick overview of
            
            
            
              what Rag is. And this is a very high
            
            
            
              level reference architecture of, for rag.
            
            
            
              Right. Now you'll notice two sides to
            
            
            
              the story. On the left hand side you have processes that
            
            
            
              occurs in the end user critical path. That is,
            
            
            
              the end user interacts with application and is
            
            
            
              waiting for a response. And on the right hand side are
            
            
            
              the processes that happens behind the scene,
            
            
            
              right, like ingestion from data sources,
            
            
            
              batch and stream processing, data integration
            
            
            
              with pipelines. So you need this for
            
            
            
              populating your vector databases and various enterprise
            
            
            
              databases or data warehouses.
            
            
            
              Now notice the data governance and data warehouse
            
            
            
              and vector data store. These are very critical, right.
            
            
            
              And what I am seeing and what most and more customers are doing
            
            
            
              is they are modernizing their entire infrastructure
            
            
            
              by moving them to the cloud. And this includes
            
            
            
              relational databases, non relational databases,
            
            
            
              right. And let's talk more about the vector
            
            
            
              data store that is there in the screen. Right.
            
            
            
              Before we go there, let's look at this critical
            
            
            
              path for the end user here. Right now.
            
            
            
              This is again a set of use case, set of scenario
            
            
            
              that we have. I don't have animation right now,
            
            
            
              but I'll go by the numbers on the screen here. So,
            
            
            
              yeah, the first one is, you know, the end user interacts
            
            
            
              with JNAi application and typically by posing
            
            
            
              a question. And this is just to give you a example of what
            
            
            
              happens underneath, right? So an end
            
            
            
              user interacts with the Jenny application is number one. The second is
            
            
            
              the application loads the relevant prompt template and,
            
            
            
              you know, you can create your own templates based on different rules
            
            
            
              that you have and things like that. Then there is
            
            
            
              a number three. Is that the question posed by the user?
            
            
            
              Right. It could be a new question or it could be an ongoing conversation.
            
            
            
              Right. So anyways, in that case, you know, what we have to do is
            
            
            
              we have to look into the history data
            
            
            
              store to allow the user to pick up where they left off.
            
            
            
              Let's say this is in between the conversation. And, you know,
            
            
            
              this is a very good example when you go online and you go for
            
            
            
              like chat option or you go for online help
            
            
            
              option, right. This is a critical workflow. And for that.
            
            
            
              So anyways, application needs to pick up where the customer lost the application.
            
            
            
              We need to pull that state into the right context. Right.
            
            
            
              What was the context we were asking that question?
            
            
            
              Number four is the application need to query for profile
            
            
            
              or any other situational data, right. And this typically would
            
            
            
              come out of like a data store. For example, if you're returning something,
            
            
            
              right. It would go back to your historical data store and say,
            
            
            
              when did you purchase this? And details of the order and all of that,
            
            
            
              right. Number five is it tokenizes the original question,
            
            
            
              so, you know, to get a set of embeddings from the LLM.
            
            
            
              And number six, what happens is with those questions embedding,
            
            
            
              it performs a similar search in the vector data store.
            
            
            
              This is using some form of algorithm
            
            
            
              which basically tells you the nearest neighbor search
            
            
            
              for the algorithm, right. And it searches,
            
            
            
              you know, that along with some context. Right. So it basically creates
            
            
            
              its algorithm to search it. Number seven is once
            
            
            
              all that data is synthesized into a prompt, it is
            
            
            
              then sent to LLM to get a response,
            
            
            
              right? And number eight is, you know,
            
            
            
              it updates the conversation state and history according to the new
            
            
            
              interaction. And number nine is finally is the
            
            
            
              response that you see on the screen, like so, you know,
            
            
            
              if you really dive deep, you know, I've used data stores and,
            
            
            
              you know, retrieving the historical information and all of that.
            
            
            
              Right. You know, if you dive deep, you know,
            
            
            
              different layers and see what data services you should all
            
            
            
              consider for these architectural prompts. Right? And one of
            
            
            
              this is vector databases. So let me go to the next
            
            
            
              slide and talk more about what is a vector data
            
            
            
              store? Right, perfect. Now, vector embeddings,
            
            
            
              basically it represents word and phrases
            
            
            
              and entities as numerical vectors in a multi
            
            
            
              dimensional space. Now, in this example that you see,
            
            
            
              the words or items with similar meanings are
            
            
            
              mapped closer to each other in this space.
            
            
            
              This kind of representation or this kind of semantic
            
            
            
              relationship actually enables genai to
            
            
            
              understand similarities and relationship between words
            
            
            
              and entities, right? So for example, like, if you
            
            
            
              say sandals, high heels,
            
            
            
              color, comfort, fit,
            
            
            
              you know, all of these are similar things and this will help them to kind
            
            
            
              of do that. Next slide.
            
            
            
              Okay, now, vector embeddings are essentially
            
            
            
              numerical representation of your audio
            
            
            
              or video data, right?
            
            
            
              While humans can understand the meaning of all these words,
            
            
            
              right? But machine cannot and machine will
            
            
            
              only understand numbers. So do
            
            
            
              that. To make them understand that, we have to translate them into format that
            
            
            
              is suitable for machine learning or for the jenny I application.
            
            
            
              And this is essentially what is called vector embeddings.
            
            
            
              This is a very good example of vector embedding.
            
            
            
              Now let's assume by assigning numbers to different words,
            
            
            
              you know, we can view vectors in a multi dimensional space, as I said.
            
            
            
              Right? And then you can measure the distance between them.
            
            
            
              For instance, you know, if you look at this graph, cat is
            
            
            
              closer to kitten, whereas dog is closer to puppy.
            
            
            
              So now by comparing these embeddings in this way,
            
            
            
              the module, the Jnai module, will produce more relevant
            
            
            
              and contextual responses for the question
            
            
            
              that was asked or for a matching word, right?
            
            
            
              So this is how, you know, whole vector,
            
            
            
              you know, assignment works. Basically.
            
            
            
              Another example, just to give you more insight, you know,
            
            
            
              this is called a superpower semantic search for
            
            
            
              use cases, you know, like rich media search or
            
            
            
              for product recommendations. So when you go on the websites and you get some
            
            
            
              product recommendations based on your previous purchase history
            
            
            
              or what have you typed in or, you know, what have you seen
            
            
            
              in that specific portal and things like that.
            
            
            
              Right? Now, in this scenario and in this screenshot that you see on the screen,
            
            
            
              you can see that semantic search greatly enhances the
            
            
            
              accuracy of the output of the query,
            
            
            
              right? Like one of the things that you say is bright color
            
            
            
              golf shoes, right? So that is like very, very specific,
            
            
            
              and that is how attaching vectors and numbers
            
            
            
              to the search query makes it very, very precise
            
            
            
              in scenarios like this, right?
            
            
            
              Okay,
            
            
            
              so most of our custom,
            
            
            
              I've spoken about vectors, and now let's talk about how does this vector and
            
            
            
              data work together. Right? Now, many of our customers
            
            
            
              are using vectors for their genai application.
            
            
            
              And one of the feedback that we have got from them is their existing
            
            
            
              databases should have vector enabled
            
            
            
              and it will make them more confident, it will
            
            
            
              meet the requirements of being scalable, available and
            
            
            
              provide durability,
            
            
            
              storage and high compute, right?
            
            
            
              And what we have done is we have made sure that when
            
            
            
              your vector and business data are stored in the same
            
            
            
              place, your application will run faster,
            
            
            
              because when they are in the same place, there is no need or no,
            
            
            
              you don't have to worry about data sync or data movement and,
            
            
            
              you know, data silos at all. So we store our vector
            
            
            
              and database together. And that is why we have,
            
            
            
              um, enabled vector searches across our multiple
            
            
            
              services that you see. We have Amazon open search,
            
            
            
              we have Aurora postgres, we have RDS postgres,
            
            
            
              Neptune document DB and DynamoDB also has
            
            
            
              zero ETL for faster retrieval.
            
            
            
              So this is our famous flywheel. And we start
            
            
            
              off with, you know, unify where you make sure that you
            
            
            
              break down your data silos, you innovate by building
            
            
            
              new Ji application and you modernize your
            
            
            
              data infrastructure. Now, the beauty about this flywheel
            
            
            
              is you can essentially start off your data modernization or
            
            
            
              data strategy from anywhere, right? Like,
            
            
            
              I've met customers who would say, I'm going to start off with innovate
            
            
            
              where I'm going to innovate genai application,
            
            
            
              you know, work on my llms and work on my use cases
            
            
            
              and then go to the modernizing your data
            
            
            
              or, you know, infrastructure and then think about not
            
            
            
              having data silos or making more use of those data. And then,
            
            
            
              you know, flywheel goes where I have come across customers
            
            
            
              who would say, yes, we would like to go to cloud first,
            
            
            
              have a modern data structure infrastructure
            
            
            
              on the cloud, have a great data strategy,
            
            
            
              utilize all the benefits of the cloud and then go in the flywheel
            
            
            
              and then innovate and all of that. So the beauty of this flywheel is,
            
            
            
              you know, we can start off from anywhere and then once you
            
            
            
              are in the flywheel, it will just power itself and goes from there.
            
            
            
              Right. Also, you know, this whole flywheel
            
            
            
              avoids the risk of getting logged into a proprietary
            
            
            
              format. It will help you break down data silos
            
            
            
              and, you know, empower your team to build Genai
            
            
            
              applications, building a data foundation to
            
            
            
              fuel your generative AI application. You know,
            
            
            
              AWS provides a wide variety of services
            
            
            
              which are comprehensive services for each use.
            
            
            
              We have integrated with vector databases,
            
            
            
              zero etL, so you can easily connect to your different data
            
            
            
              stores, right. If you're already in cloud and using any
            
            
            
              of your services, we have zero ETL in most of our services that will
            
            
            
              easily help you to connect and access your
            
            
            
              data all around. And then we have some
            
            
            
              very good data governance as well available to
            
            
            
              have secure your data in the cloud and
            
            
            
              also utilize some policies or user access as well.
            
            
            
              So we have lot of services aligned to that as well.
            
            
            
              Now, where we can help, there are a couple of places where we can help.
            
            
            
              You can obviously go to our generative AI innovation
            
            
            
              center page and request for a conversation or if you're
            
            
            
              already one of our customers and reach out to your respective
            
            
            
              account team. But there are a few ways you can we can help you is
            
            
            
              about getting buy in from your exec on data strategy.
            
            
            
              The next one is we can help your organization to
            
            
            
              envision, you know, data to
            
            
            
              drive some of the business outcome. Maybe do a POC, maybe do a
            
            
            
              first pilot. And then we also have options to basically
            
            
            
              modernize your data foundation as well. So there are
            
            
            
              a couple of ways where we can help you, right? And then reach out to
            
            
            
              AWS generative AI Innovation center to help you more.
            
            
            
              These are some two very good workshops. This is
            
            
            
              a very technical, dive deep workshop. So if
            
            
            
              you are interested to learn more, get your hands dirty,
            
            
            
              scan the code, register for these amazing
            
            
            
              workshops and you go from there.
            
            
            
              So this is my last slide. Thank you so much everyone
            
            
            
              for joining today's session and
            
            
            
              have a great day. Thank you.