Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello, everyone.
Welcome to Con 42, the Platform Engineering Edition.
I'm really excited to share our experiences and journey so far in building
a serverless generative AI platform.
My name is Murali Malina, and I'm Chief Technology Officer at SoftRAMS.
I've been really lucky to have been building teams and software solutions
for more than 25 years and the last seven years at SoftRAMS itself.
I'm also the co founder and CEO of a five year old nonprofit
organization called Teaching for Good.
This is a very unique editech nonprofit.
It empowers anybody that is interested to teach, train, mentor, or coach.
At the same time, you can actually use this platform to raise funds
for a non profit of your choice.
So in other words, we are a non profit and we support raising funds
for any non profit of your choice by leveraging your skills and passion.
To teach and mentor others.
I also had a great privilege of collaborating and working with
exceptional teams, worldwide teams in Germany, UK, France, India, China,
Russia, and of course, us and have built mission critical systems.
In telecom supply chain, particularly auto industry, healthcare,
working with federal agencies.
And my favorite and pretty close to my heart is Editech.
I'm not that active on Twitter.
Please hit me up on LinkedIn if you want to connect and chat about pretty
much anything you want to chat about.
I've been working with SoftRAMS for almost seven and a half years now.
And SoftRAMS is one of the fastest growing civic digital services firms.
We support a variety of mission critical loads for our federal agency customers.
We have been working hard to bring some of these generative AI
experiences and solutions that depend on the generative AI into our work.
With our federal agencies, and as these agencies provide a very unique
opportunities and different levels of constraints and complexity.
That's one of the factors actually that led us to build this Gen AI platform.
This can be deployed in entirety into any customer environment, for example.
And go from an experiment to production in days to weeks, rather
than weeks to months, it used to take us to build ML experiences,
for instance, and I will definitely touch a few aspects and these unique
constraints later in the discussion.
just a shameless plug.
If anybody's looking for an opportunity, take a look at some
of the open positions at SoftRAMS.
It's a great place to work and it's one of the top workplaces in the U S.
And it has been recognized for almost fourth year in a row for
innovation and leadership culture.
Given that this is a, an online virtual conference, instead of
polling you to see where you are and what you're doing, I will make a few
assumptions and discuss various aspects.
but please LinkedIn if you have any specific questions or want to take
this conversation to a different level.
I'm sure many of you are using JNI for fun and Some probably have also started
using a variety of Gen AI tools at work.
Maybe GitHub Copilot or similar, for example, or automated peer reviews,
or tools to generate documentation, release notes, just to start with.
And how about building new Gen AI applications?
I'm pretty sure you're as excited as me and everybody else in the ecosystem
when Charged GPT got released.
We all wanted to build these new AI experiences into new applications
as well as bring these Gen AI experiences to existing applications.
Or, at least, most of you must be experimenting with various tools
and LLM models, probably some of the software as a services that are
out there in the ecosystem as well.
As you all know, AI and ML that we all know and told as in historians, That
there is an AI or ML before ChargeGPT and there is an AI and ML after the ChargeGPT.
This transformation is pretty surreal in many ways and this is
now very pervasive as well across the board, across the world.
The excitement and ease of building applications,
ChargeGPT is totally profound.
Suddenly the AI or ML, or I would say especially generative AI,
is on the list of top priorities across the board around the world.
Shifting the perception of AI and ML from what used to be like a little
bit more complicated and costly process, even more important of the
talent like need for data scientists.
Only so many organizations were able to afford to build those
end to end ML experiences.
And the chart GPT brought this into something that is now Totally
accessible to pretty much everyone.
and though we have come a long way in terms of automated automation, tooling
and systems to build amazing EAML applications before, but charge GPT and
LLM models in general, are taking us to a completely different level in order of
magnitude in terms of speed of innovation.
One fundamental pivot for that entire ecosystem is that LLMs made this
natural language as a new programming language, making it possible for anybody
in the organization, technical, non technical, business teams, doesn't
matter, to be able to leverage LLMs, create apps and assistants, Purely
using plain everyday language.
Of course, this still need to learn a little bit about how to create
those instructions and how to work with these prompts, but they can
do it pretty quickly and with that people from all walks of life now can
create exceptional user experiences without the need for large teams and
large coding exercises or workflows.
to be able to bring these experience experiences into life
in practice, I definitely believe, and I have seen it firsthand.
So many of these enterprise applications can now be developed without writing
a single line of code by specifying instructions in plain text and
leveraging any kind of relevant knowledge basis from variety of data sources.
Be it in documents, presentations, or even databases, for example.
Teams can now create these custom GPTs, if you will, or chatbots, or assistants, or
agents, tailored to their specific needs.
And another unique aspect that these LLMs brought, truly, conversational
interfaces to interact with a variety of software systems.
Even in conventional software applications, chart is fast becoming the
preferred and primary interaction model.
And to be able to ensure that these AI experiments, these AI apps become
successful, that's something that you can rely on and use it in at your
work, in your organizations and teams.
It is really critical that we provide access to these models in the first place,
create these capabilities in a way that those are accessible to regular users,
and provide them that safe and secure environment so they can actually play
around and build these applications.
The most important part of this is that now that plain language is becoming the
primary interaction, not only to use applications, but also to create, It's
very important to make these capabilities accessible to every team member in your
organization, technical, non technical, doesn't matter at all, and abstract away
some of these complexities with respect to how to create the infrastructure,
the workflows, all the other backend infrastructure tasks, if you will.
And of course.
creating a shareable catalog of sample applications, reusable prompt libraries,
and a little bit of training so that you can accelerate the adoption of
that platform in your organization.
And precisely in this context, I would like to share about the Gen AI
platform that we have built, initially for our internal teams, but now we're
using it for our customers as well.
We have built this generic platform that allows anybody in our teams, irrespective
of the technical aptitude to be able to quickly create a conversational
chart board or an assistant.
But that is also grounded on information that is provided by the author of this
chart board from variety of data sources.
And kind of this is now possible to do so in about five
minutes once you have an idea.
So once you can get the chart board up and running in the five minutes,
it'll allow you to iterate faster, quickly test them, refine the
prompts, refine the knowledge basis.
And once you feel good about the work, the quality of the work, the way it is able to
process information, present information.
Then you can actually share with everybody else in the organization.
And as I mentioned, we do support federal agencies, and we made sure
that this can be deployed in entirety into any customer environment.
The initial version of this product was built as a CDK project with everything
needed, including infrastructure, the databases for, vector stores,
the orchestration, the workflows, the pipelines, the access control.
Everything is built using CDK so that we can deploy to AWS.
We have different versions now available that will make this entire platform
available to be deployed into Azure or, Google Cloud or onto on premises as long
as you have a Kubernetes infrastructure.
this platform has been set up to connect to LLM models across the board.
Either they're offered on AWS Bedrock, OpenAI directly, or Azure OpenAI.
Our models on Google Vertex are pretty much anywhere as long as you have
access to the API for hosted models.
And if you want to host your own models, then, of course, it
is very specific to the cloud.
On AWS, for example, you could set them up into the SalesMaker to be able to
make them available for the platform.
And the platform comes with handy Prebuilt features, for example, that will take
care of most of the setup so that the end users can just concentrate on fine tuning
their apps and making sure everything is up to snuff in terms of security and by
making automation tools more accessible across the organization, it definitely
sparks that creativity because everybody is excited about able to use chart GPT
or a tool, something like chart GPT that they can build for themselves.
That will help in their own work.
And of course it'll boost their productivity.
If they could use their own work as well.
this means your team can definitely streamline some of these workflows
in all, all kinds of shapes and capabilities and present them in
a safe and secure space so that.
Everybody in the team can experiment with these ideas
and create amazing applications
to be able to streamline and automate the infrastructure, the
pipelines, the workflows, and the most importantly, the security aspects.
We have looked at different use cases and different applications.
and quickly standardized in all these varieties of applications
and use cases based on a few basic capabilities into four different groups.
The first and foremost important one is chatbots.
These are probably the first application any user try even on our platform.
And this primary use case is that this will enable users to interact with
applications using natural language.
And you just can ask questions and receive answers based on the context.
And this is completely grounded in the private knowledge basis that the
others of the chartboards provide.
And there are connectors available to load this information
from SharePoint conference.
Or just upload documents directly.
the next set of capabilities are grouped into what we call as agents.
The agents are like chat bots, but they're autonomous, that they
can autonomously make decisions to decide the flow or the number of
iterations, for example, to prepare appropriate answers or act on behalf
of the user, or just simply navigate a multi-step processor, a workflow.
These agents will have the ability to access different tools, APIs,
or functions, and also will have access to data sources to gather all
relevant information and let them make, let them make the decisions
to execute that particular workflow.
And they can iterate through multiple steps to provide a
final response to the user.
And sometimes they can also act on behalf of the user, like Creating a task list
or placing an order or updating status in a tool like Jira, for instance.
And the next one is my favorite, is what we call as AI for BI.
Every business collects tons of data, and many enterprises have extensive teams and
systems to be able to extract insights, analysis, and reports every single day.
However, in, in many cases, it usually takes days to weeks.
to be able to get an answer when a business team has a
question about something.
Because this typically requires teams to go back and identify the right
data sources, fetch the data, put it in a space area or space or a safe
area to be able to process the data or create alternate views to be made
them accessible, and then extract those insights and deliver those insights.
And if the business team happens to have a follow up question,
which happens all the time based on the information that is provided.
Typically, the next iteration goes in a very similar
workflow again, days to weeks.
But now, thanks to LLMs, we can let these business users ask questions
directly in natural language.
The orchestrator that will rely on these LLMs, first of all, to identify
what data sources are appropriate, fetch the data, process the data, And
summarize the data to be able to answer that question and best of best case is
also that it will also provide all the necessary background, how it arrived,
how did it process pretty much explaining the process of going through answering
that question itself, along with the evidence is presented to the user.
And this can happen within a matter of one, probably 1 to 5
minutes instead of days or weeks.
We noticed our quickest question on AIBA solutions is about 30,
35 seconds to three minutes.
It will take longer based on your use case or quicker based on number
of data sources it need to look at.
number of iterations, number of queries it need to make kind of stuff.
And it is possible with all the new frameworks coming up, you can do
all these iterations in parallel.
Sorry, the number of queries in parallel and quickly iterate that so that you
can actually improve that latency to get the responses even faster.
And these apps can also support data visualization, not just
presenting a textual summary, but also data visualization, prepare.
Powerpoint decks, if you want, generate reports of dashboards
along with the summaries and insights in natural language.
So when you prepare a report, you're not only looking at the visual aspect
of the data presented, but also the textual explanation of what happened
or what is happening with the data.
And.
The key insights in that report itself, and it's so easy and totally accessible.
Everybody can understand it.
It's textual, it's graphical, combined, and it's automated.
The most important part is this can happen in real time.
And the fourth group of.
Applications, and these are probably the most powerful in this group,
are what we call as agent crews.
An agent crew represents essentially a team of self organizing agents that can
collaborate with each other to perform little bit more complicated tasks.
these agents will work together, assume different roles and responsibilities,
bring their specialization aspect, and automate and refine those
processes to create that final answer.
And these are particularly valuable for multi step tasks, preparing
reports, for example, or doing analysis, automatic security
testing, for instance, and others.
and even preparing comprehensive documentation, for example, that
will take multiple iterations, multiple loads and responsibilities.
So these are the four groups of capabilities, and for each group, we
were able to identify exactly what is needed to support these use cases,
the infrastructure, the workflows, the pipelines for Rack stores, for example,
plus the observability aspect of being able to see every piece of We have a
lot of back end interactions that are happening to facilitate that response
so that developers and technically savvy users have all the information that they
need to be able to evaluate, validate the responses, look at the performance
issues, and go and refine them as needed
by offering a self service platform team members throughout the organization
now can easily whip up and launch these apps without long development
cycles that we used to have.
And it also speeds up innovation because now everybody that has an idea can
easily and quickly create these apps and everything else is taken care for them.
And in our journey, of course, we also teamed up with federal agencies to
see what kind of workloads we want to support, what kind of use cases we want
to support, and make sure that whatever we build, that solution, it can smoothly
go live in customers environments.
And every team should be able to try these things out in a safe and
secure space and smoothly move to production whenever they're ready.
So to make it happen, we come up with a serverless platform that is super flexible
and includes all the things that are required, like an all in one platform,
and it plays nice with all the hosted models as well as your custom models.
For example, if you want to tap into any models that are hosted on AWS
Bedrock, You can do or if you want to connect directly to open AI or host
your custom model, you will be able to do that as well in the safe confines
of that specific cloud provider.
In this case, the initial version was built with AWS, so you can
run everything in a safe and secure way in AWS account itself.
And of course it got the power to reach into a variety of data sources within your
AWS account as well as across different accounts, guaranteeing that there is
a safe setup and data privacy across your accounts and your applications.
Nothing is going beyond the security boundary with respect to the
data or the privacy is concerned.
I would like to quickly show this simplified platform architecture.
This is I made sure to bring it to the essential model so that everybody can
see what it takes to build this platform.
And on top of it, of course, you will be lots of additional
modules and little customizations.
But this will give you a com a, a good gist of what it takes
to build a platform like this.
Starting with, you need an interface or you will be using some kind of client
applications that will interact with API, so I'm not showing the UI piece of it.
You can build it as an SPA using React or Wrangler or view, for example, that will
serve as an UI layer for your application that will interact with the APIs.
And this architecture shows what it takes to build that API behind
the scenes that can provide these rich capabilities into the platform.
So you have some kind of authorization and authentication mechanism.
We're just using Cognito in this case and federated with single as single sign on
if needed to connect to your enterprise I am controls and an API gateway that
will accept the request that are coming in to be able to support streaming.
Of course, you have other mechanisms.
I'll touch upon just in a minute and the crux of the solution is what we
call as an orchestrator, This could be a container service or a microservice
or a group or a series of microservices that are, that will work in collaboration
to orchestrate the workflow that you need to be able to answer that question.
For example, if it is a straightforward chatbot, it will directly go and hit the
Bedrock API and get the responses back.
Or if the Rack store, it will include going to the Rack, bring the context.
include that context and go back to the bedrock model and
then get the response back.
And every interaction with the agents, AI for BI, or the multi agent crews,
the, all the crux of the interaction happens inside the orchestrator.
And it will be relying on LLMs to do everything.
Some of that work, but orchestrators are the most important pieces
in this specific context.
And that's as simple as that.
Once you have an orchestrator, and once you have an API accessible to reach out to
an LLM, this will give you everything you need to be able to build a Gen I platform.
Of course, you need access to your data sources.
API is functions and whatnot, as well as workflows and pipelines set
up that will allow you to upload documents into your rack store.
And convert them into and store them into rack store for easy access
later on as part of your queries on top of it, you will be building
lots of other building blocks.
But this is the essential skeletal architecture that anybody can build
this platform very quickly and then refine the platform later on.
It's been close to a year now since we launched the first version of our
platform, and we have come a long way and want to share a few things.
That'll probably help you out if you're also thinking or already building
a platform like this on your own.
the secret sauce to crafting this top notch platform for your organizations
and clients is training and education because everybody's excited, but they're
also scared a little bit on not to lose their data, not to create another security
incident and worried about what kind of information is fed back into these models.
and that somehow their information doesn't show up in public anywhere.
So it's very super important to train everyone in the system, not just
technical folks like developers and testers, but everyone so that they
will have the foundational skins, like writing a simple prompt, asking the
right question, right way so that LLM can bring you more relevant answers.
Since a lot of these folks are also new to Gen A apps, adding and allowing
them to experiment is really key.
Because we cannot allow them to come up with the right prompt the first time that
experimentation is key and allowing them to make mistakes is key and giving them
that immediate feedback as part of the cost is incurred based on their query.
The latency aspects of performance aspects, the bias aspects, the
efficiency with how it was able to pull that information from rag.
So provide all that information as a feedback to the users, but
make sure that let them quickly experiment in a private space.
And whenever they're ready, let them say, let them share it with
everybody else in the system.
And if it is developers, DevOps, security folks, by allowing them to start using
tools like GitHub Copilot, they get better with asking questions, working with
Copilots, and then that also allow them to build applications like Copilots by
themselves by using these capabilities.
And organize some fun hackathons in safe spaces so that Everybody can dive
into examples, get inspired by looking at what everybody else is doing, and
get the lowdown on how to use it, how to create these apps like a pro.
And when you are building up your platform, it is really crucial to organize
these applications and capabilities based on skills, based on use cases.
And some sample apps as well.
This will help everybody to understand what is available, get inspired by
looking at what everybody else in your organization are doing, and then
quickly start their own experiments.
And make sure that whatever you build, abstract away all the
complexity and show the simple interface that everybody can use it.
And that is the key for success of any of the, any platform.
And it's also really important that.
It is, it's going to be a long journey for you as your team as well that are
building this platform treated like a product, start with an MVP, gather
some user input, see how they're using, where they're fumbling, what
is working, what is not working, and then constantly beep up those features,
and then you can unlock additional use cases, additional, capabilities
and additional, security aspects.
And, of course, a lot more metrics that will make sense to your users as well.
as I said, it's a journey, it's a long journey.
that learning and adaptation is really important to be able to build any
platform, but Janai specifically as well.
And when we adopt a little bit more structured approach to giving training
to all kinds of users, giving them those space spaces for experimentation, and
taking their feedback and iteratively build it, I think you definitely
will pave the way for building a really strong platform that caters
to changing needs and evolving use cases in your organization and, of
course, for your customers as well.
So with that, I will want to quickly run down a few highlights as a
summary from this conversation.
It is really important to provide a safe and secure sandbox for your users.
And as I said earlier, try to optimize your experiences for the non typical
users first because the language or the lingua franca for generative experience
is a natural language and we want to empower everybody in your organization to
be able to do So make sure you optimize your experiences for the non typical
users and also create that environment where they can learn from each other.
Looking at the catalog of applications, catalog of capabilities, demos, as
well as shareable prompt libraries, for example, go a long way.
with that, I would like to take this moment to thank you very much for joining
this discussion on platform engineering, especially building a Janai platform.
Feel free to reach out on LinkedIn if you would like to connect
and continue this conversation.
Thank you very much, and have a wonderful day.