Transcript
This transcript was autogenerated. To make changes, submit a PR.
Are youll an SRe?
A developer, a quality
engineer who wants to tackle the challenge of improving reliability
in your DevOps? You can enable your DevOps for reliability
with chaos native. Create your free account
at Chaos native Litmus cloud folks,
my name is Arthur with Dasha AI.
Today's session is on how you can use conversational voice
AI to handle incidents in
the site reliability engineering line
of work. So these two concepts,
voice AI incident handling, might seem like they exist
in different worlds, which they may,
but the fact of the matter is at least one of
them, and that's incident handling, can gain a lot of benefit from the other.
And that's voice AI. Here's what you should expect from today's session.
We're going to start off with some definitions,
so basically terminology,
and we'll talk about why you might want to use why you might
want to add conversational voice AI technologies to your set
of tools in your line of work. I'll do a live demonstration
of a Dasha AI conversational AI application
which I built and integrated with better uptime, which handles
incidents, closes them completely,
resolves them if needed, acknowledgement them,
notifies me about them and which gives me updates of
the status of the Kubernetes cluster, et cetera.
Then we'll talk about how it works and I
will finally give you a rundown of
the Dasha studio, the tool set that I use to build this
conversational app and which you can use as well
to add conversational AI to your set
of site reliability engineering tools.
Let's start with the definitions. Conversational voice
AI is a set of technologies that lets
youll create automated conversations powered by
machine learning and artificial intelligence services.
At Dasha AI, we call these automated conversations apps
Dasha apps, and we'll go into a bit more
detail as to how they are run and how
they work a little bit later after these demo. In the
how it works section,
why you might want to use conversational voice AI in
site reliability engineering there are a few use cases.
So one is notifying you about incidents. Another is
resolving these incidents over these phone.
The third is handling incidents on the go. So essentially,
if you're away from your machine, you get notified about an incident.
Maybe youll can check some of the statuses online and acknowledge
the incident and then resolve it when you get back to your computer.
For example, you can handle incidents quicker while at your desk
because you don't have to switch back and forth.
You can ask the AI app what the status of your
vital technologies are, for example, what's the
status of your TLS certificates, what's the status of the Kubernetes clusters?
And you can tell it to acknowledge the incident, to resolve
the incident, et cetera, all online with your voice
over, say, speakerphone. You don't have to tap around
on your computer to do this.
And actually that's one of the big benefits.
Another is that you can tell the AI to quickly notify your
colleagues, all of your colleagues or specific ones if you
feel that there is a need. Get updates on
the status of vital services. We already talked about that.
You can take detailed notes of every incident. What I
mean is, as youll are reporting the incident, you can
dictate notes to the AI app and you can get those
notes sent to your email address, or you can push
them wherever you keep this information about
incidents in the demo I show you today, we'll just send it via
email. And finally you
can resolve incidents quicker.
So moving into the demo, you're seeing visual studio
code and you're seeing better uptime side by side
visual studio code. Why I'm using it? Dasha Studio
is an extension to VS code. If you're familiar with vs code,
you'll feel right at home with Dasha studio.
I'll go over all the different components and how you use them
to build these Dasha app a little bit later. Right now I'll just run
a live demo for you. So what we've got here is two servers
running. One is just a simple hello world server. It's being monitored
by betteruptime. The other is a server which
takes inbound webhooks from betteruptime
and activates these Dasha application from
the node JS code which then calls
me, talks to me, helps me to resolve these issues
in real time. So you might
ask, why do this with betteruptime? Since betteruptime
already does voice notifications. It does do that
indeed. But those are just, hey, you've got an incident.
Hey, incident was resolved by types of notifications.
What I'll show you today is significantly more advanced. So I
just killed the server that we're monitoring. And any
second we're going to get a webhook
coming into our server here and
activating the Dasha application.
There it goes.
This is Arthur. Hello. Hello Arthur.
This is Dasha calling you regarding your website. There has been an incident. You can
acknowledge or resolve the incident right on the call with me. Please note I
will listen and take notes until you mention that you are ready to resolve or
acknowledge. Okay, let me take a look here.
Okay, I'll wait.
What's the status of the kubernetes cluster?
Kubernetes is up and running. And what's
the status of TLS certificate and health check?
TlS certificate is active site health checks are not responding.
Okay, I can acknowledgement the incident.
Can you please confirm that you want me to acknowledge the incident?
Yes, I can. Got it.
I have set the status and better uptime of acknowledgement.
The next step is to resolve the incident. Would you like me to notify other
team members that there is an incident which may affect them? Yes,
please. Okay, team members notified.
I will wait for you to let me know when incident is resolved.
Okay. Oh crap.
What happened, Arthur? Did youll use the wrong terminal again?
Yeah, something like that. Hold on.
Okay, incident resolved.
Can you please confirm that you want me to acknowledge the incident?
Incident resolved. Can you please confirm
that you want me to resolve the incident? Yes, I can.
Well done, Arthur. I have set these status and better uptime is resolved.
Thank you and take care. Goodbye. Bye.
Right, well, there youll have
it. So,
as I mentioned earlier, we're also taking
down the transcription of the conversation. And I
should have just received an email with
the transcript. There it is. So this is the
transcript of the conversation that we have just had.
It. Obviously, as you saw, these incident was
resolved in real time. Acknowledged and
resolved. And we are able to youll data
from external functions regarding the kubernetes status
and et cetera, et cetera. Pretty much anything that you
might need at your fingertips or ear tips,
whatever you can get with Dasha,
because you can run all the HTTPs
requests. Whatever, anything that you can do with
node JS code you can do with Dasha.
So I'm going to switch back over to the deck and just
kind of give you a quick overview of the architecture. And then we'll
come back to visual studio code where I will take you through
the actual code which makes this
app tick,
and to take you through the architecture of the
Dasha apps. So there are three parts.
The first is the Dasha studio.
That's what you have just seen me
use to run the application, and it provides
you such tools as analytics, debugger, visual editor, code editor.
Essentially, the studio is where you write out the
conversation flow using Dashascript, which is a
domain specific language specifically
designed as a Turing machine with nodes and
states, and each node is responsible for something happening in
the conversation. Youll might have nodes that don't show up in
the conversation, but that youll use to do calculations, et cetera,
or to call up external functions. You call up external
functions in your index JS file,
and from index JS you can call upon any external services,
et cetera, et cetera. The second part is the Dasha
SDK. So that's essentially used what you import
into your node JS file, and it lets you integrate with
APIs, lets you handle your telephony, et cetera,
et cetera. And the third is the Dasha cloud.
So this is the part of the whole
system which gives you the AI as a service component.
Alerts your conversations, have digressions,
alerts you customize intents, entities.
Slot filling provides an out of the box
natural language generation, natural language understanding. Text to
speech, speech to text calls, best in class,
all proprietary technology.
We're actually rolling out what's already in live
testing. We'll be pushing into production emotionally
charged speech synthesis so you can define what types
of emotions you want to give the
talker, if that's the type of thing that you're into.
And by the talker, I mean the AI. And this
is how the entire thing works at an overview.
So you write the killer app in the studio,
it's loaded into the Dasha cloud platform through
these SDK, and then the conversation happens
through a telephony provider with the user.
So we've gone over the architecture. Let's now look at
the actual code that makes these conversational AI
applications work and interface
with all of the services that you use, in this case with
better uptime. So as I
have mentioned,
we've got a few main parts that we'll be looking
at. The first is main DSL.
DSL is Dasha scripting language.
It's a domain specific language which specifically
is used to create, to construct conversations.
It denotes the structure of a conversation.
The second file that we'll be looking at is data JSon.
And this is the set of data which is used to train the
Dasha AI neural networks in the Dasha cloud to recognize
specific intent or to recognize specific
named entities that the user requests.
We'll look over that as well.
The third, we'll barely look at it, is phrasemap JSon.
I'll show you a couple of things in it and tell you what it's all
about. And finally, index JS is sort of
the file that puts this all together. And this is where we will actually
start today. So,
as I have mentioned, when I went into the demo, I had two
applications running. One is the,
this is, I guess, another file. One is hello world js, which is
like about as simple of a server on node JS as
you can set up,
and the other is index js.
So I didn't mention this at the demo, but obviously these are all
running on my local machine. So I used Ngrok to
give them a web address so that better
uptime can actually monitoring one and send
and interface with the other. Monitor this one,
interface with this one. So index JS
starts off by importing the Dasha AI SDK
and obviously we're importing express to
run the server here. We're using a few other things.
So here is where the
webhook listener app begins. So this is where
our server actually starts.
It gets data via webhook from
better uptime. The most important piece of data for us
is the incident id, but we also want to know whether the
incident is acknowledged or resolved. The thing is
that better uptime passes, sends webhooks no
matter what happens. But we only want to get a call from Dasha
if the webhook is initiated,
if the incident is created, not resolved
or acknowledged. So once we get that type of
webhook for an incident that's been created, we launch
the Dasha application and these Dasha
application calls me and the conversation
begins. So let's look at the body of the
conversation here. We start off with two
input variables, phone and name.
You could look at index js
and see right here in
the Dasha app where
we've got these input variables.
There they are. So I'm storing these
in the env file along with all these other things
that I don't want to store directly in my code.
We also declare external functions here. As mentioned earlier,
external functions are a way for you to call up code
within index js from the body of your AI
conversation, which can then go on to do any manner
of external service. Call any external
services. So conversation starts with the node root.
We wait to connect to the phone, and the application
waits until the user says something. Then it
greets the user and in this
case tells them that. You can let me know when you're ready
to resolve or acknowledge the incident. You've had an incident by the
way. So then we can
really take the conversation in a few directions. This is a pretty simple script
as far as AI conversations are concerned.
Essentially we can either acknowledge,
resolve, or ignore the event.
We can also ask the application about the status
of some vital services, specifically TLS
certificates, kubernetes cluster and site
health check. So you could ask it to
wait to repeat the last question and a bit of an
Easter egg. Did you use
the wrong terminal again calls.
Right. So this is what I want to draw your attention to.
The resolve, the acknowledgement and
the ignore nodes are all labeled not as a node,
but as a digression. What is a digression
in the context of dasha? It's a node that can be called
up at absolutely any point in the conversation.
And we've really developed this for two reasons.
One is it's a great way to navigate if
you've got a huge giant menu, and two is it's a
way to give the dachshare apps that human
like feel. So when you're talking to a person, and if
you're talking about whatever, you're talking about
site reliability engineering, and suddenly your friend
says, hey, by the way, what's the
weather like where you are? You're able to reply to that and
to kind of pass the sort of using test to give the user a feeling
like they're talking to a human. We want the AI applications to be able to
do the same thing. So these digressions do
really well with that. And the digressions
are activated by intents. So you can see here,
conditions on message has intent ignore such, and such
digression is activated. So in this case,
we're looking for the intent of ignore.
So when these phrases or any number of phrases
utilizing these phrases show up,
the digression is activated.
Data json is a way for you to feed data
easily into the dasha AI as a
service, neural networks, which then
are trained, and in ongoing conversations, they're trained
over and over and recognize
a variety of phrases which may include these words,
or which may not even sound exactly like these words,
but which have been identified to carry
the same weight of meaning as these words
do. So once
the digression is activated in this case,
we ask to confirm the action, and then
the user can either choose to continue to confirm
with the intent of yes, that they want to indeed,
in this case ignore the incident, or to say no,
in which case they will be moved over to node waiting,
where Dasha says that she will wait for additional instructions
from the user.
Finally, what I want to show you is
how we are identifying named
entities when we're checking for status from external
services. So these,
this digression status is activated on two conditions.
One is that the message has to have intent status,
and the other that the message has to carry some data,
specifically status entity. What is
status entity? It's a named entity
where it has a number of values.
It's not an open set of data it's
a closed set, which means that only these
values will be identified. For example, if this
was an open set, then dasha might substitute
any number of words which
are placed in the proper position by the user.
But in this case we're looking to identify some very specific
services. So Kubernetes, TLS and
health check and here are the instructions
provided to the neural network when
the message has intent status and data
status entity. These are the types of phrases that the user
might say. What is the status of kubernetes cluster?
What's the status of Kubernetes and TLS? Tell me
about the status, or give me an update on the status of
this or such and the other in the
course of the conversation. This is how
we got to basically
check the status of these things. We call up an external
service and the service returns
the status to us. In the case of going
back to we looked at the digression for ignore, but let's
look at the digression for resolve. For example
the same workflow. If the digression gets
called up,
the Dasha app asks to confirm that
the incident is ready to be resolved, and if it
is confirmed, it calls up the external function resolve.
Here it is in our
code resolve.
It checks whether the incident id is null, and if
it's not, it authorizes
with the bearer API
token and it sends
an HTTPs post request with the
incident id instructing better uptime to resolve
the incident, after which the
Dasha tells the user that the incident
has been resolved. Take care and goodbye.
By the same token, you could literally do
any type of activities that you
right now do manually with a Dasha app that's
tailored to your specific needs.
So to put into perspective how
easy it is to build with. Took me probably
around five or 6
hours to build this entire thing and
I'm not a very experienced software engineer,
and I hate to be the person who
says it's that easy, but it's really that
easy to build your own
apps to make your site
reliability engineering workflow even more efficient.
The source code to this application will be attached below
to the YouTube description and if
you go to the GitHub you will find in the readme a bit of a
tutorial on how to actually put all these into action,
and it will also be up on the
Conf 42 website for you to review,
to download, to use, and to build out
your own applications running on top of it.
I hope this was exciting for you, as exciting for you
to watch as it was for me to create and
good luck making your site reliability workflows
ever more efficient. Thanks, everybody.