Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello, everyone.
Good morning.
Happy New Year.
My name is Praveen Paili, working as a Senior Data Engineer.
Over the years, I got a chance to build critical data pipelines, manage large
scale warehouses, and create automation tools to tackle complex challenges.
As an organization continues to scale, fostering a data driven culture is
essential for empowering informed decision making across the organization.
Teams rely on data to drive product development, optimize
operations, and shape learning.
However, the complexity and fragmentation of data tools have
historically limited access to insights.
In today's data driven world, organizations are under constant
pressure to make informed decisions quickly and efficiently.
However, the need to access data is increasing.
Quickly, often conflicts with the equally important need to keep that data secure.
As you know, enterprises demand quick and accurate data access to drive
informed data driven decision making.
Data democratization, making data accessible to all teams.
has become a priority for organizations.
It also involves in security risks.
Security risk grows as access expands with concerns about protecting sensitive
information and maintaining compliance.
The challenge lies in balancing data accessibility and security
without compromising either.
A solution is needed.
to empower teams with data while ensuring robust security protocols.
To address this, we are introducing AskData AI tool, which is an AI driven
platform that democratizes data, insight generation, and optimizes workflow
through natural language interfaces.
This tool simplifies data access for both technical and non technical
teams by allowing anyone to ask questions and retrieve accurate
data directly within their workflow.
This tool offers an AI powered conversational interface that allows
users to access data intuitively using natural language queries.
Our context aware agents are specialized to understand diverse
data contexts, ensuring relevant and accurate data retrieval.
We prioritize enterprise grade security with role based access, control,
and protect sensitive information.
To revolutionize your data strategy, start by assessing your current data landscape
to identify pain points and opportunities.
Explore AI powered solutions like AI powered conversational interface, context
aware AI agents for improved data access.
The AI powered conversational interface offers several benefits.
User can interact with the system using natural language queries, making it easier
for non technical users to access it.
and understand data.
The conversational interface simplifies the process of data retrieval, reducing
the need for complex query language or technical expertise, which will
help to improve user experience.
It's, it speeds up the data access process by allowing user to ask questions
and receive answers in real time.
Enhancing productivity and efficiency, democratize data access across the
organization, enabling all team members regardless of their technical skills
to leverage data for decision making.
The interface can understand.
the context of queries, providing more accurate and relevant
results by context aware agent.
Context aware agents improve data access by understanding a diverse data context.
They are specialized to comprehend various data environments, ensuring
that the data retrieved is relevant and accurate to the user query.
These agents can interpret natural language queries, making it easier for
us users to access that data intuitively without needing technical knowledge.
By understanding the context, these agents can provide more precise context and
contextually appropriate data, improving, improving the overall user experience
and efficiency in data retrieval.
This approach ensures that both technical and non technical users can access the
data they need quickly and accurately.
These benefits collectively enhance the overall data strategy of www.
microsoft.
com an organization by making data more accessible,
understandable, and actionable.
Balancing security and usability is crucial in today's
data driven environments.
By leveraging a role based security process, organizations
can ensure users have seamless access to the data they need.
Promote efficiency workflows and decision making.
Maintaining Robust security and protect sensitive data from unauthorized access.
Adhere to compliance and regulatory standards.
Balance data democratization and security enable broader access to
data for innovation and insights.
Implement strict controls to safeguard critical information assets.
Achieving this balance ensures both data accessibility and security are
aligned with the organizational goals.
Embrace continuous innovation.
to stay updated and, and, advancements and evolve your data strategy.
The technical architecture deep day for our, tool is our technical architecture
includes quadrant vector search for efficient similarity based data retrieval.
We use hybrid search model that comments vector and traditional
sets boosting accuracy by 20%.
Reciprocal rank fusion re ranks results to improve the accuracy.
Relevance by 40%.
We have implemented a raven, ranking, model, which improving performance by 40%.
We also integrated charge up to four or it.
Which reduces query latency by 35%.
Key platform features, we majorly, focused on enhancing AI platform.
So we used a quadrant vector search, was deployed as a scalable vector
search engine, enhancing overall search capabilities and enabling
hybrid and multistage searches.
Also implemented advanced filtering.
filtering for JSON collections support was added for a advanced filtering
using quadrant query language enable targeting filtering, such as including
only finance related data sets for ask finance for a global, so which
also uses a team based, searches.
So.
We can restrict the access there and only the respective teams can get their
data for their decision making systems.
Popularity and certification based re ranking implemented a reciprocal
rank fusion approach which prioritizes results based on the key indicators
like page popularity and certifications.
Cementing dataset collections, ask data experts, Explorer generates data in sets
based on relevance and table popularity and Hubble certification status.
Table sets can be applied to any agent for efficient table suggestions.
Latency optimization, hybrid sets model evaluation, and other tools.
And integration.
This will help, hybrid sets model increase of accuracy by 20 percent
across multiple agents, scalable and back end AI chatbots and batch mode.
This is a new service deployed for back end services in a supercell environment.
GRPC integration also included security, secure security and privacy
approvals for open AI process.
Integrations also, we also integrated with Slack and Kafka and
other, tools, tools to, Enhance or improve our product reachability.
So developed BFF and EA components enabling faster
development and deployment cycles.
So enable real time feedback collections to enhance service response accuracy.
And also include a Slack bot.
So which will help ask questions.
questions using a Slack bot, by plain English so that users can get their
result back with the proper information.
We also included, logging and, user ID propagations and improved traceability
for, logging so that security related things also can be fall in place,
by looking at this logging process.
So improved accuracy and latency enhanced the platform stability and user
experience by integrating the testing.
Added automation evaluation mechanism improved accuracy by 20 percent latency
with 22 percent of the prompt engineering.
Caching and prompt responses significantly improved latency with the cache and
reduced LLM cost for repetitive prompts.
Performed semantic cache evaluation for accuracy.
The hybrid caching revamped lagging and created a thread safe logging
mechanism throughout the application to monitor and debug all requests with
the, with the idea and user mapping with the full traceability implemented
guardrails limited chat history and the, Context provided to the LLM agent to
a maximum of 8 conversation turns are until the token limit is reached, which
helps reduce the latency and LLM cost.
Implemented a BFF model changes to support a NuAsk data AIJRPC
gateway service on that.
These are the UI and the Created you in a web web.
Next, with more than 75, 70 5% of unit test coverages, support conversational
chat with rich market support and work seamlessly across different screens.
Even logging and usability testing.
Also included in the, in the PLA platform.
So overall.
Our platform achieves 60 percent boost in data discovery efficiency and 20
percent improvement in query accuracy.
We have reduced query response time by 35 99 percent reduction
in unauthorized access.
We, we worked on enhancing the data discovery, cementing data collection,
which is automatically analyze tags and organize data set for
intuitive navigation and discovery.
And, we implemented, reciprocal, Raven, ranking, ranking methods, that popularity
based ranking sets, ranks data sets dynamically based on user engagement
metrics and collaboratively feedback.
Certification based re ranking evaluates trusted data source through a re ranking.
Rigorous quality checks and expert verification.
The major challenge was data access versus security.
Modern enterprises need seamless real time access to data instead
for agile decision making while democratizing data access driven.
Innovation is crucial to balance accessibility with robust security
protocols to protect sensitive information and maintain compliance.
So we implemented Ravan role based access control.
Our Ravan role based access control system ensures 99 percent reduction in
unauthorized access, precise alignment with IT rules, dynamic permission
updates, and granule access control.
Granule access control is fine grained permission tailored to
specific application requirements.
Dynamics Rolex Assignment is the ability to update roles based on
user activity, context, or roles.
Auditing and compliance also implemented which logs and reports
for monitoring access and ensuring compliance with regulations.
Integration support is compatibility with external Identity providers like what,
Samuel and all, we also implemented our tools to auto scale suitable for a system
with a large number of users, rules and permissions and also cross functional
class organizations and cross teams.
So major challenge.
We unified disparate data sources across the ecosystem to address
data integration complexity.
Optimizing our architecture to handle over 15, 000 plus
concurrent customers and users.
Customized chat GPT 4 for specific data context.
Implemented an intuitive UI and comprehensive training
program to drive user adoption.
Three lessons learned from overall this implementation of the tool.
Balance is the key.
It's crucial to democratize data access while maintaining robust security.
Context is the matter.
AI agents must understand enterprise specific data context
for accuracy and their responses.
Continuous improvement, regular model updates and user feedback loops drive
platform evaluation and improvement.
of the, use of the tool, user centric design, intuitive interfaces, and clear
documentation boost adoption rates.
Our future directions.
We are expanding to handle multi modal data integration, including
images, videos, and audios.
Integrating with other enterprise tools to tools and workflows
for cross platform synergy.
Implementing AI driven forecasting and trend analysis for the
advanced predictive analytics.
Expanding to support queries in implementing and queries in multiple
languages for global language support.
Thank you once again for your time and attention.
I look forward to witness more groundbreaking advancement in
AI and data in the near future.
Once again, thank you all.