Conf42 DevOps 2025 - Online

- premiere 5PM GMT

AskDataAI: Democratizing Enterprise Data Access Through Context-Aware AI Agents

Video size:

Abstract

Discover how DoorDash revolutionized data access for 15,000+ employees with AskDataAI, achieving 40% better search relevance and 35% faster queries.Learn how we combined vector search,GPT-4 , robust security to build an AI platform that democratizes data while maintaining enterprise-grade protection

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello, everyone. Good morning. Happy New Year. My name is Praveen Paili, working as a Senior Data Engineer. Over the years, I got a chance to build critical data pipelines, manage large scale warehouses, and create automation tools to tackle complex challenges. As an organization continues to scale, fostering a data driven culture is essential for empowering informed decision making across the organization. Teams rely on data to drive product development, optimize operations, and shape learning. However, the complexity and fragmentation of data tools have historically limited access to insights. In today's data driven world, organizations are under constant pressure to make informed decisions quickly and efficiently. However, the need to access data is increasing. Quickly, often conflicts with the equally important need to keep that data secure. As you know, enterprises demand quick and accurate data access to drive informed data driven decision making. Data democratization, making data accessible to all teams. has become a priority for organizations. It also involves in security risks. Security risk grows as access expands with concerns about protecting sensitive information and maintaining compliance. The challenge lies in balancing data accessibility and security without compromising either. A solution is needed. to empower teams with data while ensuring robust security protocols. To address this, we are introducing AskData AI tool, which is an AI driven platform that democratizes data, insight generation, and optimizes workflow through natural language interfaces. This tool simplifies data access for both technical and non technical teams by allowing anyone to ask questions and retrieve accurate data directly within their workflow. This tool offers an AI powered conversational interface that allows users to access data intuitively using natural language queries. Our context aware agents are specialized to understand diverse data contexts, ensuring relevant and accurate data retrieval. We prioritize enterprise grade security with role based access, control, and protect sensitive information. To revolutionize your data strategy, start by assessing your current data landscape to identify pain points and opportunities. Explore AI powered solutions like AI powered conversational interface, context aware AI agents for improved data access. The AI powered conversational interface offers several benefits. User can interact with the system using natural language queries, making it easier for non technical users to access it. and understand data. The conversational interface simplifies the process of data retrieval, reducing the need for complex query language or technical expertise, which will help to improve user experience. It's, it speeds up the data access process by allowing user to ask questions and receive answers in real time. Enhancing productivity and efficiency, democratize data access across the organization, enabling all team members regardless of their technical skills to leverage data for decision making. The interface can understand. the context of queries, providing more accurate and relevant results by context aware agent. Context aware agents improve data access by understanding a diverse data context. They are specialized to comprehend various data environments, ensuring that the data retrieved is relevant and accurate to the user query. These agents can interpret natural language queries, making it easier for us users to access that data intuitively without needing technical knowledge. By understanding the context, these agents can provide more precise context and contextually appropriate data, improving, improving the overall user experience and efficiency in data retrieval. This approach ensures that both technical and non technical users can access the data they need quickly and accurately. These benefits collectively enhance the overall data strategy of www. microsoft. com an organization by making data more accessible, understandable, and actionable. Balancing security and usability is crucial in today's data driven environments. By leveraging a role based security process, organizations can ensure users have seamless access to the data they need. Promote efficiency workflows and decision making. Maintaining Robust security and protect sensitive data from unauthorized access. Adhere to compliance and regulatory standards. Balance data democratization and security enable broader access to data for innovation and insights. Implement strict controls to safeguard critical information assets. Achieving this balance ensures both data accessibility and security are aligned with the organizational goals. Embrace continuous innovation. to stay updated and, and, advancements and evolve your data strategy. The technical architecture deep day for our, tool is our technical architecture includes quadrant vector search for efficient similarity based data retrieval. We use hybrid search model that comments vector and traditional sets boosting accuracy by 20%. Reciprocal rank fusion re ranks results to improve the accuracy. Relevance by 40%. We have implemented a raven, ranking, model, which improving performance by 40%. We also integrated charge up to four or it. Which reduces query latency by 35%. Key platform features, we majorly, focused on enhancing AI platform. So we used a quadrant vector search, was deployed as a scalable vector search engine, enhancing overall search capabilities and enabling hybrid and multistage searches. Also implemented advanced filtering. filtering for JSON collections support was added for a advanced filtering using quadrant query language enable targeting filtering, such as including only finance related data sets for ask finance for a global, so which also uses a team based, searches. So. We can restrict the access there and only the respective teams can get their data for their decision making systems. Popularity and certification based re ranking implemented a reciprocal rank fusion approach which prioritizes results based on the key indicators like page popularity and certifications. Cementing dataset collections, ask data experts, Explorer generates data in sets based on relevance and table popularity and Hubble certification status. Table sets can be applied to any agent for efficient table suggestions. Latency optimization, hybrid sets model evaluation, and other tools. And integration. This will help, hybrid sets model increase of accuracy by 20 percent across multiple agents, scalable and back end AI chatbots and batch mode. This is a new service deployed for back end services in a supercell environment. GRPC integration also included security, secure security and privacy approvals for open AI process. Integrations also, we also integrated with Slack and Kafka and other, tools, tools to, Enhance or improve our product reachability. So developed BFF and EA components enabling faster development and deployment cycles. So enable real time feedback collections to enhance service response accuracy. And also include a Slack bot. So which will help ask questions. questions using a Slack bot, by plain English so that users can get their result back with the proper information. We also included, logging and, user ID propagations and improved traceability for, logging so that security related things also can be fall in place, by looking at this logging process. So improved accuracy and latency enhanced the platform stability and user experience by integrating the testing. Added automation evaluation mechanism improved accuracy by 20 percent latency with 22 percent of the prompt engineering. Caching and prompt responses significantly improved latency with the cache and reduced LLM cost for repetitive prompts. Performed semantic cache evaluation for accuracy. The hybrid caching revamped lagging and created a thread safe logging mechanism throughout the application to monitor and debug all requests with the, with the idea and user mapping with the full traceability implemented guardrails limited chat history and the, Context provided to the LLM agent to a maximum of 8 conversation turns are until the token limit is reached, which helps reduce the latency and LLM cost. Implemented a BFF model changes to support a NuAsk data AIJRPC gateway service on that. These are the UI and the Created you in a web web. Next, with more than 75, 70 5% of unit test coverages, support conversational chat with rich market support and work seamlessly across different screens. Even logging and usability testing. Also included in the, in the PLA platform. So overall. Our platform achieves 60 percent boost in data discovery efficiency and 20 percent improvement in query accuracy. We have reduced query response time by 35 99 percent reduction in unauthorized access. We, we worked on enhancing the data discovery, cementing data collection, which is automatically analyze tags and organize data set for intuitive navigation and discovery. And, we implemented, reciprocal, Raven, ranking, ranking methods, that popularity based ranking sets, ranks data sets dynamically based on user engagement metrics and collaboratively feedback. Certification based re ranking evaluates trusted data source through a re ranking. Rigorous quality checks and expert verification. The major challenge was data access versus security. Modern enterprises need seamless real time access to data instead for agile decision making while democratizing data access driven. Innovation is crucial to balance accessibility with robust security protocols to protect sensitive information and maintain compliance. So we implemented Ravan role based access control. Our Ravan role based access control system ensures 99 percent reduction in unauthorized access, precise alignment with IT rules, dynamic permission updates, and granule access control. Granule access control is fine grained permission tailored to specific application requirements. Dynamics Rolex Assignment is the ability to update roles based on user activity, context, or roles. Auditing and compliance also implemented which logs and reports for monitoring access and ensuring compliance with regulations. Integration support is compatibility with external Identity providers like what, Samuel and all, we also implemented our tools to auto scale suitable for a system with a large number of users, rules and permissions and also cross functional class organizations and cross teams. So major challenge. We unified disparate data sources across the ecosystem to address data integration complexity. Optimizing our architecture to handle over 15, 000 plus concurrent customers and users. Customized chat GPT 4 for specific data context. Implemented an intuitive UI and comprehensive training program to drive user adoption. Three lessons learned from overall this implementation of the tool. Balance is the key. It's crucial to democratize data access while maintaining robust security. Context is the matter. AI agents must understand enterprise specific data context for accuracy and their responses. Continuous improvement, regular model updates and user feedback loops drive platform evaluation and improvement. of the, use of the tool, user centric design, intuitive interfaces, and clear documentation boost adoption rates. Our future directions. We are expanding to handle multi modal data integration, including images, videos, and audios. Integrating with other enterprise tools to tools and workflows for cross platform synergy. Implementing AI driven forecasting and trend analysis for the advanced predictive analytics. Expanding to support queries in implementing and queries in multiple languages for global language support. Thank you once again for your time and attention. I look forward to witness more groundbreaking advancement in AI and data in the near future. Once again, thank you all.
...

Praveen Payili

Senior Data Engineer @ Horizonsoft Solutions

Praveen Payili's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)