Conf42 Python 2025 - Online

- premiere 5PM GMT

AI Agents: Customizing AI to Your Needs

Video size:

Abstract

What if we could leverage the advanced NLP techniques of LLMs to complete complex tasks that require up-to-date information, planning, and external tools? How can developers automate and scale these complex multi-step workflows? One effective solution is to leverage custom tools.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi, everyone. My name is Anna. I'm a data scientist working in developer advocacy at IBM. I'm super excited to talk to you today a bit about AI agents and how they're reshaping the future of AI as we speak. So it goes without saying, over the last few years, AI has transcended across almost every industry. From customer service, healthcare, and even agriculture, AI is reshaping the way we work and the way we think about work. I'm sure many of us have used a large language model and been disappointed by its responses. So why is it that some large language models are able to successfully complete complex tasks like writing JavaScript code, for example, but can't answer simple questions like, what is today's date? The answer to filling these information gaps lies in AI agents. So what exactly is an AI agent? An AI agent refers to a system or program that's capable of autonomously performing tasks on behalf of a user or another system by designing its workflow and using available tools. So what makes them any different than a traditional AI assistant like we've used in the past? traditional LLMs, like IBM's Granite models, Meta's Llama models, or Google's Gemma models, just to name a few, produce their responses based on the data used to train them. And they're bounded by knowledge and reasoning limitations. Hence, why an LLM is unable to provide us with today's date. It can't access information it wasn't trained on. In contrast, Agentech technology is modular, and it's easier to adapt to personalized tasks. Without the need for human intervention, tool calling is used on the backend to obtain up to date information, optimize workflow, and create subtasks autonomously to achieve complex goals. In this process, the autonomous agent learns to adapt to user expectations over time. And the agent's ability to store past interactions in memory and plan future actions encourages a personalized experience and comprehensive responses. This process can be broken down into three stages. First there's goal initialization and planning. This is where the agent takes in a user query, extracts the main goal, and creates a plan of subtasks in order to reach that goal. Next, there's tool calling. In which the AI agent uses its available tools to fill any information gaps in the LLM's pre trained knowledge. These tools can include external datasets, web searches, APIs, and even other agents. After the missing information is retrieved from these tools, the agent can update its knowledge base. This means that each step of the way, The agent reassesses its plan of action and self corrects. We have the flexibility to either create custom tools for our AI agents, Or use pre built tools like the ones already available through LangChain. For anyone unfamiliar with LangChain, it's an open source LLM orchestration framework. Additionally, LangGraph is built on top of LangChain, and it's a framework specific to agentic workflows. It's a specialized library within the LangChain ecosystem. And we'll be using both in the upcoming demo. The tools available through LangChain are vast and they include functionality for search, web browsing, code interpretation, productivity, database management, and much, much more. Lastly, the final stage is learning and reflection. This is where an AI agent uses feedback mechanisms such as other AI agents or a human in the loop to improve the accuracy of its responses. Feedback mechanisms improve the AI agent's reasoning and accuracy, and this is commonly referred to as iterative refinement. To avoid repeating the same mistakes, AI agents can also store data about solutions to previous obstacles in a knowledge base. Alright, now let's bring in some real world data. As seen by a recent survey of AI developers, a whopping 99 percent of them are exploring and developing use cases for AI agents. Clearly, this is a major topic with lots of growth potential. Now let's apply this preliminary AI agent knowledge to an example that is applicable to many of us today. Let's say we want to plan a vacation and we're unsure about the number of our remaining vacation days at work. If I were to ask an LLM this user query, How many vacation days do I have left? The LLM would either hallucinate or simply return a response that indicates it can't provide me with this information. Not only does the LLM not know who I am and where I work, but it also doesn't have the ability to figure that out. Whereas, if we were to provide the LLM with a tool that searches a database of vacation days for each employee, along with the same user query, an AI agent would be able to provide us with exactly the information that we're looking for. This is a simple example of the significance of agentic systems. Of course, as with most things in AI, we can make this more convoluted through the use of reasoning paradigms. There are several reasoning paradigms out there, but the one most commonly used is React, short for reasoning and action. And it's not to be confused with the JS framework, as I've learned once before. Essentially, with React, we're controlling the logic of a compound AI system by putting the LLM in charge. In other words, we're designing the system to think slow, readjust, As seen in this diagram, the agent begins with a user query from which it extracts the main goal, as well as any subtasks that goal is dependent on. Then, the agent checks its tools and determines the appropriate ones for each subtask. It then executes the subtasks and finally provides an output. The process doesn't end there, however. Instead, the agent determines whether the output is satisfactory. If not, the agent adjusts and the process is repeated. Otherwise, the agent returns the synthesized response to the user. And so this encapsulates the end to end process an AI agent goes through to produce responses to user queries using React. don't AI agents seem great? what if I told you that they're more powerful when working together? That's right, multi agent systems tend to outperform singular agents. There is strength in numbers due to the larger pool of shared resources, optimization, and automation. Instead of multiple agents learning the same policies, one agent can share its learned experiences with the rest to optimize both time complexity and efficiency. In a multi agent system, agents remain autonomous, but they also cooperate and coordinate in agent structures. And each agent within a multi agent system has individual properties, but all agents behave collaboratively to lead to desired global properties. Multi agent systems are particularly valuable in completing large scale, complex tasks that can encompass hundreds if not thousands of agents. Multi agent systems can operate under various architectures. In centralized networks, a central unit contains the global knowledge base. It also connects the agents and oversees their information. Some strengths of the structure are the ease of communication between agents and uniform knowledge, and a weakness of the centrality is the dependence on the central unit. If it fails, the entire system of agents fails. Whereas in decentralized networks, agents share information with their neighboring agents instead of a global knowledge base. And so some benefits of this are robustness and modularity. The failure of one agent doesn't cause the overall system to fail since there's no central unit. And one challenge of decentralized agents. is coordinating agent behavior to benefit other cooperating agents. Multi agent systems also come with their pros and cons regardless of system structure. Firstly, they're quite flexible. We can adjust the system by removing and adding agents as needed, which ties into their scalability. As we can tackle much more complex goals than a singular agent could. And each agent comes with its own domain specialization. So agents are not relearning the same information. Instead, they're collaborating by sharing information. In turn, the greater the number of agents, the greater the chances are of an agent malfunction. And the harder it can be to coordinate each agent to benefit other cooperating agents. Since multi agent systems involve dependencies between agents for accomplishing subtasks, this means we may also encounter unpredictable behavior. This behavior can be due to infinite feedback loops, changes in access to certain tools, or malfunctioning due to design flaws. And we can mitigate this risk by incorporating means of system interruption, human supervision, and feedback. This is especially important when working with sensitive data. Okay, so given all of this information, you may still be wondering, why should I care, and how does this apply to me? here are just a few of the incredible ways in which AI agents and multi agent systems can help us as humans to allocate our time to more nuanced and critical tasks that can't be easily automated. Probably the most common use case of AI agents today is the integration into websites and apps to enhance the customer experience, whether it's serving as virtual assistants, providing mental health support, simulating interviews, and much more. In healthcare, AI agents have the potential to alleviate the workload of medical professionals by overseeing drug processes and drafting treatment plans. For Thus allowing their time to be spent on urgent, hands on tasks. Additionally, AI agents can also optimize rescue operations, in times of natural disasters. They can do this, by locating those in need of rescue, using tools like social media! Lastly, multi agent system can be a powerful tool for managing complex transportation systems, like railroad systems, truck assignments, or marine vessels visiting the same ports. Now having said this, by no means am I saying AI agents in their current state can be 100 percent trusted with these listed use cases. Agents applied to these use cases should be under significant supervision and should undergo thorough testing before being fully implemented. Regardless of this, I think we're at such a pivotal time, of agentic AI that it's important to put its capabilities in perspective. If we look at the current trends of how enterprise AI developers are using AI agents, we see that about half of them are using them for customer service and support, followed by project management, serving as a personal assistant, And content creation, HR, transportation, healthcare, and only 1 percent of developers in AI are not currently exploring or developing AI AJA use cases. Now that we've established the foundational understanding of AI agents and we've seen some real world data about their significance, let's demo how to build an AI agent using LangGraph. We'll use pre built Lang chain tools for a React agent to showcase its ability to differentiate appropriate applications of each tool. The code that I will share with you is available on GitHub, as well as on IBM. com, and I'll include the links to both at the end of this presentation if you'd like to run the project on your own time and experiment with it. so okay, let's jump right into it. Alrighty, so since we've covered the basics of what an AI agent is and how tool calling works, we can go ahead and take a look at our prerequisites for this demo. So there is one, you will need an IBM cloud account, and keep in mind there are hyperlinks throughout this entire demo, so you can go into either the github repository or the ibm. com page and, follow along there. So once you have created an IBM Cloud account, you can then log in to watsonx. ai. Which is in step one. There you can create a watsonx. ai project and Make a note of your project id which we'll need for this demo And finally you can either create a jupyter notebook from scratch Or you can import an already existing Jupyter notebook. So if you want to go ahead and download this one from the GitHub it, directly to what's next. ai, that is also an option. Moving along to step number two, we will set up a what's next. ai runtime instance and API key. So you can create a free instance, choosing the light blue. plan of WatsonX. ai runtime. There you can generate an API key. And finally, you can associate the runtime instance to the project that you just created in step one. Once we have all these preliminary steps completed, we can go ahead and install and import relevant libraries, as well as set up our credentials. I've already installed these libraries. I won't rerun this cell, but I just want you to see that we've imported all of the necessary libraries and modules that we'll need, as well as, we have loaded our environment variables. there are several ways of setting up your environment variables. My personal preference is to use a separate env file. that is what I have done here, using OS. Here we can also set up our credentials for the What's Next API, so that is our API key, our project ID, as well as the API endpoint. Your endpoint will be specific to your region, so please reference the documentation to make sure you're using the correct endpoint, otherwise you will have a hard time connecting. We can also set up our open weather API key, Which you can generate by creating an OpenWeather account, and all of this is free to create. We will be using this API in a later step for a link chain tool. In step four, we can go ahead and initialize our LLM. So in this tutorial, we'll be using the IBM Granite 3. 1 8B Instruct model. And I will be setting it up using the chat Watson X wrapper available through lang chain. And this wrapper is pretty neat. It allows us to integrate tool calling and chaining. And to initialize the LLM, I'll set the model parameters here as well. One thing to point out here is I set the temperature to zero to limit the amount of agent hallucinations and overall LLM creativity. Moving right along, we can establish the built in tools. So to do we will use the tool class available through LangChain. And the first tool that we'll set up is the OpenWeatherMap tool. So this uses that API key that we set up in an earlier step. And when I set up these tools, I provide a name for the tool, a description, and overall, the function that is meant to be run when calling each tool. The next tool we'll set up is the YouTube search tool. Pretty self explanatory. It'll return YouTube links to videos relevant to the user query. And finally, we can set up a password. a online shopping tool which uses Ionic. This is also available through LangChain. And this is essentially an e commerce marketplace tool that returns search results relevant to your user query. I'll explain a bit more about the search parameters in a later step. So now that we have our tools defined, we can put them into a list and simply call the list tools. And here we see how the tool is loaded. We see their names, their descriptions, the functions, as well as any API keys. So now for step six, we will actually get to implement some tool calling. And before we jump into creating an AI agent, I first want to see how the LLM performs, simply being provided a user query and its tools. So we will use the bind tools method available through link chain, and we will invoke a user query asking what are some YouTube videos about Greece? Okay. Just to print the response, additional arguments. so just to see a little better what's going on here. We can see that the function YouTube search was appropriately selected from the tool base, and that the argument of Greece was also successfully extracted from the user query. But as you can see here, the actual tool response is not returned. That's because we're not creating an agentic system. This is simply the LLM with its tools. moving right along, we can actually create the React agent using the createReactAgent method available through LangGraph. And so as we've discussed, I won't go too into depth again. But just to refresh our memory, the user provides a user query to the LLM. In our case, this will be a React agent. The agent then assesses its tool base and determines the appropriate tools for each subtask within its broader goal. Finally, once it collects all of the tool messages, it formulates a synthesized response to the user, and if successful, the process ends there. If the user is not satisfied with the results, the process repeats. So let's go ahead and initialize our React agent. Super simple, super clean method. Okay, so now that we've set up our agent, let's ask the same user query. What are some YouTube videos about Greece? Awesome. So as you can see, we have the human message, which contains our initial user query. We also have the AI message, which is The one that you previously saw with the LLM and the buying tools method. But the interesting part here is that our AI agent actually calls the tool, as you can see by the tool message. And so once it receives that tool response, it's able to formulate a synthesized response. So here it is within the AI message. We see the message says, here are some YouTube videos about grace. We have the YouTube video title. Along with the URL. And here we have two URLs. Wonderful. Okay, so now we can move on to testing whether the AI agent will know when to use tools for non YouTube related queries. So let's ask a weather related question. What is the weather in Athens, Greece? Okay, let's see what we have here. In the human message, we see our initial user input. In the AI message, we see that the weather search tool was selected appropriately, and that the argument of Athens And Greece was also extracted correctly. Then in the tool message, we see a thorough description of the weather in Athens, Greece, retrieved from the weather tool, and finally in the AI message, we see a synthesized response that lets us know the temperature, as well as the wind speed and humidity. And whether there's any rain and the cloud cover in Athens, Greece. Okay, so finally, let's test whether the agent will know to use the e commerce marketplace search tool when provided with this user query. We will ask the agent to find some suitcases for less than 10, 000 cents. You might be wondering why I'm using US cents instead of dollars. So that's one thing to keep in mind when using the Ionic search tool. It actually uses cents and not dollars. And so the equivalent of 10, 000 cents would be 100. So just keep that in mind when running your user queries. And we can go ahead and run this now. We have our human message. We then have our AI message where we see that the Ionic commerce shopping tool was correctly selected and that the argument of suitcases under 10, 000 cents, those arguments were also successfully extracted. So then within our tool message, we should see suitcases under. 100. And we can verify this within our synthesized AI message. So we see here are some suitcases under 10, 000 cents. The first response is a suitcase at the price of 80. The next one is a suitcase that costs 34. So as you can see, these suitcases fit our criteria. We have the synthesized response that we're looking for. So there's definitely a lot of room here for creativity and to have fun with this. So feel free to test out different user queries on your own time. And. Yes, this is it! This is how you create an AI agent using LangChain and LangGraph. So to recap, we created a React agent in Python using the WatsonX API, and we used the Granite 3. 1 8 bit instruct model. We also use the YouTube weather and online shopping tools available through link chain and we were able to successfully receive the output we were looking for. So I hope that this was fruitful and I hope you enjoy running your own user queries on your own time. Thank you for joining me today.
...

Anna Gutowska

Data Scientist, Developer Advocacy @ IBM

Anna Gutowska's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)