Conf42 Prompt Engineering 2024 - Online

- premiere 5PM GMT

Optimizing IT Performance in Complex Systems: The Role of AI and Prompt Engineering

Abstract

Unlock the power of AI-driven prompt engineering to solve IT’s most persistent challenge: performance optimization. Learn how to overcome bottlenecks in complex systems, from cloud to IoT, with cutting-edge AI techniques that accelerate diagnostics and boost efficiency.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. Welcome to my talk titled Sisyphean struggle navigating performance issues in modern IT landscapes. My name is Ravitej Uveeramachaneni. I work as a principal software engineer for a software services based company. And the main reason, or the main, topic that I'll be discussing today is how performance issues plague the IT industry today, why it's a challenge, why it's a tough challenge and a challenge that may never be solved. you'll all agree, if you've, if any, if you have, IT experience, you'll all agree. Performance issues are an ongoing thing. There's no such thing as, as users being completely satisfied with performance. Even if they're satisfied today, they won't be, in a year from now with the same systems and the same performance. so the idea of the talk is just how performance varies, the challenges it, it brings up. And. And how to overcome them. so let's get started. Excuse me. just a review of the contents that we were discussing. So again, like I said, we'll be discussing why IT performance remains a challenge. any business line, even if it's a software company, even one of the high performing tech ones, why IT Performance, remains a challenge. You cannot rest easy on your laurels, thinking your product is working fine. Today, you'll have to, you'll have to account for many unforeseen things. and so that, so basically why it performance remains a challenge. The next thing is the complexity conundrum in IT systems. So how complexity of multiple systems working together impacts performance. so that's an area of discussion. Also, we'll be discussing the rising expectations and of users when it comes to performance. Like I said at the beginning, even if users are happy today, they will not be a year from now or six months from now with the same system, and with the same performance. Users expectations keep increasing and with advancements in technology, you will have you, you will have, even if your system is performing at the same as it was performing a year ago, you might see competitors, come in and perform at a, a better, level than your product is performing or, things like that. So with advancements in technology, you will always want to be at the top when it comes to performance. And then there's the human elephant, element and, performance challenges. Things like, how, skilled labor impacts performance. And then we'll talk about the resource constraints and optimization. the cost algorithm. There's always, you can always spend a lot of money to improve performance, but it comes, obviously it comes at a cost. So how do you go about, balancing the two? we'll also talk about advanced solutions and performance management. What are the advanced techniques to monitor performance and so on? and challenges to definitive solutions. There might be solutions today as to how to tackle performance issues. But what the challenges are with those solutions. and what the implications for I. T. professionals are, and future research and directions, when it comes to performance. And I finally, conclude, the paper with, why performance will always be an ongoing thing. so again, an introduction, as to why I. T. performance remains a challenge. Performance issues in I. T. systems continue to challenge professionals. Despite major advancements in technology, think of it, we should always, we should have put performance issues behind us with all the technological advancements today, but that's not what I find, no matter which, which, the company I work for, you're always debugging, something related to performance. It could be network. it could be your, data load. It could be a database. it could be your UI and so on. So performance is one of the most complex challenges to solve because of so many systems being involved. You have to, let's say, something that was working fine yesterday, suddenly stops working today. It is hard to pinpoint. A reason as to why you restart your system. Things are back working fine. Maybe they are Back to working as normal, but you cannot you cannot, you can look at the logs, you can do all the debugging, but because of so many systems being involved, you cannot, you sometimes cannot pinpoint as to why there was a lag in performance at a certain time, at a certain time with a certain system. So it's just a difficult thing. to pinpoint and it's, it's like I said, it is an evolving challenge. It is, it will always be a challenge, 10 years from now, 20 years from now, impacts and the impact performance issues have on business, obviously, as any industry, if their IT systems are not performing well, it will have a business, impact even more. So if, Your, company is a software services provider or, or you live in the, technology space, take Google, X or any social media platform. If you, if your site, or if your, website goes down for a few minutes, it costs you millions of dollars. so it does have performance has an impact, on the business. What the key causes are, Performance challenges mainly arise from, increasing system complexity. So as you integrate more systems, if you're a developer, you, you understand there is a DevOps side, you, there's a deployment, side, to your code, then there is the aspect of writing good code. so because of complex systems being involved, those could, lead to performance, challenges. So the, again, the purpose of this presentation is to explore the complex factors causing persistent IT performance issues, evaluate the solutions that we have today, and suggest strategies for navigating these challenges effectively. so let's discuss the complexity conundrum in IT systems. complex interconnected environments. They are one of the reasons for, performance issues. Modern IT landscapes consist of diverse interconnected components, including cloud infrastructure, microservices, and lots of, Internet of thing devices. With these, interconnected environments and conflict, integrations, those, introduce new bottlenecks, bottlenecks multiply with complexity, increase distribution and interdependency, amplify the risk of performance, bottlenecks across systems. like I said a few minutes ago, it's a challenge to, isolate what the root cause was for performance. Because most of the times what ends up happening is, let's say you're having a database issue. You restart the database and the performance issue is gone. so the issue is gone, but we still cannot pinpoint as to why there was a issue in, in, in the first place because it's hard to reproduce, performance issues. So it's a challenge. It's a challenge of root cause isolation with each component introducing its own set of performance characteristics. isolating and troubleshooting root causes becomes a demanding task. An example of that would be a delay in data processing within a microservice can cascade through an entire application. As I'm sure, another good example would be, again, like I said, our database, if it slows down, it, the data that is being fetched, that, it slows down and then it has an impact on your reports or whatever the database is fetching, or whatever you're fetching from the database. So, also let's talk about the ever moving target, which is rising expectations and advancements. user expectations are always changing, that's why we call it an ever moving target. As consumer technology advances, so do user expectations for system speed, responsiveness, and reliability. Again, just think back to 10 years ago, the cell phones that we had, the phones that we had, they were a major advancement compared to 20 years ago. But, we cannot use those devices today just because we demand more, speed. so technological advancements also introduce new demands. Again, the classic example would be your, cell phones. your mobile phones with advancements in chip design and with advancements in CPU and GPU. Thank you. The phones always perform better than the previous generations. The performance might be marginally better year over year, but when you compare it to 10 years ago, you would see a drastic performance improvement. So put in other words, you cannot take a phone that was launched 10 years ago and use it in today's world. the applications designed for phones today, and so on wouldn't work on hardware of, 10 years ago. Changing standards is another moving target, which is what was once considered acceptable is now deemed subpar as expectations rise. to keep pace with technological progress. Um, another example, deviating from the mobile phone would be AI applications like real time video processing, demand high computing power. you have, you get access to 4K, 8K, and, we don't know what else in the future. but as you, as you unlock more, resolutions. that will put more processing, that will, that will demand more processing power. And so you need more, more resources to solve that challenge. and then there's the human element and performance challenges. things like, things pertaining to a human element would be impact of code and configuration. So inefficient coding practices obviously could lead to, subpar performance. or, suboptimal performance. and if you use suboptimal, algorithms and misconfigurations, you would all in those cases too, you would, obviously see performance issues, variability and skills. Not everyone, is, has the same level of skillset. So that's why, the top engineers going to only a few companies, and and so with variability in skills. comes, variability in product that is being, put out there. so the skill levels and practices of developers vary, which can lead to inconsistent performance within systems. another, human element would be uncreditable user behavior. A good example of that would be sudden spikes in traffic. let's say a new product was launched, Or, everyone is, or YouTube suddenly, lets all users, download, videos, in, in, in the highest resolution. So if everyone tried to download the same video, it would obviously put a, put, a strain on the system. just certain spikes in traffic or unexpected data growth. can create an unforeseen system strain like, unexpected data growth would be, you haven't, really, accounted for, a good example again would be YouTube, where, at one point, the maximum number of views, or the displayed view count on YouTube was, a certain range and then, with that, with some videos exceeding that, they had to change that. So things like that, even though in that example, it's not a performance issue. you could see why, sudden, unforeseen systems train could lead to performance issues. again, example of that would be a sudden increase in user activity during a promotional event can overload systems not designed for such spikes. So, and so then let's discuss about resource constraints and, and the optimization cost element. Good examples of that would be limited IT resources. Organizations often operate under constraints with limited access to hardware, software, licenses, and skilled personnel. and so with limited resources, you have to work within your budgets, and if you're not ready to pay, for the newest chip, or if you're not ready to pay for the best talent that will show in your, that will show in the performance of the products that you put out there. So it's always a balancing act. IT teams must prioritize between optimizing performance, meeting security and compliance requirements and staying within budget. Cost performance trade offs, upgrading systems for optimal performance can be costly, requiring careful consideration to avoid overspending. An example of resource constraints would be deciding between upgrading server capacity or allocating funds to other critical IP needs like cyber security. So let's discuss some advanced solutions and performance management. Some of the advanced solutions are enhanced monitoring techniques. You can have the best in class monitoring techniques, advanced monitoring techniques. Tools now use machine learning for real time anomaly detection, providing insights that allow proactive response. So instead of reacting to an issue, AI insights can tell you beforehand if you are going to have a performance issue. And that way you can, do what is necessary to prevent it from, happening. You also have sophisticated troubleshooting, options available in today's world. some of them are AI driven, root cause analysis. Distributed tracing and chaos engineering. All of these help to proactively detect and address potential issues. Holistic optimizations strategies, those include auto scaling. These are mostly on the infrastructure side where you configure your load balancers for auto scaling and DevOps. DevOps practices allow more efficient resource use. and a smoother user experience. An example for advanced solutions would be implementing auto scaling during peak usage to ensure that systems can handle increased traffic smoothly. now let's discuss the challenges to, the definitive solutions. some of the, excuse me. So unintended consequences, optimizing one area can inadvertently cause issues in another, especially in tight integrated systems. I've seen this a lot where. You fix one area of the code and break something else. So it involves, good communication. solving performance issues, involves communication between other teams. also you being aware of not just your, area of expertise, but what could be the potential impact of you changing something within the system to make performance better. If it impacts someone else, maybe. Maybe, you have upgraded to a new database connector and that database connector is not supported to one of the integrated systems. now suddenly you have, your integrations failing and so on. So unintended consequences, are one challenge. Then you have the trade off conundrum, which means improvements in aspect of performance often require sacrifices in another, such as speed versus data consistency. so that's pretty self explanatory where you are looking for speed, but, you're missing your data is missing or, you're not getting the cleanest of data and so on. Then you have the innovation paradox, the rapid technological advancements introduce new performance challenges, just as solutions for older challenges, emerge. So again, like I said, uh, what you were satisfied with. 10 years ago, you're no longer satisfied with today. So rapid technological advancements, they always, keep the user, wanting more. again, a good example would be increasing security measure, measures may add layers of authentication, potentially impacting system response, time. So what does this imply for IT professionals? it means you have to keep Evolving your skill set, professionals need to shift from reactive troubleshooting to proactive predictive performance management technique techniques. You also need to have cross functional skills. You no longer can just be an expert in your area, like a developer should know about the DevOps side. Similarly, a DevOps engineer should know about, should be able to read code and say, what is a. What is a bad code versus what is good code and so on. understanding system architecture, data analytics, and emerging technologies is essential for modern performance management. Importance of, communication, like I touched on earlier. Effectively conveying technical issues and solutions to non technical stakeholders is critical to gaining support and resources. Now, for many businesses, IT is not their main, business. Maybe for an oil and gas company. If they want to spend more on IT to improve performance, and so on, they need to first be able to understand as to how it impacts their business. And so you being able to effectively communicate, maybe they don't understand that a breach in, an IT system could negatively impact, the business. So explaining things like that to them, explaining things like how performance could. Could cause a loss in business is really important. The role of IT teams is no longer just maintenance, but also to align their work closely with business objectives for better overall performance. So what does future research state and where are we headed? So there are advanced technologies. predictive models that are coming up, developing more context aware predictive models using big data analytics to anticipate potential performance issues. We also have, another area of future research could be autonomous optimization, that is researching into self adjusting systems. We don't have that. today, that can automatically configure and allocate resources in real time. Human AI collaboration, exploring effective ways to combine AI driven insights. with human expertise to enhance troubleshooting and optimization, where, an example would be you have AI techniques to notice trends where it could alert you to potential performance issues and you proactively, jump on and, and see what's going on. Standardization of metrics, creating standardized performance benchmarks to allow comparison across diverse IT environments. That could also help with, future research could be targeted to towards that where, you do have benchmarks today, but they're not the same benchmarks do not apply to, all the systems and so on. an example would be industry wide standards for system latency or data consistency would enable better benchmarking and performance evaluations. So in conclusion, the paper talks about, it acknowledges persistent challenges. Performance issues are challenging, but drive innovation and IT. It's because users are never satisfied with performance that you always see innovation. You have opportunities for proactive strategies. Embracing proactive management, leveraging emerging technologies, and fostering a performance oriented culture are crucial. What lies in the road ahead? Continued research into predictive analytics, autonomous optimization, and standardization of metrics will pave the way forward. And as a final note, Although a final solution may be elusive, even with all of these advancements at some point will become the norm. and then you'll, there will always be, challenges as of that date and years from now, which will mean the quest never ends and you keep, you keep trying to, improve performance. so although a final solution may be elusive, the pursuit of optimal performance will continue to inspire advancements in IT, pushing professionals to innovate and excel. Thank you.
...

Ravitej Veeramachaneni

Principal Software Engineer @ Oracle

Ravitej Veeramachaneni's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways