Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone.
Welcome to my talk titled Sisyphean struggle navigating performance
issues in modern IT landscapes.
My name is Ravitej Uveeramachaneni.
I work as a principal software engineer for a software services based company.
And the main reason, or the main, topic that I'll be discussing today is
how performance issues plague the IT industry today, why it's a challenge,
why it's a tough challenge and a challenge that may never be solved.
you'll all agree, if you've, if any, if you have, IT experience, you'll all agree.
Performance issues are an ongoing thing.
There's no such thing as, as users being completely satisfied with performance.
Even if they're satisfied today, they won't be, in a year from now with the
same systems and the same performance.
so the idea of the talk is just how performance varies, the
challenges it, it brings up.
And.
And how to overcome them.
so let's get started.
Excuse me.
just a review of the contents that we were discussing.
So again, like I said, we'll be discussing why IT performance remains a challenge.
any business line, even if it's a software company, even one of the
high performing tech ones, why IT Performance, remains a challenge.
You cannot rest easy on your laurels, thinking your product is working fine.
Today, you'll have to, you'll have to account for many unforeseen things.
and so that, so basically why it performance remains a challenge.
The next thing is the complexity conundrum in IT systems.
So how complexity of multiple systems working together impacts performance.
so that's an area of discussion.
Also, we'll be discussing the rising expectations and of users
when it comes to performance.
Like I said at the beginning, even if users are happy today, they will
not be a year from now or six months from now with the same system,
and with the same performance.
Users expectations keep increasing and with advancements in technology, you
will have you, you will have, even if your system is performing at the same as
it was performing a year ago, you might see competitors, come in and perform
at a, a better, level than your product is performing or, things like that.
So with advancements in technology, you will always want to be at the
top when it comes to performance.
And then there's the human elephant, element and, performance challenges.
Things like, how, skilled labor impacts performance.
And then we'll talk about the resource constraints and optimization.
the cost algorithm.
There's always, you can always spend a lot of money to improve performance, but
it comes, obviously it comes at a cost.
So how do you go about, balancing the two?
we'll also talk about advanced solutions and performance management.
What are the advanced techniques to monitor performance and so on?
and challenges to definitive solutions.
There might be solutions today as to how to tackle performance issues.
But what the challenges are with those solutions.
and what the implications for I.
T.
professionals are, and future research and directions, when it comes to performance.
And I finally, conclude, the paper with, why performance
will always be an ongoing thing.
so again, an introduction, as to why I.
T.
performance remains a challenge.
Performance issues in I.
T.
systems continue to challenge professionals.
Despite major advancements in technology, think of it, we should always, we should
have put performance issues behind us with all the technological advancements
today, but that's not what I find, no matter which, which, the company
I work for, you're always debugging, something related to performance.
It could be network.
it could be your, data load.
It could be a database.
it could be your UI and so on.
So performance is one of the most complex challenges to solve because
of so many systems being involved.
You have to, let's say, something that was working fine yesterday,
suddenly stops working today.
It is hard to pinpoint.
A reason as to why you restart your system.
Things are back working fine.
Maybe they are Back to working as normal, but you cannot you cannot, you can look
at the logs, you can do all the debugging, but because of so many systems being
involved, you cannot, you sometimes cannot pinpoint as to why there was a
lag in performance at a certain time, at a certain time with a certain system.
So it's just a difficult thing.
to pinpoint and it's, it's like I said, it is an evolving challenge.
It is, it will always be a challenge, 10 years from now, 20 years from now, impacts
and the impact performance issues have on business, obviously, as any industry, if
their IT systems are not performing well, it will have a business, impact even more.
So if, Your, company is a software services provider or, or you live in
the, technology space, take Google, X or any social media platform.
If you, if your site, or if your, website goes down for a few minutes,
it costs you millions of dollars.
so it does have performance has an impact, on the business.
What the key causes are, Performance challenges mainly arise from,
increasing system complexity.
So as you integrate more systems, if you're a developer, you, you understand
there is a DevOps side, you, there's a deployment, side, to your code, then
there is the aspect of writing good code.
so because of complex systems being involved, those could,
lead to performance, challenges.
So the, again, the purpose of this presentation is to explore the complex
factors causing persistent IT performance issues, evaluate the solutions that we
have today, and suggest strategies for navigating these challenges effectively.
so let's discuss the complexity conundrum in IT systems.
complex interconnected environments.
They are one of the reasons for, performance issues.
Modern IT landscapes consist of diverse interconnected components, including
cloud infrastructure, microservices, and lots of, Internet of thing devices.
With these, interconnected environments and conflict, integrations, those,
introduce new bottlenecks, bottlenecks multiply with complexity, increase
distribution and interdependency, amplify the risk of performance,
bottlenecks across systems.
like I said a few minutes ago, it's a challenge to, isolate what the
root cause was for performance.
Because most of the times what ends up happening is, let's say
you're having a database issue.
You restart the database and the performance issue is gone.
so the issue is gone, but we still cannot pinpoint as to why there was a issue
in, in, in the first place because it's hard to reproduce, performance issues.
So it's a challenge.
It's a challenge of root cause isolation with each component introducing its
own set of performance characteristics.
isolating and troubleshooting root causes becomes a demanding task.
An example of that would be a delay in data processing within a microservice can
cascade through an entire application.
As I'm sure, another good example would be, again, like I said, our database, if
it slows down, it, the data that is being fetched, that, it slows down and then it
has an impact on your reports or whatever the database is fetching, or whatever
you're fetching from the database.
So, also let's talk about the ever moving target, which is rising
expectations and advancements.
user expectations are always changing, that's why we call
it an ever moving target.
As consumer technology advances, so do user expectations for system speed,
responsiveness, and reliability.
Again, just think back to 10 years ago, the cell phones that we had, the
phones that we had, they were a major advancement compared to 20 years ago.
But, we cannot use those devices today just because we demand more, speed.
so technological advancements also introduce new demands.
Again, the classic example would be your, cell phones.
your mobile phones with advancements in chip design and
with advancements in CPU and GPU.
Thank you.
The phones always perform better than the previous generations.
The performance might be marginally better year over year, but when you
compare it to 10 years ago, you would see a drastic performance improvement.
So put in other words, you cannot take a phone that was launched 10
years ago and use it in today's world.
the applications designed for phones today, and so on wouldn't
work on hardware of, 10 years ago.
Changing standards is another moving target, which is what was
once considered acceptable is now deemed subpar as expectations rise.
to keep pace with technological progress.
Um, another example, deviating from the mobile phone would be AI
applications like real time video processing, demand high computing power.
you have, you get access to 4K, 8K, and, we don't know what else in the future.
but as you, as you unlock more, resolutions.
that will put more processing, that will, that will demand more processing power.
And so you need more, more resources to solve that challenge.
and then there's the human element and performance challenges.
things like, things pertaining to a human element would be
impact of code and configuration.
So inefficient coding practices obviously could lead to, subpar performance.
or, suboptimal performance.
and if you use suboptimal, algorithms and misconfigurations, you would
all in those cases too, you would, obviously see performance
issues, variability and skills.
Not everyone, is, has the same level of skillset.
So that's why, the top engineers going to only a few companies, and
and so with variability in skills.
comes, variability in product that is being, put out there.
so the skill levels and practices of developers vary, which can lead to
inconsistent performance within systems.
another, human element would be uncreditable user behavior.
A good example of that would be sudden spikes in traffic.
let's say a new product was launched, Or, everyone is, or YouTube suddenly,
lets all users, download, videos, in, in, in the highest resolution.
So if everyone tried to download the same video, it would obviously
put a, put, a strain on the system.
just certain spikes in traffic or unexpected data growth.
can create an unforeseen system strain like, unexpected data growth would be,
you haven't, really, accounted for, a good example again would be YouTube,
where, at one point, the maximum number of views, or the displayed view count
on YouTube was, a certain range and then, with that, with some videos
exceeding that, they had to change that.
So things like that, even though in that example, it's not a performance issue.
you could see why, sudden, unforeseen systems train could
lead to performance issues.
again, example of that would be a sudden increase in user activity
during a promotional event can overload systems not designed for such spikes.
So, and so then let's discuss about resource constraints and,
and the optimization cost element.
Good examples of that would be limited IT resources.
Organizations often operate under constraints with limited
access to hardware, software, licenses, and skilled personnel.
and so with limited resources, you have to work within your budgets, and if
you're not ready to pay, for the newest chip, or if you're not ready to pay for
the best talent that will show in your, that will show in the performance of
the products that you put out there.
So it's always a balancing act.
IT teams must prioritize between optimizing performance, meeting
security and compliance requirements and staying within budget.
Cost performance trade offs, upgrading systems for optimal performance
can be costly, requiring careful consideration to avoid overspending.
An example of resource constraints would be deciding between upgrading server
capacity or allocating funds to other critical IP needs like cyber security.
So let's discuss some advanced solutions and performance management.
Some of the advanced solutions are enhanced monitoring techniques.
You can have the best in class monitoring techniques,
advanced monitoring techniques.
Tools now use machine learning for real time anomaly detection, providing
insights that allow proactive response.
So instead of reacting to an issue, AI insights can tell you beforehand if you
are going to have a performance issue.
And that way you can, do what is necessary to prevent it from, happening.
You also have sophisticated troubleshooting, options
available in today's world.
some of them are AI driven, root cause analysis.
Distributed tracing and chaos engineering.
All of these help to proactively detect and address potential issues.
Holistic optimizations strategies, those include auto scaling.
These are mostly on the infrastructure side where you configure your load
balancers for auto scaling and DevOps.
DevOps practices allow more efficient resource use.
and a smoother user experience.
An example for advanced solutions would be implementing auto scaling during
peak usage to ensure that systems can handle increased traffic smoothly.
now let's discuss the challenges to, the definitive solutions.
some of the, excuse me.
So unintended consequences, optimizing one area can inadvertently cause
issues in another, especially in tight integrated systems.
I've seen this a lot where.
You fix one area of the code and break something else.
So it involves, good communication.
solving performance issues, involves communication between other teams.
also you being aware of not just your, area of expertise, but what
could be the potential impact of you changing something within the
system to make performance better.
If it impacts someone else, maybe.
Maybe, you have upgraded to a new database connector and that
database connector is not supported to one of the integrated systems.
now suddenly you have, your integrations failing and so on.
So unintended consequences, are one challenge.
Then you have the trade off conundrum, which means improvements
in aspect of performance often require sacrifices in another, such
as speed versus data consistency.
so that's pretty self explanatory where you are looking for speed,
but, you're missing your data is missing or, you're not getting
the cleanest of data and so on.
Then you have the innovation paradox, the rapid technological advancements introduce
new performance challenges, just as solutions for older challenges, emerge.
So again, like I said, uh, what you were satisfied with.
10 years ago, you're no longer satisfied with today.
So rapid technological advancements, they always, keep the user, wanting more.
again, a good example would be increasing security measure, measures may add
layers of authentication, potentially impacting system response, time.
So what does this imply for IT professionals?
it means you have to keep Evolving your skill set, professionals need to
shift from reactive troubleshooting to proactive predictive performance
management technique techniques.
You also need to have cross functional skills.
You no longer can just be an expert in your area, like a developer
should know about the DevOps side.
Similarly, a DevOps engineer should know about, should be able
to read code and say, what is a.
What is a bad code versus what is good code and so on.
understanding system architecture, data analytics, and emerging
technologies is essential for modern performance management.
Importance of, communication, like I touched on earlier.
Effectively conveying technical issues and solutions to non technical stakeholders is
critical to gaining support and resources.
Now, for many businesses, IT is not their main, business.
Maybe for an oil and gas company.
If they want to spend more on IT to improve performance, and so on, they
need to first be able to understand as to how it impacts their business.
And so you being able to effectively communicate, maybe they don't understand
that a breach in, an IT system could negatively impact, the business.
So explaining things like that to them, explaining things
like how performance could.
Could cause a loss in business is really important.
The role of IT teams is no longer just maintenance, but also to align their
work closely with business objectives for better overall performance.
So what does future research state and where are we headed?
So there are advanced technologies.
predictive models that are coming up, developing more context aware predictive
models using big data analytics to anticipate potential performance issues.
We also have, another area of future research could be autonomous
optimization, that is researching into self adjusting systems.
We don't have that.
today, that can automatically configure and allocate resources in real time.
Human AI collaboration, exploring effective ways to
combine AI driven insights.
with human expertise to enhance troubleshooting and optimization,
where, an example would be you have AI techniques to notice trends where
it could alert you to potential performance issues and you proactively,
jump on and, and see what's going on.
Standardization of metrics, creating standardized performance
benchmarks to allow comparison across diverse IT environments.
That could also help with, future research could be targeted to towards
that where, you do have benchmarks today, but they're not the same benchmarks do
not apply to, all the systems and so on.
an example would be industry wide standards for system latency or
data consistency would enable better benchmarking and performance evaluations.
So in conclusion, the paper talks about, it acknowledges persistent challenges.
Performance issues are challenging, but drive innovation and IT.
It's because users are never satisfied with performance
that you always see innovation.
You have opportunities for proactive strategies.
Embracing proactive management, leveraging emerging technologies, and fostering a
performance oriented culture are crucial.
What lies in the road ahead?
Continued research into predictive analytics, autonomous optimization,
and standardization of metrics will pave the way forward.
And as a final note, Although a final solution may be elusive, even
with all of these advancements at some point will become the norm.
and then you'll, there will always be, challenges as of that date and
years from now, which will mean the quest never ends and you keep, you
keep trying to, improve performance.
so although a final solution may be elusive, the pursuit of optimal
performance will continue to inspire advancements in IT, pushing
professionals to innovate and excel.
Thank you.