Transcript
This transcript was autogenerated. To make changes, submit a PR.
My name is Dileesh, welcome to my talk on optimizing network performance,
and automation using Apache Airflow, Python and data science.
So let's get started.
So I'll cover this PPT in brief, and then I'll go through, how Apache Airflow
works and give a short demo of it.
I've have it installed on my local machine, so I'll just
give a quick demo of it later.
So let's get started with the presentation.
So introduction to network performance automation, So there are several modern
challenges, that networks face, right now, which is due to the increase in complexity
and, scale of, the devices and the new, technologies that come into place.
So everything needs to have network, right?
So all of these challenges, some of them can be solved by automation.
And the main driver for that automation is Python and the orchestrated
platform that's best built for running Python workflows is Apache Airflow.
So for network challenges, so there are several components, right?
So coming to complexity.
So there are several multi layered architectures, with, a
mix of, legacy and modern systems.
So there are different devices that we connect to and different
protocols that they follow.
And there are different technologies that exist within a single network as well.
And the network has to support all of these technologies in tandem.
So coming to scale, so now due to the proliferation of the internet,
so right now the connected devices that we use, have exploded in scale.
So for example, from your watch to phone, headphones, laptop
devices, any IOT devices that you do and it's growing actually.
The prediction is within billions, within 20.
2030. and, due to the increase in, density as well.
So if you take a grid of, certain, areas and compare the devices
that's that exists right now with previous version of the devices.
So right now, the density, has increased a lot.
and if you want to have a smart city or an industrial IOT city, that it puts really
pressure on the current infrastructure, and it tests the limits of it.
and, there are other challenges such as resource allocation, where you
have to distribute all the bandwidth to all these devices and the compute
power and storage, to avoid bottlenecks for the devices and latency as well.
So coming to latency, you would need a network to perform in a very
optimized way for these devices to.
stream whatever content that you are streaming, right?
So for example, if you have an AR VR device, that streaming data, so you don't
want to have any lag in the network.
So all of these are challenges that are faced by, networks currently.
So a role of automation in improving network performance, right?
So automation is one of the cornerstones for, scaling and
optimizing network performance.
it can do productive fault management where it can detect, any issues with the
network, reroute the traffic through, devices and conditions and, outages.
So you can have some automated scripts run the network and, resolve that issue.
And there are several Python powered automation tools.
So Python has Played a big role previously and plays a big role right now, with
within the networks to proactively fault manage, for these networks.
So you can use Ansible, Netmeco, Napalm.
So these are all, packages within Python, right?
So these are powered by Python.
And coming to dynamic scaling and self healing networks.
So whenever we talk about dynamic scaling and self healing networks,
there are software defined networks, SDNs, and you have your network
functional, virtualization.
So it uses Python as well.
And Python scripts.
Drive a lot of, network and network performance.
So it, interacts with different devices, different frameworks, routers, which
is, any network device that you have.
and these controllers, the devices have controllers that are support Python.
So you can use Python to interact with the device directly, and they have the built
in libraries that are built, that you can leverage to interact with these devices.
So coming to some, real world examples, right?
Apache Airflow, you can use it to, optimize your, orchestrate
your network workflows, including provisioning, monitoring, and scaling,
suggesting provisioning, network speed testing, and semantic similarity.
So these are some of the examples that we've used in the
past, for some of my clients.
and it's proven to be very good.
and very optimized version of, provisioning, monitoring, and scaling, and
improving that for network performance.
So coming to Apache Airflow.
So what is Apache Airflow?
So it's an open source framework.
it's an orchestration tool, basically whatever Python scripts that you have,
you define those scripts in a way it's there orchestrated in a way that you
can run all of your tasks within Python.
and, the core component of Apache Airflow is scheduler, where it schedules
your Python workflows, the way that you mentioned, how you want to run it.
So it's built in Python itself.
So it's built on top of Python.
it uses something called, DAGs.
So directed acyclic graphs, where you can schedule.
All your components in an acyclic manner.
for example, if you want to run three functions, you can define which
function to run first, second and third, and so on and so forth, right?
So you want to run third, first, second, and then first, you can
define it that way, making them dynamic and reusable and scalable.
coming to the key features, right?
Dynamic workflow authoring.
So you can use Python code to define tasks, the dependencies
and schedules for that.
in the configuration section, you can mention how often do you want to
run it, just like a cron job, right?
So you can, write the cron expression for it to run, to your liking every
once in a while, once in a day.
or a however you like, right?
So extensibility.
it can integrate with other systems.
you can have custom operators, hooks.
So there are already several built within Apache Airflow.
and everything is written in Python as well, where you have custom operators,
that can interface with, Azure or AWS or any other cloud provider on any other
system that you would like, scalability.
So it supports distributed systems execution as well.
So it has components like celery, Kubernetes or local executor that can
perform tasks in a scalable manner.
So you basically start out with a local executor.
So these are backend functionalities, right?
So executor is nothing but something that's behind, an Apache Airflow
framework that runs, your workflows.
you can, so initially started with a local executor where you execute these
tasks and the, if you don't care about the, scalability part, you can start
with local and then scale out basically.
So that's how you make the systems distributed, right?
So basically you start out with, vertical scaling, where you have a
single device, single machine, and increase the resources in that machine.
And then you scale up to multiple systems, as you, as your scale and increases.
So monitoring and alert.
So it has integration with, different, Platforms such as Slack, email, Microsoft
Teams, or, discord, anything, right?
So all of the things are built in so you can use that to alert you, and web UI.
So I'll get to and get into this later, but it has a really intuitive
and a really great way to visualize, manage and debug workflows.
coming to the benefits of it, right?
So it simplifies your Python code a lot.
So if you have, let's say a thousand lines of code, it's really hard for
you to, manage and maintain that code.
What went wrong, where it went wrong.
Of course you can do some logging statements and find out what's going on.
But, Apache Airflow gives you a really intuitive web UI where you can.
Log in and check the, check the, how your DAG performs and see where it went wrong.
Exactly.
So it enhances the transparency.
logs and visualizations so you can visualize what's going on with your
workflow and reduce manual interventions.
So use cases of Apache Airflow in network operations, right?
So you can have Apache Airflow run some provisioning.
So it automated tasks like, provisioning physical sims or virtual sims,
managing IMSI ranges and updating systems like SRP, and others.
And, once I've implemented this, it turns turned out to be a very
fast, execution of SIM provisioning rather than the legacy systems.
So Apache Airflow gave us a platform where we can leverage Python and, different.
Parameters of provisioning process to really improve the speeds for first net
HSS, it's first net is nothing but first responder communication network where, all
of the communication for first responders have happened through that first net HSS.
And it turned out to be, really great for that use case.
Network speed testing.
So gathering telemetry data, analyzing, key performance indicators,
triggers, anomalies in network.
So it turned out to be a great use case for speed testing as well.
Coming to configuration management, deploying configuration to router
switches, and other servers and services.
it, it makes it very easy to configure all of your devices.
Thank you within your network and predictive analytics as well.
So the way that you can write Apache Airflow workflow.
You can have all of your ETL tasks performed within the pipelines itself,
and you can have your machine learning models or semantic search models as well.
So how Apache Airflow, helps automate workflows, enhance
efficiency and optimize resource use.
So coming to automation, right?
So Airflow eliminates manual interventions.
that's like how automation works basically, right?
So you trigger your workflow based on scheduler or the external events.
It runs all the tasks that you define it to run and then picks up all
the, things that you want it to do.
Efficiency.
So if you run Celery as an executor in the backend, it can parallelize tasks,
to how many cores you have in the system.
you can have multiple requests process at once and try to use execution time.
It restructures the failed tasks intelligently.
So there is already a built in mechanism where you can handle re
tries, failed, processes and ensure that, the workflow is reliable.
the resources automation, right?
So there is some concept called task queues, where you can manage the balance
and work workloads within multiple, VMs.
where you can use kubernetes executor to scale out all the pods and,
containerized workflow execution.
So you can have your resource allocation as well.
So automation workflows, for large sim provisioning
projects, you can use Airflow.
That we've discussed earlier.
So why use Python for Airflow, right?
So Airflow is built on top of a Python itself, but within Airflow,
you can code Python again.
it's the, for the intuition for using Python is basically easy to
learn syntax, extensive libraries.
So it has a really good open source and all of the machine learning
and, Data related stuff happens in Python in nowadays, right?
if you want to talk about Spark, so there is libraries for PySpark, within Python
itself, where you can use Spark modules and Spark frameworks within Python.
There are several libraries and frameworks that lets you manage, all your
network automation needs within Python.
So netmiko or paramiko, if you heard about SSH communication, these are,
frameworks that help you with, SSH based communication to configure, network
devices and interface with those.
So you can automate tasks like, device configuration, backup troubleshooting.
So you have napalm, pi, SNMB.
You can check this out later.
I'll just give a brief overview of network X is a really good thing
where you can analyze and visualize network graphs and topology and
mapping and simulation of the networks.
pandas and embryos for network telemetry and logs, trends, and anomaly detection.
Basically, pandas, you can create data frames and process objects within Python.
So predictive analytics for network optimization, right?
role of data science in network optimization.
You can identify trends and anomalies in network, performance.
You can predict failures before they occur, reduce the downtime, and you
can optimize the resource allocation, such as bandwidth, processing
power, use for network automation.
So machine learning, right?
So there are several libraries built on in Python for machine learning.
So if you're trying to build a machine learning model, you
don't need to do it from scratch.
Python probably has a library that already does that.
So scikit learn.
So there are several scikit learn.
There are several libraries within Python for machine learning.
You have TensorFlow, PyTorch, Pandas, NumPys, for your data processing.
And, these are, you can use these, libraries for some use cases, such as
failure prediction or anomaly detection.
and automation, use predictions with Apache Airflow, right?
So you can use Python integrates basically with everything, right?
So any device, any network device, any models, any third party
systems that has support for APIs.
it probably have a built in library that it can leverage.
it, enables real time decision making through, trigger events, right?
So, so let's take an example of a workflow tag, right?
So you collect elementary data, you run a machine learning model
for prediction and some kind of, automated action you can trigger with.
So automation of network configuration and provisioning, right?
So using Apache Airplane Python, you can, do some complex stuff, such as
by same provisioning network setups.
You can set up, a script that does provision to systems like
HSS, FirstNet, Gport, HLRmind, and range allocation for SIMs and MCs.
coming to benefits, right?
So it's efficient, accurate, scalable, and monitoring, wise for, Apache
Airflow, everything that I mentioned right now, those are the beneficial
benefits that for using, Apache Airflow.
If you take an example for SIM provisioning, within Apache Airflow,
you collect all the user requests, SIM types, MC ranges, and you execute the
provisioning tasks with different systems.
So each system has one component where it uses either RTELnet, LDAP,
or whatever, communication, way of communication that you do either SSH.
Now to that system and origin, all the sims, you validate those, you
present the log status and, you can have a workflow that step that can,
notify, someone of the completion.
So that's basically, it for how to optimize.
So let's dive right in to, how Apache Airflow works, right?
Airflow is an open source tool, right?
So there is really good documentation on how this basically works.
what, providers, these are all the active providers that,
Airflow supports right now.
So you can have a Cassandra connection to Cassandra, Amazon,
AWS, a lot of stuff, basically.
So if you take Amazon, for example, using Apache Airflow.
they provide all the versions related to, they're within
their, server itself service.
So installing it in your local, right?
So I've installed, my version using, Docker.
So there is really good documentation on how to do it.
So you can play around with it as well.
But first let's make sure our Docker.
Instances running right now.
So Docker engine is paused.
So let's resume that.
So I already have a Docker installed.
Let's go to a terminal.
Let's check.
It's already running.
No.
Yes.
So I've installed earlier.
let's check.
So I have the Docker compose file.
So let's do Docker, Docker compose up.
So it creates all the containers.
and it starts up all the services that are required for So one of the good
thing is all of the containers are built in, using Docker, so you don't need
to create each instance separately.
if you're planning to run a parallelized version of Airflow.
So it's starting up.
Let's see.
So it's started all the Airflow stuff.
Docker, reduce.
Postgres.
Postgres 13 is a database that Airflow backend uses for storing metadata.
So web server.
It looks like all the service is running except for Docker Airflow.
Let's see what's going on.
There you go.
I think it started now.
Let's go to localhost and check if it's running.
So I see the instance is running.
So this is a UI that you get to, you get to log in screen, but default
or airflow and airflow credentials.
so here you have your dags, dags are the workflows that you can schedule, to run in
Python and it shows all the dags, right?
And there are two sections here called active and Post.
So you can have your workflow and you can either have it run or, in
a pause state or an unpause state.
If you want to run it, you can click on this button.
It will make it active.
So there are two, active workflows right now.
If you want to turn it off, you can just turn it off using this button.
So it gives a really good user interface.
You can manage everything here.
So coming to security, it has users, roles, actions,
everything, permissions too, right?
So you can manage user permissions.
if you go to users, you can find all the users here, you can create users, you can
drill down into how you want to manage the RBAC for users, and you have admin section
as well, where you have all your variables and you can basically create connections
to different, Services where it can see it has all the stuff related to, Amazon,
AWS, Azure, Docker or anything, right?
So it has a lot of built in libraries.
You can interface with SQLite database as well.
If you select SQLite, it should give you a connection to a SQLite.
So you give it a host login, password.
It should be able to connect to and there is an option to test as well.
So you can create a connection and test it out coming to workflows, right?
So this is an example workflow that I ran earlier.
So it gives you a really good interface.
So it took 20 seconds to run.
The run type is manual because the schedule is none.
So there is no schedule set up for this.
So, if I run this again, so it should automatically trigger a new Where, if you
check the graph of it, how this looks, so this is how the graph looks right?
So first it goes here.
The logical output is run this.
And you basically, once this is run, you go to this step.
And after this executes, you run all of these steps.
So now this is done, right?
So let's drill down into each of this.
So this is nothing but Python code base.
So, basically you define all of your functions here.
these are just examples that airflow provided.
So this is a. way that you can specify what the DAG is about.
And this is a function basically, print context, right?
So it, this is a task.
This is an operator where you can print the context of what's
going on within this function.
So here there are other functions as well.
Let's see what sleep three does.
So
sleep for three.
So this is task sleep for three.
and you can drill down into event log what happened within the events.
details of this, and you can go into each individual log and
check what happened there as well.
So if you check the logs, so this is nothing but, the log for this task.
So all of the quarks here, you have context, variables.
We have what happened within the log, within the operator.
And if you click on here, you can check out what's going on here as well.
So that is it for Apache Airflow.
so there is, interactive APIs as well.
So it, everything is built in within here.
So let's open Swagger.
So it automatically, it takes, basic code, Airflow and Airflow.
And you can use this APIs externally as well.
If you have Postman, if you want to test it out, you can basically,
8080 and, run the DAG ID.
So it asks for basically.
Few things, right?
DAG ID.
So you can select any DAG ID from here.
Let's check what the DAG ID is for this.
DAG ID is example Python operator, right?
So let's copy this
and paste it here.
Oh, let's do get.
So it gives you all the information about this, right?
So what the DAG is about the file token.
Where it is and what's going on with the, DAG.
So there is a lot of stuff that I want to go out, but I don't think
I can cover it in one session.
But, yeah, you can go to the documentation and, find out, about Apache Airflow mode.
So you can run, so this is basically what I told, right?
You can run your workflows as code.
And security is built within Airflow as well, where you can, there is something
called fernet key, that encrypts all your, passwords, usernames, everything, right?
So basically security is baked in to Apache Airflow.
And this is how we set up a diagram, right?
So this is basically how a diagram, so you define a task.
You'll give an operator and you'll ask Airflow to run the operator.
so that is pretty much it for Apache Airflow, Python and network
automation, and data science.
So if you have any more questions, you can certainly reach out to me.
Thanks for joining me in this session.