Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello, everyone.
My name is Daniel, and, I'm a data specialist, or I like to call myself.
I revolve around the data analysis, machine learning, and even as far
to natural language processing.
Also, I'm a data mentor and a programming mentor.
I tend to assist people that are looking to get on the data space
and also on the programming space.
Today, I'll be speaking on DataSense, DataCentric, which
I shortened to DataSense, an app that I built from scratch.
Thank you.
Using fully on the streamlit framework in Python.
I'll be giving a live demo of the app and I'm going to get started.
Turn this on.
Here we go.
So data analytics with Streamly by me.
DataSense, I have been using Power BI.
I've been using a tab view, but mostly Power BI.
And I realized that you cannot really get logged in with Power BI if you
don't have an organization on email.
for people, beginners that are just new to the field, that does not know anything
about programming or analytical analysis, they can't just get in Power BI if
they don't have, for example, students.
I felt like this is a very major constraint for those
learning from my own experience.
I felt like I should create a hub.
That will be easy for them to use a learning curve for
them and for me as well.
So I did so many research and I came up with Datasense.
So it's a Python powered solution for data exploration, visualization.
It simplifies the whole process of analysis, starting from
putting, uploading your data down to creating a visualization.
It's a very straightforward, I made it as straightforward as possible,
making, analyzing the data set.
and it's encouraging, encourages self learning.
So I found some problems, and I created a solution with data science.
I found that manual data analysis is time consuming.
When you get your data into Power BI, it's on you to actually pick the
one that has to be on the X axis.
And when you put it in, it gives you error without really telling you
that, oh, this is the wrong data.
This is the wrong column to put into this as is.
And it's so time consuming, error prone, and many of them are non user friendly.
Especially for learners that are trying to get on the platform.
You have to go through probably learning via Udemy or something to actually
understand the app as it is built.
So I solved this problem by creating a very straightforward interface that you
get your data in, takes you directly to the data, you make your analysis,
and you're out as simple as possible.
I added an interactive dashboard session powered with Plotly.
Plotly is a visualization library on Python that gets you more
information on the data compared to the most popular Matplotlib.
Engineers that don't want to start making so many analysis,
writing the codes and all that.
I did an exploratory data analysis for them that you can just click on the
chart and you get your analysis done.
And you know the context of your data.
You know the skew at which your numerical data, where it's aligned to.
You get your analysis and you go on further to do your machine learning code.
So some of the key features, the major features of the app
is data upload by management.
You have the option to upload CSV file, most common CSV upload JSON
and you can upload Excel spreadsheet, and then you can get sample data
from that and quickly go on to get descriptive analysis of your data.
You can get the, as many, how many columns are in your data, how many
rows, and you can get the statistic measure of your data, the mean, the
mode, the mean, the standard deviation.
The 21st, 25th percentile and whatnot.
Then you can go to the automated exploratory data analysis
that I mentioned just now.
You can get, you can just click on the chart and do your post plot analysis.
Or probably you want to check, the count plot.
How many of this, say cities are in your data, you can check it easily
and then you can do your dashboard.
You can get, create a dashboard with up to four, four different
graphs, four different shots.
with as many graphs as possible.
So now I'm going to log up onto the app so I can show you the
demo of how everything works.
Let's get started.
So here we are, this is the app.
data sent, everything here was written by me.
I wrote everything with the emojis and the, and the, the fonts.
So data analysts are short on time.
Quickly go to the data page and test the functionality.
Actually now, at the forefront, you get the navigation.
This is where you can go to the data page.
Data page, upload your data.
You can go to, the dashboard.
There's no data yet.
So it tells you that no data set available.
Please.
Go back, click on this link, the data page, to go upload the data.
Then you can also go to the exploratory dashboards analysis,
which tells you the same thing.
Go to the data page, which we will do now.
Here you have two options, upload data.
Just browse your files, you can upload your data locally.
And, put your data in here, the one you want to analyze.
Or, you can use my sample data just to test the app out,
which I'm going to do now.
These stuffs, you can read them if you want.
It's just a quick guide on how the app works, but we're not going
to go through reading them now.
Because I'll be doing the demo.
And then you can pick data.
Yeah, you can pick FHIR data or you can pick employee data, depends
on which you want to test out.
We'll go with the FHIR data first, and then it loads successfully.
The beginning, it gives you everything about the data, the top five rules.
This is akin to writing pandas.
head.
So this is the code, pandas.
head, bracket five is the code behind this stuff.
You can zoom in if you want, or you can download this place as it is.
But we're not going to do that now.
Then you can get the data information.
You can see that non null, or the rows and there's no nil,
null rows or empty colon or rows.
And you can see the data type of each object.
As in, object floats with number integers, etc. Quickly
you can go on to do your IFIs.
If you want to do exploratory data analysis you can just go on to do
your exploratory data analysis.
If you want to create a dashboard, you can go on to create a
dashboard as quick as possible.
Then you can get the row count.
So we have 44 rows and eight columns in the data.
You can check them.
You can count them here to confirm that they are correct.
Then descriptive analysis of the data, as you can see, this gives
the statistical value of the numerical columns in the data frame.
So other columns are omitted.
You get the counts.
There are 44 rows.
You get the mean of each row, and then you can compare it to the max easily to
see if your data is, if it's skewed or there are so many, what is it called?
Appliers in your data.
You can get the mean data, the standard deviation, 25th percentile,
50th percentile, 75th percentile.
straightforward, you can go on to the dashboard.
The dashboard is where I actually find very interesting because I said auto
use because it's not this closely.
It's actually popular now.
But so many people, especially beginners, don't really get on to use Plotly because
Matplotlib is a bit easier to get started on and it's more popular than Plotly.
So I did a little how to use step for those that are new.
Select your visual from the options provided.
So I have a list of options here.
Which I can actually remove anyone or add anyone as I see fit.
So going on the problem something that a solution here is going with power
bi You just get the random casual shards that you can use just like
here, but you can't really Arturito, you can go to the, the visuals and
get more visuals, but some are bots.
The ones that are not bot aren't really that good, but here you can just behind
the scene, just write the kind of visuals you want, put it in and it appears here.
So that's how interesting this is.
So get started with, area sharp Get started with area sharp So getting
started with area sharp It tells you the exercises So I try to simplify
it as much as possible Removing the numerical value from these exercises So
you don't get an error Say This is the wrong as is because it's an area shot
or you get a very wrong shot And then you don't understand what you're doing.
So I simplified it to remove the numerical value from the exercise So we're going
to start with the region of England and then for this place You cannot really
use two object coulombs in an area shot.
So I removed that as well I removed the non numerical coulombs as well to make it
easy for beginners to get started on it.
So we'll start with areas, bombs and acres.
You can add more area shots if you want, but I'm not going to do that now.
boom, you get your area shot.
Interactive, you can zoom in.
To see all the necessary, shots, the points, where it says,
Region of England is 3377.
You can look at the points, you can interact with your shots.
You can take a snapshot and then upload to your PowerPoint to make your slides.
Zoom out.
We go to visualization two.
You click the link, say, pie chart, which is another of my chart.
Choose the value, just like the same error in area chart.
I limited it to the value.
Let's say, the area would, oh, let's go with the incident
and then region of England.
You confirm.
You can see you get the incident.
You can see that the ISDA is 15.
2, which is from Southwest.
And then 14.
8, you get your visual.
You get your interactive charts as fast as possible.
And you get a description on top.
You can download it as well like the previous chart, or you can continue.
Let's go for the next one.
The next one is, this scatterplot.
With scatterplot, for scatterplot, you need two columns that are, say, numerical.
And I made sure of that here.
areas, white fire areas, and sometimes you don't really have to, get two columns.
Choose column to get, so this is an error.
You reset, click on it, you reset.
And then for the other visualization, say, wire life, And then you go in
and then you get your scatterplot.
Each dot you get to look into each dot.
Let's zoom in to get each dot.
You get to look into each dot and know the key as to why they are there.
We have some outliers in here.
We have some outliers in here.
This is the small area but high fire.
This has a larger area but low fire duration.
And then we have, the major.
Keys in here.
So moving on, this is, then you can, but the sweating about this is I
can increase it to visualize five visualizations, six visualization,
seven visualizations down to infinity.
And I can reduce it to just one visualization to make it easier for users.
That is the create in about making an app yourself.
it's dynamic to your own taste.
You make the changes as much as you want.
So let's go into, The bar chart, which is another popular chart, exercise region of
England again, or region code, whatever.
And then what have I not done this and boom, you get it.
Get interaction as well.
Get your header, and then you can zoom in, zoom out, autoscale, whatever.
And something I failed, I remember I failed to do is show you some,
a good feature of this area chart.
Let's say with line chart, region of England to, area.
Then say you want to compare two columns together.
You can click on add additional column.
Then you choose an option.
And then the initial column you actually added is removed,
automatically removed from it.
Then you can say, white file, normal.
You can add as many columns, but we're going to go with one now.
And voila, you get two columns.
The one that is patented and different colors.
Okay, I made sure to add getness color to my code to make sure that
every other color is different and it's interactive as well.
Let's zoom in to check out the interactivity.
Interactive as well and it tells you duration of, it gives you keys.
Then you go in, it tells you this.
Go in this Understand so so easy and simple to do can Take a snapshot and
continue your analysis or whatever then go let's go back to the topic
when you're done you can reset Automatically it comes back, you
get to pick another visualization and continue as much as possible.
we go back to exploratory data analysis, or you can go to the data
page and come back, depends on you.
So for exploratory data analysis, I made it as easy as possible.
In the dashboard you get to pick the visualization, do this.
You pick the translation, choose the column, you draw another as it's a bit
slower than exploratory data analysis.
So the best one, which I put at the top is Perth Plus.
Clicking on Perth Plus, you don't have to do anything.
You just click on it and it gives you the combination of all the
numerical data in your data set.
It compares them together.
So you are basically going on it.
, but the without the interactivity of.
Plotly, this is, the exploratory data analysis is powered by Matplotlib.
So it does not have the interactivity of what Plotlib does, but it's almost there.
You can see the histogram of each data, how they compare to each other,
the scatter plot of each of them, the app layers, etc. You can get to
understand your data as fast as possible.
Moving on, let's say you want to do a box plot, click on it.
And when you click on it, it does not affect the other plot.
It works on its own accord.
Everything is separate from each other.
Even though they're together.
I try to separate them from each other as much as I could.
So going on the bus plot here, you choose the region you want, automatically it
gets the first region and get to make your analysis with both blood on both blood.
You can zoom in.
If you want zoom out, you can move on to say, count blood.
We want to count the amount of the row or the column, especially you have, this has
already done the region of England for us.
You can count, say, the number, which is not really visible to count.
Let's say region code.
So magically it's loose and voila.
It's out.
You can see, sorry, it's almost with lowest with this and then the highest
is E to F, and then line chart.
Unlike the interactive dashboard, you don't get to
put as many lines as possible.
You just need this for your visualization.
Again, it does it automatically and then you can then choose any
other visualization you want.
It loads and.
Boom.
Changes.
And then you can change again, say wildlife incident number, loads,
changes.
So it's as easy as that.
That is the end of how this app works.
When you have time, if you are really interested, you can message me and
then we can probably work together and I will try and increase the works of.
We can increase the, functionalities in it.
And then let's go on how the app actually works.
you can fork this app on Github, and get started, make the
edits, and I will accept it.
can work together.
Let's go how I created this thing.
home.
And, I added this connect to me on LinkedIn.
And the GitHub repo.
We'll go on to the GitHub repo now.
This is the main repo.
And then we have the dashboard.
Python dashboard.
Data.
py.
Here you can upload the data.
Exploratory data analysis.
The error handler to make sure that we don't get errors from lagging or
unexpected errors from using the app.
Then the py and then visualization.
py where all the visualizations actually happen.
So let's start with the data.
py.
Pandas, obviously.
Pandas is the most popular data analytics library.
We have DAX, but I think Pandas is easy to learn, and it's very popular,
so as a beginner, and an enthusiast into data analysis, it's the first.
Library you should let get started on with stream leads.
Whatever I uploaded there then luckily with Streamly it as you
can apply CSS to your app, which is what these stuffs that are in here
CSS to beautify the page though.
I'm not really good with CSS.
I confess But I tried my best to make sure that It's easy to read.
The functionalities are easy to understand.
And moving on, let's cut the shot.
So navigation, the navigation, which is in here, just at the top
of the screen, get to navigate it.
And, yeah, that is it.
And then main, this is where I applied, apply CSS.
I called the apply CSS.
Down here and then let's get on to the data.
This is the part for both Sample data that we use is on my GitHub repo as well.
You can check it out if you want.
It's right at The sample data path.
So this is it.
There's text.
xls but I didn't add it to the what is it called?
The codes.
So going back in here.
Give me a minute.
Navigate to page name, you can N Navigate, which is Session State Power.
We stream that is session state, which when you navigate it automatically
goes down to the app, the page.
You want all within the same page, like you don't have to load and go onto
another page like Abnormal Browser, but with Session State, you get to
go to any page within the same page.
going down to the radio option with streamleads is, let me show you.
this is the radio option, upload data or sample data.
This is the radio option.
And then I did a logical statement here.
If the data is, if there is uploaded data, then file uploaded is false.
You pick the option is false.
If not, then gives you the option to upload this particular data and the type.
So I specified the type.
It has to be in CSV.
In JC, in Excel, both Excel, old and new Excel, data.
If you don't have to upload your data, then the other option is using, use
sample data, like I just showed you.
And you can pick the data either fire data Employee data and all
that you have a really good to understand it better So I don't waste
so many time with employee data.
I made sure that the date and time The date column I turn them to I
Make sure they are in this column so you don't get errors with it.
And I think with upload data You can actually automatically all date
columns in your data automatically using pandas dateTime function.
I made sure compress all of them to date and time so you don't get error.
You can work with your data as fast as possible.
Then after you get the tabulation, the data information, the data preview, the
colon row account and the descriptive analysis, guest flow retreat.
Eda.
py Again now I use matplotlib as pyplus and seaborn as sns.
Seaborn beautifies matplotlib.
So for say dashboard.
okay, dashboard is connected from visualization.
For dashboard, I use Plotly as express and Plotly.
grabobject as go to give it its interactivity of.
So at least you can check them out at your free time.
But I can't really go through all the codes now.
So going back to my demo overview.
So basically, you see how smooth in between sections were.
Now the technologies I use, obviously I use Python, which is
the obvious one, but for the backend development, everything I developed
was based on, entirely on Python.
Then Streamlit, I hosted the stuff on, I used Streamlit, powered, Codes and then
I will set it on the Streamlit cloud where that's why sometimes it takes A few
seconds to load because it's actually free as the good thing about Streamlit is free
You don't have to pay to actually get into it like Instagram, for example It loads
and oh, I forgot to show Instagram with Instagram You can select the bins you want
normally when I do my coding I put it at 100, 100 bins, and then this is 100 bins.
Let's increase it to 200.
Oh, I think there's something, there's an issue with this.
Let's get a good numerical data.
And, boom, you get your, say, added, KDE line.
So let's go on to 100 pins.
Then, reduce this.
And we see that the data are not really together.
So you have a lot of outliers in here between, from 4, 000 to 10, 000, you
have a lot of outliers and then the data was most of them are in 2000.
For areas, for wildlife incident number, let's start with that.
Say it's skewed to the left, even though there are outliers, but fewer
ones than, the area, areas burnt.
Then you can say this is skewed exactly to the left.
Let's do it this way, increases the bin, or reduces the bin as much as you can.
that will be, that is the how it works to Streamlit, and all my visuals, I
use Matplotlib, I use Plotly, I use Seaborn, and I use Plotly Express, all
inside Plotly, as Plotly to actually make it as interactive as possible.
Then I deployed it on Streamlit, Cloud.
lesson learned, the key thing that I learned is that You can actually
build anything with programming.
Any app that is out there has been built by something, and
you can actually build it again.
Though it might not be as high, as big, or as developed, or as modern
as the app that is being maintained.
Because people get paid to do it.
But you can actually build something within your own
confines, within your own comfort.
And make it work.
you can get user importance of user friendly design and interactivity.
And the I it handled though I didn't show it, but the app, I've tested it before.
You can handle it more larger data sets, not just 47 columns.
It can add up to 30 equals and about 500 rows for future enhancement.
I'm planning on looking to add.
Machine learning for machine learning features.
So you can just come in, say you want to do Random forest regression, you
can just click on random forest just like the bus plot and then put in
the extreme wide train tests and then do your regression analysis or you
can do Sg, boost your stream, graded boost analysis or your classification,
random forward classification, A TC, looking to add them in the near future.
And then I'll increase support for adding even data set of up to a million.
But I think that should depend on the cloud, on the Stream D cloud
and how effective it's going to be.
I think this is the end of my.
Talk, virtualization and every other thing, my slides.
If you have any question, you can reach out to me via LinkedIn.
And if you have.
advice or suggestions for me, you can reach out to me as well.
On final note, I'd like to say thank you to CORE42 for allowing me to
display my help to people out there and I want to say a huge thank you
to my family and thank you all for actually watching the video and my talk.
See you next time.
Hello, everyone.
My name is Daniel, and, I'm a data specialist, or I like to call myself.
I revolve around the data analysis, machine learning, and even as far
to natural language processing.
Also, I'm a data mentor and a programming mentor.
I tend to assist people that are looking to get on the data space
and also on the programming space.
Today, I'll be speaking on DataSense, DataCentric, which
I shortened to DataSense, an app that I built from scratch.
Thank you.
Using fully on the streamlit framework in Python.
I'll be giving a live demo of the app and I'm going to get started.
Turn this on.
Here we go.
So data analytics with Streamly by me.
DataSense, I have been using Power BI.
I've been using a tab view, but mostly Power BI.
And I realized that you cannot really get logged in with Power BI if you
don't have an organization on email.
for people, beginners that are just new to the field, that does not know anything
about programming or analytical analysis, they can't just get in Power BI if
they don't have, for example, students.
I felt like this is a very major constraint for those
learning from my own experience.
I felt like I should create a hub.
That will be easy for them to use a learning curve for
them and for me as well.
So I did so many research and I came up with Datasense.
So it's a Python powered solution for data exploration, visualization.
It simplifies the whole process of analysis, starting from
putting, uploading your data down to creating a visualization.
It's a very straightforward, I made it as straightforward as possible,
making, analyzing the data set.
and it's encouraging, encourages self learning.
So I found some problems, and I created a solution with data science.
I found that manual data analysis is time consuming.
When you get your data into Power BI, it's on you to actually pick the
one that has to be on the X axis.
And when you put it in, it gives you error without really telling you
that, oh, this is the wrong data.
This is the wrong column to put into this as is.
And it's so time consuming, error prone, and many of them are non user friendly.
Especially for learners that are trying to get on the platform.
You have to go through probably learning via Udemy or something to actually
understand the app as it is built.
So I solved this problem by creating a very straightforward interface that you
get your data in, takes you directly to the data, you make your analysis,
and you're out as simple as possible.
I added an interactive dashboard session powered with Plotly.
Plotly is a visualization library on Python that gets you more
information on the data compared to the most popular Matplotlib.
Engineers that don't want to start making so many analysis,
writing the codes and all that.
I did an exploratory data analysis for them that you can just click on the
chart and you get your analysis done.
And you know the context of your data.
You know the skew at which your numerical data, where it's aligned to.
You get your analysis and you go on further to do your machine learning code.
So some of the key features, the major features of the app
is data upload by management.
You have the option to upload CSV file, most common CSV upload JSON
and you can upload Excel spreadsheet, and then you can get sample data
from that and quickly go on to get descriptive analysis of your data.
You can get the, as many, how many columns are in your data, how many
rows, and you can get the statistic measure of your data, the mean, the
mode, the mean, the standard deviation.
The 21st, 25th percentile and whatnot.
Then you can go to the automated exploratory data analysis
that I mentioned just now.
You can get, you can just click on the chart and do your post plot analysis.
Or probably you want to check, the count plot.
How many of this, say cities are in your data, you can check it easily
and then you can do your dashboard.
You can get, create a dashboard with up to four, four different
graphs, four different shots.
with as many graphs as possible.
So now I'm going to log up onto the app so I can show you the
demo of how everything works.
Let's get started.
So here we are, this is the app.
data sent, everything here was written by me.
I wrote everything with the emojis and the, and the, the fonts.
So data analysts are short on time.
Quickly go to the data page and test the functionality.
Actually now, at the forefront, you get the navigation.
This is where you can go to the data page.
Data page, upload your data.
You can go to, the dashboard.
There's no data yet.
So it tells you that no data set available.
Please.
Go back, click on this link, the data page, to go upload the data.
Then you can also go to the exploratory dashboards analysis,
which tells you the same thing.
Go to the data page, which we will do now.
Here you have two options, upload data.
Just browse your files, you can upload your data locally.
And, put your data in here, the one you want to analyze.
Or, you can use my sample data just to test the app out,
which I'm going to do now.
These stuffs, you can read them if you want.
It's just a quick guide on how the app works, but we're not going
to go through reading them now.
Because I'll be doing the demo.
And then you can pick data.
Yeah, you can pick FHIR data or you can pick employee data, depends
on which you want to test out.
We'll go with the FHIR data first, and then it loads successfully.
The beginning, it gives you everything about the data, the top five rules.
This is akin to writing pandas.
head.
So this is the code, pandas.
head, bracket five is the code behind this stuff.
You can zoom in if you want, or you can download this place as it is.
But we're not going to do that now.
Then you can get the data information.
You can see that non null, or the rows and there's no nil,
null rows or empty colon or rows.
And you can see the data type of each object.
As in, object floats with number integers, etc. Quickly
you can go on to do your IFIs.
If you want to do exploratory data analysis you can just go on to do
your exploratory data analysis.
If you want to create a dashboard, you can go on to create a
dashboard as quick as possible.
Then you can get the row count.
So we have 44 rows and eight columns in the data.
You can check them.
You can count them here to confirm that they are correct.
Then descriptive analysis of the data, as you can see, this gives
the statistical value of the numerical columns in the data frame.
So other columns are omitted.
You get the counts.
There are 44 rows.
You get the mean of each row, and then you can compare it to the max easily to
see if your data is, if it's skewed or there are so many, what is it called?
Appliers in your data.
You can get the mean data, the standard deviation, 25th percentile,
50th percentile, 75th percentile.
straightforward, you can go on to the dashboard.
The dashboard is where I actually find very interesting because I said auto
use because it's not this closely.
It's actually popular now.
But so many people, especially beginners, don't really get on to use Plotly because
Matplotlib is a bit easier to get started on and it's more popular than Plotly.
So I did a little how to use step for those that are new.
Select your visual from the options provided.
So I have a list of options here.
Which I can actually remove anyone or add anyone as I see fit.
So going on the problem something that a solution here is going with power
bi You just get the random casual shards that you can use just like
here, but you can't really Arturito, you can go to the, the visuals and
get more visuals, but some are bots.
The ones that are not bot aren't really that good, but here you can just behind
the scene, just write the kind of visuals you want, put it in and it appears here.
So that's how interesting this is.
So get started with, area sharp Get started with area sharp So getting
started with area sharp It tells you the exercises So I try to simplify
it as much as possible Removing the numerical value from these exercises So
you don't get an error Say This is the wrong as is because it's an area shot
or you get a very wrong shot And then you don't understand what you're doing.
So I simplified it to remove the numerical value from the exercise So we're going
to start with the region of England and then for this place You cannot really
use two object coulombs in an area shot.
So I removed that as well I removed the non numerical coulombs as well to make it
easy for beginners to get started on it.
So we'll start with areas, bombs and acres.
You can add more area shots if you want, but I'm not going to do that now.
boom, you get your area shot.
Interactive, you can zoom in.
To see all the necessary, shots, the points, where it says,
Region of England is 3377.
You can look at the points, you can interact with your shots.
You can take a snapshot and then upload to your PowerPoint to make your slides.
Zoom out.
We go to visualization two.
You click the link, say, pie chart, which is another of my chart.
Choose the value, just like the same error in area chart.
I limited it to the value.
Let's say, the area would, oh, let's go with the incident
and then region of England.
You confirm.
You can see you get the incident.
You can see that the ISDA is 15.
2, which is from Southwest.
And then 14.
8, you get your visual.
You get your interactive charts as fast as possible.
And you get a description on top.
You can download it as well like the previous chart, or you can continue.
Let's go for the next one.
The next one is, this scatterplot.
With scatterplot, for scatterplot, you need two columns that are, say, numerical.
And I made sure of that here.
areas, white fire areas, and sometimes you don't really have to, get two columns.
Choose column to get, so this is an error.
You reset, click on it, you reset.
And then for the other visualization, say, wire life, And then you go in
and then you get your scatterplot.
Each dot you get to look into each dot.
Let's zoom in to get each dot.
You get to look into each dot and know the key as to why they are there.
We have some outliers in here.
We have some outliers in here.
This is the small area but high fire.
This has a larger area but low fire duration.
And then we have, the major.
Keys in here.
So moving on, this is, then you can, but the sweating about this is I
can increase it to visualize five visualizations, six visualization,
seven visualizations down to infinity.
And I can reduce it to just one visualization to make it easier for users.
That is the create in about making an app yourself.
it's dynamic to your own taste.
You make the changes as much as you want.
So let's go into, The bar chart, which is another popular chart, exercise region of
England again, or region code, whatever.
And then what have I not done this and boom, you get it.
Get interaction as well.
Get your header, and then you can zoom in, zoom out, autoscale, whatever.
And something I failed, I remember I failed to do is show you some,
a good feature of this area chart.
Let's say with line chart, region of England to, area.
Then say you want to compare two columns together.
You can click on add additional column.
Then you choose an option.
And then the initial column you actually added is removed,
automatically removed from it.
Then you can say, white file, normal.
You can add as many columns, but we're going to go with one now.
And voila, you get two columns.
The one that is patented and different colors.
Okay, I made sure to add getness color to my code to make sure that
every other color is different and it's interactive as well.
Let's zoom in to check out the interactivity.
Interactive as well and it tells you duration of, it gives you keys.
Then you go in, it tells you this.
Go in this Understand so so easy and simple to do can Take a snapshot and
continue your analysis or whatever then go let's go back to the topic
when you're done you can reset Automatically it comes back, you
get to pick another visualization and continue as much as possible.
we go back to exploratory data analysis, or you can go to the data
page and come back, depends on you.
So for exploratory data analysis, I made it as easy as possible.
In the dashboard you get to pick the visualization, do this.
You pick the translation, choose the column, you draw another as it's a bit
slower than exploratory data analysis.
So the best one, which I put at the top is Perth Plus.
Clicking on Perth Plus, you don't have to do anything.
You just click on it and it gives you the combination of all the
numerical data in your data set.
It compares them together.
So you are basically going on it.
, but the without the interactivity of.
Plotly, this is, the exploratory data analysis is powered by Matplotlib.
So it does not have the interactivity of what Plotlib does, but it's almost there.
You can see the histogram of each data, how they compare to each other,
the scatter plot of each of them, the app layers, etc. You can get to
understand your data as fast as possible.
Moving on, let's say you want to do a box plot, click on it.
And when you click on it, it does not affect the other plot.
It works on its own accord.
Everything is separate from each other.
Even though they're together.
I try to separate them from each other as much as I could.
So going on the bus plot here, you choose the region you want, automatically it
gets the first region and get to make your analysis with both blood on both blood.
You can zoom in.
If you want zoom out, you can move on to say, count blood.
We want to count the amount of the row or the column, especially you have, this has
already done the region of England for us.
You can count, say, the number, which is not really visible to count.
Let's say region code.
So magically it's loose and voila.
It's out.
You can see, sorry, it's almost with lowest with this and then the highest
is E to F, and then line chart.
Unlike the interactive dashboard, you don't get to
put as many lines as possible.
You just need this for your visualization.
Again, it does it automatically and then you can then choose any
other visualization you want.
It loads and.
Boom.
Changes.
And then you can change again, say wildlife incident number, loads,
changes.
So it's as easy as that.
That is the end of how this app works.
When you have time, if you are really interested, you can message me and
then we can probably work together and I will try and increase the works of.
We can increase the, functionalities in it.
And then let's go on how the app actually works.
you can fork this app on Github, and get started, make the
edits, and I will accept it.
can work together.
Let's go how I created this thing.
home.
And, I added this connect to me on LinkedIn.
And the GitHub repo.
We'll go on to the GitHub repo now.
This is the main repo.
And then we have the dashboard.
Python dashboard.
Data.
py.
Here you can upload the data.
Exploratory data analysis.
The error handler to make sure that we don't get errors from lagging or
unexpected errors from using the app.
Then the py and then visualization.
py where all the visualizations actually happen.
So let's start with the data.
py.
Pandas, obviously.
Pandas is the most popular data analytics library.
We have DAX, but I think Pandas is easy to learn, and it's very popular,
so as a beginner, and an enthusiast into data analysis, it's the first.
Library you should let get started on with stream leads.
Whatever I uploaded there then luckily with Streamly it as you
can apply CSS to your app, which is what these stuffs that are in here
CSS to beautify the page though.
I'm not really good with CSS.
I confess But I tried my best to make sure that It's easy to read.
The functionalities are easy to understand.
And moving on, let's cut the shot.
So navigation, the navigation, which is in here, just at the top
of the screen, get to navigate it.
And, yeah, that is it.
And then main, this is where I applied, apply CSS.
I called the apply CSS.
Down here and then let's get on to the data.
This is the part for both Sample data that we use is on my GitHub repo as well.
You can check it out if you want.
It's right at The sample data path.
So this is it.
There's text.
xls but I didn't add it to the what is it called?
The codes.
So going back in here.
Give me a minute.
Navigate to page name, you can N Navigate, which is Session State Power.
We stream that is session state, which when you navigate it automatically
goes down to the app, the page.
You want all within the same page, like you don't have to load and go onto
another page like Abnormal Browser, but with Session State, you get to
go to any page within the same page.
going down to the radio option with streamleads is, let me show you.
this is the radio option, upload data or sample data.
This is the radio option.
And then I did a logical statement here.
If the data is, if there is uploaded data, then file uploaded is false.
You pick the option is false.
If not, then gives you the option to upload this particular data and the type.
So I specified the type.
It has to be in CSV.
In JC, in Excel, both Excel, old and new Excel, data.
If you don't have to upload your data, then the other option is using, use
sample data, like I just showed you.
And you can pick the data either fire data Employee data and all
that you have a really good to understand it better So I don't waste
so many time with employee data.
I made sure that the date and time The date column I turn them to I
Make sure they are in this column so you don't get errors with it.
And I think with upload data You can actually automatically all date
columns in your data automatically using pandas dateTime function.
I made sure compress all of them to date and time so you don't get error.
You can work with your data as fast as possible.
Then after you get the tabulation, the data information, the data preview, the
colon row account and the descriptive analysis, guest flow retreat.
Eda.
py Again now I use matplotlib as pyplus and seaborn as sns.
Seaborn beautifies matplotlib.
So for say dashboard.
okay, dashboard is connected from visualization.
For dashboard, I use Plotly as express and Plotly.
grabobject as go to give it its interactivity of.
So at least you can check them out at your free time.
But I can't really go through all the codes now.
So going back to my demo overview.
So basically, you see how smooth in between sections were.
Now the technologies I use, obviously I use Python, which is
the obvious one, but for the backend development, everything I developed
was based on, entirely on Python.
Then Streamlit, I hosted the stuff on, I used Streamlit, powered, Codes and then
I will set it on the Streamlit cloud where that's why sometimes it takes A few
seconds to load because it's actually free as the good thing about Streamlit is free
You don't have to pay to actually get into it like Instagram, for example It loads
and oh, I forgot to show Instagram with Instagram You can select the bins you want
normally when I do my coding I put it at 100, 100 bins, and then this is 100 bins.
Let's increase it to 200.
Oh, I think there's something, there's an issue with this.
Let's get a good numerical data.
And, boom, you get your, say, added, KDE line.
So let's go on to 100 pins.
Then, reduce this.
And we see that the data are not really together.
So you have a lot of outliers in here between, from 4, 000 to 10, 000, you
have a lot of outliers and then the data was most of them are in 2000.
For areas, for wildlife incident number, let's start with that.
Say it's skewed to the left, even though there are outliers, but fewer
ones than, the area, areas burnt.
Then you can say this is skewed exactly to the left.
Let's do it this way, increases the bin, or reduces the bin as much as you can.
that will be, that is the how it works to Streamlit, and all my visuals, I
use Matplotlib, I use Plotly, I use Seaborn, and I use Plotly Express, all
inside Plotly, as Plotly to actually make it as interactive as possible.
Then I deployed it on Streamlit, Cloud.
lesson learned, the key thing that I learned is that You can actually
build anything with programming.
Any app that is out there has been built by something, and
you can actually build it again.
Though it might not be as high, as big, or as developed, or as modern
as the app that is being maintained.
Because people get paid to do it.
But you can actually build something within your own
confines, within your own comfort.
And make it work.
you can get user importance of user friendly design and interactivity.
And the I it handled though I didn't show it, but the app, I've tested it before.
You can handle it more larger data sets, not just 47 columns.
It can add up to 30 equals and about 500 rows for future enhancement.
I'm planning on looking to add.
Machine learning for machine learning features.
So you can just come in, say you want to do Random forest regression, you
can just click on random forest just like the bus plot and then put in
the extreme wide train tests and then do your regression analysis or you
can do Sg, boost your stream, graded boost analysis or your classification,
random forward classification, A TC, looking to add them in the near future.
And then I'll increase support for adding even data set of up to a million.
But I think that should depend on the cloud, on the Stream D cloud
and how effective it's going to be.
I think this is the end of my.
Talk, virtualization and every other thing, my slides.
If you have any question, you can reach out to me via LinkedIn.
And if you have.
advice or suggestions for me, you can reach out to me as well.
On final note, I'd like to say thank you to CORE42 for allowing me to
display my help to people out there and I want to say a huge thank you
to my family and thank you all for actually watching the video and my talk.
See you next time.