Transcript
This transcript was autogenerated. To make changes, submit a PR.
Greetings, fellow golfers out there, and a warm welcome to
this lightning talk on craft your game with
go. As I was prepping up for this presentation, I was
looking for that aha moment or a punchline to
start off with. I have kind of decided that
I'll reserve it for the next slide.
A little bit of context sitting here. How many of you have watched money Bahal
the million dollar arm? How many
of you still recollect this classic saber metric scene from
Moneyball? A fun fact. These movies
cumulatively cost 140 million at the
box office and are based on sports analytics.
What's the big deal about sports analytics?
Largely the measure of a professional sports team's
success has traditionally been pretty straightforward. The team
either wins or loses. Hey, there's a
lot more to it, Albert. What happens on
the playing field, a lot dependent
on what happens outside it, what decisions are taken
outside it. I'm going to do a bit of a role play here,
case study and look at two specific
cases in the next couple of slides.
You know, what is the problem at hand? What is the analytics problem at hand?
Rather simple cases and how craps can
effectively or simply represent them
in a very simple way. The first case I would look at
is related to player focus.
What makes a player tick? What makes a player
tick when it really matters. If I can plot a
quantitative relationship,
high stake games versus performance, say,
the way in which a player plays, and playoff games or regular games,
if I can effectively represent it and use it for my analytics,
that would be a boon. I want this identification
process to be very simple. I don't want to look
at elaborate slides, lots of data
analysis and data dumps. I want a concise cheat
sheet kind of representation. More specifically here,
a rod who used to play very well
in seasonal games, but was bit of a choker when it comes
to the playoffs. A lot of very specific training
elements were incorporated by coaches to simulate pressure conditions, etcetera,
so that he also ticks off in the playoff games.
Let's look at how a graph can be used to represent this situation.
A very familiar structure, so to speak.
Nodes. You have edges and then you have relationships.
You have the main actors here represented by the notes,
vertices, the coach, a rod himself,
and the playoff games, and how the coach,
the edges, actually represent the relationships. The analysis, the simulation,
and the actual action of playing a
graph says a million words. The next
case I would look at is based on team focus.
Do you remember the Chicago club's strategic victory in the 2016 World
Series? How they overcame Cleveland's strong pitching
and corrie clubbers dominance.
So this was again a relationship Analysis which was
done, you know, on how to tire
the opposition in this case. And then what relationship
did increase pitch counts in the beginning have
again, this can be effectively represented by
a graph as seen here,
like how the pitch count and the sedition drills by the
coach did play a relationship effectively in the gameplay.
Let's go on to a broader view of how
this all spans out in the current industry. The graphs we
saw are just like tiny drops in the ocean.
Knowledge graphs are now the norm of powerful
AI based engines and analytic systems.
Powerful knowledge graphs are now becoming the norm in any
industry, so to speak. Let's now go down to
the underlying technology and
underlying terminologies used and the preferences of
concepts and notations. Why do we use Cypher Ql?
Cypherql is a very effective,
concise and text based representation.
So it's in fact text based visual art, so to speak.
So, you know, if you actually look at it in this particular
case, the relationship over here
is a person lives in so and so city and
knows this person. So look at the way in which it's effectively
represented. This graph. It's truly, truly very simple
and very clean to understand.
So you want your representation to be not
confusing, not ambiguous. And I feel cipher
QL is the perfect choice. There are other choices like
gremlin, which can make it overtly complex. But definitely
in terms of the visual art, the clarity in which the
nouns, the relationships, you know,
the verbs associated with other nouns, so to speak,
are represented is very effective in the Cyberku L
format. Let's look at a 10,000ft
view of how we want the solution landscape to be
here in a cloud native ecosystem.
At the very heart of it you would actually see
a neo four j instance. A neo four j
graph instance which is used for storing the graph data.
Our favorite go based lambda, the high concurrency
and high performant go based lambda which is
used to actually drive these analytics modules.
Neo four j database instance would again be deployed in
an AWS cluster.
And then as far as the scalability goes,
there are some claims that at least
as per the case studies published on the neo four J website, that it scales
like 1000 x faster than your SQL for
greenfield development and building systems ground up.
Having looked at the 1000 10,000ft architecture and
again stressing on the usability, the actual use
case, this is going to solve the sports peaks
analytics users both on mobile and web browsers.
There are a host of other tools used here
right from monitoring, storage, etcetera,
you know, also ensuring that the workflow does
authentication, role management, so to
speak. Going on to a few code
snippets. Hey, one thing in the previous slide
I forgot to mention one part, you would see a small lightning
bolt. Next to the Gopher bolt is the actual
driver used to connect to neo four j and user
query in CypherQL in the Golang ecosystem.
Coming onto our favorite part, a few code snippets.
This is a simple code snippet for the data
definition languages or creating the model using Neo four g.
A bunch of create and update statements used for
representing teams, useful representing players
and what kind of relationship the player
had. And you know, a few other attributes for the player
have been mentioned here. It's very simple. It's just like boilerplate
snippets to just give you the idea,
more ground level code and go lang awesomeness.
In the true spirit of simplicity of Golang, just a few lines of
code to actually connect to the neo four J
driver. Again, to reiterate, the name of the driver is called
bolt. Just some basic authentication at play here
and then a little more awesomeness here in terms of a basic
insert statement in order to insert data
corresponding to a, an item onto
the database. This could be a player, it could be any entity per
se. Going on more further, a little bit of
deep dive. One of the sports my daughter
specifically likes, it's lacrosse. Lacrosse is like a fun
combination between your badminton and, you know, like butterfly catching
and I would say a bit of hockey in
some way. So a lot of graph
based relationships can be used in lacrosse is what I realized over a period of
time. And this is again a snippet which talks
about a few intricacies, about, you know,
how some basic level of identification can be done
using relationships. And graphql. You would look at this
particular idiom here. It's a
low submarine shot and a rainbow
pass, which are pretty tactical moves in the lacrosse space.
And also you have a bunch of other paradigms
here like turnover and save.
That's again neo four j model
model used to set up data for this particular regard.
There can be some advanced functionalities here which can be used
from the graph to actually,
you know, analyze in greater detail. One such
functionality is the cosine similarity. Rather a mathematical
functionality is cosine similarity. So using
a graph database enables us to calculate and compare cosine
similarity between two nodes. In this way,
one can compare opponents, games, players and other nodes against
each other to better understand strengths and weaknesses.
And how to make the appropriate adjustments.
With 30 teams in major leagues, there are bound to be programs
that play similarly to each other from a data science perspective in
aspects of the games that may include type of relief pitching
or stolen base percentages. Evaluation of
your opponent based on similarity allows for anticipating given game
scenarios, which leads to better practice and strategy.
More graph based concepts. You have
a bunch of correlations and indices from the
graph world which can again be used, you know, combined effectively
with go as demonstrated by the few go snippets
to actually come up with advanced analytics.
One more aspect which actually shows the significance
of why sports betting is such a big deal and
analytics is the backbone of sports betting.
So the daily fantasy sports
was red hot in 2014. About 1.5
million Americans paid more than a billion dollars in
tournament entry fees and FanDuel grew about 300%
in active customers. So you
know, there are millions and millions in dollars pumped in by high
stake companies like ESPN and Disney in
the entire betting world. And you know,
Golang, with all its awesomeness, can be used in this ecosystem.
A few useful links on this regard have
been mentioned here in terms of the repository used and the code snippets
which can be used for further analysis.
Finally, to wrap this up, a retrospective on what
this presentation was meant to be and what this was not meant to be.
This presentation is a blueprint, a thought provoker on how sports analytics
can be clapped with the awesomeness of Go ecosystem.
To achieve advanced analytics, there is an indirect
emphasis on using a particular tech stack, like sticking onto
cipher QL, which is like a more easily representable
format for graphs, and then also teasing the
thought of how this can power knowledge, graphs and AI's.
And maybe on the other end of the spectrum, this could also be used to
power tiny devices like tiny graphs
on tiny go. Who knows? That's it for
this presentation for now. Hope you liked it.
Please do drop me a note on the feedback. Thank you so much.