Transcript
This transcript was autogenerated. To make changes, submit a PR.
Jamaica make up real
time feedback into the behavior of your distributed systems
and observing changes exceptions.
Errors in real time allows you to not only experiment with confidence,
but respond instantly to get things working again.
Close SAK
about application application for blockchain in crowdsourcing
data what are the problems in data
sourcing? The world is Chainlink around all
the time. Companies formed,
others become smaller and go bankrupt.
New technologies come to light,
governments come into power either through democratic
means or otherwise, and government policies
are changing all the time. We live
in a very turbulent time right now due to coronavirus
and the conflict in Ukraine.
Things that we thought were going to stay
the same are changing in a rapid and
massive way. New data sources
are coming online and evolving all the time
due to Chainlink world, we have more and more data
that's becoming available, just difficult to keep up.
The amount of data is growing massively. 2.5 quintillion
bytes I've generated every day.
Also, each data source has its own design,
some format or language, and its culture
that is embedded in might be different,
it might be working in different laws, require different relations
that can be very hard to keep up.
All this change seems to be on a massive amount
affecting the world. Some data sources
are resistant to quantification. Scientists,
engineers, analysts love the data that
is uniform and quantifiable,
is easy to enrich and transform.
They love standard data formats as they're important.
For example, when calculating inflation, you need
to know what category particular product is in,
how to convert units into a standard format,
and what currency it is measured in.
If you are the scale of Google, everyone will try to serve
you. It is different if you are at scale of a startup.
Large established corporation such as Google has
a massive scale and near monopoly grip on the web,
which means everybody's trying to serve them.
Everybody's trying to adapt their websites,
make it easier for Google algorithm to extract
the data and to rank the website higher than
their competitors. Everyone is trying to
produce content that is suitable for
the Google algorithm. However, if you
are a small startup, your challenges are going to be much different.
Nobody is going to care to transform the data for you.
That means this is going to be your own responsibility. Your resources
are limited, so your challenge is to do it
at any scale.
So what are the advantages of
crowdsourcing data collection?
Naturally, the first advantage is that you can scale
it up easily. As long as you can incentivize
the crowd to contribute your project.
The number of data collectors and analysts can be nearly
limitless. This is like a
cloud service provider for computing or storage
services can also discover new
means to defeat data quantification.
A lot of organizations have data silos.
They do not want to necessarily make them available because it
can reduce their competitive advantage.
However, to interact with individuals such as
their customers, they do have share small
parts of that data set. At least some
have implemented automated data scraping protection, which is curious
considering decades ago Google started doing
scraping without much permission. There was no such thing.
These days everybody knows that Google brings customers
over. However, for others, data access
is often restricted to protect their competitive advantage
while crowdsourcing this data.
This type of protection can be defeated on several
levels because there is no way to share your data
with the customer while protecting it as well.
Another approach for an organization can
take is by rapidly modifying their user
interface so they can change up where things appear
to change, how they look, how they are presented.
However, when dealing with a crowd, this protection is no longer viable
because an individual can always understand and
help quantify it into an easily understandable
format. Sometimes, however, data is also
made available to others through APIs and
interfaces. However, due to large number of different
interfaces and lack of standards, it can be difficult
to have resources to integrate them and this is where
crowdsourcing data can also help. It can
also enable data analysis on a scale
and depth that is not viable for a small organization.
After gathering raw data, it is important to
perform analysis convert it into useful
knowledge. Raw data is only as valuable
as much as you can analyze and obtain insights.
A smaller organization can have difficulty
performing a wealth of different types of analysis
because for each analysis you might need
a different skill set. You might need a different type of expert
which can be very expensive to either contract them
for a short time at a high rate, or to have
somebody permanent which takes time to train and
incorporate into an organization.
But you can also crowdsource analysis if you provide an
incentive. It means that the expert can perform
a small step toward achieving a result
and you can share whatever benefits from the analysis with
the person to incentivisation them. This means you
don't necessarily have to come up with sufficient capital
in advance to contract
the expert. You can also collect data from online
and offline sources. A lot of the data is online
and more and more of it comes available.
However, no matter how effective the collection
and upload means are,
there still is always more information
available. Offline data is also available offline
first, therefore before it
can be become online. When crowdsourcing
the data, you can use the crowd wisdom to
note offline events and incentivisation
the crowd to share those events as
they happen this way to enable functionality
that is more rapidly done than without
a crowd. What are the
differences between centralized and decentralized
systems? Centralized systems can be less
complicated to build and to control.
However, they are opaque. They are not tolerant
to attacks and can be less secure.
They can be less scalable and have a central
point of failure, at least the organization that created
them. Decentralized based systems
can be more scalable and fault tolerant can
be more secure. As all nodes are treated fairly,
it eliminates the need for intermediaries
and is more transparent.
So how decentralized are blockchain?
There is no such thing as a perfectly decentralized system.
Nakamoto coefficient was invented to attempt to
quantify the level of decentralization of a blockchain
system. It takes into account various parameters
such as how many active wallets are there, how many developers are there,
how many nodes active, and so on.
It might not be a perfect measure
because bitcoin has three large mining
pools that in principle could coordinate what's
called a 51% attack. However, its Nakamoto
coefficient is the highest. An attack like that would result
in bitcoin's price dropping and the
value of their mining work being reduced, and as such,
there is no incentive to do so.
What is a blockchain? A blockchain is a decentralized
ledger which is implemented as a
series of blocks. Each block contains a number
of transactions and refers to the previous block
and its hash. This ensures that the order of
transactions can always be verified.
Modern blockchains also support having smart
contracts, which can run code based on the conditions
in the blockchain and can interact with the outside
world through interfaces called oracles.
For example, you could set up a smart contract that
pays somebody cryptocurrency into their account every
month, like a loan.
Or you could set up a system where people play roulette
and this would be more transparent than a traditional casino website,
as you could verify everything on the blockchain.
Blockchain also can incorporate
actual legal contracts with people that are
called ricardian contract. However, this is largely
unrecognized to date.
Blockchain requires a consensus algorithm to
make sure that we know which data is valid.
For example, bitcoin is using longest
chain consensus, which means that everybody
is trying to build on the longest chain to make
sure their work is included.
Bitcoin is using a proof of work consensus mechanism,
which is very energy intensive.
Alternative consensus mechanisms such as
proof of stake have been implemented to solve that problem.
On other blockchains such as cardano.
What is blockchain good at? It is good
at incentivizing people. It means by
giving them something of value, often of monetary value.
By providing small incentives for actions done
on the blockchain, you can encourage people to work
together with you. A lot of cryptocurrencies
used on the blockchain are deflationary,
which means over time they appreciate in value.
That can be a powerful long term incentive for
some investors of sweat equity.
It is good at determining the order of things.
As blockchain. All transactions
in a block have to be ordered. It is easy
to determine if someone, for example,
provided a piece of information before somebody else,
and to make sure the first person to do so gets a
reward. It is good at anonymity or
pseudonymity, as most blockchains are pseudonymous,
which means that you do not know who owns our
particular wallet. But normally you can track the
transactions from one address to another.
Can also preserve anonymity by, for example,
generating a new address for each transaction.
It is good at artificial scarcity. It is a
way where you can make something scarce by design.
For example, bitcoin has a limited supply and therefore,
as demand goes up, the price of bitcoin is going
up as well. It is easy to make a token that
is artificial cars on blockchain and
you can also create nonfungible tokens which
represent unique things like pieces of art,
digital or otherwise.
It is also good at decentralization as
it can easily scale. If the incentives are set
up right, the number of nodes can increase as
the demand for the network increases. What is
blockchain lacking at? It is difficult
to store large amounts of data on the blockchain because
each copy of the data needs to be available on
every node that is participating in the network.
There are attempts to solve this, and some blockchains, such as filecoin,
are storing large amounts of data
also to enable queries of the data is not yet
a fully solved problem, and that a lot of blockchains are
integrating with centralized systems for the storage
of data. You can also not make
data changes easily as each
change is preserved, and as such is costly.
The trilemma of blockchain challenges
has been described by Vitalik Butler,
the founder of deuterium. It means that from three
parameters, decentralized, scalable and
secure, you can only have two.
Multiple blockchains have made attempts to solve it,
or at least find a workable middle ground. For example,
Polkadot has a small number of validator nodes that are
elected and such can be very fast but
is not very centralized.
Decentralized application
application application application application application
for blockchain in crowdsourcing data blockchain to obtain
the data first, we can
incentivize people by providing them with a token that
ideally would appreciate in value over time,
so they would be interested in investing their time and effort so
they can obtain more of this token. Also, the blockchain
ensures the security of this reward and enables
trading it in for money if necessary.
We can determine who owns what piece of work.
We can use blockchain to determine who has performed what
work or provided what data first. This can enable people
to easily obtain proof and therefore
incentivize people who are pioneering a particular type
of tasks we can enable on chain governance
blockchain is a good means of blockchain governance
as you prove who owns a particular number of votes
and the only drawback is that blockchain voting
is currently public only. So to implement private
voting you would need to have new technologies
such as zero knowledge proofs.
Distributed autonomous organizations have been
around for a while now and are using on chain
governance for disrupting the work of executives.
Choosing a Blockchain Ethereum
was the first blockchain to implement
smart contracts functionality on a
blockchain. Its founder Vitalik actually tried to
implement smart contracts on bitcoin first,
but his everports were rejected and as such
he created his own blockchain. Ethereum is the oldest
one and has the best community documentation
and infrastructure to be able to learn more about
programming blockchain. It uses solidity
language as a standard for most other blockchains.
The drawback of this blockchain? That it is very expensive to make
transactions on it due to scalability challenges.
There are hopes that the next version of Ethereum is going to address these
issues, but this has been promised for a while.
Sushiswap is the most popular distributed exchange
on ethereum. Ethereum is also the most
popular platform for non fungible tokens.
Binance Smart chain is a fork of
Ethereum has much lower fees, however it's much
more transcendralized as it relies on nodes that are
run by Binance. A lot of projects use
binance due to its low fees and it
trade on decentralized exchanges such as
pancake swap. Solana is
one of the fastest blockchains and very popular one
as well has a very high transaction per second rate compared
to others. However, because of its speed and low cost
has been attacked and taken down several times by
denial of service attacks.
Cardano is my favorite blockchain because it takes
a different approach towards development. They aim
to write academic papers first and only
develop code after the papers are reviewed.
They also employ mathematical proofs to make sure that they
have a good quality of their code
compared to other blockchains like Ethereum and Solana.
As a result, it is very safe blockchain
with no known exploits to date because
of this approach, however, they have been recently dealing with
some congestion problems after the first popular
decentralized exchange has been launched on Cardano
called Pancake Swap. This has been
somewhat improved with tweaking the parameters and
hopefully will be improved later with further scalability
advancements such as sharding Filecoin
is a good blockchain to store a lot of data.
It is somewhat similar to s three buckets on AWS.
Chain link is a blockchain for oracles.
It integrates with other blockchains to enable their
smart contracts to obtain data from the offline
or centralized world. For example,
you can make database query or request
information from a web API to
get the price of bitcoin. This is only a review
of the top blockchains to work on. New blockchains are always
coming online and they implement new functionality
as well, so there's plenty to watch
out for. Where can you learn more?
Naturally, we can start with various documentations on
the specific blockchains such as
Ethereum. However, once you advance a bit further,
Amorgo provides courses for a variety of technical
and non technical topics in the blockchain
world, and you can obtain certification if you complete one
of the courses. Build space is
also a good place to learn to code on the blockchain. While doing it,
you will also learn earn non fungible tokens
for each course you complete on time.
Crowdflation is an open source project building a
decentralized autonomous mission on the blockchain
to crowdsourcing data from the people built an alternative
inflation index. Participants are rewarded
with a cryptocurrency token. Thank you all for
listening and for making this talk
happen.