Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, my name is Vladi. Today we'll talk about why and how top
performing apps like Figma Notion have leveraged the power of real time
collaboration and give you a quick overview of how you can also
make your Nodejs apps collaborative very quickly
and with a lean and open source tech tag.
When I think of modern applications, I have certain expectations in
mind. More specifically, in the formula for creating a modern application,
I would expect to add certain ingredients. For example,
the application would need to be instantly available anywhere
in the world, accessible via the web browser or your mobile device,
optimized for user experience, making it effortless to perform the
task you want to complete. And for those of us who work in distributed
teams, whether because of geographic distribution,
preferred work install, or because of the size of your organization,
you may also expect to see other functionalities,
the ability to share your work with your team, the ability to
get direct feedback without switching to another app, and the
ability to collaborative on the same project.
It's surprising that not all of these expectations are being met
by modern applications today,
especially when it comes to collaboration and
especially knowing that the capabilities are out there and they have
been an integral part, for example, of the gaming industry for several
years. Before I continue a quick introduction
I'm the community director for multiplayer.
My background is in community building and customer service in the tech
space, and I was fortunate to work for other user centric
companies like Prisma and MongoDB in my previous roles.
I hope I don't disappoint too many of you when I say that multiplayer
is not an MMORPG, a massively multiplayer online role
playing game. Multiplayer is a collaborative SaaS platform that aims
to support teams working on large scale or complex distributed
systems with numerous components, interactions and dependencies
to effortlessly align within their team and with other stakeholders PMS
Ops QA on how to visualize, design and manage
those systems. This need for effortless
alignment real time visual collaboration is so important to
us that even inspired our name, and it led us to spend a
lot of time researching, investigating and trying out how
to build collaborative features in our app using a
lean and OSS focused tech stack. Today,
I wanted to share our experience and learnings in adding real time collaborative
features and maybe spark some ideas on how you can also
add these features to your own apps.
Real time collaboration is now table stakes, and that's also due to the
benefits it brings to the user experience. Think back
of how teams used to work. Everyone worked individually on the same project
but in silos and then they would exchange that
project file back and forth with their individual additions and changes,
and somebody would have the dubious pleasure of reconciling, if at
all possible, the work of the entire team and producing the
final, final version of the document. Now compare
that to the experience of working together in a single shared space,
side by side, virtually using a common language and
framework, you can align, communicate, and drive consensus
effortlessly. It's not a coincidence that Figma
Miro canva notion, et cetera, have been so popular and successful
in the past few years. And beyond them, many SaaS businesses,
even companies that haven't traditionally thought of themselves as collaboration
first, have embraced this trend. In fact,
besides empowering, effortless team collaboration, another big
benefit of collaborative products is that they are viability machines.
For businesses, the ability to invite and involve
more stakeholders in a project ensures, one,
that more users use your product, and two, a higher
adoption and retention rate, because when distributed
teams are able to achieve more together, your software becomes
a key part of how they work. So let's start
by pinning down a bit better what we mean when we say
real time collaborative tech there have been
many attempts to coin a term to describe this trend,
including multiplayer collaboration with a lowercase m,
deep collaboration, and collaborative enterprise.
We consider an application to be real time collaborative when it has
these four functionalities. The first one is real
time updates. A change made by one user must
propagate to all other users instantly or as
near as instant as the Internet allows.
Like playback, each user should be able to watch in
real time precisely what their collaborators see on their side,
right down to watching each person's cursor move about their screen
presence and status. So have an easy way
to see who is online and available to work.
And the final one is data integrity.
With data and changes coming in from multiple people at once,
there is a risk of conflict. And preventing data loss
from conflicts is both a UX and a backend challenge,
as we'll see in this presentation.
Now that we have aligned on the definition,
let's talk about how you would go about concretely
building a real time collaborative product.
So the first approach that might come to mind is to extend
an existing restful API and simply broadcast requests and
response messages to all connected clients using websockets.
This works fine for changes that don't conflict. For example, both clients
agree on the logo design, but it leads to bad things
when multiple clients try to change the same thing at the same
time, for example, they want to propose different logo
designs. As another example, imagine typing in
a document, but the words show up out of order or individual characters
are in odd places. That's because more than
one person is typing in the same place on the document at
the same time, but you don't have good conflict resolution strategies,
and indeed, handling concurrent editing in a multi user environment
gracefully is very challenging.
In order to merge conflicting changes in a way that makes sense to the
end users, you have to share not just the value
of the change, but the context of the data and the state of the client
making the change. Luckily, there are
some great technologies that come to the rescue. The first one
is operational transformations ots, the second
one conflict free replicated data types crdts,
among others. Operational transformations are
an algorithm for the transformation of operations such that
they can be applied to documents whose state have diverged,
bringing them both back to the same state and conflict
for you by packet data types as the name suggests,
provide several data types for handling changes and resolving conflicts
from multiple clients operating the same data at same time.
And to be clear, you could still use websockets,
but it's just a lot more work and time. Also,
depending on your problem space and the desired behavior for different
states, you may find that you'll need a custom approach inspired
by these two traditional methods. For example,
if you're a startup and value the ability to ship features quickly.
Ots the technology used for Google Docs might
be too complex. Likewise, you might get inspired by
multiple separate crdts and use them to create the final
data structure that best represents your document.
Similar to the approach that figma took today,
we will focus on the technology that we use at multiplayer and
that we feel offers the simplest, fastest, and most powerful approach
to implementing collaborative features. CRDTs There
are two approaches to CRDTs operation based CRDTS
or commutative replicated data types and state based
CRDTS convergent replicated data types.
Both can provide strong eventual consistency,
and this means that even if clients drift because of short term
connection issues or suffer from high latency, the data
on all connected clients will eventually resolve to the same final state.
CRDTs are really a collection of simple algorithms such
as last write of win registers grow only counters positive
negative counters grow only sets two phase sets
sequence CRDTs, among others.
We chose a popular OSS CRDT implementation called
js. It provides data structures like docs
maps and arrays that map easily to regular JavaScript data
types. So when we designed our app, we didn't need to radically
refactor things to use js. There are
also other benefits to yJs. It has a huge number of
ready open source integrations, code editors, whiteboard apps,
rich text editors any data structure
can be supported with YJs shared types.
It allows also for painless client reconnect without losing
client progress. It's network agnostic,
very easy to use awareness features out of the box, and it
also has a very big community. One thing
to keep in mind is that using CRDTs and YGs
isn't just about data types. You also need to
think about your system architecture. It's possible
to use CRDTs in a peer to peer orientation.
However, because we provide features like snapshotting,
we decided to incorporate a central peer in our system to
resolve conflicts and to provide a definitive source
of truth for our data. Using the
example from the hard way, let's say that both clients are working on
the same logo. Each one makes specific
changes. For example, one client changes the color of one
square, while the other client changes the color of the other
squares and their order. Those changes would
be propagated to the central peer who is using
YGS library and would resolve any conflicts
so that there is no drift between the states and in other words,
everyone is looking at the same version of the logo.
While this solution is simpler to support, it does have a few drawbacks
that I want to mention. So for example, the first one
is that it can be more cpu and memory intensive since
you need to server side resources when editing documents.
The second is that scaling is more complex because
you need to be able to run the central peer close to the clients.
But overall this is a much simpler approach because you
end up needing some central services anyway for document storage
and management. So let's do a quick recap.
This is an overview of the multiplayer system architecture
as visualized using the multiplayer UI, and we
have opted to have our CRDT peer sessions managed by
a central peer, the collaboration service.
This service has three main functions. The first one is serving
the latest version of the document when a new client connects.
The second one is sharing edits across all connected
clients, and the third one is saving snapshots
of the document. We keep latency low by deploying
collaboration services close to our users, so latency
is minimized and we have a directory service
cluster to make decisions about where to place collaboration
sessions, track existing sessions, and even move
sessions. Now that we've seen
how we implemented real time collaboration features,
let's look at the specific features. And before I go
on, I wanted to make a quick aside. We didn't implement all
of the possible real time collaboration features, but only those
that make sense for multiplayer. Also, we didn't always
use JS, we only used it when it made sense.
So this approach makes sense when we require conflict resolution
between changes from different users or for temporary things
that belong to the document. Think for example, user info
or cursors. There are three different categories of
real time collaboration features. The first one is awareness
or presence features, which allow you to automatically track and
communicate the online status of your users. So for example,
avatar stacks. Now note that since
we have a central peer and since JS is network agnostic,
we decided to use websockets to display this. Also because
we are pulling this information across multiple documents at a project level.
However, it can be also achieved with JS.
Other features that fall into this category are live cursors
and user in application, which you
would use a similar logic in JS to implement, and typing
indicators, which is a feature we haven't implemented yet,
but we'll likely do in the future. Another category
of collaborative features is state synchronization.
Features include all those user actions and changes that
have to be synced correctly and at low latency, so live updates is
an example. Then you have coediting, undo and
redo an easy way to see the YGS
implementation of these category features in multiplayer is to
think back of the multiplayer
system architecture that I showed you a few slides back.
One client is moving the collaboration service to
a different location in the system architecture, while at the same time
another client is renaming that same component, the collaboration
service something else, and both changes would be combined together without
issues in a final state. The last
and third category of collaboration features is the pub submessaging
features which are needed to deliver the right message to the right
client in real time. This comprises comments and push
notifications. We use yJs, as I mentioned,
only for things that require conflict resolution between changes from different
users, or for temporary things that belong to the document.
And so comments do not require simultaneous edits
because comments can have only a single owner and they are
not temporary. And for this reason we
didn't use js to implement this feature. Instead we used recipe,
API, call and websocket notifications.
And although we haven't implemented yet push notifications,
we would be using websockets for those too. Although you
can use JS for state synchronization now,
while using YJS was surprisingly easy,
there were some challenges. The first being learning
was how to keep track of order.
Arrays are supported in YJS, but order of elements is
not maintained without a little work. The first
approach you might think of is to add an integer for the order.
However, the drawback is that whenever a new element is
inserted, you have to change the order value of every single
element that follows it. This is not a scalable solution,
especially if you have a very long list. What we
ended up using is fractional indexing,
which is also the approach that used by Figma.
This solution has the benefit of inserting an element without having
to update all of the elements that follow it. To implement
fractional indexing, it's best to use an arbitrary precision library rather
than the built in JavaScript number type, which is
a 32 bit floating point number. Because it has
limited precision, there is a limit to how
many times you can insert something into a list before you hit the
precision wall of the type, and instead with an
arbitrary precision library, you don't run into this problem.
Our second learning, to which I alluded to
already when discussing our system architecture, is that while YJS
supports a peer to peer design where you don't
need to have a central service that may not fit
your business model. In fact, for us, having a central
collaboration service was very much necessary to
be able to implement features like snapshotting and storing data
in specific locations because of security requirements. Think GDPR.
However, a peer to peer architecture would be better suited if your application
has a requirement, for example, to operate in an offline mode.
Our third learning was about defining the scope of collaboration,
more specifically being selective on where and how
we use real time collaborative features, because they might
be unnecessary or worse, confusing.
So let me give you a little bit of context. Multiplayer supports
branches, chain sets and views of the platform architecture and API.
These represent either copies of original documents or
filtered views of the content of a document and when
deciding how to show the awareness features. So think
back. Avatar Stack and user cursors we decided to only
show the users who are viewing the same document in the same branch at
the same time as you can see in these two branches,
instead of showing all of the users who are working on the same platform
architecture across all branches, because that would be
unnecessary and confusing.
The last learning I'm keen to share is that JS can lift
a lot of the overhead of implementing collaborative features,
but it doesn't do everything for you. For example, it doesn't
support cross document changes where you need to propagate
information from one place to another.
To give you another concrete example, in multiplayer we
have several documents that have dependencies. For example,
the individual component of a system architecture may be
referenced in different places. Component list,
component description, platform architecture. To be able
to ensure that any change to an independent document would flow up
to other documents that reference them. We needed to wire up
dependencies between different data structures in our app,
listening for changes in the YJs document. So for
example, if you change the name of a component in the components view,
that change will flow to any other platform document that shows that
component automatically without needing to refresh.
And this is all to say that you may find that your app requires some
UX choices like that which are not supported by JS.
So to recap, why should you build for collaboration within
your app? With real time collaboration you ensure a
better user experience. In other words,
quicker time to fund your users can align,
communicate and drive consensus effortlessly.
Shorter production cycles, quicker time to market.
Making sure that your users are aligned on the expectations,
responsibilities requirements. Roadmap ensures
that there is no miscommunication or backtracking.
Therefore they can work more productively and deliver results faster.
Also, they are working and communicating in a shared space,
which means less context switching and wasted time.
And this is not to say that real time collaboration alone is
the ultimate productivity booster. However, it is an
expected enhancer of the user experience.
Also, increase business revenue. Because your tool is
adopted by entire teams, if not entire organizations,
it becomes a key part of the company's workflows,
giving you higher retention rates.
So the final advice we can give you is to
not reinvent the wheel. There are lots of solutions
for adding real time collaboration to your app, and while we decide to
use YJs and build in house these functionalities,
we can also recommend using ASAs provider
like liveblocks. But there are so many others out there
as well. So for example, app playkit collab kit.
Our second advice is to embrace collaboration
from the start if you can. Adding real time collaboration
features as an afterthought to an existing product, or to only
part of an existing product is more messy and difficult.
It's not impossible though. You can certainly introduce JS
in an existing system and it would take a lot less work and time
than doing the same with websockets.
However, by including it in your early designs, you ensure
that the full ux experience is collaborative and
your technology choices will easily support the
evolution of these features. Thank you for
listening to my talk and I hope you got inspired to add real time collaboration
to your apps. If you'd like to try yourself. The features we built into multiplayer
we'll be launching our open beta very soon and we'd love your feedback.
You can find me on X, LinkedIn or around the Discord
server if you have any questions.