Figmex: Making Figma with Elixir

September 2, 2022 ยท 7 min read

fly phoenix

This blog post explains how an app I wrote, Figmex, works.

Rather than a step by step walk-through. I’ve opted to write in a “FAQ” sort of question/answer format. These are some of the questions I was asking myself at the time.


Why Elixir?

Elixir is good with handling many concurrent connections and it makes distributed clusters VERY easy compared to many other languages. Elixir/Erlang’s tooling empowers the developer to choose/implement a different distributed-system strategy depending on the situation (whether it’s storage, communication between nodes, etc.). Basically, you have a lot of options without that much work.

The Phoenix webframework really utilizes all of the things mentioned above to make it easy to have a distributed web app. Specifically I’m talking about Channels, Pubsub, and Presence (aka Tracker).

Also shoutout to Fly.io who makes it easy to host a cluster.

How does Figmex work?

How do you create a cluster with Elixir?

Elixir supports clustering very easily with libcluster. I’ve included notes in the README on how to run one locally. Fly.io’s docs are also helpful.

Is every change to the board being broadcast to every user right away?

A “change to the board” would be moving one’s mouse, creating an object, resizing an object, etc..

Not quite! There is a “event buffer” on each node that accumates all the recent changes on that node and then broadcasts the changes periodically (for example, every 30ms) to other nodes. This significantly simplifies the broadcast traffic between nodes.

Does Figmex use LiveView?

No. LiveView solves a lot of use cases, but with the amount of event tracking and javascript involved it makes more sense to simply use Phoenix Channels/PubSub, which is effectively all the websocket power of liveview without the live html updating.

The requirement to rely on canvas events (and libraries like Fabric.js) quickly makes LiveView the wrong tool for the problem.

How are you tracking users? How are you tracking which user owns which resource?

Each user is tracked using Phoenix Presence, a super set of Phoenix Tracker, which is basically designed to do just that. Each “Presence” has “meta” data field which is a good temporary storage location for something ownership over a resource on the board.

Does Figmex use an external database?

No, everything is stored in app memory aross the cluster. A DB could be handy, but I wanted to see how far I could get without one.

Erlang/Elixir offers a bunch of solutions to caching like ETS, Mnesia, Genservers, etc. so using something external like Redis isn’t necessary (though in practice you still might).

How is the initial state of the board stored?

It’s stored in a global Genserver, which means it’s stored in app memory on ONE of the nodes in the cluster. There are a lot of downsides to this approach, namely the farthest node has the worst read/write time. The positive is that it a really simple way to avoid race conditions; there’s no need to implement a locking system, etc..

A good alternative might be Mnesia which can store state across a cluster with good consistency guarantees (transactions for locking, etc.). I didn’t go down this path because I couldn’t find a working example with Fly.io and didn’t want to spend the time solving it on my own.

Otherwise use Redis, a NoSQL db of some sort that is is quick.

Javascript frameworks?

Figmex doesn’t rely on any Frameworks like React or Vue. Simply Vanilla Javascript. However, it is a canvas-based app and relies heavily on Fabric.js. If you look at the code then you’ll see that I tried wrap and not expose any of the APIs of Fabric and/or Phoenix’s Channels’ javascript, but it’s hard to pull off perfectly.

In my opinion, vanilla Javascript is fine for projects of this size. In a real world scenerio, use whatever works for your team. Otherwise, keep it simple.

Last note, it’s not like a “framework” like React or something would help with a canvas app anyways.

What about race-conditions?

There are a number of ways that a user could run into a “race condition” with Figmex.

Example 1:

For example, two users click the same object at the same time so who takes priority? At the moment “ownership” or a “claim” on an object is saved on a user’s Presence (meta data). When making a new claim, a quick check is run against all current claims to make sure someone else doesn’t already own it. This requires cycling through Presences of all the current users. As far as I know, Presence prioritizes Availability rather than Consitency (that’s CAP theorem A over C). In this case, it means it’s possible a the value is being read while it’s also being changed somewhere else.

At the very least, if two claims come in at the same time then there is an attempt to take the earlier claim as the owner.

The real answer is these sorts of errors don’t happen frequently, but even if they do it’s acceptable as the users are collaborating and not competing so they would self-correct if something went wrong anyways.

Example 2:

Another problem area is if a user joins while a change is being made to the board. In that case they might load an initial state that is slightly old and thus invalid. A solution here would be to introduce some sort of locking mechanism during reads and writes, perhaps like Mnesia (reference: “How is the initial state of the board stored?”), or an external db layer like Redis, RDB, or some alternative.

Improvements?

If I were making Figmex for real, then there are some clear improvements:

A real Figmex alternative

I just want to shout out Excaldraw, which is opensource-ish to a point and does a good job.

It was also pointed out to me by relang that tldraw exists, which I think nails it aesthetically.

Any takeaways?

Collaborative apps like Figma, GoogleDocs, etc. rely on everyone cooperating (duh). So by design people tend to not compete with each other for resources. With that in mind, a lot of the “race conditions” from two users clicking the same object are actually avoided.

For example, if I see that my peer is about to click on an object, then I might choose not click on it (as to not be a troll).

This is all to say, the collaborative nature of something like Figma is self-correcting for a lot of the annoying race conditions that might occur.