Ever since Meteor was first introduced way back in 2012, the most common question people have asked has been “does it scale?”.

As with most things, the answer is “it depends”. In this case what it depends on most is your app, and specifically how it manages data.

Note that the scaling landscape of Meteor is still changing rapidly, so it’s hard to speak definitively on this subject. But in this post, we’ll do our best to walk you through the overall picture of scaling a Meteor app.

Three Different Models

To begin with, let’s consider the fundamental scaling differences between traditional, single-page, and real-time web applications.

Traditional Web Applications

A traditional web application.

In a traditional application, each time a user accesses a new page on the site, the client (i.e. the browser) makes a request to the server and gets content (typically an HTML file) back in return.

Single-page Web Applications

A single-page web application.

In a single-page application, the initial request to the site downloads a single JavaScript file containing all the code needed to render every page. Separately, the app will also make requests for the raw data (usually in a lightweight format like JSON) to fill in each page.

Real-time Web Applications

A real-time web application.

Finally, in a real-time, single-page application (such as a Meteor app), not only can the client request more data, but the server can also push data to the client (for example, when that data has changed in the database).

This raises some interesting issues: how does the server know when data has changed? And what’s more, how does the server know which user to send those changes to? These questions are key to understanding the performance characteristics of real-time apps.

Poll-And-Diff

Part of Meteor’s magic is an app’s ability to instantly know whenever data has changed. But what exactly is going on under the hood?

To understand this, let’s take a look at the mechanism Meteor first used to monitor the database for changes, poll-and-diff.

Poll-and-diff is on its way out (soon to be replaced with Oplog tailing, as we’ll soon see), but it’s still important to understand how it works.

Let’s consider a very simple publication:

Meteor.publish('posts', function() {
  return Posts.find();
});

When you publish this simple posts cursor, and a user first subscribed to it, a LiveResultsSet (LRS for brevity) is established. A LRS is the mechanism that Meteor uses to poll the database.

At this point, it’s important to remember that while data changes will usually happen through the Meteor app itself, it’s also possible for external data changes to occur. These could be triggered by another instance of the same app (when scaling horizontally), a different app altogether, or even manual changes to the database through the Mongo shell.

So every 10 seconds (or whenever the server thinks that the Posts collection may have changed), the LRS will re-run the Posts.find() query against Mongo (with a corresponding hit to your Mongo server).

We don’t want to send down all posts all over again to each client, so we need to determine what’s different about the new cursor. To do this, the app will diff (i.e. compare to find out the differences) the result of the query against what it knows about the Posts collection.

A visual representation of Meteor’s diff algorithm.

Unfortunately, in many cases, running a proper diff over many posts is a computationally expensive exercise. And if you happen to have have many such polls going on, things can quickly get out of hand.

So even though in the long term we’d probably anticipate that per-user memory usage will become the driving scale factor for production Meteor applications, in the short term most Meteor apps still seem to hit CPU bottlenecks first.

LRS Inflation

There are two common reasons why would you have many diffs going on:

  1. Having a lot of different LRSes. Although Meteor tries it’s best to share LRSes between users (if two people subscribed to our posts publication above, they’d share a LRS), if a publication is user specific this won’t be possible.
  2. Writing a lot to the database, which in turn leads to a lot of polling.

The end result of all this is that the nature of your published data and the amount of data writes you make will end up dictating how badly this issue affects your app.

Oplog Tailing

With the recent release of Meteor 0.7.0, a much better database monitoring technique has been developed: Oplog tailing.

Oplog tailing does away with the LRS’ poll-and-diff dance. Instead, Meteor uses the Mongo Oplog (a feed of data that’s intended for synchronizing a Mongo Replica Set) to make Mongo a “pseudo-realtime” database.

Now if Mongo was a true real-time database, it would send out the added/changed/removed messages needed by Meteor for every query requiring real-time updates, and there would be far less work needed on the server to in turn transmit these changes to the client.

Pictured: a real-time Mungo with a tail. Close enough.

The Oplog doesn’t quite achieve this, as it instead tells Meteor about changes at the collection level. This means that Meteor still has to do a bunch of work to determine what has changed at the query level, and if that changes should be synchronized to the subscribing clients.

Although in general the Oplog tailing technique is much more efficient that the fairly naive poll-and-diff algorithm, there is still a CPU load required for each database change that will increase with the number of open subscriptions (since we need to check if each change matches each subscription).

So despite the very real improvements brought by Oplog tailing, it’s important to remembers that it still won’t dispense you from properly thinking about your pub/sub strategy. What’s more, even if you enable Oplog tailing, as of now only a subset of queries are candidates for the technique.

Memory Loss

We’ve looked at how an app detects changes to the data, but the second part of the equation is figuring out which users to send those changes to.

This means keeping a copy of each connected client’s “in-memory database” (in Meteor, the contents of Minimongo) in the server’s memory. In Meteor, the data structure that contains this is the Merge Box.

The direct implication is that each connected client takes up a certain amount of memory on the server. And the more data you publish to the client, the more memory you’ll need on the server to keep track of it.

Slimming Down

This brings us to the real question: what can you do about this?

The first step is making sure you only publish the data you need (assuming you don’t need to go through too many contortions to get there).

This is good practice all around: not only will this use less resources for the connection between client and server, but it will also require less browser memory on the user’s computer. And as we’ve seen, it will also have a vital impact on server-side memory use, and thus how many online users each web node can support.

The second take away is that it becomes very important to consider how long users are connected for. Because even if a user isn’t doing anything (for example, having a browser tab open in the background), this will still result in a certain server load.

Scaling Strategies

You can also get more specific about your observers and how much work Meteor needs to do to keep them up to date:

  • Attempting to minimize the number of unique-per-user subscriptions will allow Meteor to pool work across users as much as possible.
  • Using simpler queries will usually mean there’s less work for Meteor to do to decide if a change affects it.
  • Constraining your database writes (especially against heavily observed collections) will limit the amount of work Meteor needs to do to keep observers up to date.

There’s also a few monitoring tools that can help you get a better picture of your app’s performance and locate bottlenecks. One of them is the Server Info package, which provides a password protected URL to get stats on the current number of open LRSes against each collection.

Arunoda’s upcoming Meteor performance monitoring suite.

The Facts package (part of the Meteor core since 0.7.0) is also a great tool to gather data about open observers, both polling and oplog-based. And keep an eye out as well for Arunoda’s very promising MeteorAPM performance monitoring service.

Finally, to test your app’s scalability, the best tool right now is Adrian Lanning’s Load Test.

Conclusion

As we’ve seen, there are some fundamental reasons why realtime applications are usually more server-intensive than traditional web applications. Hopefully, this article has helped you understand the various factors at play, and will make scaling your Meteor apps a little bit easier.