What Building a Database for Two Billion Devices Taught Me About Data

last year · 845 words · 5 min read

databasesarchitecturestartup-fundamentalsperformance

The first time a developer ran a benchmark against SQLite using Realm, they emailed us to say they thought they had made a mistake. The performance gap was too large to be real.

They hadn't made a mistake.

We were seeing 80x performance gains — not because we had written cleverer algorithms, but because we had solved a problem most database teams never fully confront: data copying.

A conventional mobile database, when you query for an object, does three things with your data. It reads it from disk. It copies it into memory. It copies it again into the object your code gets back. Every read operation involves three copies of the same bytes, three allocations, three opportunities for memory pressure to compound.

At Nokia, where I spent over a decade working on software configuration management and build systems for software deployed on hundreds of millions of devices before founding Realm, I learned something that sounds obvious but isn't: at scale, the work you think you're doing is almost never the actual bottleneck. The bottleneck is always the invisible work — the copies, the allocations, the serialization — that your abstraction layer is doing behind your back.

This is what most teams building data-intensive software get wrong. They optimize the queries. They add indexes. They tune the cache. None of it matters as much as the data format itself.

At Realm, we spent months working on a different approach. The insight was simple enough to state: what if the data format on disk and the data format in memory were identical?

If you could design a custom C++ storage engine where the representation on disk was already the representation you needed in memory, you could eliminate deserialization entirely. No parsing. No transformation. No copying. Just direct memory mapping without deserialization — the OS maps the file into memory, and your objects are already there, pointing directly into the mapped region.

This is the zero-copy architecture. The implementation took months of careful engineering, but the principle is something you can explain to a developer in thirty seconds.

What made it work was a constraint that most database teams don't have: we were designing for a specific, known set of data types. Realm objects are typed — you define a schema. The storage engine could be optimized for that schema in a way that a general-purpose database can't.

I remember when we ran the first internal benchmark that showed the architecture was working correctly. The number came back and the engineer who ran it — one of our core storage engine people — came over and showed it to me without saying anything at first. Just put the screen in front of me. The comparison column was SQLite. The multiple was in the high seventies. We both looked at it for a moment. His first comment was something like "that can't be right, let me check the test harness." We spent the next hour going through the benchmark setup to find the error. There wasn't one. That hour of trying to disprove our own result was its own kind of validation — we had been so careful not to get our hopes up that we had built in the skepticism automatically.

The lesson isn't specific to databases. It's about what "working with data at scale" actually requires.

The conventional approach to scale is to add infrastructure: more servers, more caches, more queue layers. This works, but it attacks a symptom rather than a cause. The symptom is that your system is slow. The cause is usually that your system is doing unnecessary work — and that unnecessary work is almost always invisible until you know to look for it.

At Nokia, working on build systems and configuration management for software deployed across dozens of countries, the same pattern held. The build pipelines that worked were the ones where someone had gone through the entire process and eliminated every redundant step. Not optimized them — eliminated them. Found the unnecessary work and removed it from the design.

The 80x gains we saw with Realm didn't come from working harder. They came from working less. From designing a system that did only the necessary work and nothing else.

There's a version of "handling data well" that looks like adding complexity — smarter caching layers, more sophisticated query optimizers, more infrastructure between your data and your users. That version tends to produce systems that are impressive in architecture diagrams and slow in production.

The version that actually works tends to look simpler in the end, not more complex. You arrive at simplicity through a detailed understanding of what work is actually required — and the discipline to eliminate everything else.

That's what the zero-copy architecture was. Not a clever trick. A careful accounting of what work was necessary, followed by a decision to not do the rest.

The database reached 2 billion device installations and was acquired by MongoDB. The benchmark still holds. Before you optimize how you move data, ask how many times you're moving it — and why.

What Building a Database for Two Billion Devices Taught Me About Data

Writing on building and fundraising

Related Posts