Future of MongoDB: Fireside chat with MongoDB CTO Eliot Horowitz

Last night I attended a Meetup at MongoDB Inc.'s new Palo Alto office to hear MongoDB's CTO, Eliot Horowitz, speak about the product roadmap. With a new production release right around the corner and MongoDB World in the not-so-distant future, the buzz and excitement around all things MongoDB is high. For those who were not able to attend, we're going to recap all the major points Eliot made.

MongoDB 2.6

To start things off, Eliot announced some big news:  MongoDB will GA its next major production release (v2.6) in the next 4-6 weeks. This release includes extensive rewrites of MongoDB's internal architecture and exciting new features. We'll highlight some announcements here, you can find a complete list of improvements in the v2.6 release notes.

Query engine rewrite

The entire query engine has been rewritten, from the execution engine to cursor management. The motivation behind the rewrite was to accomplish a few goals: make the code easier to maintain, increase introspection into what the queries are doing and provide more insight into what indexes are used by queries and why.

Index intersection

As part of the query engine rewrite, a new feature, index intersection, has been released. MongoDB queries have been limited to only one index, regardless if it is a single field or compound index. With index intersection queries can utilize multiple indexes for performance optimization. For collections with many indexes e.g. upwards of 50, this feature aims to help reduce the number of redundant indexes.

Optimizations to the update() method

The update() method has been completely rewritten from scratch. The goal here was to write code that is easily understandable so that any future new MongoDB engineer can write a new update operator in a matter of days. Users should expect many more update operators to come.

Significant improvements to network architecture

In perhaps the most interesting segment of the night, Eliot quickly recounted the history of MongoDB and 10gen (now MongoDB Inc.). This history lesson was particularly important as it provided context for understanding why MongoDB was originally architected the way it was- MongoDB Inc. began as a DBaaS company.

To quote the linked post:

"When we first started 10gen in the fall of 2007, we set out to build a full platform as a service stack with MongoDB as the data layer. This was a fully hosted system (still open source), that encompassed a load balancer, auto scaling application server and data tier. The application side was a full server side JavaScript environment...

Writes in that system did not individually wait for a response from the database.  However, the application server itself always checked the database for any errors that occurred during the entire page load (using getLastError and getPrevError) so that the user/system would be notified of any issues... This worked great in the platform, as we were able to control the whole access pattern."

As a result, MongoDB has been working on a significant overhaul for networking around database operations. Operations such as inserts, updates, deletes, which used to require multiple packets now only use one packet, resulting in significantly less network overhead.

Unlimited result sizes for aggregation framework queries

When using the MongoDB aggregation framework on large data sets, it's possible for an interim result in the pipeline to exceed the size of RAM. Currently result sizes from the pipeline are limited to RAM; with the new release these queries are now no longer limited.

Looking beyond 2.6

Of course, after 2.6 is released MongoDB will continue to innovate. Eliot also shared his vision for some future roadmap plans.

Vertical scalability

MongoDB has a strong horizontal scaling story and is looking to improve its vertical scaling story. On the roadmap is to better CPU concurrency, index RAM consumption and spinning disk performance.

Replica sets

Unlike shards, which can run up to the thousands, replica sets are currently limited to twelve member nodes due to election logic. Large MongoDB deployments will be able to deploy more members in the future.

Along with support for additional members, MongoDB is looking to introduce more types of replica set members. For example, some member nodes may be solely used for analytics, not to be used to in production.

Got questions?

We look forward to hearing your thoughts and questions about the future of MongoDB. For specific questions, you can write to our team at any time and we'll be happy to discuss or help point you in the right direction.