Introducing Dex: the Index Bot

(update 2012-07-19: A new remote feature detailed here.)
(update 2012-10-09: Version 0.5 detailed here.)

Greetings adventurers! MongoLab is pleased to introduce Dex! We hope he assists you on your quests.

Dex is a MongoDB performance tuning tool, open-sourced under the MIT license, that compares logged queries to available indexes in the queried collection(s) and generates index suggestions based on a simple rule of thumb. To use Dex, you provide a path to your MongoDB log file and a connection URI for your database. Dex is also registered with PyPi, so you can install it easily with pip.

Quick Start

pip install dex


dex -f mongodb.log mongodb://localhost

Dex provides runtime output to STDERR as it finds recommendations:

"index": "{'simpleIndexedField': 1, 'simpleUnindexedFieldThree': 1}",
"namespace": "dex_test.test_collection",
"shellCommand": "db.test_collection.ensureIndex(
  {'simpleIndexedField': 1, 'simpleUnindexedFieldThree': 1},
  {'background': true})"

As well as summary statistics:

Total lines read: 7
Understood query lines: 7
Unique recommendations: 5
Lines impacted by recommendations: 5

Just copy and paste each shellCommand value into your MongoDB shell to create the suggested indexes.

Dex also provides the complete analysis on STDOUT when it's done, so you will see this information repeated before Dex exits. The output to STDOUT is an entirely JSON version of the above, so Dex can be part of an automated toolchain.

For more information check out the and tour the source code at Or if you're feeling extra adventurous, fiddle with the source yourself!

git clone

The motivation behind Dex

MongoLab manages tens of thousands of MongoDB databases, heightening our sensitivity to slow queries and their impact on CPU.  What started as a set of operational heuristics has been cast as an automated tool.  For example, "Create indexes with this order: Exact values first. Sorted fields next. Ranged fields after that."  It's worked very well for us and we hope that it'll work as well for your own MongoDB databases, even if you never host one with us.  We'll continue to improve Dex -- see the future directions below -- and would love your feedback,  suggestions, and requests.

How it works

At a high level, Dex reads the MongoDB log and performs three steps:

  1. Parse the query
  2. Evaluate existing indexes against query
  3. Recommend an index (if necessary)

Step 1: Parse the query

Each query is parsed into an internal representation that classifies each query term into one of:

  • EQUIV - a standard equivalence check; ex: {a: 1}
  • SORT - a sort/orderby clause; ex: .sort({a: 1})
  • RANGE - a range or set check:
    Specifically: '$ne', '$gt', '$lt', '$gte', '$lte', '$in', '$nin', '$all', '$not'
    • Composite ($and, $or, $nor)
    • Nested "operators" not included in RANGE above.

Step 2: Evaluate existing indexes against query 

The query is evaluated against each index according two criteria:
  • Coverage (none, partial, full) - a less granular description of fields covered. None corresponds to Fields Covered 0 and indicates the index is not used by the query. Full means the number of fields covered is equal to the number of fields in the query. Partial describes any value of fields covered value between None and Full.
  • Order (ideal or not) - describes whether the index is partially-ordered according to Dex's notion of ideal index order. This notion of order is:
    Equivalence -- Sort -- Range
    This ordering is a synthesis of conventional indexing wisdom and a rule of thumb developed to avoid expensive MongoDB scanAndOrderoperations when performing sorted range queries.Note: Geospatial queries and indexes are unsupported. Index evaluation is performed but Dex will not make recommendations for Geospatial queries. Analysis continues but the index is no longer considered for recommendation purposes.

Step 3: Recommend an index (if necessary)

Once evaluation is complete, Dex considers an ideal index to have Coverage 'full' and Ideal Order true. If these conditions are not met, and the query itself contains no UNSUPPORTED terms, Dex reccommends an ideal index (with an index order of 1 for all fields).

Note: Dex does not really need to look at existing indexes in order suggest the ideal index for a given query. But Dex does examine existing indexes as a courtesy to users who already have them in place -- both to provide analysis of partial indexes (in verbose mode), and to avoid suggesting indexes that already exist.

Future Directions

  • Line Parsers: Better coverage of log lines, with a goal of complete coverage of all indexable queries
  • Analyze the system.profile collection with -p option
  • Constrain analysis by a time range with -t option
  • Add Dex's own "SLOW_MS" to narrow results if desired
  • Support geospatial queries
  • Improved recommendation caching, storing queries by mask and summing:
    • Number of like queries
    • Time consumed
    • Min/max time range
    • Min/max nscanned/nreturned
  • Improved recommendations:
    • Combine like recommendations (or generate recommendations from multiple like queries)
    • Measure cardinality (yay aggregation framework) in the collection to inform recommended index key order.


We're really excited about Dex at MongoLab and look forward to the many improvements that are possible in the MongoDB automation space. I'm presenting Dex at the June MongoDB San Francisco User's Group today, June 19, 2012. If you can make it on such short notice, check out the details here. We will follow up with a link to presentation slides a short time later.

Finally, for those interested in the indexing knowledge we've accumulated as Dex was built, check out my Cardinal $ins blog here.

As always, good luck out there!


(update: Slides from Eric's SF Mongo User Group talk on Jun 19, 2012 are here.)

, , ,