Monitor MongoDB (query) load with Custom Metrics

Sometimes standard instrumentation isn't enough to track the root cause of a performance issue. Custom metrics help us track the missing pieces.

Thanks to AppSignal's performance graphs we know exactly what part of our codebase causes performance issues. In the screenshot below you can see we had a severe slowdown because of MongoDB:

What it doesn't tell us is which of the numerous databases running on different ReplicaSets caused this slowdown. Using our Custom Metrics platform we can answer this question at a glance.

Mongo::Monitoring

With the new 2.x Ruby driver, the mongo gem exposes a monitoring API. We use this to track every query sent to the database.

Here's a subscriber class that implements the three required methods (started, succeeded and failed) and sends the data to AppSignal:

# config/initializers/mongo_command_subscriber.rb
class MongoComandSubscriber
  VALID_DATABASES = Mongoid.clients.map { |k,v| v['database']}
 
  def started(event)
  end
 
  def failed(event)
    finished(event)
  end
 
  def succeeded(event)
    finished(event)
  end
 
  def finished(event)
    database = event.database_name
    duration = event.duration
    return unless VALID_DATABASES.include?(database)
 
    Appsignal.increment_counter("query_count.#{database}", 1)
    Appsignal.add_distribution_value("query_duration.#{database}", duration)
  end
end
 
# Subscribe to all COMMAND queries with our subscriber class
Mongo::Monitoring::Global.subscribe(
  Mongo::Monitoring::COMMAND,
  MongoComandSubscriber.new
)

Custom Metrics dashboard

Now that we're sending these metrics to AppSignal, we have to create a dashboard to visualize the metrics. Let's create two graphs, one for the query count and one for the average query duration:

- title: "MongoDB Query Load"
  graphs:
    - title: "Database Query count"
      kind: count
      filter: "query_count/*"
      format: number
    - title: "Database average query duration"
      kind: measurement
      filter: "query_duration/*"
      format: duration

We use the filter key to provide a Regex that matches any query duration we have sent to AppSignal.

The result

Now we track the individual query count/duration and have a dashboard in place, and can see which of the databases caused the spike in our performance graph:

This is one of the many examples where using Custom Metrics helps gaining more insights into the overall performance of our (and your!) application.

If you'd like to give Custom Metrics a try, or need help in identifying and tracking valuable metrics, just let us know.