*** Query Optimization with Indexing ***
1. Make sure you have MongoDB Database Tools installed
(https://docs.mongodb.com/database-tools/installation/installation/).
2. Make sure you have a local MongoDB instance running (see lab #2) or use the
remote MongoDB Atlas instance (see lab #1).
3. Download the sample database containing daily NASDAQ summaries
http://mng.bz/ii49
4. Use mongorestore tool to upload the sample NASDAQ database
Syntax: mongorestore <options> <connection-string> <directory or file to restore>
mongorestore --db=stocks mongodb://mongoadmin:secret@localhost:27888/?
authSource=admin dump/stocks
5. Connect to MongoDB using shell
mongosh mongodb://mongoadmin:secret@localhost:27888/?authSource=admin
6. Find the first occurence of Google's stock price:
> use stocks
> db.values.find({"stock_symbol": "GOOG"}).sort({date: -1}).limit(1)
Notice the slow query and possibly the warning issued by MongoDB.
7. Enable profiling
> db.setProfilingLevel(2)
8. See the execution statistics for the highest closing price in the data set:
> db.values.find({}).sort({close: -1}).limit(1).explain("executionStats")
Notice the "totalDocsExamined" field shows 4308303, which means that all 4 million
documents have been scanned.
9. Create an index on the "close" field
> db.values.createIndex({close: 1})
10. Let's try the same query again
> db.values.find({}).sort({close: -1}).limit(1).explain("executionStats")
Notice the "totalDocsExamined" field shows 1, which means that just one document
has been scanned(!!!) which makes a tremendous difference.
11. Fetch all of Google's closing values greater than 200:
> db.values.find({stock_symbol: "GOOG", close: {$gt: 200}})
Run the query with the "explain" option, like so:
> db.values.find({stock_symbol: "GOOG", close: {$gt:
200}}).explain("executionStats")
Notice that the "nReturned" number of documents is 730 but the "totalDocsExamined"
number of examined documents is 5299, even if we have an index on the "close"
value.
12. Create a compound index on stock_symbol and close
> db.values.createIndex({stock_symbol: 1, close: 1})
13. Rerun the previous query
> db.values.find({stock_symbol: "GOOG", close: {$gt:
200}}).explain("executionStats")
Notice that the query execution is optimal as the number of returned documents
"nReturned" is equal to the number of examined documents "totalDocsExamined", i.e.
730.
NEXT:
We will put relational databases to shame and will explore the aggregation
pipeline.
Familiarize yourself with aggregation operators (see reference docs at
https://docs.mongodb.com/manual/aggregation/).