Introducing: Mongodb: David J. C. Beach
Introducing: Mongodb: David J. C. Beach
MongoDB
David J. C. Beach
WARNING: extreme
oversimplification
Past: “Relational” (RDBMS)
Graph (neo4j)
categories are
Document Oriented (Mongo, Couch, etc...) incomplete
Document Databases,
Key-Value Stores,
and RDBMSes.”
automatic sharding
support (v1.6)*
easy-to-use API
C, C++, Java
PHP, JavaScript
Download: http://pypi.python.org/pypi/pymongo/1.7
Database
Collection
Document
>>> db = connection.mydatabase
Contains documents
>>> blog.insert(entry1)
ObjectId('4c3a12eb1d41c82762000001')
document
>>> entry1
{'_id': ObjectId('4c3a12eb1d41c82762000001'),
'body': "Here's a document to insert.",
'title': 'Mongo Tutorial'}
Mongo’s IDs are designed to be unique...
>>> blog.insert(entry2)
ObjectId('4c3a1a501d41c82762000002')
another document
Sunday, August 1, 2010
Indexing
>>> blog.ensure_index(“tags”)
by multiple values
bulk_entries = [ ]
for i in range(100000):
entry = { "title": "Bulk Entry #%i" % (i+1),
"body": "What Content!",
"author": random.choice(["David", "Robot"]),
"tags": ["bulk",
random.choice(["Red", "Blue", "Green"])]
}
bulk_entries.append(entry)
>>> blog.insert(bulk_entries)
{u'_id': ObjectId('4c3a1e411d41c82762018a89'),
u'author': u'Robot',
u'body': u'What Content!', returned in 0.04s - extremely fast
u'tags': [u'bulk', u'Green'], No index created for “title”!
u'title': u'Bulk Entry #99999'}
{u'_id': ObjectId('4c3a1e411d41c82762018a89'),
u'author': u'Robot',
u'body': u'What Content!',
u'tags': [u'bulk', u'Green'],
u'title': u'Bulk Entry #99999'}
presumably, need more entries to effectively test index performance...
>>> green_items = [ ]
>>> for item in blog.find({“tags”: “Green”}):
green_items.append(item)
- or -
>>> green_items = list(blog.find({“tags”: “Green”}))
16646
use remove(...)
Regular Expressions
{“tag” : re.compile(r“^Green|Blue$”)}
$or, $not
$elemmatch
collection.find(...)
sort(“name”) - sorting
collection.map_reduce(mapper, reducer)
)LJXUH0DS5HGXFHORJLFDOGDWDIORZ
Java MapReduce
also see: Diagram Credit:
+DYLQJUXQWKURXJKKRZWKH0DS5HGXFHSURJUDPZRUNVWKHQH[WVWHSLVWRH[SUHVVLW
Hadoop: The Definitive Guide
LQFRGH:HQHHGWKUHHWKLQJVDPDSIXQFWLRQDUHGXFHIXQFWLRQDQGVRPHFRGHWR
Map/Reduce : A Visual Explanation by Tom White; O’Reilly Books
UXQ WKH MRE 7KH PDS IXQFWLRQ LV UHSUHVHQWHG E\ DQ LPSOHPHQWDWLRQ
Chapter 2, RI
pageWKH
20Mapper
LQWHUIDFHZKLFKGHFODUHVD map()PHWKRG([DPSOHVKRZVWKHLPSOHPHQWDWLRQRI
RXUPDSIXQFWLRQ
([DPSOH0DSSHUIRUPD[LPXPWHPSHUDWXUHH[DPSOH
import
Sunday, August 1, 2010 java.io.IOException;
SELECT
19OPQ db.runCommand({
A*2=*LR
Dim1, Dim2, ! mapreduce: "DenormAggCollection",
SUM(Measure1) AS MSum, query: {
"
COUNT(*) AS RecordCount, filter1: { '$in': [ 'A', 'B' ] },
AVG(Measure2) AS MAvg, # filter2: 'C',
MIN(Measure1) AS MMin filter3: { '$gt': 123 }
MAX(CASE },
WHEN Measure2 < 100 $ map: function() { emit(
THEN Measure2 { d1: this.Dim1, d2: this.Dim2 },
END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1,
FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 }
WHERE (Filter1 IN (’A’,’B’)) );},
AND (Filter2 = ‘C’) % reduce: function(key, vals) {
AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 };
GROUP BY Dim1, Dim2 ! for(var i = 0; i < vals.length; i++) {
HAVING (MMin > 0) ret.msum += vals[i].msum;
ORDER BY RecordCount DESC ret.recs += vals[i].recs;
LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin;
if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax))
ret.mmax = vals[i].mmax;
}
! ()*+,-./.01-230*2/4*5+123/6)-/,+55-./ return ret;
*+7/63/8-93/02/7:-/16,/;+2470*2</ },
)-.+402=/7:-/30>-/*;/7:-/?*)802=/3-7@ finalize: function(key, val) {
'
" A-63+)-3/1+37/B-/162+6559/6==)-=67-.@ & val.mavg = val.msum / val.recs;
return val;
# C==)-=67-3/.-,-2.02=/*2/)-4*)./4*+273/
},
G-E030*2/$</M)-67-./"N!NIN#IN'
G048/F3B*)2-</)048*3B*)2-@*)=
1+37/?607/+2705/;02650>670*2@
$ A-63+)-3/462/+3-/,)*4-.+)65/5*=04@
out: 'result1',
verbose: true
% D057-)3/:6E-/62/FGAHC470E-G-4*).I });
5**802=/3795-@
db.result1.
' C==)-=67-/;057-)02=/1+37/B-/6,,50-./7*/
7:-/)-3+57/3-7</2*7/02/7:-/16,H)-.+4-@
find({ mmin: { '$gt': 0 } }).
& C34-2.02=J/!K/L-34-2.02=J/I!
sort({ recs: -1 }).
skip(4).
limit(8);
http://rickosborne.org/download/SQL-to-MongoDB.pdf
Sunday, August 1, 2010
Map/Reduce
Examples
! “weighings”: [
! ! ... ]
Linear scaling
Insert
1000
292s
100
29.5s
10
3.14s
1
1k 10k 100k
384s
100
38.8s
10
4.29s
1
1k 10k 100k
100
108s
10
10s
1 1.23s
1k 10k 100k
100
108s
26s
10
10s
1 2.2s
1.23s
0.37s
0.1
1k 10k 100k
www.mongodb.org
PyMongo
api.mongodb.org/python
MongoDB
The Definitive Guide
O’Reilly
www.10gen.com