0% found this document useful (0 votes)
145 views63 pages

Mongodb Introduction: Presenter: John Page

This document provides an overview of MongoDB, including: - MongoDB is a modern document-oriented database designed for building applications. It is object-oriented rather than SQL-oriented. - MongoDB is easy to scale, assumes business critical use, and learns from 40 years of relational database management system experience. - The document data model allows for dynamic schemas and embedding of related data. - MongoDB offers drivers for popular programming languages, a command line interface, complex queries, aggregation capabilities, and auto-sharding for scalability. - MongoDB is well-suited for complex data access, large data volumes, rapid development, and distributed deployments requiring high availability.

Uploaded by

tmartel_99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views63 pages

Mongodb Introduction: Presenter: John Page

This document provides an overview of MongoDB, including: - MongoDB is a modern document-oriented database designed for building applications. It is object-oriented rather than SQL-oriented. - MongoDB is easy to scale, assumes business critical use, and learns from 40 years of relational database management system experience. - The document data model allows for dynamic schemas and embedding of related data. - MongoDB offers drivers for popular programming languages, a command line interface, complex queries, aggregation capabilities, and auto-sharding for scalability. - MongoDB is well-suited for complex data access, large data volumes, rapid development, and distributed deployments requiring high availability.

Uploaded by

tmartel_99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 63

MongoDB Introduction

Presenter: John Page


Agenda

Technology

Customers Company

Community
Technology
MongoDB.

Modern Document-model database.


Designed to build today’s business applications.
• Object oriented not SQL oriented.
• Easy to scale.
• Business Critical is assumed.
• Lessons learned from 40 years of RDBMS.
Relational Model
EmpID Name Dept Title Manage Payband
9950 Dunham, 500 1500 6531 C
Justin

EmpBenPlanID EmpFK PlanFK


1 9950 100
2 9950 200

PlanID BenFK Plan BenID Benefit


100 1 PPO Plus 1 Health
200 2 Standard 2 Dental

TitleID Title DeptID Department


1500 Product Manager 500 Marketing
Document Model
EmpID Name Dept Title Manage Payband Benefits
9950 Dunham, Marketing Product 6531 C Health PPO Plus

Justin Manager Dental Standard

EmpBenPlanID EmpFK PlanFK


1 9950 100
2 9950 200

PlanID BenFK Plan{ _id : 9950,


employee_name: "Dunham, Justin",
100 Health PPO Plusdepartment : "Marketing",
title : "Product Manager, Web",

200 Dental Standardreport_up: "Neray, Graham",


pay_band: “C",
benefits : [
{ type :  "Health",
plan : "PPO Plus" },
{ type :   "Dental",
plan : "Standard" }
]
}
MongoDB - Agility
EmpID Name Dept Title Manager Payband Benefits
Dynamic Schemas
9950 Dunham, Marketing Product 6531 C Health PPO Plus

Justin Manager Dental Standard

EmpID Name Title Payband Bonus


9952 Joe White CEO E 20,000
EmpID Name Dept Title Manager Payband Shares
9531 Nearey, Marketing Director 9952 D 5000
Graham
MongoDB - Usability

Drivers
Drivers for most popular Java Ruby
programming languages and
frameworks
JavaScript Perl

Python Haskell

Shell > db.collection.insert({product:“MongoDB”,


type:“Document Database”})
>
Command-line shell for > db.collection.findOne()
interacting directly with {
“_id” : ObjectId(“5106c1c2fc629bfe52792e86”),
database “product”
“type”
: “MongoDB”
: “Document Database”
}

8
MongoDB - Utility
• Complex Indexed Queries
• Aggregation.

Age > 65 AND


Male living near
AgeLyon Profit Margin
1-17 0
18-35 20
36-50 80
51-65 50
66+ 5
MongoDB - Scalability

• High Availability.
• Auto Sharding.
• Compression of data.
• Lock free access.
• Enterprise Management.
The database landscape.

Key/Value Store

Column Family Document Store

Relational
When MongoDB should be used.
• When you have high speed access to complex objects
• Atomic partial updates.
• Fast Retrieval.
• Secondary Queries. 22 [ 2 , 3, 4,]

• Aggregation capabilities. {
a: 5
bob :
22 [ 2
{ a { e:3}
, 3, 4,]
{
a: 5
bob : { a { e:3}
22 [ 2 , 3, 4,]
{
a: 5
bob : { a { e:3}

• When you want to store larger data structures.


• Large Arrays
• Long Text fields.
• In line BLOBS.
When MongoDB should be used.
• When you value rapid development and evolution.
• Direct Object Models – lack of Mapping
• Application defined Schemas
• Rich feature sets and Search

• Where you need to store structures of any shape.


• Direct Object Models
• Application defined Schemas
• Heterogeneous schemas.
When MongoDB should be used.
• When you have large data volumes.
• When data volumes are growing
• Where growth is potentially unlimited.
• Where you don’t want to pay for future growth just now.

• When you want distributed data access or high uptime.


• Worldwide sites want low access times.
• Data must stay at point of origin legally.
• Data mirroring should be as live time as possible.
How to use MongoDB.
• Easy to get a long way with no training or help.
• But bad habits can be learned.
• Performance can suffer when it shouldn’t
• Issues can arise too late!

• DBA’s need to be respected, trained and certified..


• Developers do many traditional DBA ops.
• But DBA’s have tasks too and skills to hone.
• Forget them at your peril.
Company
We’re your partner

Enterprise
MongoDB Overview

400+ employees 2000+ customers

Offices in New York, Palo Alto, Washington


DC, London, Dublin, Barcelona and Over $231 million in funding
Sydney and more
Community
MongoDB - Global Community
10,000,000+
MongoDB Downloads
250,000+
Online Education Registrants
30,000+
MongoDB User Group Members
20,000+
MongoDB Days Attendees
35,000+
MongoDB Management Service (MMS) Users

21
Customers
MongoDB Use Cases
Big Data Product & Asset Security & Internet of Database-as-a-
Catalogs Fraud Things Service
Top Global Shipping
Company

Top US Retailer
Top Investment and
Top Media Company
Retail Banks

Intelligence Agencies Top Industrial Equipment Top Investment and


Manufacturer Retail Banks

Mobile Customer Data Data Social & Content


Apps Management Hub Collaboration Management
Case Study
Helping users find love faster through a more engaging
dating platform

Problem
Problem Solution
Why MongoDB Results
Results


Bi-direction matching process did A flexible data model to 95% faster compatibility
not scale on single monolithic seamlessly handle new user matching; matching the
database as the service grew – attributes entire user base takes 12 hrs
running a matching analysis of
the user base was taking 2 The ability to scale on
instead of 2 weeks
weeks, detracting from customer commodity hardware and not
experience add operational overhead to a 30% higher communication
team already managing between prospective
Richer and more complex data thousands of servers partners; 50% increase in
models caused operational paying subscribers; 60%
complexity and downtime as Support for complex, multi- increase in unique web visits
schema changes required a full attribute queries that provide the
database dump and reload foundation of eHarmony’s
compatibility matching system

24
Case Study
Re-inventing eCommerce personalization for over 2 million
users per day

Problem
Problem Solution
Why MongoDB Results
Results

Product catalog with over 2 Flexible data model allowed Otto Products get to market faster
million products took over 12 hrs to quickly iterate data schema for – product catalog update
to update, resulting in stale changes to products, attributes, time reduced to 15 minutes
catalogs and worse customer customer profiles
experiences
from 12 hrs
All site interactions stored in
Site was static, slow and MongoDB to enable Personalized experience for
expensive to change; it could not personalized products, 30m shoppers, resulting in
react quickly to market changes navigation and filters higher customer
engagement, satisfaction
Only small fragments of the site In-memory speed dramatically and revenues
could be changed to personalize improved site response times
the customer experience

25
Case Study
Optimizing performance and reducing costs with tens of
billions of records

Problem
Problem Solution
Why MongoDB Results
Results

Legacy system became too Scale out on inexpensive 10x increase in performance
cumbersome and expensive commodity servers
to manage as data volumes 12 billion docs, growing at 1
reached the 10s of billions Built in redundancy billion docs per year

Flexible schema well suited Support growth at a fraction


for massive amounts of of the footprint of legacy
variable data system

26
Case Study
Insurance leader generates coveted single view of
customers in 90 days – “The Wall”

Problem
Problem Solution
Why MongoDB Results
Results

No single view of customer, Built “The Wall” pulling in Prototyped in 2 weeks


leading to poor customer disparate data and serving
experience and churn single view to customer Deployed to production in 90
service reps in real time days
145 years of policy data, 70+
systems, 15+ apps that are Flexible data model to Decreased churn and
not integrated aggregate disparate data improved ability to
into single data store upsell/cross-sell
Spent 2 years, $25M trying
build single view with Oracle Expressive query language
– failed to serve any data in real time

27
PHARMACEUTICALS ACCELERATES R&D WITH
MONGODB
Their technology creates a synthetic version of messenger RNA,
which helps create protein in cells. If successful, the proteins
could fight cancer, among other diseases.

AstraZenaca explained recently, analyzing 88 whole human


genomes took 15,000 hours and 171 terabytes (TB) of data.
Analyzing a single human genome can take four days on
RDBMS.

With MongoDB AstraZeneca’s experiment involved taking 10%


of all its compounds and pulling in information from its disparate
database systems. Using MongoDB, the company was able to
execute Tanimoto comparisons on about 500,000,000
compounds in a matter of hours. “All of this, underneath my
desk,” Tetrault says.
Molecular Similarity Database

• Store Chemical Compounds – Fingerprints


• Want to find compounds which are “close” to a given
compound
• Need to return quickly a small set of reasonable
candidates
• Use Tanimoto association coefficient to compare two
compounds based on their common fingerprints
Molecular Similarity Database
Molecular Similarity Database

• Easy to implement with schema-less documents


• Simple to code through Python and driver
• On a single node, screen millions of compounds in
subsecond time for queries such as
– Find compounds with Tanimoto coeff >= 0.8 with
compound A
• Easy to extend functionality with M/R and
aggregation framework
Big Data Genomics

• Very large base of DNA sample sequences


– Origin, collection method, sequence, date, …
• Enumeration of mutations relative to reference
sequence
– Positions, mutation type, base
• Need to retrieve efficiently all sequences showing a
particular mutation
Big Data Genomics

• Similar to Content Management System pattern


• Add tag array in sequence document with mutation
names
• Index tag array
• Queries looking for affected sequences are indexed
and very fast

• Easy to setup, flexible representation and details for


sequences, flexible evolution
• Can scale to massive amounts
Case Study
Met new requirements for personalization server serving
over 20 million customers in record time

Problem
Problem Solution
Why MongoDB Results
Results

Personalization server that acts Dynamic schema for storing New version of
as the ‘master’ storage for variable customer data personalization server was
customer data was originally built built on MongoDB in ¼ the
on Oracle (over 14 months) but it
performed below expectations,
Native fault tolerance and time with ½ the team
did not scale, and cost too much high availability
Performance boosts of more
New performance requirements Official drivers, production than a magnitude, 1/3 the
– 40% more data to be stored, support, and significantly storage requirements
reload entire data warehouse (22 reduced costs
million customers) daily in small Decreased costs and
window – could not be met with increased revenues
Oracle

35
Case Study
Quantitative investment manager with over $11.3 billion in
assets under management invests heavily in new database

Problem Solution Results


Problem Why MongoDB Results

AHL needed new Once it was determined that MongoDB was 100x faster in
technologies to be more MongoDB could significantly retrieving data
agile and gain competitive improve operations, the
advantages in the systematic database was embraced for Tick Data: Quickly scaled to
trading space a number of applications, 250 million ticks per second,
replacing RDBMS a 25x improvement in tick
throughput
Faster data retrieval; faster
compute times; better Cut disk storage 60%, and
throughput for tick data realized 40x cost savings by
using commodity SSDs

36
Relational Database Challenges
Relational Database Challenges
Data Types Agile Development
• Unstructured data • Iterative
• Semi-structured data • Short development
cycles
• Polymorphic data
• New workloads

Volume of Data New Architectures


• Petabytes of data • Horizontal scaling
• Trillions of records • Commodity servers
• Millions of queries per • Cloud computing
second
MongoDB Solution
Agility

RDBMS MongoDB

{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type :  "Health",
plan : "PPO Plus" },
{ type :   "Dental",
plan : "Standard" }
]
}

40
Performance

Better Data Locality In-Memory Caching In-Place


Updates
Scalability

Auto-Sharding

• Increase capacity as you go


• Commodity and cloud architectures
• Improved operational simplicity and cost visibility
High Availability

• Automated replication and failover


• Multi-data center support
• Improved operational simplicity (e.g., HW swaps)
• Data durability and consistency
MongoDB Architecture

44
Sharding and Replication
MongoDB Architecture

46
MongoDB Architecture

47
MongoDB Architecture

48
Lower Total Cost of Ownership

Developer/Ops Savings
• Ease of Use
• Agile development
• Less maintenance

Hardware Savings
• Commodity servers
• Internal storage (no SAN)
• Scale out, not up

Software/Support Savings DB Alternative


• No upfront license
• Cost visibility for usage growth

49
70%+ Cost Takeout

Dev. and Admin

Dev. and Admin


Compute – Scale-Up Servers

Storage - SAN Compute – Scale-Up Servers


Storage - SAN
MongoDB Products and Services
MongoDB Enterprise Advanced

Enterprise build with value-added capabilities


• Advanced Security w/Kerberos
• Ops Manager
– Visualization and alerts on 100+ system metrics
– 1 min RPO Backup
– Automation
– Enterprise Software Integration via SNMP
• Private, On-Demand MongoDB University Training
• Certified OS Support
Consulting

Dedicated Consulting Custom Consulting Health Check


Engineer

• Named MongoDB • Assist with all phases of • Assess overall status


expert project and health of existing
MongoDB deployment
• Advisory services • E.g., config., testing,
optimization, best
• Ongoing basis
practices

Lightning Consults also available


Training

Public Private Online

• Dev, admin, and • Customized to your • Free


combined courses needs
• For devs and admins
available
• For devs and admins
• 7 weeks
• North America and
• On-Site
EMEA • Weekly lectures,
homework, final exam

Private, On-Demand MongoDB University Training


Included with MongoDB Enterprise Subscription
MongoDB Management Tools

Cloud-based suite of services for managing


MongoDB deployments
• Monitoring, with charts,
dashboards and alerts on
100+ metrics
• Backup and restore, with
point-in-time recovery,
support for sharded clusters

• MMS On-Prem included with MongoDB Enterprise


(backup coming soon)
55
Case Study
Leading solutions provider for smart energy networks
captures and stores complex M2M data

Problem Solution Results


Problem Why MongoDB Results
Many utilities find that 70% MongoDB seamlessly Allow utiliities and app devs
of their enterprise data is captures and stores high to extract actionable insights
generated from Smart Grid volumes of rapidly changing, into energy usage patterns
Networks, yet extracting that M2M data for new SilverLink as they’re happening for
information is costly and Sensor Network better apps, better customer
resource intensive experiences
SSN produces near real-time
Utilities end up relying on analytics of fast moving data Allows for the delivery of new
siloed apps that are difficult across different parameters applications and services for
to scale – temporal, geospatial, utilities and energy
sensor type consumers at 10x the speed,
and 1/10th the cost

56
Case Study
Multi-national banking and financial services firm meets
strict SLAs by replatforming on new technology

Problem
Problem WhySolution
MongoDB Results
Results

Globally distributed app did Replatformed on a single Application in compliance


not meet SLAs required in database technology with strict SLAs
delivering data to traders, (MongoDB) for a simplified
resulting in SEC fines and infrastructure $40 M in savings over 5
damages to the reputation of years with simplified
firm Data replicated to data infrastructure and the use of
centers on multiple commodity servers
Complex infrastructure with continents, bringing it closer
many components was to stakeholders and reducing
expensive and difficult to the effects of geographic
maintain latency

57
Case Study
Powering next-generation SaaS for mission-critical
government services in the Netherlands

Problem
Problem WhySolution
MongoDB Results
Results

Brein BV operates in a highly MongoDB manages all the Greatly improved customer
competitive market where each content from government forms experience (Writes are 23x
government tender receives and stores the business rules faster and reads are 12x
responses from multiple vendors; that enable automated workflow
must have competitive and collaboration
faster)
advantage to survive
Dynamic schema brings new Migration to MongoDB
Existing database technology flexibility to the solution happened in 6 months
was not evolving at the pace
needed to keep up with new Multi-node replica set, distributed Platform evolves faster with
trends – online services, always across data centers ensures agile dev enabled by
connected users and businesses always-on availability, critical to dynamic schema
SLAs with customers

58
Case Study
Improving drug discovery tests

Problem Solution Results


Problem Why MongoDB Results
Testing instruments have MongoDB capture the variety New tests can be run in
gotten better at capturing of data generated by genetic weeks versus 3-6 months
complex and varied genetic tests and integrates it with using just Oracle
data while legacy database existing environment
technologies struggle to Reduce the time it takes to
store and use it all Flexible schema makes the introduce a new drug – a big
database easy to integrate difference to patients
Team needed to change the into existing infrastructure
schema every time a new lab
instrument was introduced, MongoDB redesign means
holding up research by 3-6 that adding new genetic test
months instruments have no impact
on database schema

59
Case Study
Re-inventing eCommerce personalization for over 2 million
users per day

Problem
Problem WhySolution
MongoDB Results
Results

Product catalog with over 2 Flexible data model allowed Otto Products get to market faster
million products took over 12 hrs to quickly iterate data schema for – product catalog update
to update, resulting in stale changes to products, attributes, time reduced to 15 minutes
catalogs and worse customer customer profiles
experiences
from 12 hrs
All site interactions stored in
Site was static, slow and MongoDB to enable Personalized experience for
expensive to change; it could not personalized products, 30m shoppers, resulting in
react quickly to market changes navigation and filters higher customer
engagement, satisfaction
Only small fragments of the site In-memory speed dramatically and revenues
could be changed to personalize improved site response times
the customer experience

60
Case Study
Optimizing performance and reducing costs with tens of
billions of records

Problem
Problem WhySolution
MongoDB Results
Results

Legacy system became too Scale out on inexpensive 10x increase in performance
cumbersome and expensive commodity servers
to manage as data volumes 12 billion docs, growing at 1
reached the 10s of billions Built in redundancy billion docs per year

Flexible schema well suited Support growth at a fraction


for massive amounts of of the footprint of legacy
variable data system

61
Case Study
Met new requirements for personalization server serving
over 20 million customers in record time

Problem
Problem WhySolution
MongoDB Results
Results

Personalization server that acts Dynamic schema for storing New version of
as the ‘master’ storage for variable customer data personalization server was
customer data was originally built built on MongoDB in ¼ the
on Oracle (over 14 months) but it
performed below expectations,
Native fault tolerance and time with ½ the team
did not scale, and cost too much high availability
Performance boosts of more
New performance requirements Official drivers, production than a magnitude, 1/3 the
– 40% more data to be stored, support, and significantly storage requirements
reload entire data warehouse (22 reduced costs
million customers) daily in small Decreased costs and
window – could not be met with increased revenues
Oracle

62
Case Study
Keeping costs low while serving over 6 billion images to
millions of customers

Problem
Problem WhySolution
MongoDB Results
Results

JSON-based data structure


Brittle code base slowing allows team to quickly develop
down engineering use cases and deploy new applications
80% cost reduction with
to a painful crawl meaning which were difficult and costly to
implement on legacy systems commodity hardware
that the business would
struggle to provide the
Automatic failover satisfies high 900% performance
highest quality customer
uptime requirements improvement
experiences Agile, high performance,
scalable Faster time-to-market; Dev.
Subpar performance on
cycles in weeks vs.
expensive database platform Simplified architecture allowed
company to improve tens of months
(Oracle licensing and
hardware costs) performance while keeping costs
low

63
Case Study
Airline improves customer experience with optimized seat
re-assigning system

Problem
Problem WhySolution
MongoDB Results
Results

United Airlines experiences MongoDB stores customer Improved customer


significant issues with reissuing seat preferences information experience and loyalty to
boarding passes based on in a more efficient manner United Airlines
customer preferences – called
Seat Re-accommodation.
with more detailed
information. Reduced reassign process
“When an Irregular Operation by up to 75%
occurs (weather – mechanical) MongoDB assists in re-
we have challenges to re-assign assigning seats under tight Reduces late departures by
customer preferred seats time tables improving up to 10%
causing reduced customer customer experience
satisfaction.”

64

You might also like