Cloudera Overview PDF
Cloudera Overview PDF
Cloudera Overview PDF
Sr. Sergio
Rodríguez de Guzmán
CTO
PUE
www.pue.es
Hadoop & Why Cloudera
Sergio Rodríguez| Systems Engineer | [email protected]
3
Industry-Leading Consulting and Training
4
Source: Fortune, “Fortune 500 “ and “Global 500,” May 2012.
Common BigData Early Problems in a Project
• Infrastructure investment
• Security and Compliance concerns
• Architecture sizing
• Wide and heterogeneous Hadoop ecosystem
• Support
• Ease of management
5
Best-In-Class Support
6
Cloudera Platform
7
Cloudera Enterprise
Making Hadoop Fast, Easy, and Secure
Hadoop delivers:
Process Discover Model Serve • One place for unlimited data
Batch, Stream SQL, Search Analytics, ML NoSQL
• Unified, multi-framework data access
Security, Governance, Administration
8
From Hadoop to an Enterprise Data Hub
Open Source CLOUDERA’S ENTERPRISE DATA HUB
Scalable
✔
CLOUDERA NAVIGATOR
MANAGEMENT
Flexible BATCH ANALYTIC SEARCH MACHINE STREAM 3RD PARTY
Cost-Effective PROCESSING SQL ENGINE LEARNING PROCESSING APPS
DATA
MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING
Managed ✖
✔ WORKLOAD MANAGEMENT YARN
MANAGEMENT
CLOUDERA MANAGER
UNIFIED, ELASTIC, RESILIENT,, SECURE SENTRY
SYSTEM
Architecture
✖
✔
FILESYSTEM ONLINE NOSQL
Secure and
HDFS HBASE
Governed
9
The Only Complete Hadoop Management Suite
Deliver optimum system utilization and meet SLA commitments.
Cloudera Manager
Focus on the solution, not the
cluster, with the only complete,
zero-downtime administration
tool for Apache Hadoop.
Unique Capabilities:
• Unified configuration, management
and monitoring across all services
• Online installation and upgrades
• Direct connection to Cloudera Support
• 3rd Party Extensibility
10
The Only Portable Cloud Experience for Hadoop
Maximize flexibility in Hadoop deployment architectures.
Cloudera Director
The first portable, self-service
solution for deploying and
managing enterprise-grade
Hadoop in the Cloud.
Unique Capabilities:
• Dynamic cluster lifecycle management
• Cloud blueprints
• Multi-cluster health visibility
• Usage reporting for billing models
11
Why Cloudera is the Leader in Spark Support
• Integrated with other Cloudera Components – Cloudera Manager, Sentry,
Navigator, etc.
• Cloudera more customers running Spark today than all our competitors
combined. Installations range from a few nodes to 1000 node installs.
• Cloudera has been supporting Spark since early 2014 and first Hadoop vendor
• Between Cloudera and Intel, have over 20 developers working on Spark and 4
Committers
• The first and only Spark Training Class
12
Apache Kudu
Completes Hadoop's storage layer to enable fast analytics on fast data.
13
The Only Hadoop Data Governance Solution
Enable compliance and maximize analyst productivity.
Cloudera Navigator
Minimize risk and maintain
compliance with the only native
end-to-end data governance
solution for Apache Hadoop.
Unique Capabilities:
• Auditing
• Lineage
• Metadata Tagging and Discovery
• Lifecycle Management
14
Adaptive Data Model Management
Improve DBA productivity through continuous optimization.
Navigator Optimizer
Instantly understand data
warehouse and Hadoop cluster
usage, and drive optimizations
to reduce cost and improve
performance.
Unique Capabilities:
• Schema and workload profiling
• Data model discovery
• Optimization guidance
• Optimization automation (future)
15
The Only Comprehensively Secure Hadoop Platform
Meet compliance requirements and reduce risk exposure from storing sensitive data.
16
MasterCard
Cloudera: The first PCI-Certified
Hadoop Platform
Challenge: All applications, databases, or file
systems that have the potential to handle
“Data privacy and protection is a top
personal account-related data must undergo full priority for MasterCard. As we maximize
PCI certification the most advanced technologies from
partners and vendors, they must meet the
Solution: MasterCard’s Cloudera environment rigorous security standards we’ve set. With
Cloudera’s commitment to the same
fully conforms to the PCI-DSS V 2.0 security standards, we now have additional options
standards so it can host PCI datasets and in how we manage our data center.”
potentially integrate with other internal systems Gary VonderHaar
Chief Technology Officer,
Architecture
MasterCard
17
Security and Governance
Cloudera Competitors
Unified, Compliance-Ready, Transparent Fragmented, Incomplete, Complex
Perimeter
● Kerberos with Cloudera Manager
Automated, industry-standard ◐ Kerberos
Manual configuration
Protecting access to the cluster authentication integrated with and integration
existing systems
Access
● Apache Sentry
Working within the ◐ Hive ATZ-NG, Ranger
RBAC configuration silos,
Securing access to data community to deliver centralized, GUI “Band-Aid”
granular RBAC across frameworks
18
Why Cloudera?
Your trusted partner for getting results with enterprise Hadoop.
20