0% found this document useful (0 votes)
36 views17 pages

Big Data Platforms: Department of Computer Science and Engineering Rajasthan Technical University Kota, Rajasthan

This document provides an overview of several major big data platforms: Hadoop, Cloudera, Amazon Web Services, Hortonworks, MapR, IBM Open Platform, and Microsoft HDInsight. For each platform, a brief description is given of its core functionality and how it relates to Apache Hadoop distributed processing capabilities.

Uploaded by

Deepak Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views17 pages

Big Data Platforms: Department of Computer Science and Engineering Rajasthan Technical University Kota, Rajasthan

This document provides an overview of several major big data platforms: Hadoop, Cloudera, Amazon Web Services, Hortonworks, MapR, IBM Open Platform, and Microsoft HDInsight. For each platform, a brief description is given of its core functionality and how it relates to Apache Hadoop distributed processing capabilities.

Uploaded by

Deepak Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Big Data Platforms

Department of Computer Science and Engineering


Rajasthan Technical University
Kota, Rajasthan
Big Data Platforms
List of Big Data Platforms
1. Hadoop
2. Cloudera
3. Amazon Web Services
4. Hortonworks
5. MapR
6. IBM Open Platform
7. Microsoft HDInsight
2
Big Data Platforms
What is Hadoop?

3
Big Data Platforms
What is Hadoop?
Hadoop is open-source, Java based programming framework
and server software which is used to save and analyse data
with the help of 100s or even 1000s of commodity servers in a
clustered environment.
Hadoop is designed to store and process large datasets
extremely fast and in fault tolerant way.
Hadoop uses HDFS (Hadoop Distributed File System) for
storing data on clusters of commodity servers. If any server
goes down HDFS know how to replicate the data and there is
no loss of data even in hardware failure. 4
Big Data Platforms
What is Hadoop?
Hadoop is Apache sponsored project and it consists of
many software packages which run on the top of the
apache Hadoop system.
Hadoop ecosystem provides necessary tools and
software for handling and analysing Big Data.

5
Big Data Platforms
What is Cloudera?

6
Big Data Platforms
What is Cloudera?
Cloudera is one of the first commercial Hadoop based
Big Data Analytics platform offering Big Data
Solutions.
Its product range includes Cloudera Analytic DB,
Cloudera Operational DB, Cloudera Data Science and
Engineering and Cloudera Essentials.
All these products are based on the Apache Hadoop
and provide real-time processing and analytics of
massive data sets. 7
Big Data Platforms
What is Amazon Web Services?

8
Big Data Platforms
What is Amazon Web Services?
Amazon is offering Hadoop environment in cloud
as part of its Amazon Web Services package.
AWS Hadoop solution runs on Amazon’s Elastic
Cloud Compute and Simple Storage Service (S3).
Enterprises can use the Amazon AWS to run their
Big Data processing analytics in the cloud
environment.
9
Big Data Platforms
What is Hortonworks?

10
Big Data Platforms
What is Hortonworks?
Hortonworks is a big data company based in California.
This company is developing and supports application for
Apache Hadoop.
Hortonworks Hadoop distribution is 100% open source and is
enterprise ready with following features:
Centralized management and configuration of clusters.
Security and data governance are built-in features of the
system.
Centralized security administration across the system.
11
Big Data Platforms
What is MapR?

12
Big Data Platforms
What is MapR?
MapR is a Big Data platform which is using the
Unix file system for handling data.
This solution integrates Hadoop, Spark, and
Apache Drill with a real-time data processing
feature.

13
Big Data Platforms
What is IBM Open Platform?

14
Big Data Platforms
What is IBM Open Platform?
IBM also offers Big Data Platform which is based
on the Hadoop eco-system Software.
IBM Open Platform Features are:
Based on 100% open source software.
Platform includes Ambari, which is a best tool for
provisioning, managing and monitoring Apache
Hadoop Clusters.

15
Big Data Platforms
What is Microsoft HDInsight?

16
Big Data Platforms
What is Microsoft HDInsight?
The Microsoft HDInsight is also based on the
Hadoop Distribution and it is commercial big
data platform from Microsoft.
This is the Hadoop Distribution offering which
runs on the windows and azure environment.

17

You might also like