Case Study
Case Study
- 321506402090
D.V SRIJA
4/6 CSE-2
CASE STUDY
UNIT 7-INSIDE CLOUD
INTRODUCTION TO CLOUD COMPUTING
INTRODUCTION TO MAP REDUCE
BIG DATA AND ITS IMPACT ON CLOUD
COMPUTING
HADOOP-OVERVIEW OF BIG DATA
BUSINESS IMPACT OF CLOUD COMPUTING
1.Introdution to Cloud Computing:
What is Cloud Computing?
Cloud computing is the on-demand access of
computing resources—physical servers or virtual servers,
data storage, networking capabilities, application
development tools, software, AI-powered analytic tools and
more—over the internet with pay-per-use pricing.
The cloud computing model offers customers greater
flexibility and scalability compared to traditional on-premises
infrastructure.
Cloud computing plays a pivotal role in our everyday
lives, whether accessing a cloud application like Google
Gmail, streaming a movie on Netflix or playing a
cloud-hosted video game.
Types of Cloud Computing:
Public:
A public cloud is a type of cloud computing in which a
cloud service provider makes computing resources available
to users over the public internet. These include SaaS
applications, individual virtual machines (VMs).
Private:
A private cloud is a cloud environment where all cloud
infrastructure and computing resources are dedicated to one
customer only. Private cloud combines many benefits of
cloud computing—including elasticity, scalability and ease of
service delivery—with the access control, security and
resource customization of on-premises infrastructure.
Hybrid:
A hybrid cloud is just what it sounds like: a combination
of public cloud, private cloud and on-premises environments.
Specifically (and ideally), a hybrid cloud connects a
combination of these three environments into a single,
flexible infrastructure for running the organization’s
applications and workloads.
Modules of Hadoop:
HDFS: Hadoop Distributed File System. Google published its
paper GFS and on the basis of that HDFS was developed. It
states that the files will be broken into blocks and stored in
nodes over the distributed architecture.
Yarn: Yet another Resource Negotiator is used for job
scheduling and manage the cluster.
Map Reduce: This is a framework which helps Java programs
to do the parallel computation on data using key value pair.
The Map task takes input data and converts it into a data set
which can be computed in Key value pair.
The output of Map task is consumed by reduce task and
then the out of reducer gives the desired result.
Hadoop Common: These Java libraries are used to start
Hadoop and are used by other Hadoop modules. Hadoop
Architecture
The Hadoop architecture is a package of the file system,
MapReduce engine and the HDFS (Hadoop Distributed File
System). The MapReduce engine can be MapReduce/MR1 or
YARN/MR2.
A Hadoop cluster consists of a single master and multiple
slave nodes. The master node includes Job Tracker, Task
Tracker, Name Node, and Data Node whereas the slave node
includes Data Node and Task Tracker.
Role of Hadoop in Cloud Computing:
Hadoop plays a significant role in cloud computing by
enhancing data storage, processing, and analysis capabilities.
Here are some key aspects:
Scalability: Cloud environments can quickly scale resources
up or down based on demand. Hadoop’s ability to add nodes
easily aligns well with cloud elasticity, allowing organizations
to handle large datasets efficiently.
Cost Efficiency: Using commodity hardware in cloud
environments reduces costs significantly. Organizations can
leverage Hadoop on cloud platforms without the need for
expensive infrastructure.
Data Storage and Management: Hadoop can store vast
amounts of structured and unstructured data in the cloud,
making it easier for organizations to manage and analyze
diverse data sources.
Integration with Other Cloud Services: Hadoop can
integrate with various cloud services, including data lakes,
analytics tools, and machine learning platforms, providing a
comprehensive ecosystem for big data solutions.
Flexibility and Accessibility: Cloud-based Hadoop
deployments allow users to access data and analytics tools
from anywhere, facilitating collaboration and real-time data
processing.
Disaster Recovery and Backup: Cloud providers often
offer robust backup and disaster recovery options, ensuring
that Hadoop data is secure and recoverable.