0% found this document useful (0 votes)
26 views

Hadoop Big Data Solutions

Hadoop Big Data Solutions

Uploaded by

chandu102103
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Hadoop Big Data Solutions

Hadoop Big Data Solutions

Uploaded by

chandu102103
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

HADOOP - BIG DATA SOLUTIONS

http://www.tutorialspoint.com/hadoop/hadoop_big_data_solutions.htm

Copyright tutorialspoint.com

Traditional Approach
In this approach, an enterprise will have a computer to store and process big data. Here data will
be stored in an RDBMS like Oracle Database, MS SQL Server or DB2 and sophisticated softwares
can be written to interact with the database, process the required data and present it to the users
for analysis purpose.

Limitation
This approach works well where we have less volume of data that can be accommodated by
standard database servers, or up to the limit of the processor which is processing the data. But
when it comes to dealing with huge amounts of data, it is really a tedious task to process such data
through a traditional database server.

Googles Solution
Google solved this problem using an algorithm called MapReduce. This algorithm divides the task
into small parts and assigns those parts to many computers connected over the network, and
collects the results to form the final result dataset.

Above diagram shows various commodity hardwares which could be single CPU machines or
servers with higher capacity.

Hadoop
Doug Cutting, Mike Cafarella and team took the solution provided by Google and started an Open
Source Project called HADOOP in 2005 and Daug named it after his son's toy elephant. Now
Apache Hadoop is a registered trademark of the Apache Software Foundation.
Hadoop runs applications using the MapReduce algorithm, where the data is processed in parallel

on different CPU nodes. In short, Hadoop framework is capabale enough to develop applications
capable of running on clusters of computers and they could perform complete statistical analysis
for a huge amounts of data.

You might also like