0% found this document useful (0 votes)
25 views

Big Data Analytics Using Hadoop

This document provides an overview of big data analytics using Hadoop. It discusses what big data is, the 5 V's of big data, and how big data analytics involves collecting and analyzing large datasets to find patterns. It then reviews literature on using relational databases versus Hadoop for big data, describes the components and architecture of Hadoop, and gives examples of applications like healthcare and social media. Finally, it outlines advantages like scalability and disadvantages like security concerns for using Hadoop in big data analytics.

Uploaded by

bhargavi
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Big Data Analytics Using Hadoop

This document provides an overview of big data analytics using Hadoop. It discusses what big data is, the 5 V's of big data, and how big data analytics involves collecting and analyzing large datasets to find patterns. It then reviews literature on using relational databases versus Hadoop for big data, describes the components and architecture of Hadoop, and gives examples of applications like healthcare and social media. Finally, it outlines advantages like scalability and disadvantages like security concerns for using Hadoop in big data analytics.

Uploaded by

bhargavi
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

A Seminar On

Big Data Analytics Using


Hadoop
Outlines
Introduction

Literature Survey

Working of Hadoop in Big Data Analytics

Advantages and Disadvantages of Hadoop

Application of Big Data Analytics Using Hadoop

Conclusion

References
Introduction
BIG DATA
What is Big Data?

“A massive volume of both structured and


unstructured data that is so large that it's difficult to
process with traditional database and software
techniques”.
5 Vs of Big Data
Big Data Analytics
 Big data analytics is the process of collecting,
organizing and analyzing large sets of data (called big
data) to discover patterns and other useful information.


Literature Survey
Relational database management system

 In this illustrated that in olden days through RDBMS


tools ,the data was less and easily handled by RDBMS but
recently it is difficult to handle huge data, which is
preferred as “big data”.

Relational Databases Are Not Designed To Handle Change


Cost
No support for complex object such as documents,video,images etc.
Relational databases have limits on field lengths.
No support for unstructured data.
Old Version Of Hadoop
 2006 - Yahoo! created Hadoop based on GFS and MapReduce (with Doug Cutting
and team)
 2007 - Yahoo started using Hadoop on a 1000 node cluster
 Jan 2008 - Apache took over Hadoop
 Jul 2008 - Tested a 4000 node cluster with Hadoop successfully
 2009 - Hadoop successfully sorted a petabyte of data in less than 17 hours to
handle billions of searches and indexing millions of web pages.
 Dec 2011 - Hadoop releases version 1.0
 Aug 2013 - Version 2.0.6 is available
 Nov 2014: Release 2.6.0 available
 Dec, 2015: Release 2.6.3 available
 Oct, 2016: Release 2.6.5 available
Disadvantages of old versions of hadoop
 It limits scalability
 Availability Issue
 Problem with Resource Utilization
 Limitation in running non-MapReduce Application
Latest Version Of Hadoop
 25 January, 2017: Release 3.0.0-alpha2
available
 This is the second alpha in a series of planned
alphas and betas leading up to a 3.0.0 GA
release. The intention is to "release early,
release often" to quickly iterate on feedback
collected from downstream users.
HADOOP
 To overcome the disadvantages of RDBMS, Hadoop is
introduced in market.

 Hadoop is an open source, Java-based programming framework


that supports the processing and storage of extremely large data
sets in a distributed computing environment.
Working Of Hadoop In Big Data Analytics
 There are many old technologies already present used for big data
handling but each one of them has some advantages and disadvantages.
There are number of technologies are there few of them are mentioned
below:
 Column-oriented databases
 NoSQL databases
 MapReduce
 Hive
 Pig
 WibiData
 PLATFORA
 Apache Zeppelin
 Hadoop
Architecture Of Hadoop
Components Of Hadoop

There are
two main
• MapReduce
components
• HDFS
of Hadoop.
NoSQL
 NoSQL (originally referring to SQL. or relational.)
database provides a mechanism for storage and
retrieval of data that is modeled in means other than the
tabular relations used in relation databases (RDBMS).
 This is backend database of hadoop.
Applications of Hadoop

Health Care Applications

IOT

Social Media
Hadoop Technology In Monitoring
Patient Vitals
Advantages of Hadoop

Scalable

Cost effective

Flexible

Fast

Resilient to failure
Disadvantages of Hadoop

Security Concerns

Not Fit for Small Data

Vulnerable By Nature
Conclusion
 Hadoop which is an open source software is a popular
framework tool to handle the big data and used for big
data analytics.
References
 [1] Sethy, Rotsnarani, and Mrutyunjaya Panda "Big Data Analysis using Hadoop:

A Survey." International Journal 5.7 (2015).


 [2] Bhosale, Harshawardhan S., and Devendra P. Gadekar. "A Review Paper on

BigData and Hadoop." International Journal of Scientic and Research

Publications 4.10 (2014): 1.


 [3] ]http://research.ijcaonline.org/volume108/number12/pxc3900288.pdf
 [4] https://en.wikipedia.org/wiki/Big data
 [5] Tom White,.Hadoop, The denitive guide.,OfReilly,3rd Edition
 [6] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws

rd=ssl#q= hadoop + tutoria+ppt


 [7] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws

rd=ssl#q= hadoop
[8] Bernice Purcell “The emergence of gbig datah technology and analytics “Journal of Technology

Research 2013.

[9] https://www.google.co.in/search?q=Hadoop%2 C + a + distributed + framework +for + Big + Data

&ie=utf-8&oeutf-8 &client = firefox ab&gfe rd = cr&ei =glXJWJyDMIKM4gL89IPACg


[10] Gupta, Bhawna, and Kiran Jyoti. "Big data analytics with hadoop to analyze targeted attacks on
enterprise data." (IJCSIT) International Journal of Computer Science and Information
Technologies 5.3 (2014): 3867-3870.
[11] Russom, Philip. "Big data analytics." TDWI best practices report, fourth quarter (2011): 1-35.
[12] http://blogs.mindsmapped.com/bigdatahadoop/hadoop-advantages-and-disadvantages/
[13]http://www.tutorialspoint.com/articles/what-is-nosql-and-is-it-the-next-big-trend-in-databases
[14] http://www.tutorialspoint.com/MongoDB/MongoDB-Application.htm
[15]http://www.w3resource.com/mongodb/nosql.php
[16] https://www.dezyre.com/article/5-healthcare-applications-of-hadoop-and-big-data/85
[17] https://www.tutorialspoint.com/hadoop/hadoop_enviornment_setup.htm
Thank You!!!

You might also like