1.3 Module-1

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 26

SRI KRISHNA COLLEGE OF TECHNOLOGY

[An Autonomous Institution | Affiliated to Anna University and


Approved by AICTE | Accredited by NAAC with ‘A’ Grade]

KOVAIPUDUR, COIMBATORE – 641 042.

21ITE06/BIG DATA ANALYTICS


III YEAR /CSE/VI SEMESTER
MODULE-1
Module 1- Introduction

Session Topic
1.1 Types of Digital Data-Characteristics of Data – Evolution of Big Data - Definition of Big Data –
Challenges with Big Data.
1.2 3Vs of Big Data – Non-Definitional traits of Big Data – BI vs. Big Data - Data warehouse and Hadoop
environment – Coexistence.
1.3 Big Data Analytics: Classification of analytics – Data Science
1.4 Terminologies in Big Data – CAP Theorem – BASE Concept.
1.5
NoSQL: Types of Databases – Advantages – NewSQL - SQL vs. NOSQL vs NewSQL.

1.6
Introduction to Hadoop: Features – Advantages - Versions – Overview of Hadoop Eco systems –

1.7 Hadoop distributions – Hadoop vs. SQL – RDBMS vs. Hadoop


1.8 Hadoop Components – Architecture-HDFS
1.9
Map Reduce: Mapper – Reducer - Combiner -Partitioner – Searching – Sorting – Compression

1.10
Hadoop 2 (YARN): Architecture – Interacting with Hadoop Eco systems.

MODULE 1 Introduction to Big Data


1.3 Introduction to Big Data:
Big Data Analytics: Classification of analytics – Data Science

Course Outcome:
Upon completion of the session, students shall have ability to

CO2 Distinguish big data analysis and analytics in optimizing the business [AP]
decisions.
Big Data Analytics
Big Data Analytics is the process of examining big data
to uncover patterns, unearth trends, and find unknown
correlations and other useful information to make
faster and better decisions.

Few Top Analytics tools are:


MS Excel, SAS, IBM SPSS Modeler, R analytics, Statistica,
World Programming Systems (WPS), and Weka.

The open source analytics tools are: R analytics and


Weka.
MODULE 1 Introduction to Big Data
Big Data Analytics

MODULE 1 Introduction to Big Data


Big Data Analytics
1. Technology enabled analytics: The analytical tools help to process and
analyze big data.
2. About gaining a meaningful, deeper, and richer insights into business to
drive in right direction, understanding the customer’s demographics, better
leveraging the services of vendors and suppliers etc.
3. About a competitive edge over the competitors by enabling with finding that
allow quicker and better decision making.
4. A tight handshake between 3 communities: IT, Business users and Data
Scientists.
5. Working with datasets whose volume and variety exceed the current storage
and processing capabilities and infrastructure of the enterprise.
6. About moving code to data. This makes perfect sense as the program for
MODULE 1 Introduction to Big Data
Big Data Analytics

MODULE 1 Introduction to Big Data


Big Data Analytics

MODULE 1 Introduction to Big Data


Big data to Big Data Analytics why?

MODULE 1 Introduction to Big Data


Classification of analytics

There are basically two schools of thought:


1. Those that classify analytics into basic,
operational, advanced and monetized.
2. Those that classify analytics into analytics 1.0,
analytics 2.0 and analytics 3.0

MODULE 1 Introduction to Big Data


Classification of analytics
First school of thought:
1. Basic analytics: This primarily slicing and slicing of data to
help with basic business insights. This is about reporting on
historical data, basic visualization etc.

2. Operationalized Analytics: It is operationalized analytics if


it gets woven into the enterprise’s business process.

3. Advanced Analytics: This largely is about forecasting for


the future by way of predictive and prescriptive modelling.

4. Monetized analytics: This is analytics in use to derive direct


business revenue.

MODULE 1 Introduction to Big Data


Classification of analytics : Second school of thought

MODULE 1 Introduction to Big Data


Classification of analytics : Second school of thought

MODULE 1 Introduction to Big Data


Challenges facing big data

MODULE 1 Introduction to Big Data


Different Big Data Analytics Approaches
Reactive – Business Intelligence:
It is about analysis of the pas or historical data and then displaying the
finding of the analysis or reports in the form of enterprise dash boards,
alerts, notifications etc.
Reactive – BigData Analytics:
Here the analysis is done on huge datasets but the approach is still reactive
as it is still base don static data.
Proactive – Analytics:
This is to support futuristic decision making by the use of data mining,
predictive modeling, text mining and statistical analysis. This analysis is not
on bigdata as it still used traditional data base management practices.
Proactive – Big Data Analytics:
This is sieving through terabytes of information to filter out the relevant data
to analyze. This also includes high performance analytics to gain rapid
insights from big data and the ability to solve complex problems using more
data.

MODULE 1 Introduction to Big Data


What kind of Technologies are we looking toward to help
meet the challenges posed by Big Data?

• The first requirement is of cheap and abundant storage.


• We need faster processors to help with quick processing of big data.
• Affordable open-source, distributed big data platforms, such as Hadoop.
• Parallel processing, clustering, virtualization, large grid environments, high
connectivity and high throughputs rather than low latency.
• Cloud computing and other flexible resource allocation arrangements.

MODULE 1 Introduction to Big Data


Data Science
• Study of data to extract meaningful insights for
business.
• Multidisciplinary approach that combines
principles and practices from the fields of
mathematics, statistics, artificial intelligence, and
computer engineering to analyze large amounts of
data.
• This analysis helps data scientists to ask and
answer questions like what happened, why it
happened, what will happen, and what can be
done with the results.

MODULE 1 Introduction to Big Data


Data Science-Business acumen skills

• A person’s ability to understand various business


environments and successfully navigate them.
• Strong business acumen skills enable people to
comprehend business processes, assess company
problems, and offer valuable insight on how to
achieve objectives and, in that way, ensure business
success.
• In times of change, they may also adapt and remain
flexible.

MODULE 1 Introduction to Big Data


 The list of traits to be honed to play the role of data
scientists are
 Understanding of domain
 Business strategy
 Problem solving
 Communication
 Presentation
 Inquisitiveness

MODULE 1 Introduction to Big Data


Data Science-Technology expertise

Skills needed for being a technology expertise are

MODULE 1 Introduction to Big Data


Data Science-Mathematical expertise

The core job of data scientist tis to comprehend ,analyse and


to dabble with learning algorithm
The skills needed for them are

MODULE 1 Introduction to Big Data


Responsibility of Data Scientist

MODULE 1 Introduction to Big Data


Responsibility of Data Scientist

MODULE 1 Introduction to Big Data


Test your Knowledge
Test your Knowledge
Next Session…

1.4 Terminologies in Big Data – CAP


Theorem – BASE Concept.…………

You might also like