0% found this document useful (0 votes)

6 views15 pages

Unit - 1 (Big Data)

The document provides an overview of Big Data, defining it as data that exceeds conventional processing capabilities and categorizing it into structured, unstructured, and semi-structured data. It discusses the evolution of Big Data through three phases, highlights the architecture and technology components involved, and outlines various applications across sectors like healthcare, banking, and retail. Additionally, it emphasizes the importance of Big Data in improving business operations and decision-making, while also addressing security and compliance measures.

Uploaded by

dss745147

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views15 pages

Unit - 1 (Big Data)

Uploaded by

dss745147

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

B.S.A.

COLLEGE OF ENGG AND

TECHNOLOGY,MATHURA

KCA022: Big Data

BIG DATA ANALYTICS
UNIT - I

INTRODUCTION TO BIG DATA

Prepared by : Er. Shahid Hussain
Introduction to Big Data
Big data is data that exceeds the processing capacity of conventional database
systems. The data is too big, moves too fast, or doesn’t fit the strictures of your
database architectures. To gain value from this data, you must choose an
alternative way to process it. Big Data has to deal with large and complex datasets
that can be structured, Semi-structured, or unstructured and will typically not fit
into memory to be processed. Big data is a field that treats ways to analyze,
systematically extract information from, or otherwise deal with data sets that are
too large or complex to be dealt with by traditional data-processing application
software.

Types of digital data,

Digital data can be broadly classified into 3 types.

1. Structured Data:
Structured data is created using a fixed schema and is maintained in tabular format.
The elements in structured data are addressable for effective analysis. It contains
all the data which can be stored in the SQL database in a tabular format. Today,
most of the data is developed and processed in the simplest way to manage
information.
Examples –
Relational data, Geo-location, credit card numbers, addresses, etc.
Consider an example for Relational Data like you have to maintain a record of
students for a university like the name of the student, ID of a student, address, and
Email of the student. To store the record of students used the following relational
schema and table for the same.

S_ID S_Name S_Address S_Email

1001 A Delhi [email protected]

1002 B Mumbai [email protected]

2. Unstructured Data :
It is defined as the data in which is not follow a pre-defined standard or
you can say that any does not follow any organized format. This kind of

Prepared by : Er. Shahid Hussain

data is also not fit for the relational database because in the relational
database you will see a pre-defined manner or you can say organized way
of data. Unstructured data is also very important for the big data domain
and To manage and store Unstructured data there are many platforms to
handle it like No-SQL Database.
Examples –Word, PDF, text, media logs, etc.

3. Semi-Structured Data :
Semi-structured data is information that does not reside in a relational
database but that have some organizational properties that make it easier to
analyze. With some process, you can store them in a relational database but
is very hard for some kind of semi-structured data, but semi-structured exist
to ease space.
Example –XML data.

History of Big Data innovation

Big Data phase 1.0

Data analysis, data analytics and Big Data originate from the longstanding domain of
database management. It relies heavily on the storage, extraction, and optimization
techniques that are common in data that is stored in Relational Database
Management Systems (RDBMS).
Database management and data warehousing are considered the core components
of Big Data Phase 1. It provides the foundation of modern data analysis as we know
it today, using well-known techniques such as database queries, online analytical
processing and standard reporting tools.

Prepared by : Er. Shahid Hussain

Big Data phase 2.0
Since the early 2000s, the Internet and the Web began to offer unique data
collections and data analysis opportunities. With the expansion of web traffic and
online stores, companies such as Yahoo, Amazon and eBay started to analyze
customer behavior by analyzing click-rates, IP-specific location data and search
logs. This opened a whole new world of possibilities.
From a data analysis, data analytics, and Big Data point of view, HTTP-based web
traffic introduced a massive increase in semi-structured and unstructured data.
Besides the standard structured data types, organizations now needed to find new
approaches and storage solutions to deal with these new data types in order to
analyze them effectively. The arrival and growth of social media data greatly
aggravated the need for tools, technologies and analytics techniques that were able
to extract meaningful information out of this unstructured data.

Big Data phase 3.0

Although web-based unstructured content is still the main focus for many
organizations in data analysis, data analytics, and big data, the current possibilities
to retrieve valuable information are emerging out of mobile devices.
Mobile devices not only give the possibility to analyze behavioral data (such as clicks
and search queries), but also give the possibility to store and analyze location -based
data (GPS-data). With the advancement of these mobile devices, it is possible to
track movement, analyze physical behavior and even health-related data (number of
steps you take per day). This data provides a whole new range of opportunities, from
transportation, to city design and health care.
Simultaneously, the rise of sensor-based internet-enabled devices is increasing the
data generation like never before. Famously coined as the ‘Internet of Things’ (IoT),
millions of TVs, thermostats, wearables and even refrigerators are now generating
zettabytes of data every day. And the race to extract meaningful and valuable
information out of these new data sources has only just begun.

Introduction to Big Data platform

A big data platform is an integrated computing solution that combines numerous
software systems, tools, and hardware for big data management. It is a one-stop
architecture that solves all the data needs of a business regardless of the volume
and size of the data at hand. Due to their efficiency in data management, enterprises
are increasingly adopting big data platforms to gather tons of data and convert them
into structured, actionable business insights.

Prepared by : Er. Shahid Hussain

Drivers for Big Data
A number of Big Data business drivers are at the core of this success and explain
why Big Data has quickly risen to become one of the most coveted topics in the
industry. Six main business drivers can be identified:
1. The digitization of society;
2. The plummeting of technology costs;
3. Connectivity through cloud computing;
4. Increased knowledge about data science;
5. Social media applications;
6. The upcoming Internet-of-Things (IoT).

Big data architecture

A big data architecture is designed to handle the ingestion, processing, and analysis of
data that is too large or complex for traditional database systems.

Big data solutions typically involve one or more of the following types of workload:

 Batch processing of big data sources at rest.

 Real-time processing of big data in motion.
 Interactive exploration of big data.
 Predictive analytics and machine learning.

Prepared by : Er. Shahid Hussain

Most big data architectures include some or all of the following components:

 Data sources: All big data solutions start with one or more data sources. Examples
include:
o Application data stores, such as relational databases.
o Static files produced by applications, such as web server log files.
o Real-time data sources, such as IoT devices.
 Data storage: Data for batch processing operations is typically stored in a
distributed file store that can hold high volumes of large files in various formats.
This kind of store is often called a data lake. Options for implementing this storage
include Azure Data Lake Store or blob containers in Azure Storage.
 Batch processing: Because the data sets are so large, often a big data solution
must process data files using long-running batch jobs to filter, aggregate, and
otherwise prepare the data for analysis. Usually these jobs involve reading source
files, processing them, and writing the output to new files. Options include running
U-SQL jobs in Azure Data Lake Analytics, using Hive, Pig, or custom Map/Reduce
jobs in an HDInsight Hadoop cluster, or using Java, Scala, or Python programs in
an HDInsight Spark cluster.
 Real-time message ingestion: If the solution includes real-time sources, the
architecture must include a way to capture and store real-time messages for
stream processing. This might be a simple data store, where incoming messages
are dropped into a folder for processing. However, many solutions need a message
ingestion store to act as a buffer for messages, and to support scale-out
processing, reliable delivery, and other message queuing semantics. Options
include Azure Event Hubs, Azure IoT Hubs, and Kafka.
 Stream processing: After capturing real-time messages, the solution must process
them by filtering, aggregating, and otherwise preparing the data for analysis. The
processed stream data is then written to an output sink. Azure Stream Analytics
provides a managed stream processing service based on perpetually running SQL
queries that operate on unbounded streams. You can also use open source Apache
streaming technologies like Spark Streaming in an HDInsight cluster.
 Analytical data store: Many big data solutions prepare data for analysis and then
serve the processed data in a structured format that can be queried using
analytical tools. The analytical data store used to serve these queries can be a
Kimball-style relational data warehouse, as seen in most traditional business
intelligence (BI) solutions. Alternatively, the data could be presented through a
low-latency NoSQL technology such as HBase, or an interactive Hive database that
provides a metadata abstraction over data files in the distributed data store. Azure
Synapse Analytics provides a managed service for large-scale, cloud-based data

Prepared by : Er. Shahid Hussain

warehousing. HDInsight supports Interactive Hive, HBase, and Spark SQL, which
can also be used to serve data for analysis.
 Analysis and reporting: The goal of most big data solutions is to provide insights
into the data through analysis and reporting. To empower users to analyze the
data, the architecture may include a data modeling layer, such as a
multidimensional OLAP cube or tabular data model in Azure Analysis Services. It
might also support self-service BI, using the modeling and visualization
technologies in Microsoft Power BI or Microsoft Excel. Analysis and reporting can
also take the form of interactive data exploration by data scientists or data
analysts. For these scenarios, many Azure services support analytical notebooks,
such as Jupyter, enabling these users to leverage their existing skills with Python or
R. For large-scale data exploration, you can use Microsoft R Server, either
standalone or with Spark.
 Orchestration: Most big data solutions consist of repeated data processing
operations, encapsulated in workflows, that transform source data, move data
between multiple sources and sinks, load the processed data into an analytical
data store, or push the results straight to a report or dashboard. To automate
these workflows, you can use an orchestration technology such Azure Data Factory
or Apache Oozie and Sqoop.

7 V's of Big Data

 Volume: - As the term implies, big data analytics entails handling and analyzing vast
amounts of data. To effectively work with such massive datasets, specialized tools
and infrastructure are necessary for capturing, storing, managing, cleaning,
transforming, analyzing, and reporting the data.
 Velocity: - Velocity denotes the speed at which data is generated. To keep up with
the rapid generation of data, systems for processing and analyzing data must
possess sufficient capacity to handle the influx of data and deliver timely, actionable
insights.
 Variety: - Variety refers to the diversity of data types and sources. Data can
manifest in various forms, originate from diverse sources, and exist in structured or
unstructured formats. Understanding the types of data and their sources, as well as
the interrelationships within the datasets, is vital for generating meaningful insights
from big data.
 Variability: - Big data often contains noisy and incomplete data points, which can
obscure valuable insights. Addressing this variability typically involves data cleaning
and validation processes to ensure data quality.
 Veracity: - Veracity pertains to the accuracy and authenticity of the data. Data must
undergo validation to ensure that it accurately represents essential business

Prepared by : Er. Shahid Hussain

functions and that any data manipulation, modeling, and analysis does not
compromise the data's accuracy.
 Value: - A successful big data analytics strategy must generate value. The insights
derived from the analysis should provide meaningful guidance for improving
operations, enhancing customer service, or creating other forms of value. An integral
part of developing a big data analytics strategy is distinguishing between data that
can contribute value and data that cannot.
 Visualization: - Visualization plays a vital role in data analytics, as it involves
presenting the analyzed data in a visually comprehensible manner. When planning
data visualization, it is essential to consider the end user and the decisions the
visualizations aim to support. Well-executed data visualization facilitates swift and
well-informed decision-making.

Big Data technology component

Two main building blocks are being added to the enterprise stack to accommodate big data:
● Hadoop: Provides storage capability through a distributed, shared-nothing file system, and
analysis capability through MapReduce
● NoSQL: Provides the capability to capture, read, and update, in real time, the large influx of
unstructured data and data without schemas;
examples include click streams, social media, log files, event data, mobility trends, and sensor
and machine data .

APPLICATIONS of big data–

Where it is used
1. Life Sciences:
Clinical research is a slow and expensive process, with trials failing for a variety of reasons.
Advanced analytics, artificial intelligence (AI) and the Internet of Medical Things (IoMT)
unlocks the potential of improving speed and efficiency at every stage of clinical research by
delivering more intelligent, automated solutions.
2. Banking:
Financial institutions gather and access analytical insight from large volumes of unstructured
data in order to make sound financial decisions. Bi
g data analytics allows them to access the information they need when they need it, by
eliminating overlapping, redundant tools and systems.
3. Manufacturing:
For manufacturers, solving problems is nothing new. They wrestle with difficult problems on a
daily basis - from complex supply chains, to motion applications, to labor constraints and
equipment breakdowns. That's why big data analytics is essential in the manufacturing industry,
as it has allowed competitive organizations to discover new cost saving opportunities and
revenue opportunities.
4. Health Care:

Prepared by : Er. Shahid Hussain

Big data is a given in the health care industry. Patient records, health plans, insurance
information and other types of information can be difficult to manage – but are full of key
insights once analytics are applied. That’s why big data analytics technology is so important to
heath care. By analyzing large amounts of information -both structured and unstructured –
quickly, health care providers can provide lifesaving diagnoses or treatment options almost
immediately.
5. Government:
Certain government agencies face a big challenge: tighten the budget without compromising
quality or productivity. This is particularly troublesome with law enforcement agencies, which
are struggling to keep crime rates d
own with relatively scarce resources. And that’s why many agencies use big data analytics; the
technology streamlines operations while giving the agency a more holistic view of criminal
activity.
6. Retail:
Customer service has evolved in the past several years, as savvier shoppers expect retailers to
understand exactly what they need, when they need it. Big data analytics technology helps
retailers meet those demands. Armed with endless amounts of data from customer loyalty
programs, buying habits and other sources, retailers not only have an in-depth understanding
of their customers, they can also predict trends, recommend new products–
and boost profitability.

What’s the importance of Big Data?

Big Data can improve business operations, offer more personalized service to customers,
improve marketing campaigns and, in general, contribute to more efficient decision-making.
When a business knows how to use its data, it gains a competitive advantage over those that
don’t, which makes it easier to grow and to increase market share.
Regardless of its sector, any business can use Big Data to improve its operations and better reach
its audience.

Big Data features: Big data features divided

into four categories
(1) Big data security is the collective term for all the measures and tools
used to guard both the data and analytics processes from attacks, theft, or
other malicious activities that could harm or negatively affect them. Much like
other forms of cyber-security, the big data variant is concerned with attacks
that originate either from the online or offline spheres.
Solution:
 One of the most common security tools is encryption, a relatively simple
tool that can go a long way. Encrypted data is useless to external actors

Prepared by : Er. Shahid Hussain

such as hackers if they don’t have the key to unlock it. Moreover,
encrypting data means that both at input and output, information is
completely protected.
 Building a strong firewall is another useful big data security tool.
Firewalls are effective at filtering traffic that both enters and leaves
servers. Organizations can prevent attacks before they happen by
creating strong filters that avoid any third parties or unknown data
sources.
(2)Big data compliance: Data compliance is the formal governance
structure in place to ensure an organization complies with laws, regulations,
and standards around its data. The process governs the possession,
organization, storage, and management of digital assets or data to prevent it
from loss, theft, misuse, or compromise. The stipulated regulations and
standards determine what data needs to be protected as well as the most
suitable processes for doing so.
Solution:
 Successful businesses approach data compliance in a holistic way.
They embark on integrating data governance with a data management
program that involves documentation of ownership, procedures,
definitions, and policies to bolster data compliance.
 To cultivate a culture of data compliance, organizations should develop
an all-encompassing approach for reviewing services, operations,
products, and processes to eliminate any compliance gaps. The
organization’s ability to protect data while maintaining user access to
real-time data should be a priority.

(3)Big data Auditing, and Protection: With Big Data and analytics, there is
a possibility of a more efficient and effective identification of financial
reporting, detection of fraud, and examination of operational business risks

Data auditing, or data risk management, is a comprehensive assessment of

all aspects of data gathering, storage, and usage, including internal data such
as financial records and external data like customer and market trend
information.

Data auditing involves monitoring data creation, collection, usage, storage,

and destruction. It helps improve data quality, identify gaps and errors, and
make informed decisions based on accurate analytics.

Prepared by : Er. Shahid Hussain

Solution: Conducting a data audit involves planning, data collection and
analysis, monitoring compliance, presenting findings, and implementing
adjustments.

(4)Big data privacy and protection: .

Big Data privacy and ethic.

Big data privacy involves properly managing big data to minimize risk and
protect sensitive data. Because big data comprises large and complex data
sets, many traditional privacy processes cannot handle the scale and velocity
required. To safeguard big data and ensure it can be used for analytics, you
need to create a framework for privacy protection that can handle the volume,
velocity, variety, and value of big data as it is moved between environments,
processed, analyzed, and shared.

Big Data Analytics

Big data analytics refers to the methods, tools, and applications used to collect,
process, and derive insights from varied, high-volume, high-velocity data sets. These
data sets may come from a variety of sources, such as web, mobile, email, social
media, and networked smart devices.

Prepared by : Er. Shahid Hussain

Business Intelligence vs Big Data Table

Difference by Parameters Big Data Business Intelligence

Big Data: Large and diverse

Business Intelligence (BI): The
datasets that require advanced
process of collecting, analyzing,
analytics techniques to uncover
and presenting structured data
Definition patterns, correlations, and
to support informed decision-
insights, often involving
making and drive business
unstructured and external data
growth.
sources.

Diverse data types, including Structured data from internal

Data Type
unstructured data sources

Data Volume Vast amounts of data Moderate to large datasets

External and internal sources

Internal sources (databases,
Data Sources (social media, sensors,
spreadsheets, etc.)
transactions, etc.)

Advanced analytics techniques

Aggregating and analyzing
Analysis Approach (data mining, machine learning,
structured data
predictive analytics, etc.)

Prepared by : Er. Shahid Hussain

Difference by Parameters Big Data Business Intelligence

Discover insights, patterns, and Support operational decision-

Purpose
trends making

Real-time and near-real-time Real-time and historical

Time Sensitivity
processing analysis

Data scientists, analysts, Executives, managers, analysts,

User Role
researchers decision-makers

Challenges in BIG DATA

1.Need For Synchronization Across Disparate Data Sources
As data sets are becoming bigger and more diverse, there is a big
challenge to incorporate them into an analytical platform. If this is
overlooked, it will create gaps and lead to wrong messages
and insights.
2. Acute Shortage Of Professionals Who Understand Big Data
AnalysisThe analysis of data is important to make this voluminous
amount of data being produced in every minute, useful. With the
exponential rise of data, a huge demand for big data scientists and Big
Data analysts has been created in the market. It is important for
business organizations to hire a data scientist having skills that are
varied as the job of a data scientist is multidisciplinary. Another
major challenge faced by businesses is the shortage of professionals
who understand Big Data analysis. There is a sharp shortage of data

Prepared by : Er. Shahid Hussain

scientists in comparison to the massive amount of data being
produced.
3. Getting Meaningful Insights Through The Use Of Big Data
Analytics
It is imperative for business organizations to gain important insights
from Big Data analytics, and also it is important that only the relevant
department has access to this information. A big challenge faced by
the companies in the Big Data analytics is mending this wide gap in an
effective manner.
4. Getting Voluminous Data Into The Big Data Platform
It is hardly surprising that data is growing with every passing day. This
simply indicates that business organizations need to handle a large
amount of data on daily basis. The amount and variety of data
available these days can overwhelm any data engineer and that is
why it is considered vital to make data accessibility easy and
convenient for brand owners and managers.
5. Uncertainty Of Data Management Landscape With the rise of
Big Data, new technologies and companies are being developed
every day. However, a big challenge faced by the companies in the
Big Data analytics is to find out which technology will be best suited to
them without the introduction of new problems and potential risks.
6. Data Storage And Quality Business organizations are growing
at a rapid pace. With the tremendous growth of the companies and
large business organizations, increases the amount of data produced.
The storage of this massive amount of data is becoming a real
challenge for everyone. Popular data storage options like data lakes/
warehouses are commonly used to gather and store large quantities
of unstructured and structured data in its native format. The real
problem arises when a data lakes/ warehouse try to combine
unstructured and inconsistent data from diverse sources, it encounters
errors. Missing data, inconsistent data, logic conflicts, and duplicates
data all result in data quality challenges.
7. Security And Privacy Of Data
Once business enterprises discover how to use Big Data, it brings
them a wide range of possibilities and opportunities. However, it also
involves the potential risks associated with big data when it comes to
the privacy and the security of the data. The Big Data tools used for

Prepared by : Er. Shahid Hussain

analysis and storage utilizes the data disparate sources. This
eventually leads to a high risk of exposure of the data, making it
vulnerable. Thus, the rise of voluminous amount of data increases
privacy and security concerns.

Classification of analytics
1) Descriptive analytics
Descriptive analytics is a statistical method that is used to search and
summarize historical data in order to identify patterns or meaning.
2) Predictive analytics
Predictive Analytics is a statistical method that utilizes algorithms and
machine learning to identify trends in data and predict future
behaviors.
3) Prescriptive analytics
Prescriptive analytics is a statistical method used to generate
recommendations and make decisions based on the computational
findings of algorithmic models.

Prepared by : Er. Shahid Hussain

Project Report On Data Analytics
50% (4)
Project Report On Data Analytics
44 pages
Data Analytics Strategy Toolkit - Overview and Approach
100% (6)
Data Analytics Strategy Toolkit - Overview and Approach
48 pages
Big Data Unit 1 Notes
100% (1)
Big Data Unit 1 Notes
27 pages
Lecture 3 - What Is Big Data Analytics - GEOG 3226 - 2023
No ratings yet
Lecture 3 - What Is Big Data Analytics - GEOG 3226 - 2023
48 pages
MGT555 - Individual Assignment 1 - AFIQ NAJMI BIN ROSMAN 2020878336-NURUL NABILAH BINTI AYOB 202183944 NBO5B
No ratings yet
MGT555 - Individual Assignment 1 - AFIQ NAJMI BIN ROSMAN 2020878336-NURUL NABILAH BINTI AYOB 202183944 NBO5B
10 pages
Supply Chain Analytics For Dummies
100% (7)
Supply Chain Analytics For Dummies
69 pages
BIG DATA 1 Unit
100% (1)
BIG DATA 1 Unit
17 pages
BDA Unit 1
No ratings yet
BDA Unit 1
10 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
48 pages
BDU1
No ratings yet
BDU1
39 pages
Big Data Unit 1 Notes - 240311 - 100703
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
15 pages
Big Data Analytics
No ratings yet
Big Data Analytics
58 pages
Bda (Unit 1)
No ratings yet
Bda (Unit 1)
24 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
32 pages
Detailednotes - Unit1 - Big Data
No ratings yet
Detailednotes - Unit1 - Big Data
22 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
37 pages
Bigdata Notes
No ratings yet
Bigdata Notes
136 pages
Unit-1 Module Updated
No ratings yet
Unit-1 Module Updated
48 pages
Bda Unit 1
No ratings yet
Bda Unit 1
47 pages
Harnessing Big Data
No ratings yet
Harnessing Big Data
29 pages
Introduction To Big Data Platform
No ratings yet
Introduction To Big Data Platform
20 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Introduction To Big Data: Types of Digital Data, History of Big Data Innovation
No ratings yet
Introduction To Big Data: Types of Digital Data, History of Big Data Innovation
12 pages
Big Data Analytics and Its Applications
No ratings yet
Big Data Analytics and Its Applications
4 pages
Big Data Components
No ratings yet
Big Data Components
58 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
37 pages
I Jcs It 2015060405
No ratings yet
I Jcs It 2015060405
6 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
24 pages
Unit 1
No ratings yet
Unit 1
44 pages
BIG Data1
No ratings yet
BIG Data1
49 pages
Big Data Analytics
100% (1)
Big Data Analytics
11 pages
Da Unit - I - Notes
No ratings yet
Da Unit - I - Notes
30 pages
Unit 1
No ratings yet
Unit 1
17 pages
BDA NOTES With Questions Included
No ratings yet
BDA NOTES With Questions Included
108 pages
117769
No ratings yet
117769
20 pages
BD U1.PDF - Crdownload
No ratings yet
BD U1.PDF - Crdownload
65 pages
Unit I: Chapter 1: Introduction To Big Data
No ratings yet
Unit I: Chapter 1: Introduction To Big Data
35 pages
Dsc652 - Chapter 1 Introduction To Big Data Systems
No ratings yet
Dsc652 - Chapter 1 Introduction To Big Data Systems
27 pages
CH 1
No ratings yet
CH 1
218 pages
Unit 1 Bda Complete Notes
No ratings yet
Unit 1 Bda Complete Notes
15 pages
Bda Unit I LM
No ratings yet
Bda Unit I LM
14 pages
Project FInal Report
No ratings yet
Project FInal Report
67 pages
Big Data in Business
No ratings yet
Big Data in Business
11 pages
Lecture1 Introductiontobigdata 190301171350
No ratings yet
Lecture1 Introductiontobigdata 190301171350
63 pages
Unit-III CC&BD Cs62 Ab
No ratings yet
Unit-III CC&BD Cs62 Ab
85 pages
Big Data (Unit 1)
No ratings yet
Big Data (Unit 1)
38 pages
Big Data Analytics Unit Test-I Answers Bank
No ratings yet
Big Data Analytics Unit Test-I Answers Bank
10 pages
Unit 1
No ratings yet
Unit 1
21 pages
Module 1. 16974328175990
No ratings yet
Module 1. 16974328175990
119 pages
Big Data Chapter 1
No ratings yet
Big Data Chapter 1
7 pages
U1 A CLSRM
No ratings yet
U1 A CLSRM
33 pages
Big-Data-Analytics Notes For Ug
No ratings yet
Big-Data-Analytics Notes For Ug
10 pages
Itfm Assignment Group 8
100% (1)
Itfm Assignment Group 8
16 pages
Hamid Seminar
No ratings yet
Hamid Seminar
57 pages
What Is Data
No ratings yet
What Is Data
20 pages
21ai402 Data Analytics Unit-1
No ratings yet
21ai402 Data Analytics Unit-1
37 pages
Lecture 2
No ratings yet
Lecture 2
25 pages
Mtech Scheme
No ratings yet
Mtech Scheme
54 pages
01 - Introduction To Big Data Analytics PDF
No ratings yet
01 - Introduction To Big Data Analytics PDF
37 pages
Unit I LM
No ratings yet
Unit I LM
12 pages
Module 1
No ratings yet
Module 1
60 pages
Introduction To Big Data Platform (Module-3)
No ratings yet
Introduction To Big Data Platform (Module-3)
23 pages
Big Data Components
No ratings yet
Big Data Components
31 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
34 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
COA Unit 2 Tutorial
No ratings yet
COA Unit 2 Tutorial
1 page
Btech 4 Sem Engineering Mathematics 3 Ras401 2019
No ratings yet
Btech 4 Sem Engineering Mathematics 3 Ras401 2019
2 pages
COA Unit 1 Tutorial
No ratings yet
COA Unit 1 Tutorial
1 page
Software Engineering Unit 2
No ratings yet
Software Engineering Unit 2
19 pages
Btech 4 Sem Mathematics 3 Ras 404 2019
No ratings yet
Btech 4 Sem Mathematics 3 Ras 404 2019
2 pages
CD Practical File Format
No ratings yet
CD Practical File Format
2 pages
Program 1
No ratings yet
Program 1
2 pages
CN Lab File
No ratings yet
CN Lab File
18 pages
Software Design: Basic Concept of Software Design
No ratings yet
Software Design: Basic Concept of Software Design
70 pages
Software Maintenance: Market Conditions Client Requirements Host Modifications Organization Changes
No ratings yet
Software Maintenance: Market Conditions Client Requirements Host Modifications Organization Changes
35 pages
Unit 1 Previous Year Ques
No ratings yet
Unit 1 Previous Year Ques
1 page
Wa0001
No ratings yet
Wa0001
8 pages
Framing + Error Detection+correction
No ratings yet
Framing + Error Detection+correction
28 pages
Unit-3, BCS603, Computer Network
No ratings yet
Unit-3, BCS603, Computer Network
40 pages
Unit-1 - Part A
No ratings yet
Unit-1 - Part A
129 pages
Unit-1, BCS603, Computer Network
No ratings yet
Unit-1, BCS603, Computer Network
36 pages
Exam Datesheet
No ratings yet
Exam Datesheet
1 page
UNIT 5 Previous Year Question
No ratings yet
UNIT 5 Previous Year Question
1 page
Unit 1 Business Analytics
No ratings yet
Unit 1 Business Analytics
24 pages
Business Research Methods BRM - BA4205 - Notes by MIET
No ratings yet
Business Research Methods BRM - BA4205 - Notes by MIET
106 pages
Predictive Analytics With Knime Analytics For Citizen Data Scientists 1st Edition Acito Download
No ratings yet
Predictive Analytics With Knime Analytics For Citizen Data Scientists 1st Edition Acito Download
78 pages
Data Analytics III I Good Notes
No ratings yet
Data Analytics III I Good Notes
86 pages
Introduction To Analytics - BBA 2020 - CO
No ratings yet
Introduction To Analytics - BBA 2020 - CO
13 pages
21BCAD5C01 IDA Module 1 Notes
No ratings yet
21BCAD5C01 IDA Module 1 Notes
24 pages
Module 2 - Fund. of Business Analytics
No ratings yet
Module 2 - Fund. of Business Analytics
26 pages
Unit-I Introduction To People Analtyics
No ratings yet
Unit-I Introduction To People Analtyics
24 pages
Data Analytics Future Trends An Exploration of The Future Trends in Data Analytics
No ratings yet
Data Analytics Future Trends An Exploration of The Future Trends in Data Analytics
5 pages
Ai (X) Practice Paper 5-1
No ratings yet
Ai (X) Practice Paper 5-1
5 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
14 pages
Cisco Data Analysis Course Week
No ratings yet
Cisco Data Analysis Course Week
5 pages
Automotive-Report-2020 TecAlliance v2.1.8 PDF
No ratings yet
Automotive-Report-2020 TecAlliance v2.1.8 PDF
64 pages
Wjarr 2024 0590
No ratings yet
Wjarr 2024 0590
12 pages
1 Introduction To Data Analytics
No ratings yet
1 Introduction To Data Analytics
14 pages
II-Sem-Business Analytics - 241023 - 215057
No ratings yet
II-Sem-Business Analytics - 241023 - 215057
2 pages
Unit 4
No ratings yet
Unit 4
5 pages
Buma 30083 - Fundamentals of Prescriptive Analytics 1
No ratings yet
Buma 30083 - Fundamentals of Prescriptive Analytics 1
4 pages
HR Matrics and Analysis
No ratings yet
HR Matrics and Analysis
16 pages
Human Resource Analytics M.pritha
No ratings yet
Human Resource Analytics M.pritha
16 pages
Chapter3 Ub
No ratings yet
Chapter3 Ub
35 pages
Yunita 2021 J. Phys. Conf. Ser. 1898 012044
No ratings yet
Yunita 2021 J. Phys. Conf. Ser. 1898 012044
15 pages
Unit Ii
No ratings yet
Unit Ii
26 pages
From Predictive To Prescriptive Analytics
No ratings yet
From Predictive To Prescriptive Analytics
21 pages
What Is Business Analytics
No ratings yet
What Is Business Analytics
9 pages

Unit - 1 (Big Data)

Uploaded by

Unit - 1 (Big Data)

Uploaded by

B.S.A.

COLLEGE OF ENGG AND

KCA022: Big Data

INTRODUCTION TO BIG DATA

Types of digital data,

S_ID S_Name S_Address S_Email

1001 A Delhi [email protected]

1002 B Mumbai [email protected]

Prepared by : Er. Shahid Hussain

History of Big Data innovation

Big Data phase 1.0

Prepared by : Er. Shahid Hussain

Big Data phase 3.0

Introduction to Big Data platform

Prepared by : Er. Shahid Hussain

Big data architecture

 Batch processing of big data sources at rest.

Prepared by : Er. Shahid Hussain

Prepared by : Er. Shahid Hussain

7 V's of Big Data

Prepared by : Er. Shahid Hussain

Big Data technology component

APPLICATIONS of big data–

Prepared by : Er. Shahid Hussain

What’s the importance of Big Data?

Big Data features: Big data features divided

Prepared by : Er. Shahid Hussain

Data auditing, or data risk management, is a comprehensive assessment of

Data auditing involves monitoring data creation, collection, usage, storage,

Prepared by : Er. Shahid Hussain

(4)Big data privacy and protection: .

Big Data privacy and ethic.

Big Data Analytics

Prepared by : Er. Shahid Hussain

Difference by Parameters Big Data Business Intelligence

Big Data: Large and diverse

Diverse data types, including Structured data from internal

Data Volume Vast amounts of data Moderate to large datasets

External and internal sources

Advanced analytics techniques

Prepared by : Er. Shahid Hussain

Discover insights, patterns, and Support operational decision-

Real-time and near-real-time Real-time and historical

Data scientists, analysts, Executives, managers, analysts,

Challenges in BIG DATA

Prepared by : Er. Shahid Hussain

Prepared by : Er. Shahid Hussain

Prepared by : Er. Shahid Hussain

You might also like