0% found this document useful (0 votes)
19 views3 pages

Unit 1 BDA

unit 1 of big data analytics

Uploaded by

saisri.pentapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

Unit 1 BDA

unit 1 of big data analytics

Uploaded by

saisri.pentapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Unit 1: Introduction to Big Data

(10-mark answers for each topic)

1. Big Data and Its Importance

• Definition: Big Data refers to datasets that are too large or complex to process using
traditional methods.

• Importance:

o Enables data-driven decision-making. o Provides predictive insights


in fields like healthcare, finance, and marketing.

o Drives innovation and operational efficiency.

• Key Applications:

o Healthcare: Personalized medicine and real-time monitoring.

o Retail: Enhanced customer personalization and inventory management.

2. Characteristics of Big Data (5 V's)

The key properties of Big Data are summarized as:

1. Volume: The size of data, measured in terabytes or petabytes.

2. Velocity: The speed at which data is generated and processed (e.g., social media).

3. Variety: Data in different formats like text, images, videos, etc.

4. Veracity: Ensuring accuracy and reliability of data despite inconsistencies.

5. Value: Deriving meaningful insights to enhance business operations.

3. Big Data Analytics

• Definition: The process of analyzing large and varied datasets to uncover hidden
patterns, correlations, and actionable insights.

• Steps in Big Data Analytics:

1. Data Collection: Gathering structured, semi-structured, and unstructured


data.

2. Storage: Using platforms like Hadoop and Spark.


3. Analysis: Employing algorithms for predictive, descriptive, and prescriptive
insights.

• Real-World Example:

o In e-commerce, analytics is used to recommend products based on browsing


history.

4. Basic Requirements for Big Data Analytics

1. Hardware Requirements: High-performance servers and storage systems.

2. Frameworks: Tools like Hadoop and Spark for data storage and processing.

3. Scalable Algorithms: Efficient algorithms for handling large datasets.

4. Expertise: Skilled professionals to manage data pipelines.

5. Big Data Applications

1. Healthcare: Disease outbreak prediction and real-time patient monitoring.

2. Finance: Fraud detection and algorithmic trading.

3. Retail: Targeted marketing and demand forecasting.

4. Transportation: Traffic prediction and route optimization.

6. MapReduce Framework

• Definition: A programming model for processing large-scale data in parallel.

• Phases:

1. Map Phase: Breaks data into key-value pairs.

2. Shuffle and Sort: Groups similar keys together.

3. Reduce Phase: Aggregates data to produce the final result.

Diagram: Refer to the MapReduce Workflow.

7. Algorithms Using MapReduce

• Examples:

1. Word Count: Counts the frequency of each word in a dataset.


2. Sorting: Arranges data in a specific order.

8. NoSQL Databases

• Definition: Non-relational databases optimized for Big Data.

• Types:

1. Key-Value Databases: Efficient for lookup operations (e.g., Redis).

2. Column-Family Databases: Stores data in columns instead of rows (e.g.,


Cassandra).

3. Document Databases: JSON-like documents (e.g., MongoDB).

4. Graph Databases: Nodes and edges represent relationships (e.g., Neo4j).

Diagram: Refer to the SQL vs NoSQL Comparison.

You might also like