100% found this document useful (1 vote)
696 views4 pages

MODULE-3 Notes

The document discusses big data concepts including volume, velocity, and variety. It defines big data and provides examples of the large amounts of data generated from sensors. The challenges of big data are storage and analytics while the benefits include improving operations and customer satisfaction.

Uploaded by

abellorodelcute
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
696 views4 pages

MODULE-3 Notes

The document discusses big data concepts including volume, velocity, and variety. It defines big data and provides examples of the large amounts of data generated from sensors. The challenges of big data are storage and analytics while the benefits include improving operations and customer satisfaction.

Uploaded by

abellorodelcute
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

MODULE 3: EVERYTHING GENERATES DATA 3.1.

3 Check Your Understanding - Big Data: Volume,


Velocity, or Variety Characteristics
3.1.1 What is Big Data?
Which of the following refers to different types of data?
 Data sources: People, pictures, text, sensors,
websites, devices (e.g., cell phones, computers).  Volume
 Characteristics of Big Data:
 Volume: Large amount of data requiring  Velocity
increased storage space.
 Velocity: Data grows rapidly, generating at an  Variety
increasing speed.
 Variety: Data is generated in various formats. Which of the following refers to data coming in quickly?
 Examples of data volume generated by sensors:
 Smart connected home: 1 GB of data per week.  Volume
 Autonomous car: 500 GB of data per day.
 Mining operations: 2.4 TB of data per minute.  Velocity
 Airbus A380 Engine: 1 PB of data per flight.
 Challenges and benefits of Big Data:  Variety
 Challenges: Storage, analytics.
 Benefits: Fine-tuning operations, improving Which of the following refers to a lot of data?
customer satisfaction.
 Velocity
3.1.2 Check Your Understanding - Identify Scenarios
that Generate Big Data  Volume

An orange grove company has sensors in the trees and  Variety


on the machines that harvest the oranges. A camera
mounted on the harvester takes a close-up picture of the 3.1.4 Large Datasets
orange every 5 minutes. Live data is sent to the
distributor who gets this data from 100 companies. Does  Accessibility to data sets:
the distributor have big data?  Companies can utilize free data sets available.
 Collection of own data isn't necessary for all
 Yes organizations.

 No 3.1.6 Check your Understanding - Large Data Sets

An independent t-shirt vendor advertises through A large European city has sensors installed in
Facebook and other social media sites. The vendor ambulances and police cars to verify that the emergency
receives statistics on the customer demographics. Does response services are adequate in all areas of the city.
the vendor have big data? This example would generate big data.

 Yes  True

 No  False

Smart parking meters, real-time street traffic video, and A corporation generates data from their web site, from
crime statistics feed data to an app that shows users sensors, and from internal documents. What
recommended and available parking spaces as well as characteristic of big data does this describe?
information regarding cost and a safety/security rating.
Does the app use big data?  Volume

 Yes  Velocity

 No  Redundancy
 Variety  Importance of distributed processing for managing
large volumes of data.
 Viscocity
3.2.5 Check Your Understanding - Define Big Data
Small companies cannot analyze big data since they do Terms
not generate enough themselves.
Which of the following refers to generating high volumes
 False of data?

 True  Edge

3.2.1 What Are the Challenges of Big Data?  Sensors

 Growth of data:  Velocity


 Predicted daily data generation: 463 EB
globally.  Hadoop
 Rapid data growth poses challenges and
opportunities.  Distributed database
 Storage challenges:
 Traditional technologies struggle with storage  Cloud services
needs.
 Security concerns with cloud storage. Which characteristic is used to describe big data?
 Importance of secure, fault-tolerant, replicated Big
Data storage solutions.  Edge

3.2.2 Where Can We Store Big Data?  Sensors

 Big data storage:  Velocity


 Typically on multiple servers in data centers.
 Edge Computing:  Hadoop
 Utilizes end-user devices for pre-processing
and storage closer to the data source.  Distributed database
 Reduces bandwidth usage and speeds up
communications.  Cloud services

Which of the following refers to servers located close to


the company network edge for pre-processing data?
3.2.3 The Cloud and Cloud Computing
 Edge
 Cloud services:
 Provided by companies like Google, Microsoft,  Sensors
Apple.
 Advantages for individuals and enterprises:  Velocity
 Data storage and access.
 Access to applications.  Hadoop
 Cost reduction and efficiency.
 Security concerns with cloud data.  Distributed database

3.2.4 Distributed Processing  Cloud services

 Transition from vertical to horizontal scaling. Which of the following provides storage and access to
 Distributed file systems and Hadoop: applications via a web browser?
 Hadoop Distributed File System (HDFS) and
MapReduce for distributed data processing.  Edge
 Scalability and fault tolerance of Hadoop.
 Sensors  Categorization of data into structured and
unstructured.
 Velocity  Tools for data collection from web sources.

 Hadoop 3.3.3 Data Visualization

 Distributed database  Data mining and visualization:


 Process of turning raw data into meaningful
 Cloud services information.
 Importance of presenting mined data effectively.
Which tool was created to distribute and process big  Types of visualizations: Line, column, bar, pie,
datasets in smaller quantities? scatter charts.

 Edge 3.3.6 Analyzing Big Data for Effective Use in


Business
 Sensors
 Importance of analyzing Big Data for value
 Velocity extraction.
 Data analysis process: Inspection, cleaning,
 Hadoop transformation, modeling.
 Determining business problem or goal before
 Distributed database analysis.
 Range of tools for data analysis: From spreadsheets
 Cloud services to dedicated Big Data analytics applications.

Which of the following refers to large volume of data 3.3.8 Check Your Understanding - Big Data Terms
broken into smaller pieces and stored on different servers?
What is data created using spreadsheets or pre-printed
 Edge forms called?

 Sensors  Line

 Velocity  Transactional

 Hadoop  Pie

 Distributed database  Structured

 Cloud services  RapidMiner

3.3.1 Why Do Businesses Analyze Data?  Unstructured

 Importance of efficiency and innovation for  JSON


businesses.
 Value of IoT and data analytics in gaining insights. Comma-separated value (CSV) and XML create plaintext
 Data as the new oil: Value from processed data. file types. What other file creates plaintext file type?
 Types of processed data: Transactional and
analytical.  Line

3.3.2 Sources of Information  Transactional

 Varied sources of large datasets:  Pie


 Sensor data, social media, web pages, archives,
metadata, medical forms, genomics research.  Structured
 RapidMiner What type of chart is used when you display a continuous
set of data over time?
 Unstructured
 Line
 JSON
 Transactional
What kind of information is captured and processed as
events happen?  Pie

 Line  Structured

 Transactional  RapidMiner

 Pie  Unstructured

 Structured  JSON

 RapidMiner What is an example of a Big Data analytics tool?

 Unstructured  Line

 JSON  Transactional

Freeform data such as audio or video is what type of data?  Pie

 Line  Structured

 Transactional  RapidMiner

 Pie  Unstructured

 Structured  JSON

 RapidMiner

 Unstructured

 JSON

What type of chart represents data as a segment of a


whole?

 Line

 Transactional

 Pie

 Structured

 RapidMiner

 Unstructured

 JSON

You might also like