0% found this document useful (0 votes)

58 views6 pages

Bda Iat-1 Answer Key

Uploaded by

yash.engineering

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views6 pages

Bda Iat-1 Answer Key

Uploaded by

yash.engineering

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

REG.

NO. :
5113

(Approved by AICTE, affiliated to Anna University & Accredited by NBA)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
INTERNAL ASSESSMENT TEST – I – ANSWER KEY

Sem & Branch :III / CSE’A

Subject :BIG DATA ANALYTICS

Sub. Code :CCS334

Part-A
Answer all the questions [6 x 2 = 12 Marks]
1. Differentiate Big Data processing and distributed processing. (K2)(CO1)
Big Data Processing: Refers to the techniques, tools, and frameworks used to handle and analyze
large datasets (often characterized by the "3 Vs": volume, velocity, and variety) that are too
complex for traditional data processing tools. Big data processing often requires specialized
technologies to capture, store, manage, and analyze these vast amounts of data efficiently.
Distributed Processing: Refers to the computing technique where multiple computers (or nodes)
work together to process data in parallel, splitting the workload among different machines. It is a
broader concept that can be used for any kind of data processing (including big data), but the key
idea is that the workload is divided across multiple systems.

2. List the difference between inter and trans firewall analytics. (K1)(CO1)

3. Explain NIST definition to define cloud computing. (K2)(CO1)

Cloud computing refers to the on-demand availability of computing resources over internet.
These resources include servers, storage, databases, software, analytics, networking and
intelligence over the Internet and all these resources can be used as per requirement of the
customer. In cloud computing customers have to pay as per use. It is very flexible and can be
resources can be scaled easily depending upon the requirement.
REG.
NO. :
5113

4. What are the characteristics of firewall? (K1)(CO1)

 Network size and complexity: Larger and more complex networks benefit more from
inter-firewall analytics for comprehensive monitoring.
 Security needs and threats: Trans-firewall analytics is crucial for networks handling
sensitive data and facing advanced threats.
 Budget and resources: Implementing trans-firewall analytics requires
additional investment in specialized hardware and software.
5. Why is Hadoop important? (K3)(CO1)
Salient features of Apache Hadoop:
 Free to use and offers an efficient storage solution for businesses.
 Offers quick access via HDFS (Hadoop Distributed File System).
 Highly flexible and can be easily implemented with MySQL, and JSON.
 Highly scalable as it can distribute a large amount of data in small segments.
 It works on small commodity hardware like JBOD or a bunch of disks.
6. What is machine generated data? (K1)(CO1)
Machine-generated data refers to information created without human intervention, typically by
devices, sensors, software, or machines. In the context of big data analytics, it plays a crucial role
as it provides large volumes of data in real-time or near real-time. This type of data is often
structured and used for predictive analytics, monitoring, and decision-making processes.
Here are some examples of machine-generated data:
1. Log files: Generated by web servers, application servers, or databases, capturing activities,
transactions, errors, and usage patterns.
2. Sensor data: Created by IoT (Internet of Things) devices, monitoring various environmental
parameters like temperature, humidity, motion, and pressure.
3. Telecommunication data: Call records, network traffic data, and performance metrics.
Part-B
Answer any two of the followings [2 x 16 = 32 Marks]

7. (i) Elaborate the significance of three Vs in the context of Big Data. (10)(K2)(CO1) )
Big Data- CONCEPT
Big data refers to extremely large and diverse collections of structured, unstructured, and semi-
structured data that continues to grow exponentially over time. These datasets are so huge and
complex in volume, velocity, and variety, that traditional data management systems cannot store,
process, and analyze them.
The Vs of big data
Volume
Velocity
Variety
Veracity:
Variability:
Value:
REG.
NO. :
5113

8. (i) Define unstructured data? Compare structured and unstructured data. (8)(K1)(CO1) )
Types of Big Data

1.Structured data

2.Semi-Structured data
REG.
NO. :
5113

(ii) Explain the concept of web analytics and list its importance in detail.
( (8)(K2)(CO1)

WEB ANALYTICS
Importance of Web Analytics
Web Analytics needed to assess the success rate of a website and its associated
business. Using Web Analytics, we can −
 Assess web content problems so that they can be rectified.
 Have a clear perspective of website trends
 Monitor web traffic and user flow
 Demonstrate goals acquisition
 Figure out potential keywords
 Identify segments for improvement

Key Performance Indicator (KPI)

It depends upon the business type and strategy. KPI varies from one business to another.
Micro and macro Level Data Insights
Google Analytics gives you more insight data accurately. You can understand the data at two levels
micro level and macro level.
Micro Level Analysis
It pertains to an individual or a small group of individuals. For example, number of times job
application submitted, number of times print this page was clicked, etc.
Macro Level Analysis

It is concerned with the primary business objectives with huge groups of people such as
communities, nation, etc. For example, number of conversions in a particular demographic.
REG.
NO. :
5113

9. List the role and implications of crowdsourcing analytics in today’s data-driven landscape.
(16)(K2)(CO1)
CROWD SOURCING ANALYTICS

Crowdsourcing is a sourcing model in which an individual or an organization gets support

from a large, open-minded, and rapidly evolving group of people in the form of ideas, micro-tasks,
finances, etc. Crowdsourcing typically involves the use of the internet to attract a large group of
people to divide tasks or to achieve a target. The term was coined in 2005 by Jeff Howe and Mark
Robinson. Crowdsourcing can help different types of organizations get new ideas and solutions,
deeper consumer engagement, optimization of tasks, and several other things.
Where Can We Use Crowdsourcing?
Crowdsourcing is touching almost all sectors from education to health. It is not only accelerating
innovation but democratizing problem-solving methods. Some fields where crowdsourcing can be
used.
1. Enterprise

2. IT
3. Marketing
4. Education
5. Finance

6. Science and Health

Examples of Crowdsourcing
1. Doritos: It is one of the companies which is taking advantage of crowdsourcing for a long time for
an advertising initiative. They use consumer-created ads for one of their 30-Second Super Bowl
Spots(Championship Game of Football).
2. Starbucks: Another big venture which used crowdsourcing as a medium for idea generation. Their
white cup contest is a famous contest in which customers need to decorate their Starbucks cup with
an original design and then take a photo and submit it on social media.
3. Lays:” Do us a flavor” contest of Lays used crowdsourcing as an idea-generating medium. They
asked the customers to submit their opinion about the next chip flavor they want.
4. Airbnb: A very famous travel website that offers people to rent their houses or apartments by listing
them on the website. All the listings are crowdsourced by people.
Here is the list of some famous crowdsourcing and crowdfunding sites.
1. Kickstarter

2. GoFundMe
3. Patreon
4. RocketHub
REG.
NO. :
5113

Part-C (Compulsory)
Answer the questions [1 x 16 = 16 Marks]

10. What is Open-Source technology? Explain the advantages, disadvantages and

applications of Open-Source. (16)(K2)(CO1)

OPEN SOURCE TECHNOLOGIES / BIG DATA ANALYTICS TOOLS

1. APACHE Hadoop

Features of Apache Hadoop:

2.Cassandra
Features of APACHE Cassandra:
3.Qubole
Features of Qubole:
4.Xplenty
Features of Xplenty:
5.Spark
Features of APACHE Spark:
6.Mongo DB
Features of Mongo DB:
7.Apache Storm
Features of Storm:
8.SAS
Features of SAS:
9.Data Pine
Features of Datapine:
10. Rapid Miner
Features of Rapid Miner:

Knowledge- K1, Comprehension-K2 Application –K3,

Analysis- K4, Synthesis- K5, Evaluation – K6

Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Sap Gateway Odata
No ratings yet
Sap Gateway Odata
36 pages
Bda Ak
No ratings yet
Bda Ak
107 pages
Big Data Analytics QB
No ratings yet
Big Data Analytics QB
13 pages
Bda 2M
No ratings yet
Bda 2M
10 pages
BDA PartB
No ratings yet
BDA PartB
47 pages
Big Data Analytics Question Bank NW
No ratings yet
Big Data Analytics Question Bank NW
22 pages
21CS71 Solutions
No ratings yet
21CS71 Solutions
24 pages
Iat 1 QP Ccs334-Bda
No ratings yet
Iat 1 QP Ccs334-Bda
2 pages
BD 1
No ratings yet
BD 1
19 pages
Bda 2M
No ratings yet
Bda 2M
13 pages
BDA Question Bank
No ratings yet
BDA Question Bank
33 pages
BDA Assign 1
No ratings yet
BDA Assign 1
21 pages
Big Data - Cloud - AI
No ratings yet
Big Data - Cloud - AI
45 pages
5 It 22cs702 QBM
No ratings yet
5 It 22cs702 QBM
11 pages
Big Data Analytics - Project
50% (2)
Big Data Analytics - Project
27 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
33 pages
Module 1-BDA
No ratings yet
Module 1-BDA
82 pages
BDA Module-1
No ratings yet
BDA Module-1
9 pages
Fundamentals of Big Data and Business Analytics Answers
No ratings yet
Fundamentals of Big Data and Business Analytics Answers
20 pages
BDA 02 - Fundamentals
No ratings yet
BDA 02 - Fundamentals
64 pages
Big Assignment
No ratings yet
Big Assignment
8 pages
Unit 1
No ratings yet
Unit 1
11 pages
CCS367 Aids &CSBS Answer Key
No ratings yet
CCS367 Aids &CSBS Answer Key
10 pages
P.prabu (31x61c) CCS334 BDA - Unit 1
No ratings yet
P.prabu (31x61c) CCS334 BDA - Unit 1
31 pages
Types of Digital Data: Unit 1 Big Data KCS-061
No ratings yet
Types of Digital Data: Unit 1 Big Data KCS-061
12 pages
BD Question Bank
No ratings yet
BD Question Bank
56 pages
VTU Exam Question Paper With Solution of 18CS72 Big Data and Analytics Feb-2022-Dr. v. Vijayalakshmi
No ratings yet
VTU Exam Question Paper With Solution of 18CS72 Big Data and Analytics Feb-2022-Dr. v. Vijayalakshmi
25 pages
Big Data (KCS-061)
No ratings yet
Big Data (KCS-061)
46 pages
12th Tes It
No ratings yet
12th Tes It
10 pages
Big Data
No ratings yet
Big Data
1 page
BIG Data1
No ratings yet
BIG Data1
49 pages
BDA Model QP Soln
No ratings yet
BDA Model QP Soln
55 pages
Bda Assignment 1
No ratings yet
Bda Assignment 1
11 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
51 pages
2018 Book NetworkDataAnalytics PDF
100% (1)
2018 Book NetworkDataAnalytics PDF
406 pages
Big Data Analytics For Wireless and Wired Network Design: A Survey
No ratings yet
Big Data Analytics For Wireless and Wired Network Design: A Survey
23 pages
Ccs334 BDA Important Questions
No ratings yet
Ccs334 BDA Important Questions
31 pages
Unit I LM
No ratings yet
Unit I LM
12 pages
It (r20) 4-1 Big Data Analytics Digital Notes
No ratings yet
It (r20) 4-1 Big Data Analytics Digital Notes
84 pages
Big Data Assignment 1 1
No ratings yet
Big Data Assignment 1 1
4 pages
Big Data Unit-I
No ratings yet
Big Data Unit-I
28 pages
Bda Test1 Key Answers
No ratings yet
Bda Test1 Key Answers
7 pages
Data Analytics Quantum
No ratings yet
Data Analytics Quantum
144 pages
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
No ratings yet
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
13 pages
Big Data Analytics
No ratings yet
Big Data Analytics
31 pages
Unit - 1 Bda
No ratings yet
Unit - 1 Bda
14 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
Bda Question Bank
No ratings yet
Bda Question Bank
10 pages
Data Analytics in Cloud (Bragas, Romo) Data Analytics
No ratings yet
Data Analytics in Cloud (Bragas, Romo) Data Analytics
9 pages
21CS71 Solutions
No ratings yet
21CS71 Solutions
24 pages
CCD Chapter 3 Notes
No ratings yet
CCD Chapter 3 Notes
11 pages
Ds4015 Big Data Analytics QB
No ratings yet
Ds4015 Big Data Analytics QB
155 pages
Part B Questions
No ratings yet
Part B Questions
3 pages
EmTec Chapter 2
No ratings yet
EmTec Chapter 2
32 pages
HCLT106 1 Jan Jun2024 T&L Solutions Week13 RM V.1 09052024
No ratings yet
HCLT106 1 Jan Jun2024 T&L Solutions Week13 RM V.1 09052024
6 pages
Data Science, AI, and Blockchain: Integrated Approaches
From Everand
Data Science, AI, and Blockchain: Integrated Approaches
Ekaaksh Deshpande
No ratings yet
Building Scalable Data-Intensive Applications
From Everand
Building Scalable Data-Intensive Applications
Chandani Kaul
No ratings yet
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Real-Time Big Data Analytics: Emerging Trends
From Everand
Real-Time Big Data Analytics: Emerging Trends
Trilokesh Khatri
No ratings yet
Motherboard Chipset: Definition - What Does Mean?
No ratings yet
Motherboard Chipset: Definition - What Does Mean?
3 pages
Kendriya Vidyalaya Ujjain: Submitted By:-Subodhit Chouhan
No ratings yet
Kendriya Vidyalaya Ujjain: Submitted By:-Subodhit Chouhan
21 pages
Django Setup
No ratings yet
Django Setup
11 pages
Final Course Project - University Course Management System UCMS in Java
No ratings yet
Final Course Project - University Course Management System UCMS in Java
2 pages
Customer Spent Analysis Using K-Means Clustering
No ratings yet
Customer Spent Analysis Using K-Means Clustering
1 page
HBL InternetBanking FAQs
No ratings yet
HBL InternetBanking FAQs
9 pages
Course 20537-B - Configuring and Operating A Hybrid Cloud With Microsoft Azure Stack
No ratings yet
Course 20537-B - Configuring and Operating A Hybrid Cloud With Microsoft Azure Stack
6 pages
Removable External Media Policy
No ratings yet
Removable External Media Policy
1 page
Dev Ops
0% (1)
Dev Ops
113 pages
CSC 101 42786
No ratings yet
CSC 101 42786
3 pages
Open Source Internships and Mentorship Programs
No ratings yet
Open Source Internships and Mentorship Programs
2 pages
Project Report of ISO/IEC 23000 MPEG-A Multimedia Application Format
No ratings yet
Project Report of ISO/IEC 23000 MPEG-A Multimedia Application Format
81 pages
ANIK CHATTERJEE - Cyber
No ratings yet
ANIK CHATTERJEE - Cyber
10 pages
Access Control Based On 802.1x (SRAN18.1 - Draft A)
No ratings yet
Access Control Based On 802.1x (SRAN18.1 - Draft A)
28 pages
Perkominfo No. 5 Tahun 2020 Tentang Penyelenggaraan Sistem Elektronik Lingkup Privat (English Ver.)
No ratings yet
Perkominfo No. 5 Tahun 2020 Tentang Penyelenggaraan Sistem Elektronik Lingkup Privat (English Ver.)
21 pages
Input Output
No ratings yet
Input Output
40 pages
Microsoft Virtual Training Day Security Compliance and Identity Fundamentals
No ratings yet
Microsoft Virtual Training Day Security Compliance and Identity Fundamentals
130 pages
CTS
0% (1)
CTS
66 pages
Wafl PDF
No ratings yet
Wafl PDF
36 pages
ITIL Process List
No ratings yet
ITIL Process List
1 page
IT1906 - Syllabus and Course Outline
No ratings yet
IT1906 - Syllabus and Course Outline
6 pages
Atollic TrueSTUDIO For STMicroelectronics STM32 QuickstartGuide
No ratings yet
Atollic TrueSTUDIO For STMicroelectronics STM32 QuickstartGuide
124 pages
Spring 2025 - CS606 - 1
No ratings yet
Spring 2025 - CS606 - 1
3 pages
Safety Precautions and Troubleshooting
No ratings yet
Safety Precautions and Troubleshooting
26 pages
QLE8142 Datasheet
No ratings yet
QLE8142 Datasheet
2 pages
Medal Log 20241225
No ratings yet
Medal Log 20241225
151 pages
Power Bi
No ratings yet
Power Bi
68 pages
Topics: FPGA Fabric Architecture Concepts
No ratings yet
Topics: FPGA Fabric Architecture Concepts
23 pages
Lecture 4 Software Engineering - DR Mohammed Kamal 2024
No ratings yet
Lecture 4 Software Engineering - DR Mohammed Kamal 2024
32 pages

Bda Iat-1 Answer Key

Uploaded by

Bda Iat-1 Answer Key

Uploaded by

REG.

(Approved by AICTE, affiliated to Anna University & Accredited by NBA)

Sem & Branch :III / CSE’A

Sub. Code :CCS334

3. Explain NIST definition to define cloud computing. (K2)(CO1)

4. What are the characteristics of firewall? (K1)(CO1)

Key Performance Indicator (KPI)

Crowdsourcing is a sourcing model in which an individual or an organization gets support

6. Science and Health

10. What is Open-Source technology? Explain the advantages, disadvantages and

OPEN SOURCE TECHNOLOGIES / BIG DATA ANALYTICS TOOLS

Features of Apache Hadoop:

Knowledge- K1, Comprehension-K2 Application –K3,

You might also like