0% found this document useful (0 votes)
15 views

Lesson 6 Big Data, Data & Business Analytics

The document discusses big data and data integrity in business analytics. It defines big data as exceptionally large datasets that traditional tools struggle to manage effectively. Big data provides extensive information for businesses through sources like online interactions, sensors, and social media. Ensuring data integrity involves maintaining both physical integrity of accurate data storage and logical integrity through consistency checks. The two key aspects of logical integrity are entity integrity which ensures unique identifiers, and referential integrity which maintains relationships between database tables. Maintaining data integrity is important for making informed business decisions from big data analysis.

Uploaded by

Andrei Cirera
Copyright
© Public Domain
0% found this document useful (0 votes)
15 views

Lesson 6 Big Data, Data & Business Analytics

The document discusses big data and data integrity in business analytics. It defines big data as exceptionally large datasets that traditional tools struggle to manage effectively. Big data provides extensive information for businesses through sources like online interactions, sensors, and social media. Ensuring data integrity involves maintaining both physical integrity of accurate data storage and logical integrity through consistency checks. The two key aspects of logical integrity are entity integrity which ensures unique identifiers, and referential integrity which maintains relationships between database tables. Maintaining data integrity is important for making informed business decisions from big data analysis.

Uploaded by

Andrei Cirera
Copyright
© Public Domain
You are on page 1/ 65

Big Data, Data &

Business Analytics
Lesson 6
In this lesson, we will delve deeper
into Big Data and its indispensable
role in Business Analytics.
First, let us recall, “What is Big Data?”
Big Data refers to exceptionally large
and complex datasets that traditional
data processing tools struggle to
manage effectively.
Big Data refers to exceptionally large
and complex datasets that traditional
data processing tools struggle to
manage effectively.
In the context of Business Analytics, Big
Data serves as a crucial resource,
providing extensive and diverse
Information.
Its analysis empowers organizations to
extract valuable insights, make informed
decisions, and gain a competitive edge in
the ever-evolving business landscape.
Big Data is collected through various sources and interactions. For
instance, think about when you shop online. When you browse an e-
commerce website, every click, search, and product view is recorded.
Your purchase history, payment transactions, and even the time spent
on each page contribute to the data pool. Additionally, customer
feedback, reviews, and social media interactions further add to the
dataset. In essence, the amalgamation of these diverse sources forms the
basis of Big Data, offering a comprehensive view for analysis and insights
in areas such as customer preferences, market trends, and business
strategies.
But have you ever wondered how
they collect these big data?
Of course, they have techniques and
Methods to collect this data from us.
Here are the various techniques and methods to help businesses collect data about their
customers.

1. Websites and Apps: 4. Surveys and Feedback Forms


When you use a social media app, every post, like, and Online surveys or feedback forms after a service or
comment generates data. The platform collects this purchase provide businesses with direct insights into
information to understand user behavior and preferences. customer satisfaction and preferences.

2. Sensors and IoT (Internet of Things) Devices: 5. Social Media Interactions:


Smart thermostats in homes collect data on temperature Twitter hashtags, likes on Instagram, and Facebook shares
preferences and usage patterns, helping energy companies create a wealth of data. Companies analyze this social
optimize their services. media data to understand trends and consumer sentiment.

3. Transaction and Financial Data.


Every time you make a purchase using a credit card,
transaction details are recorded. This data is valuable for
businesses to analyze spending patterns and tailor
marketing strategies.
But aside from collecting data, businesses
must also ensure the integrity of the data
they are collecting. This is called

Data Integrity
Data Integrity refers to the accuracy, consistency, and
reliability of data. In simpler terms, it ensures that the data
you collect and use is correct, reliable, and remains
unchanged throughout its lifecycle.
Data integrity is the overall accuracy, completeness, and consistency of
data. The importance of data integrity in protecting yourself from data
loss or a data leak cannot be overstated: in order to keep your data safe
from outside forces with malicious intent, you must first ensure that
internal users are handling data correctly. By implementing the
appropriate data validation and error checking, you can ensure that
sensitive data is never miscategorized or stored incorrectly, thus
exposing you to potential risk."
For Example:
Consider an online banking system where the account balance is a
critical piece of data. If there is a lack of data integrity, meaning that the
accuracy and consistency of the balance are compromised, users might
experience serious consequences. For instance, if a financial transaction
is recorded inaccurately or if the balance fails to update properly, it
could lead to incorrect financial decisions, overdrawing accounts, or
even unauthorized access due to data leaks. Ensuring data integrity in
such a system is crucial for maintaining trust, preventing financial
errors, and safeguarding sensitive information.
Another Example:
In Google Ads, data integrity is crucial for businesses aiming to reach
specific audiences. Suppose there's a failure in maintaining data
integrity, leading to inaccuracies in user demographics or behavior
tracking. In such a case, businesses might end up displaying ads to the
wrong audience, resulting in inefficient use of advertising budgets and
reduced campaign effectiveness. By prioritizing data integrity, Google
ensures that businesses can rely on accurate user information,
maximizing the impact of their advertising efforts and enhancing the
overall user experience.
Types of data integrity
Maintaining data integrity requires an understanding of the
two types of data integrity: physical integrity and logical
integrity. Both are collections of processes and methods
that enforce data integrity in both hierarchical and relational
databases.
Types of data integrity
1.Physical integrity. Physical integrity is the protection of the wholeness and accuracy of that data as
it’s stored and retrieved. When natural disasters strike, power goes out, or hackers disrupt database
functions, physical integrity is compromised. Human error, storage erosion, and a host of other issues
can also make it impossible for data processing managers, system programmers, applications
programmers, and internal auditors to obtain accurate data.

Example. Imagine a company that stores its critical business data on servers in a physical data center. A severe
natural disaster, such as a flood or earthquake, hits the region, causing damage to the data center. The flooding or
structural damage compromises the physical integrity of the stored data. In this scenario, data processing
managers may face challenges in retrieving accurate and intact information due to the impact of the natural
disaster on the physical storage infrastructure. Implementing robust disaster recovery measures and backup
systems becomes crucial to maintaining physical integrity and ensuring data availability even in the face of
unforeseen events.
Types of data integrity
2. Logical Integrity. Logical integrity ensures that data remains consistent and unchanged when used
in various ways within a relational database. It acts as a safeguard against human errors and
unauthorized access, offering protection in a different manner compared to physical integrity. There are
four main types of logical integrity:

Entity Integrity: Ensures each record has a unique identifier, preventing duplicates or
missing entries.
Referential Integrity: Maintains relationships between tables, ensuring that foreign
key values match existing primary key values in related tables.
Domain Integrity: Validates data to fit predefined data types and rules, ensuring
each attribute contains appropriate and valid values.
User-defined Integrity: Allows users to set custom rules specific to their application or
business requirements.
Types of data integrity
2. Logical Integrity
Example. Imagine a university database where student information is stored relationally. Logical
integrity ensures that each student has a unique student ID (Entity Integrity). Now, let's say the
database links student grades to courses they have taken (Referential Integrity). If a student drops a
course, logical integrity ensures that the record of their grades in that course is appropriately handled
to maintain consistency. Additionally, logical integrity ensures that the dates of enrollment and
graduation are within valid ranges (Domain Integrity). This prevents illogical scenarios, such as a
student graduating before enrolling. User-defined integrity might include custom rules, like specifying
the maximum number of courses a student can take in a semester, adding an extra layer of tailored
logical checks. Overall, logical integrity in this university database guarantees consistency and
reliability in student records.
Any Question?
Quiz
1. What does data integrity refer to?
a. Speed of data processing
b. Accuracy, completeness, and consistency of data
c. Size of the dataset
d. Security of data storage
2. Which type of integrity focuses on maintaining
the accuracy of data at the storage level?

a. Speed of data processing


b. Accuracy, completeness, and consistency of data
c. Size of the dataset
d. Security of data storage
3. In a database, what does Entity Integrity
ensure?
a. Each record has a unique identifier
b. Relationships between tables are maintained
c. Data fits predefined data types and rules
d. Users can set custom rules
4. What does Referential Integrity maintain in a
database?
a. Unique identifiers for records
b. Relationships between tables
c. Valid data types and rules
d. Custom rules set by users
5. Which integrity type validates data to fit
predefined data types and rules?
a. Logical Integrity
b. Entity Integrity
c. Domain Integrity
d. Referential Integrity
6. What aspect of data does User-defined Integrity
allow users to control?
a. Storage level accuracy
b. Logical consistency
c. Custom rules and constraints
d. Relationships between tables
7. How does physical integrity get compromised?

a. Human error and hackers


b. Natural disasters, power outages, and hackers
c. Storage erosion and internal auditors
d. System programmers and applications programmers
8. What does logical integrity safeguard against in
a relational database?
a. Physical damage
b. Human error and unauthorized access
c. Data leaks
d. Natural disasters
9. What is the main goal of maintaining Entity
Integrity in a database?
a. Preventing duplicate or missing entries
b. Maintaining relationships between tables
c. Validating data types and rules
d. Custom rule setting by users
10. Which integrity type ensures that foreign key
values match existing primary key values in
related tables?
a. Logical Integrity
b. Entity Integrity
c. Referential Integrity
d. User-defined Integrity
11. What does Domain Integrity prevent in a
database?
a. Duplicate entries
b. Relationships between tables
c. Illogical scenarios in data
d. Human error
12. Which type of integrity allows users to set rules
specific to their application or business
requirements?
a. Domain Integrity
b. Logical Integrity
c. Entity Integrity
d. User-defined Integrity
13. In a file system, which integrity type ensures
that a document file saved on a computer's hard
drive is not corrupted?
a. Logical Integrity
b. Physical Integrity
c. Entity Integrity
d. Domain Integrity
14. How does logical integrity contribute to a
university database system?
a. Ensures physical security
b. Maintains relationships between tables
c. Validates data types and rules
d. Prevents data leaks
15. Which integrity type ensures that a date of
birth field only contains valid dates in a database?

a. Ensures physical security


b. Maintains relationships between tables
c. Validates data types and rules
d. Prevents data leaks
16. What does User-defined Integrity add to the
data management process?
a. Validation of data types
b. Custom rules specific to user requirements
c. Physical security measures
d. Logical consistency
17. What is the primary role of Referential
Integrity in a relational database?
a. Preventing duplicate entries
b. Ensuring unique identifiers
c. Maintaining relationships between tables
d. Validating data types
18. How does Domain Integrity contribute to data
quality in a database?
a. Preventing duplicate entries
b. Validating data to predefined data types and rules
c. Ensuring unique identifiers
d. Maintaining relationships between tables
19. What is the main purpose of Entity Integrity in
a database system?
a. Maintaining relationships between tables
b. Preventing duplicate or missing entries
c. Validating data types and rules
d. Custom rule setting by users
20. How does logical integrity protect data in a
relational database?
a. Ensures physical security measures
b. Maintains consistency and relationships between tables
c. Validates data types and rules
d. Prevents natural disasters
Answers
1. What does data integrity refer to?
a. Speed of data processing
b. Accuracy, completeness, and consistency of data
c. Size of the dataset
d. Security of data storage
2. Which type of integrity focuses on maintaining
the accuracy of data at the storage level?

a. Logical Integrity
b. Entity Integrity
c. Physical Integrity
d. Referential Integrity
3. In a database, what does Entity Integrity
ensure?
a. Each record has a unique identifier
b. Relationships between tables are maintained
c. Data fits predefined data types and rules
d. Users can set custom rules
4. What does Referential Integrity maintain in a
database?
a. Unique identifiers for records
b. Relationships between tables
c. Valid data types and rules
d. Custom rules set by users
5. Which integrity type validates data to fit
predefined data types and rules?
a. Logical Integrity
b. Entity Integrity
c. Domain Integrity
d. Referential Integrity
6. What aspect of data does User-defined Integrity
allow users to control?
a. Storage level accuracy
b. Logical consistency
c. Custom rules and constraints
d. Relationships between tables
7. How does physical integrity get compromised?

a. Human error and hackers


b. Natural disasters, power outages, and hackers
c. Storage erosion and internal auditors
d. System programmers and applications programmers
8. What does logical integrity safeguard against in
a relational database?
a. Physical damage
b. Human error and unauthorized access
c. Data leaks
d. Natural disasters
9. What is the main goal of maintaining Entity
Integrity in a database?
a. Preventing duplicate or missing entries
b. Maintaining relationships between tables
c. Validating data types and rules
d. Custom rule setting by users
10. Which integrity type ensures that foreign key
values match existing primary key values in
related tables?
a. Logical Integrity
b. Entity Integrity
c. Referential Integrity
d. User-defined Integrity
11. What does Domain Integrity prevent in a
database?
a. Duplicate entries
b. Relationships between tables
c. Illogical scenarios in data
d. Human error
12. Which type of integrity allows users to set rules
specific to their application or business
requirements?
a. Domain Integrity
b. Logical Integrity
c. Entity Integrity
d. User-defined Integrity
13. In a file system, which integrity type ensures
that a document file saved on a computer's hard
drive is not corrupted?
a. Logical Integrity
b. Physical Integrity
c. Entity Integrity
d. Domain Integrity
14. How does logical integrity contribute to a
university database system?
a. Ensures physical security
b. Maintains relationships between tables
c. Validates data types and rules
d. Prevents data leaks
15. Which integrity type ensures that a date of
birth field only contains valid dates in a database?

a. Ensures physical security


b. Maintains relationships between tables
c. Validates data types and rules
d. Prevents data leaks
16. What does User-defined Integrity add to the
data management process?
a. Validation of data types
b. Custom rules specific to user requirements
c. Physical security measures
d. Logical consistency
17. What is the primary role of Referential
Integrity in a relational database?
a. Preventing duplicate entries
b. Ensuring unique identifiers
c. Maintaining relationships between tables
d. Validating data types
18. How does Domain Integrity contribute to data
quality in a database?
a. Preventing duplicate entries
b. Validating data to predefined data
0 types and rules
c. Ensuring unique identifiers
d. Maintaining relationships between tables
19. What is the main purpose of Entity Integrity in
a database system?
a. Maintaining relationships between tables
b. Preventing duplicate or missing entries
c. Validating data types and rules
d. Custom rule setting by users
20. How does logical integrity protect data in a
relational database?
a. Ensures physical security measures
b. Maintains consistency and relationships between tables
c. Validates data types and rules
d. Prevents natural disasters
Thanks!
CREDITS: This presentation template was created by Slidesgo, and includes
icons by Flaticon, and infographics & images by Freepik

You might also like