0% found this document useful (0 votes)
22 views10 pages

DSP CT 3

Uploaded by

rratishh57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views10 pages

DSP CT 3

Uploaded by

rratishh57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Detailed Brief Answers for Unit 5 Questions:

1. How are data mining techniques used to preserve privacy in databases?


- Anonymization: Techniques like k-anonymity ensure that individuals cannot be distinguished
from a group of at least \( k \) people.
- Perturbation: Adding noise to the data, such as additive or multiplicative perturbation, to
mask original values while preserving statistical properties.
- Encryption: Securing data using cryptographic methods to protect it during storage and
transmission.
- Secure Multi-party Computation (SMC): Allows multiple parties to jointly compute a
function over their inputs while keeping those inputs private.
- Differential Privacy: Adding random noise to query results to ensure that the inclusion or
exclusion of a single database item does not significantly affect the outcome.

2. What is Privacy Quantification?


- k-Anonymity: A dataset is k-anonymous if each record is indistinguishable from at least \(
k-1 \) other records based on certain identifiers.
- l-Diversity: Extends k-anonymity by ensuring that sensitive attributes within a k-anonymous
group have at least \( l \) "well-represented" values.
- t-Closeness: Ensures that the distribution of a sensitive attribute in any equivalence class is
close to the distribution of the attribute in the overall dataset, preventing attribute disclosure.

3. What are privacy-preserving algorithms available in data mining? Explain any one.
- Randomized Response: Collects truthful responses and randomized responses to maintain
privacy.
- k-Anonymity: Generalizes or suppresses data so that individuals cannot be re-identified from
the dataset.
- Example: In a dataset, names and specific ages might be replaced with age ranges and
general professions to ensure privacy while still providing useful data for analysis.

4. What is Randomization? Explain in brief.


- Additive Noise: Adding random values to the original data points to mask the actual values.
- Multiplicative Noise: Multiplying the original data points by random values to obscure the
data while preserving the relationship between data points.
- Purpose: To protect individual data privacy while allowing for aggregate data analysis.

5. What is Group-Based Anonymization?


- k-Anonymity: Grouping data so that each individual's information is hidden within a group
of \( k \) people.
- Example: Grouping ages into ranges (e.g., 20-30, 31-40) to prevent precise age identification
while keeping the data useful for analysis.
- Goal: To make re-identification of individuals difficult or impossible by ensuring they are
part of a larger, indistinguishable group.

6. What is Distributed Privacy Preserving?


- Secure Multi-party Computation (SMC): Parties jointly compute a function over their inputs
while keeping those inputs private.
- Federated Learning: Training machine learning models across multiple decentralized devices
or servers holding local data samples, without exchanging them.
- Homomorphic Encryption: Performing computations on encrypted data without decrypting it
first.

7. What are the applications of privacy-preserving data mining?


- Healthcare: Analyzing patient data for medical research while maintaining patient
confidentiality.
- Finance: Detecting fraud without disclosing individual transaction details.
- Marketing: Personalizing marketing campaigns without compromising consumer privacy.
- Government: Conducting census data analysis while protecting citizens' privacy.

8. Give a general survey about the randomization method.


- Techniques: Includes additive and multiplicative perturbation.
- Goals: To mask individual data values while preserving the overall statistical properties of
the dataset for analysis.
- Strengths: Easy to implement and effective in protecting individual privacy.
- Weaknesses: May reduce data utility if not carefully applied.

9. Why are data mining techniques preferred for preserving privacy?


- Balancing Utility and Privacy: Allows extraction of meaningful patterns without
compromising individual privacy.
- Scalability: Can be applied to large datasets efficiently.
- Compliance: Helps meet regulatory requirements for data privacy and protection.
- Versatility: Applicable across various domains, including healthcare, finance, marketing, and
government.

10. Define Semi-Honest Adversaries.


- Definition: Adversaries who follow the protocol correctly but try to learn additional
information by analyzing the messages they receive.
- Behavior: They do not deviate from the protocol steps but use all available information to
infer private data.
- Assumption: Common in secure multi-party computation protocols where parties are
assumed to be honest-but-curious.
11. State Multiplicative Perturbation.
- Technique: Multiplying data points by random values to mask the original data.
- Purpose: To protect individual data values while preserving the overall data structure for
analysis.
- Example: Multiplying salaries by random factors to obscure exact amounts while allowing
statistical analysis of salary trends.

12. State Additive Perturbation.


- Technique: Adding random noise to data points to mask the original values.
- Purpose: To protect individual data values while allowing aggregate data analysis.
- Example: Adding random values to annual incomes to prevent exact figures from being
known, but still enabling overall income distribution analysis.

13. List out the Malicious Adversaries.


- Definition: Adversaries who actively try to disrupt the protocol and learn as much as
possible about other participants' data.
- Behavior: Deviate from the protocol steps, inject false data, or perform attacks to
compromise data privacy and integrity.
- Types: Internal (disgruntled employees) and external (hackers) adversaries.

14. Analyze the l-Diversity Method.


- Definition: An extension of k-anonymity ensuring that sensitive attributes have at least \( l \)
"well-represented" values within any anonymized group.
- Strengths: Addresses the limitations of k-anonymity by providing better protection against
attribute disclosure.
- Example: In a healthcare dataset, ensuring that within any group of records sharing the same
quasi-identifiers, there are at least \( l \) different diagnoses.
- Goal: To prevent attackers from inferring sensitive information based on homogeneity or
lack of diversity in the data.
UNIT VI
1. What is the need of auditing database? In what ways can it be audited?
- Need for Auditing:
- Security: Detect unauthorized access and potential security breaches.
- Compliance: Meet regulatory and legal requirements.
- Accountability: Track user actions and database changes.
- Performance: Monitor and improve database performance.

- Ways to Audit:
- Manual Auditing: Regularly reviewing logs and reports.
- Automated Tools: Using database auditing tools like Oracle Audit Vault.
- Triggers: Implementing database triggers to log changes.
- Transaction Logs: Analyzing transaction logs for activity monitoring.
- Application Auditing: Embedding auditing features within applications.

2. What are triggers in Oracle?


- Triggers are stored procedures that automatically execute in response to specific events on a
particular table or view. They are used for:
- Data Integrity: Enforcing business rules.
- Audit Logging: Tracking changes to data.
- Complex Validation: Performing complex checks that cannot be done with constraints.

3. Why should server activity be audited?


- Security: To detect and respond to unauthorized access and actions.
- Compliance: To adhere to industry standards and legal requirements.
- Troubleshooting: To diagnose and resolve issues by reviewing activity logs.
- Performance Monitoring: To identify and optimize slow queries and processes.
- Change Management: To track and review changes made to the system.

4. What is the difference between Oracle Server and SQL Server 2000 in auditing databases?
- Oracle Server:
- Provides comprehensive auditing options including fine-grained auditing, which allows
detailed control over audit records.
- Uses built-in features like Oracle Audit Vault and Database Firewall for enhanced security.

- SQL Server 2000:


- Offers basic auditing capabilities through SQL Server Profiler and triggers.
- Auditing can be implemented using DML triggers and transaction logs but lacks some
advanced features available in Oracle.
5. How do you create a trigger using Oracle?
- Creating a Trigger:
```sql
CREATE OR REPLACE TRIGGER trigger_name
BEFORE INSERT OR UPDATE OR DELETE ON table_name
FOR EACH ROW
BEGIN
-- trigger logic here
END;
```
- Example: Logging changes to a table
```sql
CREATE OR REPLACE TRIGGER audit_trigger
BEFORE INSERT OR UPDATE OR DELETE ON employees
FOR EACH ROW
BEGIN
IF INSERTING THEN
INSERT INTO audit_table (action, user, timestamp)
VALUES ('INSERT', USER, SYSDATE);
ELSIF UPDATING THEN
INSERT INTO audit_table (action, user, timestamp)
VALUES ('UPDATE', USER, SYSDATE);
ELSIF DELETING THEN
INSERT INTO audit_table (action, user, timestamp)
VALUES ('DELETE', USER, SYSDATE);
END IF;
END;
```

6. What is the advantage of DLL trigger over other triggers in Oracle?


- Advantages:
- Extended Functionality: Can capture schema-level events like CREATE, ALTER, DROP.
- Security: Enhances security by logging and controlling DDL operations.
- Change Tracking: Provides detailed tracking of structural changes to the database.

7. Create a sample code in SQL Server 2000 to audit a server.


```sql
CREATE TRIGGER AuditTrigger
ON DATABASE
FOR INSERT, UPDATE, DELETE
AS
BEGIN
INSERT INTO AuditTable (EventType, EventTime, UserName, ObjectName)
SELECT
CASE
WHEN EXISTS(SELECT FROM inserted) AND EXISTS(SELECT FROM deleted)
THEN 'UPDATE'
WHEN EXISTS(SELECT FROM inserted) THEN 'INSERT'
ELSE 'DELETE'
END,
GETDATE(),
USER_NAME(),
OBJECT_NAME(parent_obj)
FROM sysobjects
WHERE id = @@PROCID;
END;
```

8. Explain how auditing is done with a simple case study.

Case Study: Online Database Development


- Scenario: Developing an online retail database.
- Auditing Requirements:
- Track changes to product listings.
- Monitor user access and actions.
- Ensure data integrity and security.

- Implementation:
- Triggers: Set up triggers to log changes to product tables.
- Audit Tables: Create audit tables to store log entries.
- Automated Reports: Generate regular reports for review by database administrators.

```sql
CREATE TRIGGER ProductAudit
ON Products
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
IF EXISTS(SELECT FROM inserted)
BEGIN
INSERT INTO ProductAuditLog (ActionType, ProductID, ChangeTime, UserID)
SELECT 'INSERT/UPDATE', ProductID, GETDATE(), USER_NAME()
FROM inserted;
END
ELSE
BEGIN
INSERT INTO ProductAuditLog (ActionType, ProductID, ChangeTime, UserID)
SELECT 'DELETE', ProductID, GETDATE(), USER_NAME()
FROM deleted;
END
END;
```

Case Study: Payroll System


- Scenario: Managing payroll for a company.
- Auditing Requirements:
- Track salary changes.
- Monitor access to payroll data.
- Ensure compliance with financial regulations.

- Implementation:
- Triggers: Set up triggers to log changes to payroll tables.
- Audit Tables: Create audit tables to store log entries.
- Access Controls: Implement strict access controls to limit who can view and modify payroll
data.

```sql
CREATE TRIGGER PayrollAudit
ON Payroll
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
IF EXISTS(SELECT FROM inserted)
BEGIN
INSERT INTO PayrollAuditLog (ActionType, EmployeeID, ChangeTime, UserID)
SELECT 'INSERT/UPDATE', EmployeeID, GETDATE(), USER_NAME()
FROM inserted;
END
ELSE
BEGIN
INSERT INTO PayrollAuditLog (ActionType, EmployeeID, ChangeTime, UserID)
SELECT 'DELETE', EmployeeID, GETDATE(), USER_NAME()
FROM deleted;
END
END;
```

These answers provide detailed yet concise information on the various aspects of database
auditing, triggers, and the practical application of auditing through case studies.
Unit IV - Auditing Database Activities

1. Introduction
- This section provides an overview of the importance of auditing database activities to ensure
security, integrity, and compliance. It discusses the need for monitoring database actions to
detect unauthorized access, data breaches, and performance issues.

2. Using Oracle Database Activities


- Focuses on the tools and techniques available in Oracle for auditing database activities. This
includes using Oracle Audit Vault, Database Firewall, and other Oracle-specific features to
monitor and log database actions.

3. Creating DLL Triggers with Oracle


- Describes how to create Data Definition Language (DDL) triggers in Oracle to automatically
log schema-level changes such as CREATE, ALTER, and DROP commands. These triggers help
in maintaining an audit trail of structural changes to the database.

4. Auditing Database Activities with Oracle Auditing


- Explains the different auditing options available in Oracle, such as standard auditing,
fine-grained auditing, and unified auditing. It covers how to configure and manage these
auditing features to track database activities effectively.

5. Server Activity with SQL Server 2000


- Discusses how to audit server activities in SQL Server 2000. This includes using tools like
SQL Server Profiler, DML triggers, and transaction logs to monitor and log database operations
and user activities.

6. Security and Auditing Project Case Study


- Presents a case study on implementing a security and auditing project. This section includes
real-world examples of setting up an auditing system, configuring audit policies, and analyzing
audit logs to ensure database security and compliance.

Unit V - Privacy Preserving Data Mining Techniques

1. Introduction
- This section introduces the concept of privacy-preserving data mining (PPDM), which aims
to extract useful information from large datasets while protecting the privacy of individuals. It
discusses the importance and challenges of balancing data utility and privacy.

2. Privacy Preserving Data Mining Algorithms


- Describes various algorithms used in PPDM, such as k-anonymity, l-diversity, t-closeness,
differential privacy, and secure multi-party computation. Each algorithm is designed to protect
sensitive information while allowing for data analysis.

3. General Survey
- Provides an overview of the state-of-the-art techniques and methodologies in PPDM. It
surveys existing literature, tools, and frameworks used to implement privacy-preserving data
mining.

4. Randomization Methods
- Discusses randomization techniques that add controlled noise to data, such as additive and
multiplicative perturbation, to mask individual data points while preserving overall data patterns
for analysis.

5. Group Based Anonymization


- Explains methods like k-anonymity and l-diversity, which group data records to make
individual entries indistinguishable within a group, thereby protecting privacy while maintaining
data utility.

6. Distributed Privacy Preserving Data Mining


- Covers techniques that enable multiple parties to collaboratively perform data mining
without revealing their individual datasets. This includes methods like secure multi-party
computation and federated learning.

7. Curse of Dimensionality
- Addresses the challenges posed by high-dimensional data in PPDM. As the number of
dimensions increases, it becomes more difficult to maintain privacy while ensuring data utility.
This section explores strategies to manage and mitigate these challenges.

8. Application of Privacy Preserving Data Mining


- Highlights real-world applications of PPDM in various fields such as healthcare, finance,
marketing, and government. It discusses how these techniques are used to protect sensitive
information while extracting valuable insights from data.

You might also like