0% found this document useful (0 votes)
37 views74 pages

Adbms Unit 5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views74 pages

Adbms Unit 5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Noida Institute of Engineering and Technology, Greater Noida

DATABASE STANDARDS, SECURITY METHODS


AND TECHNIQUES

Unit 5

Advance DBMS
Course Details Faculty Name: Ms. Shweta
(B Tech 6th Sem) Assistant Professor
(AIML) Dept.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 1


Content

SQL and NoSQL standards


Use of SQL/NoSQL and standards in the industry
Limitations of standardization
Standards for interoperability and integration
Web services
JSON. Data encryption
Redaction and masking techniques
Authentication and Authorization
Database auditing

19/07/2024 Ms. Shweta ACSML0603 Unit 5 2


Evaluation Scheme

19/07/2024 Ms. Shweta ACSML0603 Unit 5 3


Syllabus

SQL and NoSQL standards: Use of SQL/NoSQL and standards in the industry,
Limitations of standardization.
Standards for interoperability and integration: Web services, JSON. Data
encryption, Redaction and masking techniques, Authentication and Authorization,
Database auditing.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 4


Branch wise Application

There are various application of Advance DBMS in different fields like:


Railway Reservation System
Library Management System
Banking
Universities and colleges
Credit card transactions etc.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 5


Course Objective

This course provides an introduction to the advanced database management


system.
The course introduces both theoretical (knowledge-based) and practical
approaches, illustrate the use of advanced database and tools in a variety of
application areas, as well as provide insight into many open research problems.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 6


Course Outcomes(COs)

19/07/2024 Ms. Shweta ACSML0603 Unit 5 7


Program Outcomes (POs)

Engineering Graduates will be able to:


1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals, and an
engineering specialization to the solution of complex engineering problems.

2. Problem analysis: Identify, formulate, review research literature, and analyze complex engineering problems
reaching substantiated conclusions using first principles of mathematics, natural sciences, and engineering
sciences.

3. Design/development of solutions: Design solutions for complex engineering problems and design system
components or processes that meet the specified needs with appropriate consideration for the public health
and safety, and the cultural, societal, and environmental considerations.

4. Conduct investigations of complex problems: Use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to
provide valid conclusions.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 8


Program Outcomes (POs)
Conti…
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.

6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.

7. Environment and sustainability: Understand the impact of the professional engineering solutions
in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.

8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms
of the engineering practice.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 9


Program Outcomes (POs)
Conti…
9. Individual and team work: Function effectively as an individual, and as a member or leader in diverse
teams, and in multidisciplinary settings.

10. Communication: Communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.

11. Project management and finance: Demonstrate knowledge and understanding of the engineering and
management principles and apply these to one’s own work, as a member and leader in a team, to manage
projects and in multidisciplinary environments.

12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 10


COs and POs Mapping

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

2 2 3 3 3 2 3 2 2 2 2 3

KCS501.2
3 3 3 2 2 2 2 2 2 2 2 3

KCS501.3
2 3 3 3 3 2 2 2 2 2 2 2

KCS501.4
2 3 2 2 2 2 2 2 2 3 2 2

KCS501.5
2 3 2 2 2 3 2 2 3 2 2 2

AVG
2.20 2.80 2.60 2.40 2.40 2.20 2.20 2.00 2.20 2.20 2.00 2.40

19/07/2024 Ms. Shweta ACSML0603 Unit 5 11


Program Specific Outcomes (PSOs)

On successful completion of graduation degree the Computer Science & Engineering graduates will be able to:

PSO1:
Design and develop the Hardware and Software systems.
PSO2:
Understand the interdisciplinary computing techniques and an ability to apply them in the design of advanced
computing.
PSO3:
Understand the programming methodology, software development paradigms, design and analysis of Algorithms,
Operating Systems, Digital Logic Design, Theory of Computation, Discrete Mathematics, Compiler Design, etc.
PSO4:
To integrate & manage the various phases/components of software development projects of society.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 12


COs and PSOs Mapping

Program Specific Outcomes

PSO1 PSO2 PSO3 PSO4

KCS-501.1
3 1 3 1
KCS-501.2
3 1 3 1
KCS-501.3
3 1 3 1
KCS-501.4
3 1 3 1
KCS-501.5
3 1 3 1
AVG
3.00 1.00 3.00 1.00

19/07/2024 Ms. Shweta ACSML0603 Unit 5 13


Program Educational Objectives (PEOs)

PEO1:
Able to apply sound knowledge in the field of information technology to fulfill the needs of IT industry.
PEO2:
Able to design innovative and interdisciplinary systems through latest digital technologies.
PEO3:
Able to inculcate professional ethics, team work and leadership for serving the society.
PEO4:
Able to inculcate lifelong learning in the field of computing for successful career in organizations and R&D
sectors.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 14


Question Paper Template

19/07/2024 Ms. Shweta ACSML0603 Unit 5 15


Question Paper Template
SECTION – A CO

1. Attempt all parts- [10×1=10]

1-a. Question- (1)


1-b. Question- (1)
1-c. Question- (1)
1-d. Question- (1)
1-e. Question- (1)
1-f. Question- (1)
1-g. Question- (1)
1-h. Question- (1)
1-i. Question- (1)
1-j. Question- (1)

2. Attempt all parts- [5×2=10] CO

2-a. Question- (2)


2-b. Question- (2)
2-c. Question- (2)
2-d. Question- (2)
2-e. Question- (2)

SECTION – B CO

3. Answer any five of the following- [5×6=30]


3-a. Question- (6)
3-b. Question- (6)
3-c. Question- (6)
3-d. Question- (6)
3-e. Question- (6)
3-f. Question- (6)
3-g. Question- (6)
SECTION – C CO

4 Answer any one of the following- [5×10=50]

4-a. Question- (10)

4-b. Question- (10)


5. Answer any one of the following-
5-a. Question- (10)

5-b. Question- (10)


6. Answer any one of the following-
6-a. Question- (10)

6-b. Question- (10)


7. Answer any one of the following-
7-a. Question- (10)

7-b. Question- (10)

8. Answer any one of the following-


8-a. Question- (10)

8-b. Question- (10)

19/07/2024 Ms. Shweta ACSML0603 Unit 5 16


Prerequisite and Recap

 The student should have knowledge of relational database management system (RDBMS) and
SQL.
 Having knowledge of basic mathematics like - SUM, DIFFERENCE, AVERAGE, MEAN, MEDIAN,
MODE, etc will definitely be a plus point.
 Having knowledge on Set Theory will help.
 The proper understanding of data structures (B and B+ trees) will help you to understand the
DBMS quickly.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 17


Unit Objective

 Students will be able to learn SQL and NoSQL standards Concepts.

 Use of SQL/NoSQL and standards in the industry, Limitations of standardization, overview of


standards for interoperability and integration.

 Introduction of Web services, JSON. Data encryption, Redaction and masking techniques,
Authentication and Authorization, Database auditing.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 18


SQL and NoSQL standards and Usages
When to use SQL instead of NoSQL

You’re working with complex queries and reports.

• With SQL you can build one script that retrieves and presents your data. NoSQL doesn’t support relations between data types. Running queries in NoSQL is doable, but much slower.

You have a high transaction application.

• SQL databases are a better fit for heavy duty or complex transactions because it’s more stable and ensure data integrity.

You need to ensure ACID compliance.

• (Atomicity, Consistency, Isolation, Durability) or defining exactly how transactions interact with a database.

You don’t anticipate a lot of changes or growth.

• If you’re not working with a large volume of data or many data types, NoSQL would be overkill.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 19
SQL and NoSQL standards and Usages
When to use NoSQL instead of SQL

When constantly adding new features, functions, data types.

• It’s difficult to predict how the application will grow over time.

Changing a data model is SQL is clunky and requires code changes.

• A lot of time is invested designing the data model because changes will impact all or most of the layers in the application.

In NoSQL, we are working with a highly flexible schema design or no predefined schema.

• The data modelling process is iterative and adaptive. Changing the structure or schema will not impact development cycles or create any downtime
for the application.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 20
SQL and NoSQL standards and Usages
When to use NoSQL instead of SQL

When need not to concerned about data consistency and 100% data integrity is not your top goal.

• This is related to the above SQL requirement for ACID compliance. For example, with social media platforms, it isn’t important if everyone sees your
new post at the exact same time, which means data consistency is not a priority.

When required a lot of data, many different data types, and your data needs will only grow over time.

• NoSQL makes it easy to store all different types of data together and without having to invest time into defining what type of data you’re storing in
advance.

When data needs scale up, out, and down.

• NoSQL provides much greater flexibility and the ability to control costs as your data needs change.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 21
SQL and NoSQL standards and Usages
When to use NoSQL instead of SQL

When need not to concerned about data consistency and 100% data integrity is not your top goal.

• This is related to the above SQL requirement for ACID compliance. For example, with social media platforms, it isn’t important if everyone sees your
new post at the exact same time, which means data consistency is not a priority.

When required a lot of data, many different data types, and your data needs will only grow over time.

• NoSQL makes it easy to store all different types of data together and without having to invest time into defining what type of data you’re storing in
advance.

When data needs scale up, out, and down.

• NoSQL provides much greater flexibility and the ability to control costs as your data needs change.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 22
Use of SQL/NoSQL and standards in the industry

Why NoSQL?

In recent times you can easily capture and access data from various sources, like Facebook, Google,
etc.
User’s personal information, geographic location data, user generated content, social graphs and
machine logging data are some of the examples where data is increasing rapidly.
To use above mentioned properties, it is necessary to process large volume of data.
For which relational databases are not suitable. The evolution of NoSQL databases is to handle this
large volume of data properly.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 23


Conti...

Use of NoSQL in industry:

1. Session Store:
Managing session data using relational database is very difficult, especially in case where applications are
grown very much.
In such cases the right approach is to use a global session store, which manages session information for
every user who visits the site.
NOSQL is suitable for storing such web application session information very is large in size.
Since the session data is unstructured in form, so it is easy to store it in schema less documents rather than
in relation database record.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 24


Conti...
2. User Profile Store:
To enable online transactions, user preferences, authentication of user and more, it is required to store
the user profile by web and mobile application.
In recent time users of web and mobile application are grown very rapidly. The relational database could
not handle such large volume of user profile data which growing rapidly, as it is limited to single server.
3. Content and Metadata Store:
Many companies like publication houses require a place where they can store large amount of data,
which include articles, digital content and e-books, in order to merge various tools for learning in single
platform.
For building applications based on content, use of NoSQL provide flexibility in faster access to data and
to store different types of contents.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 25


Conti...

4. Mobile Applications:
Using NoSQL database mobile application development can be started with small size and can be easily
expanded as the number of user increases, which is very difficult if you consider relational databases.
5. Internet of Things:
Today, billions of devices are connected to internet, such as smart phones, tablets, home appliances,
systems installed in hospitals, cars and warehouses. For such devices large volume and variety of data is
generated and keep on generating.
6. Social Gaming:
Data-intensive applications such as social games which can grow users to millions. Such a growth in
number of users as well as amount of data requires a database system which can store such data and can be
scaled to incorporate number of growing users NOSQL is suitable for such applications.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 26


Limitations of standardization

What is data standardization?


The process of transforming an incorrect or unacceptable representation of data into an acceptable form. The
best way to achieve standardization of data is to align your data representation, structure, and definition to
organizational requirements.

Why do you need to standardize data?


Every system has its own set of limitations and restrictions, leading to unique data models and their
definitions. For this reason, you may need to transform data before it can be correctly consumed by any
business process.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 27


Conti...

1. Conform incoming or outgoing data:


An organization has many interfaces that exchange data points from external stakeholders, such as vendors
or partners. Whenever data enters an enterprise or is exported out, it becomes necessary to conform data to
the required standard, otherwise the unstandarised data mess just gets bigger and bigger.
2. Prepare data for analytics:
Same data can be represented in multiple ways, but most BI tools are not specialized to process every
possible representation of data values and may end up treating the same meaning data differently.
This can lead to biased or inaccurate BI results. Therefore, before you can feed data into your BI systems, it
must be cleaned, standardized, and deduplicated, so that you can attain correct, valuable insights.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 28


Conti...

3. Consolidate entities to eliminate duplicates


Data duplication is one of the biggest data quality hazards businesses deal with. For efficient and error-free
business operations, you must eliminate duplicate records that belong to the same entity (whether for a
customer, product, location, or employee), and an effective data reduplication process requires you to
comply with data quality standards.
4. Share data between departments
For data to be interoperable between departments, it has to be in a format that is understandable by
everyone. Mostly, organizations have customer information in CRMs that is understood by the sales and
marketing folks. This can introduce delays in task completion and roadblocks in team productivity.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 29


How to standardize data?

A data standardization process has four simple steps: define, test, transform, and retest. Let’s go over each
step in a bit more detail.
1. Define a standard:
In the first step, you must identify what standard meets your organizational needs. The best way to define a
standard is by designing a data model for your enterprise. A data model can be designed as:
Identify the data assets crucial to your business operation. For example, most enterprises capture and
manage data for customers, products, employees, locations, etc.
Define the data fields of each asset identified and decide on the structural details as well. For example,
you may want to store a customer’s Name, Address, Email, and Phone Number – where the Name field spans
over three fields and Address field spans over two.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 30


Conti...
Assign a data type to every field identified in the asset. For example, the Name field is a string value,
Phone Number is an integer value, and so on.
Define character limits (minimum and maximum) for each field. For example, a Name cannot be longer
than 15 characters and Phone Number cannot be more than 8 digits, etc.
Define the pattern that fields must adhere to – this may not be applicable to all fields. For example, every
customer’s Email Address should adhere to the regex: [chars]@[chars].[chars].
Define the format in which certain data elements must be placed within a field. For example, a customer’s
DOB should be specified as MM/DD/YYYY.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 31


Conti...
Define the measuring unit for numeric values (if applicable). For example, customer’s Age is measured
by Years.
Define the value domain for fields that must be derived from a certain set of values. For example,
customer Age must be a digit within 18 and 50, Gender must be Male or Female, and so on.
An example data model for a retail company is shown below:

19/07/2024 Ms. Shweta ACSML0603 Unit 5 32


Conti...

2. Test for standard:


Data standardization techniques start at the second step, since the first step focuses on defining what should
be – something that is done once or incrementally reviewed and updated every once in a while.
a. Parsing records and attributes
b. Building data profile report
c. Matching and validating patterns
d. Using dictionaries
e. Testing addresses for standardization

19/07/2024 Ms. Shweta ACSML0603 Unit 5 33


Conti...

3. Transform:
In the third step of the data standardization process, it is finally time to convert the non-conforming values
into a standardized format. This can include:
Transforming the field data types, such as converting Phone Number from string to an integer data type
and eliminating any characters or symbols present in phone numbers to attain the 8-digit number.
Transforming patterns and formats, such as converting dates present in the dataset to the format
MM/DD/YYYY.
Transforming measurement units, such as converting product prices to USD.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 34


Conti...

Expanding abbreviated values to complete forms, such as replacing the abbreviated U.S. states: NY to
New York, NJ to New Jersey, and so on.
Removing noise present in data values to attain more meaningful information, such as removing LLC, Inc.,
and Corp. from company names to get the actual names without any noise.
Reconstructing the values in a standardized format in case they need to be mapped to a new application
or a data hub, like a master data management system.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 35


Conti...

4. Retest for standard:


Once the transformation process is over, it is a good practice to retest the dataset for standardization
errors. The pre and post standardization reports can be compared to understand the extent to which data
errors were fixed by the configured processes and how they can be improved to reach better results.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 36


Standards for interoperability and integration

The terms integration and interoperability are often used interchangeably, but they are very different
creatures, and understanding the impact of an “integration strategy” versus an “interoperability strategy”
can have a dramatic effect on your product and business.
Let's start with some definitions:
Integration:
A connection between two or more products or systems, enabling communication, usually with the use of
“middleware” to translate each system’s data.
Interoperability:
A characteristic of a product or system to be capable of communicating with any other products or systems
that speak the same language (i.e., have a common standards-based interface)

19/07/2024 Ms. Shweta ACSML0603 Unit 5 37


Conti...

Imagine that you are visiting an international conference with people from across the globe, and that
everyone speaks a different language. How do you communicate with everybody? There are three basic
options:
1. Everyone learns every single language spoken at the conference
2. Hire a translator for every language at the conference
3. Everyone learns ONE common language
Options 1 and 2 are “integrations”, where 1 is a direct integration (everyone learn everyone else’s language)
and 2 integrates via “middleware” (i.e., the translator).
If you only ever want a few people at your conference, “integration” will probably work just fine. But, if you
want a lot of participants that all interact with each other, “interoperability” is the way to go.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 38


Conti...

Option 3 is “interoperability” - everybody learns the same ONE common language. The benefits of
interoperability in this scenario are clear:
You only need to learn ONE language, no matter how many people are at the conference
When a new person arrives, they only need to learn ONE language, and no one else needs to learn
anything new
When systems have interoperability, they communicate via a common language with no translation required.
Phones, faxes, railroads, AM/FM radio, the web, and email are all examples of technological interoperability.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 39


Conti...

In the world of email, the different products Outlook, Hotmail and Yahoo all use a common data format to
transmit emails from one system to another. When Gmail was introduced, it did not add more complexity to
the network - and the other systems did not need to do any extra work, because the email ecosystem was
interoperable with any platform that used published public email protocols and standards. The
interoperability of faxes has incredible value, as every single fax machine is built to communicate with every
other fax machine.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 40


Benefits of Interoperability
The benefits of interoperability can be summarized as follows:
Facilitate the creation of a new presentation layer using a new technology and, at the same time, reuse
the existing business components.
Integrate heterogeneous software components within an enterprise
Lower Integration efforts
This is contributed to the fact that the standards, which have been already agreed upon, make the
interfaces between the systems and the processes compatible to each other.
Lower Ownership and Maintenance efforts. The cost and efforts of ownership and maintenance
inherently decreases with the use of interoperable components, supplied by third parties.
Increased market and technology opportunities With the scope of web services and interoperability, the
enterprises have a wider choice of vendors, as technologies no longer remain a hindering factor.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 41


What is Web Services?

Web services are the types of internet software that uses standardized messaging protocol over the
distributed environment. It integrates the web-based application using the REST, SOAP,
WSDL, and UDDI over the network. For example, Java web service can communicate with .Net application.

Features of web Services:


Web services are designed for application to application interaction.
It should be interoperable.
It should allow communication over the network.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 42


Conti...

Components of Web Services:


The web services must be able to fulfill the following conditions:
The web service must be accessible over the internet.
The web service is discoverable through a common mechanism like UDDI.
It must be interoperable over any programming language or Operating System.

Uses of Web Services


Web services are used for reusing the code and connecting the existing program.
Web services can be used to link data between two different platforms.
It provides interoperability between disparate applications.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 43


Conti...

How does data exchange between applications?


Suppose, we have an Application A which create a request to access the web services. The web services
offer a list of services. The web service process the request and sends the response to the Application A.
The input to a web service is called a request, and the output from a web service is called response. The
web services can be called from different platforms.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 44


What are JSON databases?

A JSON database is a document-type NoSQL database, ideal for storing semi-structured data. It’s much
more flexible compared to the row-columns format, which is fixed and expensive when it comes to
implementing even small schema changes.

With relational databases, JSON data needs to be parsed or stored using the NVARCHAR column (LOB
storage). However, document databases like MongoDB can store JSON data in its natural format, which is
readable by humans and machines.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 45


What is JSON and why is it used?
JSON (JavaScript Object Notation) is a light-weight data interchange format, used for data interchange
between client and server in web applications. It’s less verbose than XML and has syntax similar to
JavaScript objects, which makes it a good choice for web applications. It’s also easier to store JSON data,
using JSON Databases like MongoDB.
JSON Format: JSON is a readable format for structuring data. It is used for transiting data between server
and web application.
[
"employee":
{
"id": 00987
"name": "Jack",
"salary": 20000,
}
]

19/07/2024 Ms. Shweta ACSML0603 Unit 5 46


Advantages of JSON databases

1. JSON databases are faster and have more storage flexibility.


2. JSON databases provide better schema flexibility.
3. JSON databases can easily map to SQL structures.
4. JSON databases support different index types.
5. JSON databases are better suited for big data analytics.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 47


Data Encryption

Introduction:
Encryption is a security method in which information is encoded in such a way that only authorized user can
read it. It uses encryption algorithm to generate ciphertext that can only be read if decrypted.

Data encryption converts data into a different form (code) that can only be accessed by people who have a
secret key (formally known as a decryption key) or password. Data that has not been encrypted is referred
to as plaintext, and data that has been encrypted is referred to as ciphertext. Encryption is one of the most
widely used and successful data protection technologies in today’s corporate world.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 48


Types of Data Encryption

There are two types of encryptions schemes as listed below:


1. Symmetric Key encryption
2. Public Key encryption

1. Symmetric Key encryption:


Symmetric key encryption algorithm uses same cryptographic keys for both encryption and decryption of
cipher text.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 49


Conti...

2. Public Key encryption:


Public key encryption algorithm uses pair of keys, one of which is a secret key and one of which is public.
These two keys are mathematically linked with each other.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 50


What is Data Redaction?

Data redaction is a method used to protect sensitive data from being compromised or leaked. It involves the removal
of particular pieces of data from the whole of it, in an effort to keep it from being exposed as a whole and used for
malicious or nefarious purposes.
Basically, this process breaks down data into various pieces of information, and removes or hides portions that can be
used to identify or link to a particular person, company, or organization. For instance, if you have credit card
information of your customers stored in your database, you may choose to redact the first names of all the
cardholders, or the first and last four digits of the card numbers.
Data redaction tools are being used by companies all over the world in order to hide and protect their sensitive data.
Not only does it help in keeping the data secure, but also preserves its integrity and authenticity.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 51


Types of Redaction

1. Static Redaction
In static redaction, the data is copied or moved to a copy that already has redaction algorithms and
measures. It can be used for redacting sensitive information from large amounts of data. It requires quite
a lot of time and resources in order to do so.

2. Dynamic Redaction
Dynamic data redaction involves redacting sensitive information from data in real-time, which is why it is
also known as data-in-transit redaction. For this process, the data doesn’t have to go through batch
processing to be redacted. However, it is much more suitable for read-only applications, and also has
significant performance overheads.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 52


What Is Data Masking?

Data masking is a technique used to create a version of data that looks structurally similar to the original but
hides (masks) sensitive information. The version with the masked information can then be used for various
purposes, such as user training or software testing. The main objective of masking data is to create a
functional substitute that does not reveal the real data.
Here are several examples of data masking:
Replacing personally-identifying details and names with other symbols and characters
Moving details around or randomizing sensitive data like names or account numbers
Scrambling the data, substituting parts of it for other parts from the same dataset
Deleting or “nulling out” sensitive values within data records
Encrypting the data to make it infeasible for unauthorized users to access it without a decryption key

19/07/2024 Ms. Shweta ACSML0603 Unit 5 53


Types of Data Masking

1. Static data masking—involves creating a duplicated version of a dataset, containing fully or partially
masked data. The dummy database is maintained separately from the production database.

2. Dynamic data masking—alters information in real time, as it is accessed by users. This technique is
applied directly to production datasets. It ensures that the original data is seen only by authorized users,
and any non-privileged user sees masked data.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 54


Data Redaction vs Data Masking

If you are reading up on data redaction, you would also come across another term: data masking. Both of
these are tools used in data security, but they have some basic differences among them.

While data redaction is the process of removing certain pieces of sensitive or personally identifiable
information, data masking is a process in which sensitive and authentic information is replaced with
inauthentic information that has the same structure.
Data masking is mostly used for creating sample data for testing or training purposes, so that any
personally identifiable information or sensitive data isn’t exposed or manipulated during the production or
testing phase in an organization. This method also keeps the data structure and data types intact, so that
data can be used in applications.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 55


Conti...

On the other hand, data redaction is used to conceal personally identifiable or classified information from
comprehendible data, so that any sensitive data doesn’t get leaked to the public.

Therefore, we can safely say that while data redaction is a method to ‘remove’ data, data masking is a
method to ‘replace’ data with something in a similar format. In many cases, data redaction is considered to
be a sub-type of data masking.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 56


What is Authentication?

Authentication is the process of identifying someone's identity by assuring that the person is the same as
what he is claiming for.
It is used by both server and client. The server uses authentication when someone wants to access the
information, and the server needs to know who is accessing the information. The client uses it when he
wants to know that it is the same server that it claims to be.
The authentication by the server is done mostly by using the username and password. Other ways of
authentication by the server can also be done using cards, retina scans, voice recognition, and fingerprints.
Authentication does not ensure what tasks under a process one person can do, what files he can view,
read, or update. It mostly identifies who the person or system is actually.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 57


Authentication techniques

1. Password-based authentication
It is the simplest way of authentication. It requires the password for the particular username. If the password
matches with the username and both details match the system's database, the user will be successfully
authenticated.

2. Password less authentication


In this technique, the user doesn't need any password; instead, he gets an OTP (One-time password) or link
on his registered mobile number or phone number. It can also be said OTP-based authentication.

3. 2FA/MFA
2FA/MFA or 2-factor authentication/Multi-factor authentication is the higher level of authentication. It
requires additional PIN or security questions so that it can authenticate the user.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 58


Conti...

4. Single Sign-on
Single Sign-on or SSO is a way to enable access to multiple applications with a single set of credentials. It
allows the user to sign-in once, and it will automatically be signed in to all other web apps from the same
centralized directory.

5. Social Authentication
Social authentication does not require additional security; instead, it verifies the user with the existing
credentials for the available social network.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 59


What is Authorization?

Authorization is the process of granting someone to do something. It means it a way to check if the user has
permission to use a resource or not.
It defines that what data and information one user can access. It is also said as AuthZ.
The authorization usually works with authentication so that the system could know who is accessing the information.
Authorization is not always necessary to access information available over the internet. Some data available over the
internet can be accessed without any authorization, such as you can read about any technology from different place.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 60


Authorization Techniques

1. Role-based access control


RBAC or Role-based access control technique is given to users as per their role or profile in the organization.
It can be implemented for system-system or user-to-system.

2. JSON web token


JSON web token or JWT is an open standard used to securely transmit the data between the parties in the
form of the JSON object. The users are verified and authorized using the private/public key pair.

3. SAML
SAML stands for Security Assertion Markup Language. It is an open standard that provides authorization
credentials to service providers. These credentials are exchanged through digitally signed XML documents.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 61


Conti...

4. OpenID authorization
It helps the clients to verify the identity of end-users on the basis of authentication.

5. OAuth
OAuth is an authorization protocol, which enables the API to authenticate and access the requested resources.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 62


Difference between Authentication and Authorization

19/07/2024 Ms. Shweta ACSML0603 Unit 5 63


Conti...

19/07/2024 Ms. Shweta ACSML0603 Unit 5 64


Database audit

Database auditing involves observing a database so as to be aware of the actions of database user.
Database administrator and consultants often set up auditing for security purposes, for example, to ensure
that those without the permission to access information do not access it.

19/07/2024 Ms. Shweta ACSML0603 Unit 5 65


Recap

Use Of SQL/Nosql And Standards In The Industry,

Limitations Of Standardization,

Standards For Interoperability And Integration

Web Services,

Json.

Data Encryption,

19/07/2024 Ms. Shweta ACSML0603 Unit 5 66


Previous year question paper

19/07/2024 Ms. Shweta ACSML0603 Unit 5 67


Previous year question paper

19/07/2024 Ms. Shweta ACSML0603 Unit 5 68


Previous year question paper

19/07/2024 Ms. Shweta ACSML0603 Unit 5 69


Previous year question paper

19/07/2024 Ms. Shweta ACSML0603 Unit 5 70


Result Analysis

19/07/2024 Ms. Shweta ACSML0603 Unit 5 71


MCQ
Which of the following is a reason to use an SQL database?
(a) It can easily store unstructured data.
(b) It's ACID-compliant.
(c) It can enable development in the cloud.
(d) All of the above

Which of the following is a characteristic of a NoSQL database?


(a) Uses tables for storage
(b) Needs a schema
(c) Requires JOINs
(d) Uses JSON

Which of the following is a primary classification for NoSQL architectures?


(a) Document databases
(b) Graph databases
(c) Key-value databases
(d) All of the above

SQL command types include data manipulation language (DML) and data definition language (DDL).
(a) True
(b) False

________ systems are scale-out file-based (HDD) systems moving to more uses of memory in the nodes.
(a) NoSQL
(b) NewSQL
(c) SQL
(d) All of the mentioned

19/07/2024 Ms. Shweta ACSML0603 Unit 5 72


Topic wise link and Video

1. https://www.geeksforgeeks.org/use-of-nosql-in-industry/
2. https://dataladder.com/data-standardization-guide-types-benefits-and-process/
3. https://about.caredove.com/blog/integration-vs-interoperability
4. https://www.geeksforgeeks.org/difference-between-authentication-and-authorization/
5. https://www.javatpoint.com/authentication-vs-authorization
6. https://en.wikipedia.org/wiki/Database_audit

7. Video Link: https://www.youtube.com/watch?v=uakTCU5Z_pg

19/07/2024 Ms. Shweta ACSML0603 Unit 5 73


Thank You
19/07/2024 Ms. Shweta ACSML0603 Unit 5 74

You might also like