Adbms Unit 5
Adbms Unit 5
Unit 5
Advance DBMS
Course Details Faculty Name: Ms. Shweta
(B Tech 6th Sem) Assistant Professor
(AIML) Dept.
SQL and NoSQL standards: Use of SQL/NoSQL and standards in the industry,
Limitations of standardization.
Standards for interoperability and integration: Web services, JSON. Data
encryption, Redaction and masking techniques, Authentication and Authorization,
Database auditing.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex engineering problems
reaching substantiated conclusions using first principles of mathematics, natural sciences, and engineering
sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and design system
components or processes that meet the specified needs with appropriate consideration for the public health
and safety, and the cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to
provide valid conclusions.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering solutions
in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms
of the engineering practice.
10. Communication: Communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the engineering and
management principles and apply these to one’s own work, as a member and leader in a team, to manage
projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
2 2 3 3 3 2 3 2 2 2 2 3
KCS501.2
3 3 3 2 2 2 2 2 2 2 2 3
KCS501.3
2 3 3 3 3 2 2 2 2 2 2 2
KCS501.4
2 3 2 2 2 2 2 2 2 3 2 2
KCS501.5
2 3 2 2 2 3 2 2 3 2 2 2
AVG
2.20 2.80 2.60 2.40 2.40 2.20 2.20 2.00 2.20 2.20 2.00 2.40
On successful completion of graduation degree the Computer Science & Engineering graduates will be able to:
PSO1:
Design and develop the Hardware and Software systems.
PSO2:
Understand the interdisciplinary computing techniques and an ability to apply them in the design of advanced
computing.
PSO3:
Understand the programming methodology, software development paradigms, design and analysis of Algorithms,
Operating Systems, Digital Logic Design, Theory of Computation, Discrete Mathematics, Compiler Design, etc.
PSO4:
To integrate & manage the various phases/components of software development projects of society.
KCS-501.1
3 1 3 1
KCS-501.2
3 1 3 1
KCS-501.3
3 1 3 1
KCS-501.4
3 1 3 1
KCS-501.5
3 1 3 1
AVG
3.00 1.00 3.00 1.00
PEO1:
Able to apply sound knowledge in the field of information technology to fulfill the needs of IT industry.
PEO2:
Able to design innovative and interdisciplinary systems through latest digital technologies.
PEO3:
Able to inculcate professional ethics, team work and leadership for serving the society.
PEO4:
Able to inculcate lifelong learning in the field of computing for successful career in organizations and R&D
sectors.
SECTION – B CO
The student should have knowledge of relational database management system (RDBMS) and
SQL.
Having knowledge of basic mathematics like - SUM, DIFFERENCE, AVERAGE, MEAN, MEDIAN,
MODE, etc will definitely be a plus point.
Having knowledge on Set Theory will help.
The proper understanding of data structures (B and B+ trees) will help you to understand the
DBMS quickly.
Introduction of Web services, JSON. Data encryption, Redaction and masking techniques,
Authentication and Authorization, Database auditing.
• With SQL you can build one script that retrieves and presents your data. NoSQL doesn’t support relations between data types. Running queries in NoSQL is doable, but much slower.
• SQL databases are a better fit for heavy duty or complex transactions because it’s more stable and ensure data integrity.
• (Atomicity, Consistency, Isolation, Durability) or defining exactly how transactions interact with a database.
• If you’re not working with a large volume of data or many data types, NoSQL would be overkill.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 19
SQL and NoSQL standards and Usages
When to use NoSQL instead of SQL
• It’s difficult to predict how the application will grow over time.
• A lot of time is invested designing the data model because changes will impact all or most of the layers in the application.
In NoSQL, we are working with a highly flexible schema design or no predefined schema.
• The data modelling process is iterative and adaptive. Changing the structure or schema will not impact development cycles or create any downtime
for the application.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 20
SQL and NoSQL standards and Usages
When to use NoSQL instead of SQL
When need not to concerned about data consistency and 100% data integrity is not your top goal.
• This is related to the above SQL requirement for ACID compliance. For example, with social media platforms, it isn’t important if everyone sees your
new post at the exact same time, which means data consistency is not a priority.
When required a lot of data, many different data types, and your data needs will only grow over time.
• NoSQL makes it easy to store all different types of data together and without having to invest time into defining what type of data you’re storing in
advance.
• NoSQL provides much greater flexibility and the ability to control costs as your data needs change.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 21
SQL and NoSQL standards and Usages
When to use NoSQL instead of SQL
When need not to concerned about data consistency and 100% data integrity is not your top goal.
• This is related to the above SQL requirement for ACID compliance. For example, with social media platforms, it isn’t important if everyone sees your
new post at the exact same time, which means data consistency is not a priority.
When required a lot of data, many different data types, and your data needs will only grow over time.
• NoSQL makes it easy to store all different types of data together and without having to invest time into defining what type of data you’re storing in
advance.
• NoSQL provides much greater flexibility and the ability to control costs as your data needs change.
19/07/2024 Ms. Shweta ACSML0603 Unit 5 22
Use of SQL/NoSQL and standards in the industry
Why NoSQL?
In recent times you can easily capture and access data from various sources, like Facebook, Google,
etc.
User’s personal information, geographic location data, user generated content, social graphs and
machine logging data are some of the examples where data is increasing rapidly.
To use above mentioned properties, it is necessary to process large volume of data.
For which relational databases are not suitable. The evolution of NoSQL databases is to handle this
large volume of data properly.
1. Session Store:
Managing session data using relational database is very difficult, especially in case where applications are
grown very much.
In such cases the right approach is to use a global session store, which manages session information for
every user who visits the site.
NOSQL is suitable for storing such web application session information very is large in size.
Since the session data is unstructured in form, so it is easy to store it in schema less documents rather than
in relation database record.
4. Mobile Applications:
Using NoSQL database mobile application development can be started with small size and can be easily
expanded as the number of user increases, which is very difficult if you consider relational databases.
5. Internet of Things:
Today, billions of devices are connected to internet, such as smart phones, tablets, home appliances,
systems installed in hospitals, cars and warehouses. For such devices large volume and variety of data is
generated and keep on generating.
6. Social Gaming:
Data-intensive applications such as social games which can grow users to millions. Such a growth in
number of users as well as amount of data requires a database system which can store such data and can be
scaled to incorporate number of growing users NOSQL is suitable for such applications.
A data standardization process has four simple steps: define, test, transform, and retest. Let’s go over each
step in a bit more detail.
1. Define a standard:
In the first step, you must identify what standard meets your organizational needs. The best way to define a
standard is by designing a data model for your enterprise. A data model can be designed as:
Identify the data assets crucial to your business operation. For example, most enterprises capture and
manage data for customers, products, employees, locations, etc.
Define the data fields of each asset identified and decide on the structural details as well. For example,
you may want to store a customer’s Name, Address, Email, and Phone Number – where the Name field spans
over three fields and Address field spans over two.
3. Transform:
In the third step of the data standardization process, it is finally time to convert the non-conforming values
into a standardized format. This can include:
Transforming the field data types, such as converting Phone Number from string to an integer data type
and eliminating any characters or symbols present in phone numbers to attain the 8-digit number.
Transforming patterns and formats, such as converting dates present in the dataset to the format
MM/DD/YYYY.
Transforming measurement units, such as converting product prices to USD.
Expanding abbreviated values to complete forms, such as replacing the abbreviated U.S. states: NY to
New York, NJ to New Jersey, and so on.
Removing noise present in data values to attain more meaningful information, such as removing LLC, Inc.,
and Corp. from company names to get the actual names without any noise.
Reconstructing the values in a standardized format in case they need to be mapped to a new application
or a data hub, like a master data management system.
The terms integration and interoperability are often used interchangeably, but they are very different
creatures, and understanding the impact of an “integration strategy” versus an “interoperability strategy”
can have a dramatic effect on your product and business.
Let's start with some definitions:
Integration:
A connection between two or more products or systems, enabling communication, usually with the use of
“middleware” to translate each system’s data.
Interoperability:
A characteristic of a product or system to be capable of communicating with any other products or systems
that speak the same language (i.e., have a common standards-based interface)
Imagine that you are visiting an international conference with people from across the globe, and that
everyone speaks a different language. How do you communicate with everybody? There are three basic
options:
1. Everyone learns every single language spoken at the conference
2. Hire a translator for every language at the conference
3. Everyone learns ONE common language
Options 1 and 2 are “integrations”, where 1 is a direct integration (everyone learn everyone else’s language)
and 2 integrates via “middleware” (i.e., the translator).
If you only ever want a few people at your conference, “integration” will probably work just fine. But, if you
want a lot of participants that all interact with each other, “interoperability” is the way to go.
Option 3 is “interoperability” - everybody learns the same ONE common language. The benefits of
interoperability in this scenario are clear:
You only need to learn ONE language, no matter how many people are at the conference
When a new person arrives, they only need to learn ONE language, and no one else needs to learn
anything new
When systems have interoperability, they communicate via a common language with no translation required.
Phones, faxes, railroads, AM/FM radio, the web, and email are all examples of technological interoperability.
In the world of email, the different products Outlook, Hotmail and Yahoo all use a common data format to
transmit emails from one system to another. When Gmail was introduced, it did not add more complexity to
the network - and the other systems did not need to do any extra work, because the email ecosystem was
interoperable with any platform that used published public email protocols and standards. The
interoperability of faxes has incredible value, as every single fax machine is built to communicate with every
other fax machine.
Web services are the types of internet software that uses standardized messaging protocol over the
distributed environment. It integrates the web-based application using the REST, SOAP,
WSDL, and UDDI over the network. For example, Java web service can communicate with .Net application.
A JSON database is a document-type NoSQL database, ideal for storing semi-structured data. It’s much
more flexible compared to the row-columns format, which is fixed and expensive when it comes to
implementing even small schema changes.
With relational databases, JSON data needs to be parsed or stored using the NVARCHAR column (LOB
storage). However, document databases like MongoDB can store JSON data in its natural format, which is
readable by humans and machines.
Introduction:
Encryption is a security method in which information is encoded in such a way that only authorized user can
read it. It uses encryption algorithm to generate ciphertext that can only be read if decrypted.
Data encryption converts data into a different form (code) that can only be accessed by people who have a
secret key (formally known as a decryption key) or password. Data that has not been encrypted is referred
to as plaintext, and data that has been encrypted is referred to as ciphertext. Encryption is one of the most
widely used and successful data protection technologies in today’s corporate world.
Data redaction is a method used to protect sensitive data from being compromised or leaked. It involves the removal
of particular pieces of data from the whole of it, in an effort to keep it from being exposed as a whole and used for
malicious or nefarious purposes.
Basically, this process breaks down data into various pieces of information, and removes or hides portions that can be
used to identify or link to a particular person, company, or organization. For instance, if you have credit card
information of your customers stored in your database, you may choose to redact the first names of all the
cardholders, or the first and last four digits of the card numbers.
Data redaction tools are being used by companies all over the world in order to hide and protect their sensitive data.
Not only does it help in keeping the data secure, but also preserves its integrity and authenticity.
1. Static Redaction
In static redaction, the data is copied or moved to a copy that already has redaction algorithms and
measures. It can be used for redacting sensitive information from large amounts of data. It requires quite
a lot of time and resources in order to do so.
2. Dynamic Redaction
Dynamic data redaction involves redacting sensitive information from data in real-time, which is why it is
also known as data-in-transit redaction. For this process, the data doesn’t have to go through batch
processing to be redacted. However, it is much more suitable for read-only applications, and also has
significant performance overheads.
Data masking is a technique used to create a version of data that looks structurally similar to the original but
hides (masks) sensitive information. The version with the masked information can then be used for various
purposes, such as user training or software testing. The main objective of masking data is to create a
functional substitute that does not reveal the real data.
Here are several examples of data masking:
Replacing personally-identifying details and names with other symbols and characters
Moving details around or randomizing sensitive data like names or account numbers
Scrambling the data, substituting parts of it for other parts from the same dataset
Deleting or “nulling out” sensitive values within data records
Encrypting the data to make it infeasible for unauthorized users to access it without a decryption key
1. Static data masking—involves creating a duplicated version of a dataset, containing fully or partially
masked data. The dummy database is maintained separately from the production database.
2. Dynamic data masking—alters information in real time, as it is accessed by users. This technique is
applied directly to production datasets. It ensures that the original data is seen only by authorized users,
and any non-privileged user sees masked data.
If you are reading up on data redaction, you would also come across another term: data masking. Both of
these are tools used in data security, but they have some basic differences among them.
While data redaction is the process of removing certain pieces of sensitive or personally identifiable
information, data masking is a process in which sensitive and authentic information is replaced with
inauthentic information that has the same structure.
Data masking is mostly used for creating sample data for testing or training purposes, so that any
personally identifiable information or sensitive data isn’t exposed or manipulated during the production or
testing phase in an organization. This method also keeps the data structure and data types intact, so that
data can be used in applications.
On the other hand, data redaction is used to conceal personally identifiable or classified information from
comprehendible data, so that any sensitive data doesn’t get leaked to the public.
Therefore, we can safely say that while data redaction is a method to ‘remove’ data, data masking is a
method to ‘replace’ data with something in a similar format. In many cases, data redaction is considered to
be a sub-type of data masking.
Authentication is the process of identifying someone's identity by assuring that the person is the same as
what he is claiming for.
It is used by both server and client. The server uses authentication when someone wants to access the
information, and the server needs to know who is accessing the information. The client uses it when he
wants to know that it is the same server that it claims to be.
The authentication by the server is done mostly by using the username and password. Other ways of
authentication by the server can also be done using cards, retina scans, voice recognition, and fingerprints.
Authentication does not ensure what tasks under a process one person can do, what files he can view,
read, or update. It mostly identifies who the person or system is actually.
1. Password-based authentication
It is the simplest way of authentication. It requires the password for the particular username. If the password
matches with the username and both details match the system's database, the user will be successfully
authenticated.
3. 2FA/MFA
2FA/MFA or 2-factor authentication/Multi-factor authentication is the higher level of authentication. It
requires additional PIN or security questions so that it can authenticate the user.
4. Single Sign-on
Single Sign-on or SSO is a way to enable access to multiple applications with a single set of credentials. It
allows the user to sign-in once, and it will automatically be signed in to all other web apps from the same
centralized directory.
5. Social Authentication
Social authentication does not require additional security; instead, it verifies the user with the existing
credentials for the available social network.
Authorization is the process of granting someone to do something. It means it a way to check if the user has
permission to use a resource or not.
It defines that what data and information one user can access. It is also said as AuthZ.
The authorization usually works with authentication so that the system could know who is accessing the information.
Authorization is not always necessary to access information available over the internet. Some data available over the
internet can be accessed without any authorization, such as you can read about any technology from different place.
3. SAML
SAML stands for Security Assertion Markup Language. It is an open standard that provides authorization
credentials to service providers. These credentials are exchanged through digitally signed XML documents.
4. OpenID authorization
It helps the clients to verify the identity of end-users on the basis of authentication.
5. OAuth
OAuth is an authorization protocol, which enables the API to authenticate and access the requested resources.
Database auditing involves observing a database so as to be aware of the actions of database user.
Database administrator and consultants often set up auditing for security purposes, for example, to ensure
that those without the permission to access information do not access it.
Limitations Of Standardization,
Web Services,
Json.
Data Encryption,
SQL command types include data manipulation language (DML) and data definition language (DDL).
(a) True
(b) False
________ systems are scale-out file-based (HDD) systems moving to more uses of memory in the nodes.
(a) NoSQL
(b) NewSQL
(c) SQL
(d) All of the mentioned
1. https://www.geeksforgeeks.org/use-of-nosql-in-industry/
2. https://dataladder.com/data-standardization-guide-types-benefits-and-process/
3. https://about.caredove.com/blog/integration-vs-interoperability
4. https://www.geeksforgeeks.org/difference-between-authentication-and-authorization/
5. https://www.javatpoint.com/authentication-vs-authorization
6. https://en.wikipedia.org/wiki/Database_audit