0% found this document useful (0 votes)

11 views7 pages

M3A1

Uploaded by

v1msikrishna.2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views7 pages

M3A1

Uploaded by

v1msikrishna.2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

A - Data Quality Dimensions and Problems

Data Quality Dimensions

Introduction:

In the realm of data management, the concept of data quality (DQ) is pivotal,
serving as the cornerstone for ensuring that data is apt for its intended uses in
operations, decision-making, and planning. Given the diversity in the types of data
and the variety of contexts in which they are used, DQ is inherently a
multidimensional concept. A widely recognized framework by Wang et al.
delineates DQ into four primary categories: intrinsic, contextual, representation,
and access.

Data Quality Dimensions

Intrinsic Data Quality:

The intrinsic category of DQ relates to the veracity and credibility of data.
Accuracy, objectivity, and reputation are the pillars that support this category.
 Accuracy: It is the most fundamental aspect of DQ, ensuring that data
correctly reflects reality. For instance, the accuracy of birth dates in a
database is paramount; any deviation from the actual date constitutes a direct
compromise of data quality.
 Objectivity: Data should be collected and presented without bias, ensuring
that decisions made based on the data are fair and impartial.
 Reputation: The source of data significantly impacts its trustworthiness.
Data from reputable sources is more likely to be accurate and objective.

Contextual Data Quality :

Contextual DQ focuses on the practical application of data, assessing its
relevance and utility for specific tasks.
 Completeness: This dimension assesses whether all necessary data is
present.We observe instances of incomplete data, specifically in the 'Email'
column, which can hinder communication efforts.
 Relevance: Data should be pertinent to the context in which it is used.
Irrelevant data can lead to misinformed decisions and strategies.
 Timeliness: The utility of data is often time-sensitive. Outdated data can
lead to missed opportunities or continued use of inefficient processes.
Representation Data Quality :
This category assures that data is presented in a manner that is
understandable and interpretable by users.
 Consistency: Uniformity in data representation across different platforms
and datasets is crucial. Consistency ensures that data interpreted in one
context will be understood in the same way in another.
 Interpretability: The data must be in a language and symbols that are
familiar to the user, ensuring clear communication.

Access Data Quality :

Data should be easily and securely accessible to those who need it.
 Accessibility: If data is not readily available to users, it loses its value. Data
locked behind overly stringent security measures may be as inaccessible as
data that is lost.
 Security: While data must be accessible, it must also be protected from
unauthorized access. Balancing these two aspects is a continuous challenge
in DQ management.

Data Quality Problems

The quality of data has become a paramount concern for organizations in our
digitally-driven era. Data quality (DQ) is not a singular, monolithic attribute but a
multidimensional concept where problems can manifest in various forms. From the
genesis of data to its final application, several factors can undermine its integrity.
This paper explores common DQ problems identified within the framework of
multiple data sources, subjective judgment in data production, limited computing
resources, the overwhelming volume of data, and the evolving nature of data
needs.

Multiple data sources: multiple sources with the same data may produce
duplicates – a problem of consistency.

Subjective judgment in data production: data production using human

judgment (e.g., opinions) can cause the production of biased information
– a problem of objectivity.

Limited computing resources: lack of sufficient computing resources

and/or digitalization may limit the accessibility of relevant data – a problem
of accessibility.

Volume of data: large volumes of stored data make it difficult to access

needed information in a reasonable time – a problem of accessibility.

Changing data needs: data requirements change on an ongoing basis due to

new company strategies or the introduction of new technologies – a problem
of relevance.

Different processes using and updating the same data – a problem of

consistency.
B - Roles in Data Management

Introduction:
In an era where data acts as the lifeblood of organizations, managing this
vital resource demands a variety of specialized roles, each contributing to the
overall quality and value of the data. These roles range from the architects who
design the data structures to the stewards and scientists who extract meaning and
ensure its integrity. This paper outlines the various job profiles within the context
of data management, emphasizing their importance in maintaining high data
quality (DQ) and driving business value.

Information Architect:
At the forefront of data structuring stands the Information Architect, the
visionary who lays down the foundational blueprint of an organization’s data
landscape. Charged with the design of the conceptual data model, the Information
Architect must possess a keen understanding of business processes and be adept at
translating these into IT solutions. By working in tandem with the business users
and database designers, the Information Architect ensures that the data model is
robust, scalable, and flexible, effectively bridging the gap between business needs
and technological capabilities.

Database Designer:
The Database Designer plays the critical role of converting the conceptual
blueprint into a practical framework. This involves the translation of the
conceptual data model into logical and internal data models that underpin the
functioning of database applications. The Database Designer’s domain extends to
aiding application developers with external data model views and establishing
company-wide naming conventions. This uniformity is crucial for future database
maintenance and consistency across the enterprise's data ecosystem.

Data Owner:
Ownership imparts accountability, and in the data realm, the Data Owner is
entrusted with the ultimate authority over data fields within the organization's
databases. This responsibility includes decisions on data access and usage. A Data
Owner must not only understand the data’s meaning but also ensure its currency
and accuracy. In cases where DQ issues arise, the Data Owner is the go-to person
for data stewards to initiate corrective actions, thus playing a pivotal role in the DQ
lifecycle.
Data Steward:
As guardians of DQ, Data Stewards are tasked with the ongoing assessment
and assurance of both business data and metadata quality. Through meticulous and
regular DQ checks, they apply and analyze various DQ indicators and metrics,
taking initiative based on their findings. Although Data Stewards do not correct
data themselves, they are instrumental in identifying and understanding the root
causes of DQ issues and designing preventive measures to eliminate these
problems at the source. Their role is preventative, ensuring that data integrity is
baked into the systems from the very beginning, thus saving costs and resources in
the long run.

Database Administrator:
The Database Administrator (DBA) is the technical custodian of the
database environment. Their role encompasses a broad spectrum of
responsibilities, including the installation, maintenance, and performance
optimization of the DBMS. DBAs are pivotal in disaster recovery planning,
securing data, and ensuring that the data infrastructure runs seamlessly. By
collaborating with network and system managers, and interfacing with database
designers, DBAs help minimize operational costs while maintaining service levels,
thereby directly influencing the performance and reliability of the data
management systems.

Data Scientist:
The Data Scientist is the alchemist of the data management world, turning
raw data into golden insights. With a diverse skill set that spans ICT, quantitative
modeling, business acumen, and creativity, the Data Scientist digs deep into data to
unearth patterns and predictions that inform strategic decisions. Their role is
crucial in interpreting and leveraging data, which, when done correctly, can lead to
breakthroughs in understanding customer behavior and market trends.

Conclusion:
The complexity and scale of modern data management necessitate a multi-
faceted team of professionals, each specializing in different aspects of the data
lifecycle. From the strategic foresight of the Information Architect to the analytical
prowess of the Data Scientist, these roles collectively ensure that data is not only of
high quality but also a driving force for innovation and growth. As organizations
continue to navigate the data-driven landscape, the synergy among these roles will
be crucial in transforming data into actionable business value.
C- Legacy Databases
Introduction:

In the realm of data management, legacy databases serve as a testament to

the evolution of technology and the enduring nature of data as a resource. Despite
their age and perceived obsolescence, these databases still play a significant role in
the current data architecture of many organizations. This paper explores the logical
data models of legacy database technologies, their expressive power, limitations,
and the reasons why they remain relevant in today’s fast-paced technological
landscape.

The Relevance of Legacy Database Technologies:

Legacy databases often remain entrenched in organizations due to historical
implementations and limited IT budgets. Their basic characteristics are essential
knowledge for the maintenance of existing database applications and potential
migration to modern Database Management Systems (DBMSs). Moreover, the
principles underpinning these older systems offer invaluable insights into the
semantic richness of newer technologies. Notably, the procedural Data
Manipulation Language (DML) and navigational access, hallmarks of these legacy
systems, have found their way into more recent databases, such as Object-Oriented
Database Management Systems (OODBMSs).

The Hierarchical Model:

One of the earliest data models, the hierarchical model, emerged during the
Apollo moon missions as a solution by IBM to manage the vast quantities of data.
Known as the Information Management System (IMS), this model has no formal
description and is characterized by structural limitations, rendering it a legacy
technology.

Building Blocks of the Hierarchical Model:

The hierarchical model is built on two main components: record types
and relationship types. Record types represent sets of records that describe
similar entities, such as products or suppliers, each consisting of various
fields or data items. Relationship types define the connections between these
record types, allowing only for hierarchical (1:N) relationships.
Consequently, a parent record may have multiple child records, but a child
record is limited to a single parent. This model inherently supports the
construction of hierarchical structures, with a single root record type at the
top and multiple leaf record types at the bottom.
Expressive Power and Limitations:
The hierarchical model's expressive power is notably restricted. It
supports only 1:N relationship types and does not accommodate N:M or 1:1
relationships without implementing workarounds that lead to a loss of
semantics and data redundancy. The retrieval of record data in this model is
also procedurally driven, requiring navigation from the root node down
through the hierarchy, which is inefficient by modern standards.

Illustrative Example: Department, Employee, and Project Structure

To exemplify the hierarchical model, consider a simple structure
involving departments, employees, and projects. The department record type
includes fields like department number, name, and location, and is linked to
employees and projects through parent-child relationships. Departments can
have multiple employees and projects, but each employee or project is tied
to exactly one department, underscoring the model's rigidity.

Implementing N:M Relationships in the Hierarchical Model:

Implementing N:M relationships within the hierarchical model
necessitates mapping these to 1:N relationships, a suboptimal solution that
often results in redundancy and semantic loss. For instance, in an employee-
project relationship, making the project a parent and the employee a child
would distort the true network structure into an artificial tree structure, with
implications for the integrity and utility of the data.

Conclusion:
The hierarchical model, with its procedural DML and navigational access, is
emblematic of the legacy databases that many organizations still grapple with.
Understanding these models is crucial not only for maintaining and potentially
upgrading these systems but also for appreciating the advanced semantic
capabilities of modern databases. Legacy databases serve as a reminder of the
technological journey that data management has undergone and continue to inform
the development of more sophisticated, efficient, and semantically rich database
technologies.

PUB - 1032292 - XS-111 - XS-211 Install-Owners
100% (1)
PUB - 1032292 - XS-111 - XS-211 Install-Owners
42 pages
Perform An Ethical Analysis of Facebook
83% (6)
Perform An Ethical Analysis of Facebook
1 page
Data Quality - Information Quality For Northwind
No ratings yet
Data Quality - Information Quality For Northwind
18 pages
Judicial Review Notes Slides-Aggrey Wakili Msomi
No ratings yet
Judicial Review Notes Slides-Aggrey Wakili Msomi
58 pages
Abundance Meditation
75% (12)
Abundance Meditation
15 pages
Data Quality
No ratings yet
Data Quality
76 pages
Data Quality Lec 3
No ratings yet
Data Quality Lec 3
3 pages
Unit I-Database Management System
No ratings yet
Unit I-Database Management System
67 pages
Dataqualitymanagement
No ratings yet
Dataqualitymanagement
20 pages
AIA DQG IDQ Approach& Features v1.1
No ratings yet
AIA DQG IDQ Approach& Features v1.1
29 pages
2024 DQOps Ebook A Step-By-step Guide To Improve Data Quality
No ratings yet
2024 DQOps Ebook A Step-By-step Guide To Improve Data Quality
120 pages
Data Quality and Data Cleaning: An Overview
0% (1)
Data Quality and Data Cleaning: An Overview
132 pages
IDQ Functionality Imp
No ratings yet
IDQ Functionality Imp
7 pages
Strong Lee Wang CA CM May 97
No ratings yet
Strong Lee Wang CA CM May 97
8 pages
Examining Data Quality
No ratings yet
Examining Data Quality
4 pages
Strong Lee Wang CA CM May 97
No ratings yet
Strong Lee Wang CA CM May 97
9 pages
Unit 2 More Notes
No ratings yet
Unit 2 More Notes
35 pages
The Ultimate Guide To Modern Data Quality Management DQM For An Effective Data Quality Control Driven
No ratings yet
The Ultimate Guide To Modern Data Quality Management DQM For An Effective Data Quality Control Driven
14 pages
Data Quality and Data Cleaning: An Overview
No ratings yet
Data Quality and Data Cleaning: An Overview
27 pages
Encyclopedia 02 00032 v2
No ratings yet
Encyclopedia 02 00032 v2
13 pages
Components of Data MGMT
No ratings yet
Components of Data MGMT
32 pages
Talk - Data Quality Framework
100% (1)
Talk - Data Quality Framework
30 pages
Ijcet 15 05 017
No ratings yet
Ijcet 15 05 017
13 pages
Guide Real Talk A Guide To Understanding Data Quality and Data Observability
No ratings yet
Guide Real Talk A Guide To Understanding Data Quality and Data Observability
36 pages
A Framework For Current and New Data Quality Dimensions
No ratings yet
A Framework For Current and New Data Quality Dimensions
26 pages
Data Quality Management
100% (1)
Data Quality Management
12 pages
Mis Group 6 Assignment 1
No ratings yet
Mis Group 6 Assignment 1
10 pages
Data Quality
No ratings yet
Data Quality
6 pages
Data Quality Concepts PDF
100% (3)
Data Quality Concepts PDF
83 pages
02 Establishing The Need For An Organization-Wide Data Dictionary v2
No ratings yet
02 Establishing The Need For An Organization-Wide Data Dictionary v2
19 pages
Data Quality MDM
No ratings yet
Data Quality MDM
20 pages
Data Quality A Survey of Data Quality Dimensions
No ratings yet
Data Quality A Survey of Data Quality Dimensions
5 pages
White Paper: 1 Definitive Guide To Data Quality
No ratings yet
White Paper: 1 Definitive Guide To Data Quality
18 pages
Data Quality
No ratings yet
Data Quality
13 pages
Talend Data Quality Guide
No ratings yet
Talend Data Quality Guide
45 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Data Quality
No ratings yet
Data Quality
6 pages
Unit 2
No ratings yet
Unit 2
22 pages
Soft v10 n12 2017 1
No ratings yet
Soft v10 n12 2017 1
20 pages
Data Quality Setting Organizational Policies
No ratings yet
Data Quality Setting Organizational Policies
9 pages
Assignment 6 Data Management Pharamaceuticle Laboration
No ratings yet
Assignment 6 Data Management Pharamaceuticle Laboration
9 pages
Data Quality
No ratings yet
Data Quality
5 pages
Isom Midterms
No ratings yet
Isom Midterms
27 pages
Data Quality - 079 Moumon
No ratings yet
Data Quality - 079 Moumon
8 pages
Data Quality and Its Parameters
No ratings yet
Data Quality and Its Parameters
10 pages
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet
Data-Driven Decision Making
From Everand
Data-Driven Decision Making
Aadinath Pothuvaal
No ratings yet
Data Quality
No ratings yet
Data Quality
10 pages
Session 2
No ratings yet
Session 2
26 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
52 pages
Data Quality and Database Design 1
No ratings yet
Data Quality and Database Design 1
4 pages
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
2017 Iaria Advances in SW Data Quality
No ratings yet
2017 Iaria Advances in SW Data Quality
20 pages
Ensuring Data Quality
No ratings yet
Ensuring Data Quality
16 pages
Lect 6
No ratings yet
Lect 6
36 pages
A Framework To Construct Data Quality Dimensions Relationships
No ratings yet
A Framework To Construct Data Quality Dimensions Relationships
10 pages
Databases & Database Management
No ratings yet
Databases & Database Management
8 pages
The What, Why and How of Data Quality
No ratings yet
The What, Why and How of Data Quality
15 pages
Chapter 5 Data Manangement
No ratings yet
Chapter 5 Data Manangement
12 pages
Data Strategy Governance
No ratings yet
Data Strategy Governance
19 pages
3.an Overview of Data Quality Frameworks
No ratings yet
3.an Overview of Data Quality Frameworks
15 pages
Unit 3 Data Analytics
No ratings yet
Unit 3 Data Analytics
16 pages
Ascertain Accuracy of Data
No ratings yet
Ascertain Accuracy of Data
3 pages
The Role of Data in The 21st Century
No ratings yet
The Role of Data in The 21st Century
8 pages
Criminal Law
No ratings yet
Criminal Law
11 pages
Green and Black Minimalist Resume
No ratings yet
Green and Black Minimalist Resume
2 pages
Appendix-I Application Form For Empanelment of Valuers PDF
No ratings yet
Appendix-I Application Form For Empanelment of Valuers PDF
2 pages
8601 Quiz - 03087611772
0% (1)
8601 Quiz - 03087611772
55 pages
Termination of Contract
No ratings yet
Termination of Contract
12 pages
IRC SP 84 2014 - Manual of Specifications & Standards For Four Laning of Highways Through PPP
100% (2)
IRC SP 84 2014 - Manual of Specifications & Standards For Four Laning of Highways Through PPP
2 pages
Usermanual Em6400.v01
No ratings yet
Usermanual Em6400.v01
81 pages
CV Template For Software Engineer
No ratings yet
CV Template For Software Engineer
4 pages
Iot Smart Parking PDF
No ratings yet
Iot Smart Parking PDF
69 pages
Shrey Choubey: Career Objective Skills
No ratings yet
Shrey Choubey: Career Objective Skills
2 pages
Transformers Health Management Condition Monitoring System: Product Description
No ratings yet
Transformers Health Management Condition Monitoring System: Product Description
46 pages
Lecture (8) H
No ratings yet
Lecture (8) H
9 pages
Burnout Prevention: An Integrative Approach For Healthy Ministry
100% (3)
Burnout Prevention: An Integrative Approach For Healthy Ministry
37 pages
Professional Price List: 03-04-2018
No ratings yet
Professional Price List: 03-04-2018
17 pages
B/U Dorsuma Ganderbal
100% (1)
B/U Dorsuma Ganderbal
2 pages
Mini Jolly Dali 20 Manual
No ratings yet
Mini Jolly Dali 20 Manual
6 pages
Spesifikasi Gorman Ruup PAH3A60-6068H
No ratings yet
Spesifikasi Gorman Ruup PAH3A60-6068H
2 pages
Use Case Lookup
No ratings yet
Use Case Lookup
17 pages
Nonprofit Organizations
No ratings yet
Nonprofit Organizations
25 pages
Teleoperator Retrieval System Press Kit
No ratings yet
Teleoperator Retrieval System Press Kit
8 pages
Prototype - Js Cheat Sheet
100% (16)
Prototype - Js Cheat Sheet
1 page
Muhaba Research Proposal 20211
No ratings yet
Muhaba Research Proposal 20211
84 pages
Presentation On: Absorption Costing AND Marginal Costing
No ratings yet
Presentation On: Absorption Costing AND Marginal Costing
16 pages
The Social Network Review
No ratings yet
The Social Network Review
16 pages
5 Year Procurement Projection 30032023
No ratings yet
5 Year Procurement Projection 30032023
26 pages
Memo Payload Axis XL
100% (1)
Memo Payload Axis XL
3 pages

M3A1

Uploaded by

M3A1

Uploaded by

A - Data Quality Dimensions and Problems

Data Quality Dimensions

Data Quality Dimensions

Intrinsic Data Quality:

Contextual Data Quality :

Access Data Quality :

Data Quality Problems

Subjective judgment in data production: data production using human

Limited computing resources: lack of sufficient computing resources

Volume of data: large volumes of stored data make it difficult to access

Changing data needs: data requirements change on an ongoing basis due to

Different processes using and updating the same data – a problem of

In the realm of data management, legacy databases serve as a testament to

The Relevance of Legacy Database Technologies:

The Hierarchical Model:

Building Blocks of the Hierarchical Model:

Illustrative Example: Department, Employee, and Project Structure

Implementing N:M Relationships in the Hierarchical Model:

You might also like