M3A1
M3A1
In the realm of data management, the concept of data quality (DQ) is pivotal,
serving as the cornerstone for ensuring that data is apt for its intended uses in
operations, decision-making, and planning. Given the diversity in the types of data
and the variety of contexts in which they are used, DQ is inherently a
multidimensional concept. A widely recognized framework by Wang et al.
delineates DQ into four primary categories: intrinsic, contextual, representation,
and access.
The quality of data has become a paramount concern for organizations in our
digitally-driven era. Data quality (DQ) is not a singular, monolithic attribute but a
multidimensional concept where problems can manifest in various forms. From the
genesis of data to its final application, several factors can undermine its integrity.
This paper explores common DQ problems identified within the framework of
multiple data sources, subjective judgment in data production, limited computing
resources, the overwhelming volume of data, and the evolving nature of data
needs.
Multiple data sources: multiple sources with the same data may produce
duplicates – a problem of consistency.
Introduction:
In an era where data acts as the lifeblood of organizations, managing this
vital resource demands a variety of specialized roles, each contributing to the
overall quality and value of the data. These roles range from the architects who
design the data structures to the stewards and scientists who extract meaning and
ensure its integrity. This paper outlines the various job profiles within the context
of data management, emphasizing their importance in maintaining high data
quality (DQ) and driving business value.
Information Architect:
At the forefront of data structuring stands the Information Architect, the
visionary who lays down the foundational blueprint of an organization’s data
landscape. Charged with the design of the conceptual data model, the Information
Architect must possess a keen understanding of business processes and be adept at
translating these into IT solutions. By working in tandem with the business users
and database designers, the Information Architect ensures that the data model is
robust, scalable, and flexible, effectively bridging the gap between business needs
and technological capabilities.
Database Designer:
The Database Designer plays the critical role of converting the conceptual
blueprint into a practical framework. This involves the translation of the
conceptual data model into logical and internal data models that underpin the
functioning of database applications. The Database Designer’s domain extends to
aiding application developers with external data model views and establishing
company-wide naming conventions. This uniformity is crucial for future database
maintenance and consistency across the enterprise's data ecosystem.
Data Owner:
Ownership imparts accountability, and in the data realm, the Data Owner is
entrusted with the ultimate authority over data fields within the organization's
databases. This responsibility includes decisions on data access and usage. A Data
Owner must not only understand the data’s meaning but also ensure its currency
and accuracy. In cases where DQ issues arise, the Data Owner is the go-to person
for data stewards to initiate corrective actions, thus playing a pivotal role in the DQ
lifecycle.
Data Steward:
As guardians of DQ, Data Stewards are tasked with the ongoing assessment
and assurance of both business data and metadata quality. Through meticulous and
regular DQ checks, they apply and analyze various DQ indicators and metrics,
taking initiative based on their findings. Although Data Stewards do not correct
data themselves, they are instrumental in identifying and understanding the root
causes of DQ issues and designing preventive measures to eliminate these
problems at the source. Their role is preventative, ensuring that data integrity is
baked into the systems from the very beginning, thus saving costs and resources in
the long run.
Database Administrator:
The Database Administrator (DBA) is the technical custodian of the
database environment. Their role encompasses a broad spectrum of
responsibilities, including the installation, maintenance, and performance
optimization of the DBMS. DBAs are pivotal in disaster recovery planning,
securing data, and ensuring that the data infrastructure runs seamlessly. By
collaborating with network and system managers, and interfacing with database
designers, DBAs help minimize operational costs while maintaining service levels,
thereby directly influencing the performance and reliability of the data
management systems.
Data Scientist:
The Data Scientist is the alchemist of the data management world, turning
raw data into golden insights. With a diverse skill set that spans ICT, quantitative
modeling, business acumen, and creativity, the Data Scientist digs deep into data to
unearth patterns and predictions that inform strategic decisions. Their role is
crucial in interpreting and leveraging data, which, when done correctly, can lead to
breakthroughs in understanding customer behavior and market trends.
Conclusion:
The complexity and scale of modern data management necessitate a multi-
faceted team of professionals, each specializing in different aspects of the data
lifecycle. From the strategic foresight of the Information Architect to the analytical
prowess of the Data Scientist, these roles collectively ensure that data is not only of
high quality but also a driving force for innovation and growth. As organizations
continue to navigate the data-driven landscape, the synergy among these roles will
be crucial in transforming data into actionable business value.
C- Legacy Databases
Introduction:
Conclusion:
The hierarchical model, with its procedural DML and navigational access, is
emblematic of the legacy databases that many organizations still grapple with.
Understanding these models is crucial not only for maintaining and potentially
upgrading these systems but also for appreciating the advanced semantic
capabilities of modern databases. Legacy databases serve as a reminder of the
technological journey that data management has undergone and continue to inform
the development of more sophisticated, efficient, and semantically rich database
technologies.