0% found this document useful (0 votes)
4 views

Components of Data Mgmt

The document outlines the components of data management, including data governance, architecture, administration, integration, and master data management. It emphasizes the importance of data quality, governance principles, and compliance with regulations such as HIPAA, GDPR, and CCPA. Best practices for data management are also discussed, including data cleaning, lineage, traceability, and the establishment of standards and procedures.

Uploaded by

ganesh697todkari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Components of Data Mgmt

The document outlines the components of data management, including data governance, architecture, administration, integration, and master data management. It emphasizes the importance of data quality, governance principles, and compliance with regulations such as HIPAA, GDPR, and CCPA. Best practices for data management are also discussed, including data cleaning, lineage, traceability, and the establishment of standards and procedures.

Uploaded by

ganesh697todkari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Components of Data

Management
Data Management

Data Data Database Data Master Data


Governance Architecture Administration Integration Management
Data Governance
Data governance is a set of principles and practices that ensure data is
managed and used consistently throughout an organization.

Data governance is the data management discipline that focuses on


• the quality, security and availability of an organization’s data.
• helps ensure data integrity and data security by
• defining and implementing policies, standards and procedures for data
collection, ownership, storage, processing and use.
• Maintaining Quality of data, Data Lineage, Data Traceability
• Data Cleaning
• Conducting Data Audits
Who does Data governance
• jointly owned by IT and the business.
• support from upper management
Roles in Data Governance

• Chief Data Officer


• Data steward
• Database Administrators
• Data Stakeholders
• Data Users
Data Governance Principles
• Data accuracy
• Data accessibility
• Data consistency
• Data compliance
• Data integrity
• Data stewardship
• Data transparency
Best Practices for Data Governance
• Set format standards for your data.
• Account for unmanaged data.
• Map your business goals for governance.
• Focus on simplicity in most areas.
• Establish governance team roles.
• Classify and tag all of your data.
• Measure progress with multiple metrics
• Automate as much as possible.
Data Governance Program
A data governance program needs to include the following:
• Sponsorship from both senior management and business units
• A data steward manager to support, train, and coordinate the data stewards
• Data stewards for different business units, data subjects, source systems, or
combinations of these elements
• A governance committee, headed by one person, but composed of data steward
managers, executives and senior vice presidents, IT leadership (e.g., data
administrators), and other business leaders, to set strategic goals, coordinate
activities, and provide guidelines and standards for all enterprise data management
activities

The goals of data governance are transparency—within and outside the


organization to regulators—and increasing the value of data maintained by the
organization.
Other aspects of Data Governance
Data Quality
The degree to which data is accurate, consistent, complete, and reliable
for its intended use. Data quality is a key part of data governance,
which ensures that data is managed, protected, and used consistently.

Why Data Quality is important :


• Minimize IT project risk
• Make timely business decisions
• Ensure regulatory compliance
• Expand the customer base
Characteristics of Quality Data
• Entity in present • Agreement with • Values across • Data having Values
only once in recognised databases/tables to be assigned if
database authority/system are consistent they need to have

Uniqueness Accuracy Consistency Completeness

• When data will be • degree to which • data are stored, • data stating that all
available data are recent exchanged, or its references are
• Timestamp on data enough to be presented in a valid.
created useful format
• Retention period
Referential
Timelines Currency Conformance
Integrity
Characteristics of Quality Data
• Uniqueness : means that each entity exists no more than once within the database, and there is a key that can be used to
uniquely access each entity. This characteristic requires identity matching (finding data about the same entity) and
resolution to locate and remove duplicate entities.
• Accuracy : has to do with the degree to which any data correctly represents the real-life object it models. Often accuracy is
measured by agreement with some recognized authority data source (e.g., one source system or even some external data
provider). Data must be both accurate and precise enough for their intended use. For example, knowing sales accurately is
important, but for many decisions, knowing sales only to the nearest $1000 per month for each product is sufficient. Data
can be valid (i.e., satisfy a specified domain or range of values) and not be accurate.
• Consistency : means that values for data in one data set (database) are in agreement with the values for related data in
another data set (database). Consistency can be within a table row (e.g., the weight of a product should have some
relationship to its size and material type), between table rows (e.g., two products with similar characteristics should have
about the same prices, or data that are meant to be redundant should have the same values), between the same
attributes over time (e.g., the product price should be the same from one month to the next unless there was a price
change event), or within some tolerance (e.g., total sales computed from orders filled and orders billed should be roughly
the same values). Consistency also relates to attribute inheritance from super- to subtypes. For example, a subtype
instance cannot exist without a corresponding supertype, and overlap or disjoint subtype rules are enforced.
• Completeness : refers to data having assigned values if they need to have values. This characteristic encompasses the NOT
NULL and foreign key constraints of SQL, but more complex rules might exist (e.g., male employees do not need a maiden
name but female employees may have a maiden name). Completeness also means that all data needed are present (e.g., if
we want to know total dollar sales, we may need to know both total quantity sold and unit price, or if an employee record
indicates that an employee has retired, we need to have a retirement date recorded). Sometimes completeness has an
aspect of precedence. For example, an employee in an employee table who does not exist in an applicant table may
indicate a data quality issue.
Characteristics of Quality Data
• Timeliness : means meeting the expectation for the time between when data are expected and when they
are readily available for use. As organizations attempt to decrease the latency between when a business
activity occurs and when the organization is able to take action on that activity, timeliness is becoming a
more important quality of data characteristic (i.e., if we don’t know in time to take action, we don’t have
quality data). A related aspect of timeliness is retention, which is the span of time for which data represent
the real world. Some data need to be time-stamped to indicate from when to when they apply, and missing
from or to dates may indicate a data quality issue.
• Currency : is the degree to which data are recent enough to be useful. For example, we may require that
customers’ phone numbers be up-to-date so we can call them at any time, but the number of employees
may not need to be refreshed in real time. Varying degrees of currency across data may indicate a quality
issue (e.g., if the salaries of different employees have drastically different updated dates).
• Conformance : refers to whether data are stored, exchanged, or presented in a format that is as specified by
their metadata. The metadata include both domain integrity rules (e.g., attribute values come from a valid
set or range of values) and actual format (e.g., specific location of special characters, precise mixture of text,
numbers, and special symbols).
• Referential integrity Data that refer to other data need to be unique and satisfy requirements to exist (i.e.,
satisfy any mandatory one or optional one cardinalities).
Data Cleaning
• Data cleaning is the process of fixing or removing incorrect,
corrupted, incorrectly formatted, duplicate, or incomplete data within
a dataset.
Meta Data Management
• Metadata management involves capturing, organizing, and maintaining
metadata, which provides essential context and information about the data
itself.

Types of Meta Data :


• Administrative Metadata
• Technical Metadata
• Operational Metadata
• Structural Metadata
• Usage Metadata
Data Lineage
• Data lineage is the process of tracking the flow of data over time,
providing a clear understanding of where the data originated, how it
has changed, and its ultimate destination within the data pipeline.

Dala Lineage helps in


• Transparency through visibility
• Faster debugging and improved quality
Data Traceability
• Data traceability is the ability to ensure that your data is completely
traceable across the entire landscape.
• This allows you to easily follow your data all the way back to its
original source.
Data Audits
They are a systematic review of how an organization manages its data,
and can help identify issues and improve data management practices.

Data Audits help with:


• Compliance
• Security
• Data accuracy
• Data management
• Data retention
Example
• Data Security Policy
• All sensitive data must be encrypted at rest and in transit using industry-standard
protocols.
• Security incidents must be reported within 24 hours to the Data Security Officer.
• Standards for Data Security
• Incident Response Standards
• Security incidents must be logged with timestamps, source IP, and impacted systems.
• Incidents must be categorized into severity levels (e.g., Critical, High, Medium, Low) based on
their impact and urgency.
• Procedures for Data Security
• Incident Management
• Triage incidents based on severity and assign a response team.
• Contain the breach by isolating affected systems from the network.
• Investigate root causes, apply fixes, and restore operations.
• Document findings and actions taken for post-incident analysis.
Example
• Data Integration Policy :
• All integrations between systems must use approved APIs with secure authentication
protocols.
• Data mappings must be documented and reviewed before implementation.

• Standards for Data Integration


• Data Mapping and Transformation Standards
• Standardized mapping templates must be used for field-level alignment between systems.
• Transformations must be documented, including logic and tools used (e.g., ETL software).

• Procedures for Data Integration


• Data Mapping and Transformation
• Use the organization’s standard mapping template to align source and target fields.
• Perform transformations using approved ETL tools like Talend or Informatica.
• Validate transformed data with sample records to ensure accuracy.
Data Architecture

• how data is managed, from collection through to transformation,


distribution and consumption.
• blueprint for data and the way it flows through data storage systems.

• Design for data architecture includes


• Conceptual Data Model
• Logical Data Model
• Physical Data Model
• Data Architecture Pillars
• Business architecture
• Data architecture
• Applications architecture
• Technical architecture
Data Administration
• Planning - Hardware and Software needs
• Design - space requirements, estimate performance
• Implementation – Install software, create databases, transfer data
• Operation – Monitor performance, backup and recovery
• Growth and Change – monitor and forecast storage needs
• Security
• Maintenance
Data Integration
• the process of combining and harmonizing data from multiple sources into a
unified, coherent format
• put to use for various analytical, operational and decision-making purposes.

• How it works
• Data Source identification
• Data Extraction
• Data mapping
• Data Validation and QA
• Data Transformation
• Data Loading
• Data Synchronization
• Data Governance and Security
• Meta Data Management
• Data Access and analysis
Master Data Management
• approach to managing an organization's critical data across the
enterprise.
• uses technology, tools and processes to create a unified
master data service that consolidates key enterprise data
assets
• involves establishing workflows to streamline these processes
and guarantee consistent data handling across the organization
• Supported by a well-defined data model and solid data
stewardship
Data Management

Data Data Database Master Data


Data Integration
Governance Architecture Administration Management
• Policies • Data • Database • combining • processes to
• Standards and modelling creation and data from create a
Procedures • Conceptual, maintenance multiple unified
• Data Quality Logical, • Data Security sources master data
• Data Lineage, Physical • Database • ETL Process service that
Traceability design Backup and consolidates
• Audits Recovery key
• Performance enterprise
data assets
Compliance Management
• Compliance management ensures that organizations adhere to
relevant laws, regulations, and internal policies.
• It involves identifying applicable standards and implementing
procedures to meet these requirements.
• The goal is to minimize legal risks and protect the organization’s
reputation and operational integrity.
• Data compliance is the act of handling and managing personal and
sensitive data in a way that adheres to regulatory requirements,
industry standards and internal policies involving data security and
privacy.
data compliance regulations and
standards - HIPAA
Standard Standard Desc Country Standard for Applicable to
HIPAA Health Insurance USA Guidelines for - healthcare
Portability and patients' personal providers and
Accountability Act health information insurance plans
(PHI) - business associates
with access to PHI,
providers of data
transmission
services, medical
transcription service
providers, software
companies, insurance
firms, and more.
data compliance regulations and
standards - GDPR
Standard Standard Desc Country Standard for Applicable to
GDPR General Data European Union Guidelines for It mandates
Protection personally organizations within
Regulation identifiable and outside Europe
information (PII) to be transparent
about their data
collection practices,
granting individuals
greater control over
their PII.
data compliance regulations and
standards - CCPA
Standard Standard Desc Country Standard for Applicable to
CCPA California Consumer USA Guidelines for places the onus on
Privacy Act personally businesses to be
identifiable transparent about
information (PII) their data practices
and empowers
individuals to have
more control over
their personal
information.
data compliance regulations and
standards - PCI-DSS
Standard Standard Desc Country Standard for Applicable to
PCI-DSS Payment Card Payment Card guidelines to mandated by credit
Industry Data Industry Security safeguard credit card card companies for
Security Standards Standards Council data. those handling
(Independent cardholder data
Regulatory Body)
Data Standards
Data standards are policies or best practices that determine how types
of data should be formatted and what metadata and documentation
needs to be included.

Implementation of data standards makes it easier for data managers


and scientists to maintain uniformity across the entire data of the
organization.
standards makes data more usable to more than just the project or
person that created the data.
Standards are useful for integrating data from multiple resources.
Standards Bodies
• The International Organization for Standardization (ISO)
• provides requirements, specifications, guidelines or characteristics that can be used
consistently to ensure that materials, products, processes and services are fit for
their purpose.
• The International Electrotechnical Commission (IEC)
• standards for electrical and electronic technologies.
• The International Telecommunication Union (ITU)
• allocates global radio spectrum and satellite orbits,
• develops technical standards for interconnecting networks and other technologies in
international telecommunications.
• Institute of Electrical and Electronics Engineers (IEEE)
• technical professional organization
• inspire a global community through its highly cited publications, conferences,
technology standards, and professional and educational activities.

You might also like