0% found this document useful (0 votes)
9 views26 pages

Session 2

The document discusses key aspects of Metadata Management and Data Quality within the context of Data Management. It defines metadata as 'data about data' and categorizes it into descriptive, structural, and administrative types, while also outlining the six core dimensions of data quality. Additionally, it highlights the importance of managing data quality throughout its lifecycle and provides an overview of various data management knowledge areas and job titles related to data science.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views26 pages

Session 2

The document discusses key aspects of Metadata Management and Data Quality within the context of Data Management. It defines metadata as 'data about data' and categorizes it into descriptive, structural, and administrative types, while also outlining the six core dimensions of data quality. Additionally, it highlights the importance of managing data quality throughout its lifecycle and provides an overview of various data management knowledge areas and job titles related to data science.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

1.2.

1.4.8 Metadata Management

Data Management
1.2.
1.4.8
1.4.8 Metadata Management

Metadata is “data about data,”

Business Intelligence tools produce various types of Metadata

Meta data is a kind of data, and it should be managed as such.

Data Management
1.2.
1.4.8
1.4.8 Metadata Management
Metadata is typically categorized into three types: descriptive (describes the
content of the data), structural or technical (information about the technical
details and systems that store the data), and administrative or operational
(details about the processing and accessing of data)

ISO 11179

Data Management
1.2.
1.4.8
1.4.8 Metadata Management

Data Management
1.2.

1.4.9 Data quality

Data Management
1.2.
Data quality

The term data quality refers both to the characteristics associated with high quality
data and to the processes used to measure or improve the quality of data

The quality of data should be managed across the data lifecycle

Data quality managemnt is using quality management techniques to data,


in order to assure it is fit for consumption and meets the needs of data
consumers

Formal data quality management is similar to continuous quality management


for other products.

A Data Quality program should focus on the data most critical to the enterprise
and its customers

The focus of a Data Quality program should be on


preventing data errors
Data Management
1.2. Data quality
In 2013, DAMA UK produced a white paper describing six core dimensions of data quality:

• Completeness: The proportion of data stored against the potential for 100%.

• Uniqueness: No entity instance (thing) will be recorded more than once based
upon how that thing is identified.

• Timeliness: The degree to which data represent reality from the required point
in time.

• Validity: Data is valid if it conforms to the syntax (format, type, range) of its
definition.

• Accuracy: The degree to which data correctly describes the ‘real world’ object
or event being described.

• Consistency: The absence of difference, when comparing two or more


representations of a thing against a definition.

Data Management
1.2. Data quality

Data Management
1.2. Data quality

Data Management
1.2.

Data Management
1. Data Governance provides direction and oversight for data management
Abstract of by establishing a system of decision rights over data that accounts for the
needs of the enterprise.
Knowledge
Areas 2. Data Architecture defines the blueprint for managing data assets by
aligning with organizational strategy to establish strategic data requirements
and designs to meet these requirements.

3. Data Modeling and Design is the process of discovering, analyzing,


representing, and communicating data requirements in a precise form called
the data model.

4. Data Storage and Operations includes the design, implementation, and


support of stored data to maximize its value. Operations provide support
throughout the data lifecycle from planning for to disposal
of data.

5. Data Security ensures that data privacy and confidentiality are


maintained, that data is not breached, and that data is accessed
appropriately. Data Integration and Interoperability includes processes
related to the movement and consolidation of data within and between data
stores, applications, and organizations.
Data Management
7. Document and Content Management includes planning, implementation, and control
Abstract of activities used to manage the lifecycle of data and information found in a range of
Knowledge unstructured media, especially documents needed to support legal and regulatory
compliance requirements.
Areas
8. Reference and Master Data includes ongoing reconciliation and maintenance of core
critical shared data to enable consistent use across systems of the most accurate, timely,
and relevant version of truth about essential business entities.

9. Data Warehousing and Business Intelligence includes the planning, implementation,


and control
processes to manage decision support data and to enable knowledge workers to get value
from data via analysis and reporting.

10. Metadata includes planning, implementation, and control activities to enable access
to high quality, integrated Metadata, including definitions, models, data flows, and other
information critical to understanding data and the systems through which it is created,
maintained, and accessed.

11. Data Quality includes the planning and implementation of quality management
techniques to measure, assess, and improve the fitness of data for use within an
organization
Data Management
1.2.

Context diagram of knowledge areas

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2.

Context diagram

Data Management
1.2. Job Titles of Data Scientists

•Data Scientist •Chief Actuary of GeoSpatial Analytics and Modeling •Director, Business Planning & Analytics
•Data Analyst •Chief Strategy & Analytics Officer •Assistant Professor
•Research & Analytics Director •Database Manager •Machine Learning Engineer
•Business Analyst •Customer Analytics & Pricing •Python Developer
•Project Coordinator •Data Visualization Analyst •Analytics Officer
•Director - Advanced Analytics •Assistant Vice President •Executive Director
•Chief Credit & Analytics Officer •Research Analyst •Director, Big Data Analytics and Segmentation
•Director, Business Intelligence and Analytics •Director of Technology •Data Engineer
•Chief Analytic Officer •Chief Analytics & Algorithms Officer •Database Administrator
•Data Learning Engineer •Data Architect •Strategic Data Analytics Analyst
•Chief Analytic Officer •Statistician •Data and Analytics Manager
•Director of Risk Analytics and Policy •AI Product Manager •Director, Data Warehousing & Analytics
•GIS Analyst •Information Security Analyst •AI Architect
•Data Visualizers •Research Analyst •Data Science Director
•Chief Technology Officer •Statistical Modeling and Analytics •Data Ecologists
•Health Analytics •Principal Big Data Architect •Forensic Data Analytics
•Director Marketing Analytics •Customer Analytics •Data Manager
•Big Data Developer •Web Analytics •Director, Database Marketing & Analytics
•Data Developer •Risk and Business Analytics •Director of Analytics
•Clinical Analytics •Geospatial Data Scientist •Reporting/Analytics
•Big Data Architect •R&D Engineer Data Scientist •Predictive Analytics
•Python Data Developer

Data Management
1.2.
Presentation

Size: 4-6 students with close favorite jobs

Duration: 15-20 minute presentation per group

Subject:
Introduce the job, its characteristics and required skills,
Present the most important knowledge areas (Based on DMBOK) for this job
Present potential techniques, tools and methods in these knowledge areas

We will have one presentation per week.


Now, you should form the groups, identify the members and schedule the date of
presentation.

Data Management

You might also like