0% found this document useful (0 votes)
9 views32 pages

Topic 05 Data Management Best Practice

The document outlines best practices for effective data management, emphasizing the importance of a centralized data repository, clear governance policies, stringent access controls, and data security. It also highlights the need for continuous monitoring, predictive analytics, and quality data-management software to enhance decision-making. Additionally, it discusses the roles of data lakes and warehouses in storing and analyzing data, along with the significance of a robust data governance framework for compliance and improved business performance.

Uploaded by

amuletere
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views32 pages

Topic 05 Data Management Best Practice

The document outlines best practices for effective data management, emphasizing the importance of a centralized data repository, clear governance policies, stringent access controls, and data security. It also highlights the need for continuous monitoring, predictive analytics, and quality data-management software to enhance decision-making. Additionally, it discusses the roles of data lakes and warehouses in storing and analyzing data, along with the significance of a robust data governance framework for compliance and improved business performance.

Uploaded by

amuletere
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Topic 05: Best practices for

Data Management
@CIT 2024

4/9/2024 @CIT 2024 1


Introduction
• Data management is a critical business driver used to
ensure data is acquired, validated, stored, and
protected in a standardized way. It is essential to
develop and deploy the right processes so end users
are confident their data is reliable, accessible, and up
to date.
• To make sure that your data is managed most
effectively and efficiently, here are seven best
practices for your business to consider.
4/9/2024 @CIT 2024 2
Best Practices for Data Management
1. Create a Centralized Data Repository
• Creating a centralized data repository is essential for
businesses to derive value from their data and ensure data
integrity and accessibility.
• Why? Because it is crucial to get a 360-degree perspective on
business metrics, and for that, your models need data from
different sources.
• SSOT (Single Source of Truth) is a data warehouse or
repository that provides a complete and authoritative
picture of a company, allowing users to access all the data
they need without jeopardizing any mission-critical database
4/9/2024 @CIT 2024 3
Best Practices for Data Management
• 2. Define Clear Data Governance Policies
• Data governance is the foundation of effective data
management. It involves defining policies, procedures, and
guidelines for managing data throughout its lifecycle.
• By establishing clear data governance policies, organizations
can ensure that data is accurate, consistent, and accessible
to authorized users.
• This includes defining roles and responsibilities, establishing
data quality standards, and implementing data classification
and security measures.

4/9/2024 @CIT 2024 4


Best Practices for Data Management
• 3. Facilitate Access with Stringent Controls
• Ensure that essential data doesn’t vanish within servers;
empower your team with accessible information while upholding
necessary controls. Valuable data often remains underused due
to key personnel being unaware, impeding effective utilization.
• Establish a comprehensive record-keeping system outlining
available data and access privileges. Utilize customizable
dashboards to monitor specific company metrics, enhancing
visibility during company-wide discussions.
• Implement data classification techniques to restrict access based
on job roles. For example, confidential contract details might not
be relevant for a junior market research analyst.

4/9/2024 @CIT 2024 5


Best Practices for Data Management
• 4. Implement Data Backups and a Disaster Recovery Plan
• Data loss poses significant financial and operational risks. The
loss of critical data can result in downtime and missed
opportunities, impacting your business’s functionality. Various
factors like accidental deletions, cyber theft, or physical damage
to servers can contribute to data loss.
• To ensure continuity, establishing data backups and a disaster
recovery plan is imperative. Engage with your data server
vendors or cloud providers to institute necessary data recovery
protocols and synchronization points. Develop comprehensive
business continuity plans that serve as a reference during
unforeseen circumstances.
4/9/2024 @CIT 2024 6
Best Practices for Data Management
• 5. Continuously Monitor and Improve Data Management
Processes
• Data management is an ongoing process that requires
continuous monitoring and improvement. Organizations should
regularly assess their data management processes, identify areas
for improvement, and implement necessary changes. This
includes monitoring data quality metrics, analyzing performance
indicators, and soliciting feedback from users.
• By continuously refining data management processes,
organizations can adapt to evolving business needs and ensure
the effectiveness of their data management practices.

4/9/2024 @CIT 2024 7


Best Practices for Data Management
• 6. Prioritize Data Security
• Did you know that the cost of data breaches is expected to top $5 trillion by
2024?
• Ensuring the safety of your company’s data should be a top concern for all
teams. Confidential financial information, like budgets and investment plans,
if breached, can result in a loss of competitive advantage, damage to
reputation, and regulatory penalties.
• It’s crucial to verify that any vendors or partners comply with the highest data
protection standards. Implement data security tools that encrypt sensitive
data and facilitate secure sharing.
• Additionally, enforce standard security protocols for accessing data daily,
such as mandating two-factor authentication for employee access to
company
4/9/2024
apps. @CIT 2024 8
Best Practices for Data Management
• 7. Enhance Decision-Making with Predictive Analytics
• While not a conventional aspect of data management practices, leveraging
predictive analytics is crucial for unlocking the full value of data. Data’s
inherent worth lies in its predictive potential, enabling the forecasting of
future outcomes.
• For example, let’s say you want to anticipate when a customer will pay their
due invoice. Analyzing past payment trends and integrating variables like
previous payment dates and invoice amounts into a predictive model allows
for estimating the likely payment date.
• Integrating the appropriate data into robust forecasting models facilitates
predictions in complex business scenarios. Presently, predictive technologies
are increasingly sophisticated and accessible. Employing tools like predictive
analytics software empowers accurate anticipation of business outcomes,
enabling
4/9/2024
proactive responses to challenges
@CIT 2024
and opportunities alike. 9
Best Practices for Data Management
• 8.Build strong file naming and cataloging conventions
• If you are going to utilize data, you have to be able to find it. You can’t
measure it if you can’t manage it. Create a reporting or file system that is
user- and future-friendly—descriptive, standardized file names that will be
easy to find and file formats that allow users to search and discover data sets
with long-term access in mind.
• To list dates, a standard format is YYYY-MM-DD or YYYYMMDD.
• To list times, it is best to use either a Unix timestamp or a standardized 24-
hour notation, such as HH:MM:SS. If your company is national or even global,
users can take note of where the information they are looking for is from and
find it by time zone.

4/9/2024 @CIT 2024 10
Best Practices for Data Management
• 9. Carefully consider metadata for data sets
• Essentially, metadata is descriptive information about the data you are using.
It should contain information about the data’s content, structure, and
permissions so it is discoverable for future use. If you don’t have this specific
information that is searchable and allows for discoverability, you cannot
depend on being able to use your data years down the line.
• Catalog items such as:
• Data author
• What data this set contains
• Descriptions of fields
• When/Where the data was created
• Why this data was created and how @CIT 2024
4/9/2024 11
Best Practices for Data Management
• 10. Invest in quality data-management software
• When considering these best practices together, it is recommended, if not
required, that you invest in quality data-management software. Putting all
the data you are creating into a manageable working business tool will help
you find the information you need.
• Then you can create the right data sets and data-extract scheduling that
works for your business needs. Data management software will work with
both internal and external data assets and help configure your best
governance plan.
• Tableau offers a Data Management Add-On that can help you create a robust
analytics environment leveraging these best practices

4/9/2024 @CIT 2024 12


Data Storage Options
Data can be stored in;
• Data Lake
• Database
• Data warehouse

4/9/2024 @CIT 2024 13


Data Lake
• What is a data lake?
• A data lake is a centralized repository that allows you to store
all your structured and unstructured data at any scale.
• You can store your data as-is, without having to first
structure the data, and run different types of analytics—from
dashboards and visualizations to big data processing, real-
time analytics, and machine learning to guide better
decisions.

4/9/2024 @CIT 2024 14


Why do you need a data lake?
• Organizations that successfully generate business value from
their data, will outperform their peers. An Aberdeen survey saw
organizations who implemented a data lake outperforming
similar companies by 9% in organic revenue growth.
• These leaders were able to do new types of analytics like
machine learning over new sources like log files, data from click-
streams, social media, and internet connected devices stored in
the data lake.
• This helped them to identify, and act upon opportunities for
business growth faster by attracting and retaining customers,
boosting productivity, proactively maintaining devices, and
making informed decisions.
4/9/2024 @CIT 2024 15
What are the essential elements of a data lake and
analytics solution?
• As organizations are building data lakes and an analytics
platform, they need to consider a number of key capabilities
including;
• Data movement
• Data lakes allow you to import any amount of data that can
come in real-time. Data is collected from multiple sources,
and moved into the data lake in its original format. This
process allows you to scale to data of any size, while saving
time of defining data structures, schema, and
transformations.

4/9/2024 @CIT 2024 16


What are the essential elements of a data lake and
analytics solution?
• Securely store and catalog data
• Data lakes allow you to store relational data like operational
databases and data from line of business applications, and
non-relational data like mobile apps, IoT devices, and social
media. They also give you the ability to understand what
data is in the lake through crawling, cataloging, and indexing
of data. Finally, data must be secured to ensure your data
assets are protected.

4/9/2024 @CIT 2024 17


What are the essential elements of a data lake and
analytics solution?
• Analytics
• Data lakes allow various roles in your organization like data
scientists, data developers, and business analysts to access
data with their choice of analytic tools and frameworks.
• This includes open source frameworks such as Apache
Hadoop, Presto, and Apache Spark, and commercial offerings
from data warehouse and business intelligence vendors.
Data lakes allow you to run analytics without the need to
move your data to a separate analytics system

4/9/2024 @CIT 2024 18


What are the essential elements of a data lake and
analytics solution?
• Machine Learning
• Data lakes will allow organizations to generate different
types of insights including reporting on historical data, and
doing machine learning where models are built to forecast
likely outcomes, and suggest a range of prescribed actions to
achieve the optimal result.

4/9/2024 @CIT 2024 19


How does a data warehouse compare to a
data lake?
• Depending on the requirements, a typical organization will require
both a data warehouse and a data lake as they serve different needs,
and use cases.
• A data warehouse is a database optimized to analyze relational data
coming from transactional systems and line of business applications.
The data structure, and schema are defined in advance to optimize
for fast SQL queries, where the results are typically used for
operational reporting and analysis.
• Data is cleaned, enriched, and transformed so it can act as the “single
source of truth” that users can trust.

4/9/2024 @CIT 2024 20


How does a data warehouse compare to a
data lake?
• A data lake is different, because it stores relational data from line of
business applications, and non-relational data from mobile apps, IoT
devices, and social media. The structure of the data or schema is not
defined when data is captured.
• This means you can store all of your data without careful design or
the need to know what questions you might need answers for in the
future. Different types of analytics on your data like SQL queries, big
data analytics, full text search, real-time analytics, and machine
learning can be used to uncover insights.

4/9/2024 @CIT 2024 21


What is a Database
• A database is an organized collection of data stored in a computer
system and usually controlled by a database management system
(DBMS). The data in common databases is modeled in tables, making
querying and processing efficient. Structured query language (SQL) is
commonly used for data querying and writing.
• The Database is an essential part of our life. We encounter several
activities that involve our interaction with databases, for example in
the bank, in the railway station, in school, in a grocery store, etc.
These are the instances where we need to store a large amount of
data in one place and fetch these data easily.
• The database supports daily operations. It is operational in nature.
4/9/2024 @CIT 2024 22
Data Governance Framework
• A data governance framework is a set of rules,
processes, and responsibilities that dictate how an
organization collects, organizes, stores, and uses its
data.
• The goal of a data governance framework is to set a
standard on how data is managed (to ensure its
integrity), leveraged by internal teams, and protected
from security risks.

4/9/2024 @CIT 2024 23


The importance of data governance frameworks

• Data democratization
• A data governance framework allows you to establish data
democratization, giving employees of all technical skill sets
the ability to access and act on data.
• This autonomy and confidence in data allows teams to
accurately set goals, measure performance, strategize, and
discover new opportunities.

4/9/2024 @CIT 2024 24


The importance of data governance frameworks
• Standardized and trustworthy data
• An important aspect of good data governance is clear
guidelines on how to label and categorize data. Guidelines
allow you to standardize data that the entire organization
can trust.
• Efforts to standardize data may include creating a shared
data dictionary to ensure consistency across teams in what is
being tracked, and their naming conventions.

4/9/2024 @CIT 2024 25


The importance of data governance frameworks
• Compliance with regulatory requirements
• The global rise of customer data privacy regulations, such as
the European Union’s General Data Protection Regulation
(GDPR), has made it necessary for organizations to know
exactly how they collect, store, and use data. Now, certain
privacy regulations dictate that a user has a right to request
their personal data be deleted by an organization, or that
data needs to be stored and processed locally (i.e., data
sovereignty laws).
• A data governance framework ensures that a business is
adhering to these larger privacy and security regulations

4/9/2024 @CIT 2024 26


The importance of data governance frameworks
• Improved business performance
• Data governance sets clear processes for the collection,
storage, and use of data. When employees know how to
collect and where to find important data, the results are
improved efficiency and data accuracy.

4/9/2024 @CIT 2024 27


Data governance framework models and examples
• The models are based on how data governance decisions will
flow through your organization.
• Top-down: Company leadership implements data
governance policies that are then passed down to individual
business units and shared with the rest of the company.
• Bottom-up: Employees at the lower levels implement data
governance practices, such as standardizing naming
conventions, which spread to the higher levels of the
organization.
• Center-out: The team or individual responsible for data
governance sets data standards that the entire organization
follows.
4/9/2024 @CIT 2024 28
Data governance framework models and examples

• Silo-in: Various departments come together to align on data


governance while keeping in mind the needs of each group.

• Hybrid: Data governance decisions involve different levels of


the organization. For example, a company uses a center-out
model to suggest a course of action but employs a top-down
model to make the final decision.

4/9/2024 @CIT 2024 29


Data security and privacy considerations
• Data security protects information from unauthorized
access, use, and disclosure. It also protects it from
disruption, modification, or destruction.

• Data privacy is the right to control who gets to see


your personal information like credit card numbers
and bank account balances.

4/9/2024 @CIT 2024 30


Discussion Questions
• What are the threats to data security?
• How can organizations deal with data security
threats?

4/9/2024 @CIT 2024 31


End of Presentation
• Thank you for Listening

4/9/2024 @CIT 2024 32

You might also like