Cloud Data Governance and Catalog Data Sheet 4152en
Cloud Data Governance and Catalog Data Sheet 4152en
Cloud Data
Governance and Catalog
While digital transformation has been an organizational priority for years, the y Enhance data-driven
focus for most enterprises has shifted towards scaling digital business by decision-making by
leveraging modern, AI-enabled applications. However, to drive trusted outcomes improving data literacy
with AI, enterprises need accurate and reliable data. The need to harness large with faster insights
volumes of trustworthy data, manage complex data landscapes and minimize
fragmentation has made this task increasingly challenging. y Advance data asset
usability with business
To effectively leverage data and AI for various use cases, including enhancing
context via automation,
customer experience, enabling innovation and ensuring greater compliance
bulk curation
with regulatory authorities, data consumers must be able to trust and have
and crowdsourcing
visibility into the data. Comprehensive, AI-powered, data intelligence is essential
for organizations aspiring to accelerate value from their digital transformation y Assess and mitigate
initiatives. exposure risks with
sensitive data discovery
Cloud Data Governance and Catalog: Predictive Data and automatic assignment
Intelligence for Data and AI Governance of recommended
The Informatica® Cloud Data Governance and Catalog (CDGC), a service of data policies
the Informatica Intelligent Data Management Cloud™ (IDMC), combines data
y Enable trusted use of
governance, data catalog and data quality capabilities into a singular tool for
governed AI models and
automating data intelligence insights. This IDMC service is built for organizations
their underlying data
that want to maximize their investments by deriving value from their vast
data assets. y Deliver trustworthy
CDGC delivers predictive data intelligence powered by the Informatica CLAIRE® AI data and develop a data
and ML engine. governance framework
informatica.com
Cloud Data Governance and Catalog
Organizations that drive business value from trusted data can leverage automated and recommendation-
driven data classification, bulk data curation, relationship discovery and sensitive data discovery. Just
as importantly, they can provide data consumers with the business context they need through metadata
insights. The IDMC service enables efficient self-service analytics and data governance by unifying
the capabilities of data discovery, data lineage, data profiling, data quality, business glossary creation,
stakeholder and policy management, and the ability to document and govern AI models and their
implementations.
CDGC integrates into your existing data landscape and scans hybrid sources, including cloud data lakes
and warehouses, analytics/BI systems, databases, ETL tools and other enterprise systems. The IDMC
service is cloud-native, meaning you can deploy it into your existing infrastructure almost immediately and
at the scale needed.
Key Capabilities
CDGC offers broad and deep metadata connectivity that spans multi-cloud and on-premises environments.
Applying wide and deep data source connectivity, it allows you to extract metadata across:
The IDMC service provides a centralized, comprehensive view of your data. It features universal metadata
connectivity, supporting nearly all your data sources. Additionally, it provides a runtime option to run
serverless or within your on-premises or virtual private cloud.
Inspect scripts, procedures and processes to fully understand logic and internal data flow. Obtain complete
column-level data lineage, including an inventory of potential lineage sources with rich details. Scan static
and dynamic code and perform language parsing for automated data lineage across the enterprise.
informatica.com 2
Cloud Data Governance and Catalog
With the CDGC custom metadata framework, you can use simple Excel files to ingest custom metadata.
You can also derive data lineage and relationship links from critical systems where automated scanners
are unavailable. Model virtually any data source or data lineage across systems.
y Informatica Solutions and Capabilities: PowerCenter, data integration, multidomain master data
management (MDM) and business 360 applications
y Cloud Platforms: Amazon Web Services (AWS) S3, AWS Redshift, AWS RDS (Oracle, MS SQL Server,
PostgreSQL, MySQL), DynamoDB, Azure SQL DB, Azure Synapse, Azure ADLS Gen 2, Azure Blob, Google
Cloud Storage, Google BigQuery, Snowflake, Databricks Delta Tables, Oracle Cloud Storage, Oracle ADB
y On-Premises: Oracle, IBM Db2, Netezza, SQL Server, Teradata, JDBC, MySQL, SAP HANA DB, Postgres,
MongoDB, Local/Shared Filesystem
y Database Scripts: DB2 LUW SQL, Microsoft SQL Server SQL, Oracle SQL, Snowflake SQL, Teradata BTEQ
y BI and Analytics Platforms: Tableau, Microsoft Power BI, QlikView, Qlik Sense, Qlik Sense Cloud
Microsoft SSRS, Cognos, Google Looker, TIBCO Spotfire, Oracle BI,
y Other ETL and Data Science Platforms: Azure Data Factory, Databricks Notebooks, Databricks Unity
Catalog, Microsoft SSIS, Microsoft SSAS, Talend
y Enterprise Applications: Salesforce, Kafka, Workday, Marketo, SAP BW, SAP BW4/Hana, SAP ECC, SAP
S/4Hana, SAP Business Objects, Dynamics CRM, Microsoft OneDrive, Microsoft SharePoint
informatica.com 3
Cloud Data Governance and Catalog
The solution offers automated data discovery and classification to reduce the time and effort spent on
tedious manual processes that do not scale. Data stewards can curate, review and accept more than 215
out-of-the-box automated data classification associations recommended by CLAIRE. Users can further
modify and extend these classifications or add new ones as needed.
CDGC also learns from associations and can auto-tag similar fields and columns across the enterprise
using rule-based and AI-based methodologies. The IDMC service automatically associates glossary terms
to data and infer relationships, such as joins among datasets using AI/ML capabilities, including
schema matching.
With the CLAIRE activity page, users can view analytics related to automated glossary and classification
associations, including metrics on accepted, pending and declined associations. The page provides a
central location to identify and act on pending curation actions. These insights help drive the usage of
automated associations powered by CLAIRE and can be utilized to calculate the time saved for curation
activities across the organization.
Figure 1. CLAIRE activity analytics provide a summary of intelligent glossary and classification associations across
the organization.
informatica.com 4
Cloud Data Governance and Catalog
Figure 2. Quickly find data assets using powerful semantic search capabilities.
informatica.com 5
Cloud Data Governance and Catalog
Figure 3. Interactive graphical views of data relationships help users discover and understand assets across the organization.
informatica.com 6
Cloud Data Governance and Catalog
Figure 4. Explore data-element-level lineage from source to target and gain an additional understanding of your data by
displaying graphical overlays, including data quality scores.
informatica.com 7
Cloud Data Governance and Catalog
Figure 5. Collaboration capabilities include the ability to drive discussions and share comments and user ratings for data
assets. Users can also certify technical assets.
AI Model Governance
AI model governance advances explainable AI by providing organizational visibility and transparency
into models and their underlying algorithms, which is often a black box for most organizations. It details
how the model was developed, the training data used for creating the models, its quality and lineage and
relevant policies. AI model governance also helps track and monitor model performance and key metrics,
such as data drift, that may lead to model performance degradation and unreliable business outcomes.
informatica.com 8
Cloud Data Governance and Catalog
Goal-Oriented Dashboards
Interactive and graphical dashboards put the user in command, providing summarized information in
a visual form, including stakeholder/ owner assignments and glossary metrics. Users can also monitor
automated predefined workflows, check task completions and view notifications. With a variety of
visualizations and drill-down capabilities, users can quickly view the summary status and explore details
as needed.
Figure 6. Manage your business, technical and governance assets from centralized, configurable, interactive dashboards.
informatica.com 9
Cloud Data Governance and Catalog
Key Benefits
Advance Data Assets Usability with Business Context via Automation, Bulk Curation
and Crowdsourcing
Boost productivity and maximize the usability and value of data by automating and augmenting common
data management tasks at scale. With CDGC, automatically scan data and metadata across cloud and on-
premises environments and streamline common data curation processes. Data professionals can spend
less time on tedious tasks and focus on higher-value work with AI-enabled capabilities such as automatic
data classification, automatic association of business glossary terms to technical data assets and bulk
data curation. CDGC also captures crowdsourced tags, annotations, ratings, and reviews to further increase
the value of data. This “wisdom of crowds” helps with data enrichment and curation, making it even more
valuable throughout the organization while encouraging collaboration among stakeholders.
Assess and Mitigate Exposure Risks with Sensitive Data Discovery and Automatic Assignment
of Recommended Data Policies
CDGC includes more than 215 out-of-the-box classifications to facilitate the automatic discovery and
classification of potentially sensitive data. This includes data relevant to industry-related regulatory
frameworks such as GDPR, PII, PHI and PCI related data. The IDMC service can also automatically assign
recommended data policies to relevant data classifications. These classifications allow data stewards to
use data lineage to quickly identify datasets and sharing activity that may indicate potential privacy and
similar exposure risks.
With improved transparency, your data protection and data sharing plans can help ensure compliance
with policies for sensitive information use. This approach helps limit customer and intellectual property
information exposure and aids in averting risks from abuse and data loss.
informatica.com 10
Cloud Data Governance and Catalog
Enable Trusted Use of Governed AI Models and Their Underlying Data Next Steps
In this age of data science, AI models are often opaque, built with poor quality To learn more about
datasets and potentially noncompliant with organizational policies. AI model intelligent data governance
governance capabilities provide insights into AI models and the data used to train tools that can help you
models. Insights are also provided for the outputs produced, related policies’ connect data consumers
potential impact and which models are available for reuse. This approach with trusted data, visit www.
ensures that models that are used are relevant, their lineage is understood, and informatica.com/cloud-
policies applied are checked. It also provides visibility into data drift to check the data-governance-and-
impact on the model’s prediction capability. Informatica offers a holistic solution catalog.
for integrated governance of AI models and the data the models utilize.
Worldwide Headquarters
2100 Seaport Blvd., Redwood City, CA 94063, USA Phone: 650.385.5000, Toll-free in the US: 1.800.653.3871
Informatica (NYSE: INFA) brings data and AI to life by empowering businesses to realize
the transformative power of their most critical assets. When properly unlocked, data
becomes a living and trusted resource that is democratized across your organization,
turning chaos into clarity. Through the Informatica Intelligent Data Management Cloud™,
companies are breathing life into their data to drive bigger ideas, create improved
processes, and reduce costs. Powered by CLAIRE®, our AI engine, it’s the only cloud
dedicated to managing data of any type, pattern, complexity, or workload across any
location — all on a single platform.
IN17-4152-0124
© Copyright Informatica LLC 2024. Informatica and the Informatica logo are trademarks or registered trademarks of
Informatica LLC in the United States and other countries. A current list of Informatica trademarks is available on the
web at https://www.informatica.com/trademarks.html. Other company and product names may be trade names or
trademarks of their respective owners. The information in this documentation is subject to change without notice
and provided “AS IS” without warranty of any kind, express or implied.