0% found this document useful (0 votes)
77 views128 pages

Data Governance in Pharmaceuticals

Uploaded by

Nazri Nawi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views128 pages

Data Governance in Pharmaceuticals

Uploaded by

Nazri Nawi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 128

ANH – NGUYEN

Promotion 2022 – 2024

Master of Science in
Health Management & Data Intelligence

Data Governance in Pharmaceuticals

Date of Defense: 27 November 2023


Name of Supervisor: ARNAUD JAOUL

Copyright emlyon business school


Par accord du CFC
Cession et reproduction interdite
Msc Health Management & Data Intelligence

Table of Contents

1 Acknowledgments ............................................................................................................... 5

2 Abstract ............................................................................................................................... 7

3 Introduction ......................................................................................................................... 9

3.1 Overview of Data Governance and its importance in the pharmaceutical industry. . 10

3.1.1 Data Management .............................................................................................. 11

3.1.2 Data Governance ................................................................................................ 14

3.1.3 Data Integrity...................................................................................................... 16

3.1.4 Why Data Governance is crucial in ensuring Data Integrity.............................. 19

3.2 Research topic and gap in the literature. .................................................................... 21

3.3 Purposes of the study. ................................................................................................ 22

3.4 Outline of the structure of the thesis .......................................................................... 23

4 Literature Review.............................................................................................................. 24

4.1 Data governance in the pharmaceutical industry ....................................................... 24

4.1.1 Data Lifecycle in BioPharma ............................................................................. 24

4.1.2 Ensuring Data Lifecycle strategies ..................................................................... 27

4.1.3 Data Integrity practices ...................................................................................... 30

4.1.4 Nine common issues with Data Integrity in Pharmaceutical ............................. 32

4.1.5 Ensuring Data Integrity strategies ...................................................................... 44

4.1.6 Opportunities for the future ................................................................................ 52

Anh NGUYEN Page 2 of 128


Msc Health Management & Data Intelligence

4.2 Review the existing data governance models and their applicability in the

pharmaceutical context. ........................................................................................................ 52

4.2.1 Centralized Data Governance Model ................................................................. 53

4.2.2 Federated Data Governance Model .................................................................... 53

4.2.3 Hybrid Data Governance Model ........................................................................ 54

4.2.4 Data Governance in Cloud Computing environments ....................................... 55

4.2.5 Challenges and Considerations in general ......................................................... 65

4.3 Critical success factors (CSFs) for Data Governance in general ............................... 66

4.3.1 Employee data competencies ............................................................................. 67

4.3.2 Clear data processes and procedures .................................................................. 68

4.3.3 Flexible data tools and technologies .................................................................. 69

4.3.4 Standardized easy to-follow data policies .......................................................... 70

4.3.5 Established data roles and responsibilities ......................................................... 70

4.3.6 Clear inclusive data requirements ...................................................................... 71

4.3.7 Focused and tangible data strategies .................................................................. 72

5 Research Methodology ..................................................................................................... 73

5.1 The conceptual model guiding the thesis .................................................................. 73

5.2 Research context and philosophy: The criteria for selecting participants. ................ 74

5.3 Instruments/measures ................................................................................................ 74

5.4 Data analysis approach .............................................................................................. 76

6 Discussion of Research Results combined with Literature Review ................................. 77

Anh NGUYEN Page 3 of 128


Msc Health Management & Data Intelligence

6.1 Data Governance Practices at Pharmaceuticals from Survey Results ....................... 77

6.1.1 Deployment of Data Governance Program ........................................................ 77

6.1.2 Data Governance in Cloud Computing environment ......................................... 82

6.1.3 Data Governance best practices for Policies Compliance in Pharmaceutical .... 85

6.1.4 Awareness about importance of data governance .............................................. 94

6.1.5 Challenges and obstacles of implementing data governance program............... 97

7 Critical Success Factors (CSFs) for Implementing Data Governance in Pharmaceutical

Industry................................................................................................................................... 102

7.1 Sponsorship from Leaders ....................................................................................... 103

7.2 Employee Data Competency Training .................................................................... 104

7.3 Clear Roles and Responsibilities ............................................................................. 105

7.4 Advanced Technology ............................................................................................. 106

7.5 Collaboration ........................................................................................................... 108

7.6 Data Driven Culture................................................................................................. 109

7.7 Clear data processes and procedures ....................................................................... 110

8 Limitation and future research ........................................................................................ 112

9 Conclusion ...................................................................................................................... 113

10 References ....................................................................................................................... 116

Anh NGUYEN Page 4 of 128


Msc Health Management & Data Intelligence

1 Acknowledgments

I would like to begin by expressing my sincere gratitude to the individuals who graciously
participated in the survey, providing invaluable insights into the realm of Data Governance. To
respect their wishes for personal identity confidentiality, I will refrain from revealing their
names here. Your willingness to share your expertise and experiences was instrumental in
shaping this thesis, and I am deeply appreciative of your contributions. Thank you for your
support and commitment to advancing our understanding of Data Governance.

This Master’s thesis would not have been possible without the invaluable contributions,
guidance, and expertise of the data professionals in the field. I am filled with gratitude for the
individuals who have played a pivotal role in shaping this work.

First and foremost, I extend my heartfelt thanks to Arnaud Jaoul, my thesis supervisor and
professor at École des Mines de Saint-Étienne Engineering School for Big Data and Artificial
Intelligence course. His unwavering patience and profound knowledge of Data Management
not only provided me with a clear direction for developing my arguments but also helped me
enhance the quality of my thesis.

I am also grateful to Mathieu Verriere and Bruno Versaevel, both outstanding professors at
emlyon Business School, who facilitated valuable connections with data experts in Lyon,
enabling me to conduct interviews and formulate the survey questions.

My deepest appreciation goes out to all those who generously spared their time to participate
in interviews, thus offering profound insights into the challenges of Data Governance, which
ultimately inspired the formulation of specific questions for the pharmaceutical sector. I would
like to express my heartfelt thanks to:

• Ole Olesen-Bagneux, author of "The Enterprise Data Catalog," for his extensive
knowledge and constructive guidance.

• Laura Madsen, author of "Disrupting Data Governance" and "Data Driven Healthcare:
How Analytics and BI are Transforming the Industry," for sharing her invaluable
perspectives on the meaningful aspects of data governance role, although she always
humorously says, "I hate Data Governance!"

Anh NGUYEN Page 5 of 128


Msc Health Management & Data Intelligence

• Thomas de Charentenay, Data & Business Transformation at Sanofi Pasteur, for his
thoughtful review of my survey questions and advice on how to improve.

• Charlotte Ledoux, Data Governance Manager at Pernod Ricard, for her invaluable
insights into the critical role of C-level sponsorship in the success of data governance
initiatives within change management.

• Pierre-Charles Igras, Data Officer Trading at TotalEnergies in Switzerland, a


distinguished data governance expert and the founder of the Data Literacy Institute
training program, for all of his valuable guidance to improve my data governance
literacy.

• Phil Black, the founder of Data QG, whose data governance education platform
provided me with a deeper understanding of the world of data governance.

• Damien Azzopardi, Founder of Datadidakt, for generously sharing his wealth of


knowledge on data governance and management, as well as dedicating his time to
review and improve my survey questions.

• Juan Felipe Arias Aguirre, Data Scientist at Cityscoot, for offering his valuable insights
into the challenges of data governance.

I would like to say a big "thank you" to each of these people for their constant help and the
important roles they played in making this thesis happen.

Anh NGUYEN Page 6 of 128


Msc Health Management & Data Intelligence

2 Abstract

The field of information systems recognizes the growing importance of data governance in

efficiently handling the ever-increasing volume of data in organizations (Alhassan et al., 2016).

It highlights the need for effective strategies to manage and utilize data resources effectively

(Alhassan et al., 2016). This holds true particularly in the pharmaceutical sector, where data

governance plays a key role in dealing with challenges posed by various data sources that can

make it difficult to share and use data effectively (Truong et al., 2017). Data governance is not

only vital for managing large datasets but also for complying with regulations and maintaining

data accuracy (Cave, 2017).

This research endeavors to comprehensively explore data governance within the

pharmaceutical sector, combining insights drawn from a thorough literature review and a

conducted survey. The survey targeted professionals spanning roles in Data, R&D, Marketing,

Production, Sales, Clinical, and Finance within the pharmaceutical industry. Despite limitations

posed by a modest sample size of 19 and constraints precluding in-depth interviews, the survey

responses provided valuable insights. Quantitative analysis of the survey responses facilitated

a nuanced exploration of various facets of data governance, encompassing challenges and

successful implementation.

The literature underscores the integration of data governance into the quality management

system emerges as a vital strategy to meet regulatory expectations (Khin et al., 2020).

Additionally, the growing influence of artificial intelligence (AI) in decision-making and the

quest for enhanced medicines cannot be underestimated (Marcelo Corrales Compagnucci et al.,

2022). Within this context, challenges related to sustaining data integrity traverse inadequate

Anh NGUYEN Page 7 of 128


Msc Health Management & Data Intelligence

quality culture, organizational and individual behavior, leadership, processes, and technology

(Charoo et al., 2023). Notably, regulatory authorities endorse the adoption of the F.A.I.R and

ALCOA+ principles as guiding tenets for upholding data integrity (Hodgson et al., 2017).

The primary findings have been analyzed by comparing them to the literature review, leading

to the identification of seven Critical Success Factors (CSFs) essential for effective data

governance within the pharmaceutical industry. These factors include securing leadership

sponsorship, providing employee data competency training, clarifying roles and

responsibilities, leveraging advanced technology, promoting collaboration among IT, Data

Departments, and business units, and fostering a data-driven culture. Additionally, transparency

in data processes guided by principles such as F.A.I.R., ALCOA+, and cGMP is considered an

important factor.

While acknowledging the limitations posed by sample size, future research holds the potential

to mitigate these drawbacks and glean deeper insights. Additionally, the consideration of

geographical differences in data governance challenges within the pharmaceutical industry

presents an opportunity for more focused and region-specific investigations. By explaining

these details, this investigation highlights how important data governance is in influencing

healthcare and pharmaceutical progress, emphasizing the balanced relationship between theory

and practical use.

Anh NGUYEN Page 8 of 128


Msc Health Management & Data Intelligence

3 Introduction

In today's digital age, data has become the lifeblood of organizations across various industries,

and the pharmaceutical sector is no exception (Truong et al., 2017). The rapidly growing

volume of data generated within pharmaceutical companies presents both opportunities and

challenges (Alhassan et al., 2016). In the ever-evolving landscape of data infrastructure and

digital solutions, the pharmaceutical industry is witnessing a paradigm shift in how data is

generated, managed, and utilized (Alhassan et al., 2016). With the proliferation of digital

technologies, pharmaceutical companies now have access to an unprecedented amount of data,

including clinical trial results, patient records, real-world evidence, and supply chain

information (Marcelo Corrales Compagnucci et al., 2022). While this data abundance presents

remarkable opportunities, it also poses significant challenges that need to be addressed to fully

leverage the potential of data-driven decision-making (Henstock, 2019).

Data infrastructure in the pharmaceutical sector has become increasingly complex, comprising

diverse sources and formats (Marcelo Corrales Compagnucci et al., 2022). This evolution has

given rise to the critical need for robust data governance (Alhassan et al., 2016). Data

governance is a comprehensive framework encompassing policies, processes, and roles

designed to ensure data quality, security, integrity, and compliance (Ladley, 2019). It addresses

the challenges inherent in managing large and sensitive datasets, safeguarding patient privacy,

and meeting stringent regulatory requirements (Ladley, 2019).

In this context, it is imperative to examine the significance of data governance in the

pharmaceutical industry and understand how it can play a pivotal role in overcoming existing

challenges. The key problem at hand is the effective management and utilization of data to

Anh NGUYEN Page 9 of 128


Msc Health Management & Data Intelligence

make informed decisions while maintaining compliance and data privacy (Hodgson et al.,

2017). Data governance provides a solution to this problem by establishing guidelines for data

access, usage, and accountability throughout the organization's data lifecycle (Weiss, 2022).

The importance of data governance in the pharmaceutical context extends beyond operational

efficiency (Neumeyer, 2020). It is essential for maintaining patient safety and drug efficacy

throughout the entire drug development and post-marketing phases (Neumeyer, 2020). By

adhering to stringent data governance practices, pharmaceutical companies can ensure ethical

and secure handling of patient data, minimizing risks associated with adverse events and

ensuring regulatory compliance (Neumeyer, 2020).

Supply chain optimization is another area where data governance proves invaluable (Koh et al.,

2011). It enables enhanced supply chain visibility, reducing the likelihood of drug shortages

and counterfeit risks, thereby benefitting patients and healthcare providers alike (Bagozzi &

Lindmeier, 2017).

Finally, data governance plays a significant role in driving commercial effectiveness in the

pharmaceutical industry (EMEA, 2010). By leveraging well-governed data, companies can gain

valuable market insights, improve sales force effectiveness, and enhance customer relationship

management, ultimately leading to more effective commercial strategies (EMEA, 2010).

3.1 Overview of Data Governance and its importance in the pharmaceutical industry.

Data governance is often misconstrued as being synonymous with data management. However,

according to DAMA International (2009), “data governance is the exercise of authority and

Anh NGUYEN Page 10 of 128


Msc Health Management & Data Intelligence

control (planning, monitoring, and enforcement) over the management of data assets”. It should

be noted that data governance complements data management but does not replace it (Wende,

2007). Therefore, it is crucial to make a clear distinction between data management and data

governance. Their characteristics explained below will reveal the fundamental contrast between

operational handling of data and strategic oversight of data-related processes and policies:

3.1.1 Data Management

The DAMA Guide to the Data Management Body of Knowledge, authored by DAMA

International in 2009, stands as the definitive reference for all endeavors in the realm of data

management. Often referred to as the "bible" of data management since it is referenced in most

articles about data governance, this authoritative guide offers comprehensive insights and best

practices for professionals seeking to navigate the complexities of data management. According

to DAMA International (2009), data management encompasses the strategic process of

managing and utilizing data and information assets within an organization. It aims to meet the

information requirements of stakeholders by ensuring data availability, security, and quality

(DAMA International, 2009).

Anh NGUYEN Page 11 of 128


Msc Health Management & Data Intelligence

Figure 1 - Data Management Functions (DAMA International, 2009).

Figure 1 above illustrates the ten major constituent functions of data management as outlined

by DAMA International (2009):

1) Data Governance: This function involves the strategic planning, supervision, and

control of data management and utilization. It includes activities such as planning,

monitoring, and enforcement to effectively manage data assets.

2) Data Architecture Management: This function defines the blueprint for managing data

assets. It involves determining the data requirements of the organization and designing

comprehensive master blueprints to fulfill those needs. It also establishes connections

with application system solutions and projects that implement the enterprise

architecture.

Anh NGUYEN Page 12 of 128


Msc Health Management & Data Intelligence

3) Data Development: This function encompasses the activities involved in analyzing,

designing, implementing, testing, deploying, and maintaining solutions to address the

data needs of the organization. It focuses on data-related tasks within the system

development lifecycle (SDLC) and includes data modeling, requirements analysis, and

the design, implementation, and maintenance of data-related components of databases.

4) Data Operations Management: This function involves the planning, control, and support

for structured data assets throughout their lifecycle, from creation and acquisition to

archival and purging.

5) Data Security Management: This function focuses on planning, developing, and

executing security policies and procedures to ensure data privacy, confidentiality, and

appropriate access. It includes mechanisms for authentication, authorization, access

control, and auditing of data.

6) Data Quality Management: This function encompasses planning, implementation, and

control activities that utilize quality management techniques to define, monitor,

improve, and ensure the fitness of data for its intended use.

7) Reference and Master Data Management: This function involves planning,

implementation, and control activities to maintain consistency with a designated

"golden version" of contextual data values. It includes managing and synchronizing

reference and master data across various systems and applications.

8) Data Warehousing and Business Intelligence Management: This function focuses on

planning, implementation, and control processes that facilitate the provision of decision

support data and support for knowledge workers engaged in reporting, querying, and

analysis activities.

Anh NGUYEN Page 13 of 128


Msc Health Management & Data Intelligence

9) Document and Content Management: This function encompasses planning,

implementation, and control activities to store, protect, and provide access to data stored

in electronic files and physical records, including text, graphics, images, audio, and

video.

10) Metadata Management: This function involves planning, implementation, and control

activities aimed at enabling easy access to high-quality, integrated metadata. Metadata

provides essential information about the data assets and their characteristics.

These ten functions collectively form the comprehensive scope of the data management

function as defined by DAMA International (2009). They provide a structured framework for

organizations to effectively manage their data assets and ensure their strategic utilization

(DAMA International, 2009).

As illustrated in Figure 1 above, data governance holds a central and crucial position in the

management of data assets, as depicted in the circular representation of the ten data

management functions. The following section explains in more detail about the Data

Governance concept.

3.1.2 Data Governance

Data governance focuses on determining who holds decision-making authority over data assets

within an organization (Khatri & Brown, 2010), with the aim of ensuring data quality,

consistency, usability, security, privacy, and availability (Panian, 2010).

Anh NGUYEN Page 14 of 128


Msc Health Management & Data Intelligence

Figure 2 - Data Dimensions (Huff et al., 2019)

Data could only bring value to business decision making when its quality and integrity are

ensured (Otto, 2015). As depicted in Figure 2 above, Huff et al. (2019) emphasize five data

dimensions to ensure data integrity, where data governance plays a crucial role in terms of

patient care, treatment plans, record retention, key performances indicators (KPIs) and

scorecards.

As depicted in Figure 2, data integrity and quality emerge as the central facets critical for

ensuring effective decision-making, security, record-keeping, and contractual obligations—all

of which fall under the purview of data governance endeavors (Huff et al., 2019).

Anh NGUYEN Page 15 of 128


Msc Health Management & Data Intelligence

3.1.3 Data Integrity

Data integrity is a critical aspect of data that contributes to its trustworthiness and reliability

(Huff et al., 2019). According to Huff et al. (2019), data integrity refers to the systems and

processes involved in data capture, correction, maintenance, transmission, and retention.

Figure 3 - Schematic of ALCOA+ (Charoo et al., 2023; Rattan, 2018)

To ensure data integrity, several attributes must be upheld, as defined by ALCOA+ (Figure 3)

and elaborated by Rattan (2018). These attributes include (Rattan, 2018):

1) Completeness: Data should be comprehensive and include all relevant information

without any missing or omitted values.

2) Consistency: Data should be coherent and harmonious, with no contradictions or

conflicts between different sources or instances.

Anh NGUYEN Page 16 of 128


Msc Health Management & Data Intelligence

3) Enduring: Data should be retained for the required duration as specified by regulatory

and organizational requirements. It should be stored and maintained in a manner that

ensures its long-term preservation and accessibility.

4) Available: Data should be accessible and retrievable when needed, ensuring its

availability for authorized users or processes. Adequate measures should be in place to

prevent data loss or unavailability due to technical failures, disasters, or other

disruptions.

5) Attributability: Data should be traceable and linked to its source of who collected and

when, allowing for accountability and auditability.

6) Legibility: Data should be clear and easily readable, ensuring its understandability and

accessibility.

7) Contemporaneousness: Data should be recorded in a timely manner, capturing events

and information as they occur.

8) Original or True Copy: Data should be in its original form or an authorized and accurate

reproduction of the original, preserving its authenticity.

9) Accuracy: Data should be precise and free from errors, reflecting the true values and

characteristics it represents.

These characteristics must be maintained throughout the entire data lifecycle which

encompasses various stages such as creation, modification, processing, maintenance, archiving,

retrieval, transmission, and disposal after the designated retention period (Hodgson et al., 2017).

By adhering to the principles of ALCOA+ and ensuring data integrity, organizations can

enhance the reliability, credibility, and usability of their data assets (Hodgson et al., 2017).

Anh NGUYEN Page 17 of 128


Msc Health Management & Data Intelligence

Data integrity serves as a fundamental principle and commitment within the pharmaceutical

industry to ensure the production of safe, effective, and high-quality drugs (Charoo et al., 2023).

By upholding data integrity, the industry aims to comply with established standards and

regulations, reinforcing its dedication to patient safety and public health (Charoo et al., 2023).

In the context of drug manufacturing processes, data integrity plays a vital role in enabling

regulatory authorities, such as the US Food and Drug Administration (FDA) and the United

Kingdom Medicines and Healthcare products Regulatory Agency (MHRA-UK), to effectively

monitor and assess the integrity of data generated throughout the entire lifecycle of a drug

(Charoo et al., 2023). This is crucial for ensuring the reliability and accuracy of data submitted

in support of marketing authorization applications (Charoo et al., 2023).

Recognizing the significance of data integrity, the FDA CDER and MHRA-UK conducted a

joint Good Clinical Practice (GCP) workshop in October 2018 (Khin et al., 2020). The

workshop focused on discussing data integrity and clinical data management within the

pharmaceutical industry (Khin et al., 2020). The outcome of the workshop emphasized that

maintaining data integrity is of paramount importance, as any issues or breaches in data

reliability can have a profound impact on the acceptability of the data submitted for regulatory

review (Khin et al., 2020).

Moreover, violations of data integrity can pose serious safety risks to human volunteers

participating in clinical trials and undermine the regulatory efforts to protect human subjects

(Khin et al., 2020). Therefore, it is imperative for organizations to establish and maintain robust

Anh NGUYEN Page 18 of 128


Msc Health Management & Data Intelligence

procedures for maintaining study blind during clinical trials, ensuring the integrity and

confidentiality of data collected throughout the trial process (Khin et al., 2020).

By prioritizing data integrity, the pharmaceutical industry demonstrates its commitment to

upholding rigorous standards, safeguarding patient safety, and promoting public trust in the

development and manufacturing of pharmaceutical products (Khin et al., 2020).

3.1.4 Why Data Governance is crucial in ensuring Data Integrity

Data governance plays a paramount role in the biopharmaceutical industry by acting as the

guardian of data integrity across every stage of the data lifecycle (Weiss, 2022). Data lifecycle

encompasses various stages including creation, modification, processing, maintenance,

archiving, retrieval, transmission, and disposal of data after the designated retention period

(Hodgson et al., 2017). From the inception of data collection through its processing, analysis,

storage, and eventual utilization, data governance ensures that data remains accurate, consistent,

and trustworthy (Hodgson et al., 2017).

Ensuring data integrity is increasingly crucial in this modern landscape, where the industry

undergoes transformative changes propelled by new drug modalities and advancements in

digital technologies (Marcelo Corrales Compagnucci et al., 2022; Self, 2014; Truong et al.,

2017). Digital transformation initiatives leveraging big data, cloud computing, artificial

intelligence (AI), and the Internet of Things (IoT) are being pursued by many pharmaceutical

companies to improve efficiency and gain a competitive edge (10 Internet of Things (IoT)

Healthcare Examples, 2023; Marcelo Corrales Compagnucci et al., 2022; Selvaraj &

Sundaravaradhan, 2019). However, it is imperative to establish a solid foundation of good data

Anh NGUYEN Page 19 of 128


Msc Health Management & Data Intelligence

governance to ensure compliance with strict regulatory requirements and align with the Industry

4.0 movement (Weiss, 2022).

While digital transformation initiatives are often prioritized by the C-suite, data governance,

data integrity, and regulatory compliance may not always receive sufficient attention (Weiss,

2022). This difference in attention could arise from the perception that data governance lacks

the immediate appeal of subjects such as data analytics or machine learning (Weiss, 2022). Data

governance, in contrast, involves intricate and time-intensive strategies centered around

procedural complexities, rather than yielding the rapid and tangible outcomes associated with

analytics or machine learning (Levy, 2021). Nevertheless, experts underscore the significance

of data governance through the adage "bad input, bad output" (Levy, 2021; Marcelo Corrales

Compagnucci et al., 2022; Vamathevan et al., 2019). Essentially, data governance acts as a

cornerstone, ensuring the integrity of data quality (DAMA International, 2009; Wende, 2007).

This, in turn, safeguards the credibility of results derived from data analytics and machine

learning endeavors, thereby culminating in the delivery of genuine value to the organization

(Vamathevan et al., 2019). Nonetheless, these initiatives present an opportunity to modernize

legacy processes and replace manual methods with integrated systems that can enhance overall

data quality and reduce human effort (Weiss, 2022).

In recent years, there has been a significant increase in FDA regulatory warning letters. Data

integrity issues accounted for 47% of all warning letters issued by the FDA in 2019 and reached

65% by the end of 2021 (Eglovitch, 2022). This trend has prompted pharmaceutical

organizations to reassess their infrastructure and digitalize their business and operational

processes to ensure compliance and mitigate future risks (Eglovitch, 2022).

Anh NGUYEN Page 20 of 128


Msc Health Management & Data Intelligence

The adoption of tools such as electronic laboratory notebooks (ELN), laboratory information

management systems (LIMS), and manufacturing execution systems (MES) has helped the

biopharmaceutical industry improve data integrity and manage operational data (Weiss, 2022).

Large organizations are also investing in centralized data repositories like data lakes where data

is shared and used among business units to support their digitization efforts and break down

data silos (Weiss, 2022). However, integrating these systems while maintaining data integrity

and regulatory compliance remains a significant challenge that necessitates a strong foundation

of good data governance (Weiss, 2022).

In conclusion, data governance is essential for achieving a successful digital transformation in

the biopharmaceutical industry (Alosert et al., 2022; Weiss, 2022; Wise et al., 2018). It ensures

the consistency, coherence, and regulatory compliance of data throughout its lifecycle, enabling

organizations to leverage digital technologies effectively while maintaining data integrity

(Alosert et al., 2022; Rattan, 2018). By prioritizing data governance, the industry can navigate

the challenges of Industry 4.0 and establish a solid framework for future advancements (Alosert

et al., 2022).

3.2 Research topic and gap in the literature.

In the rapidly evolving biopharmaceutical industry, digital transformation initiatives have

gained significant momentum as companies seek to leverage cutting-edge technologies to

improve operational efficiency, reduce costs, and gain a competitive advantage (Leesakul et al.,

2022). However, amidst these transformative efforts, maintaining data integrity and ensuring

regulatory compliance remain critical challenges (Alosert et al., 2022; Buytaert-Hoefen, 2019;

Khin et al., 2020). The effective implementation of digital technologies must be underpinned

Anh NGUYEN Page 21 of 128


Msc Health Management & Data Intelligence

by a strong foundation of data governance, which encompasses the strategic planning, control,

and management of data assets (DAMA International, 2009).

Previous studies have highlighted the importance of data governance and data integrity in

various industries, including the biopharmaceutical sector (Alosert et al., 2022; Khin et al.,

2020; Neumeyer, 2020; Truong et al., 2017). However, there is a limited understanding of the

specific challenges and best practices for implementing data governance in the context of digital

transformation initiatives in the pharmaceutical industry. While some research has explored the

adoption of specific tools and systems to improve data integrity (Hodgson et al., 2017;

Neumeyer, 2020; Rattan, 2018), there is a lack of comprehensive studies that examine data

governance adoption rate and their impact on digital transformation outcomes.

3.3 Purposes of the study.

The purpose of this thesis is to explore and analyze the current rate of adoption and awareness

of data governance importance in facilitating successful digital transformation in the

biopharmaceutical industry. The thesis aims to investigate the current state of data governance

practices, identify challenges and gaps in implementing data governance, and propose strategies

and recommendations to enhance data governance frameworks within the industry.

Additionally, the study will involve conducting surveys with key stakeholders in leading

biopharmaceutical companies, to gain insights into their data governance models, challenges,

opportunities, and change management processes.

By addressing this research gap and incorporating insights from industry, this study aims to

provide valuable insights and guidance for industry practitioners, helping them understand the

Anh NGUYEN Page 22 of 128


Msc Health Management & Data Intelligence

critical role of data governance in digital transformation. The findings and recommendations

will help raise awareness about this subject and equip organizations in the biopharmaceutical

sector with the necessary tools and strategies to ensure regulatory compliance, data integrity,

and successful implementation of Industry 4.0 initiatives.

3.4 Outline of the structure of the thesis

The structure of the thesis is outlined as follows:

1. Literature Review

• Explores data governance in the pharmaceutical industry, focusing on data integrity

practices, common issues with data integrity, the data lifecycle in the BioPharma sector,

and data governance in cloud computing environments.

• Identifies critical success factors (CSFs) for data governance in general.

• Reviews existing data governance models (centralized, federated, hybrid) and assesses

their applicability in the pharmaceutical context.

2. Research Methodology

• Describes the research design and methodology, including the approach to data

collection and analysis.

• Explains the selection criteria for pharmaceutical companies for survey.

• Outlines the instruments and measures used for data collection.

• Details the data analysis techniques and procedures employed.

3. Data Governance Models in Pharmaceutical Companies

• Examines the data governance models implemented in pharmaceutical organizations

based on survey results.

4. Critical Success Factors for Implementing Data Governance in Pharmaceuticals

Anh NGUYEN Page 23 of 128


Msc Health Management & Data Intelligence

• Provides an in-depth analysis of critical success factors for data governance

implementation by identifying key factors necessary for successful data governance

adoption in the pharmaceutical industry.

5. Conclusion

• Summarizes the findings from the literature review and empirical research.

• Discusses the implications of the study for practitioners and suggests future research

directions.

4 Literature Review

4.1 Data governance in the pharmaceutical industry

4.1.1 Data Lifecycle in BioPharma

4.1.1.1 Problems with Data Lifecycle in BioPharma.

This section delves deeply into the data integrity issues discussed in the pharmaceutical context

within the literature. The representation and transmission of data in biopharmaceutical

processes currently lack established standards (Weiss, 2022). Although some standards are

emerging, they are not widely adopted by hardware and software vendors due to the absence of

industry consensus or the immaturity of these standards (Weiss, 2022). This issue is

compounded by insufficient programmatic interfaces provided by vendors, resulting in isolated

data silos and hindered system-to-system integration (Alosert et al., 2022; Weiss, 2022).

An often overlooked yet crucial aspect of data integrity is data contextualization (Buytaert-

Hoefen, 2019; Huff et al., 2019; Rattan, 2018). Simply extracting data from a specific system,

such as chromatography, is insufficient without combining it with information from other

systems, including experimental conditions or sample processing methods (Weiss, 2022).

Anh NGUYEN Page 24 of 128


Msc Health Management & Data Intelligence

Maintaining this contextual information, also known as the "chain of custody," is essential not

only for interpreting experimental findings but also for attaining the necessary business

intelligence to optimize drug candidate attributes or manufacturing processes (Weiss, 2022).

The combination of data from multiple systems is frequently performed manually by operators,

often relying on intermediary tools like spreadsheets (Weiss, 2022). This manual process is

time-consuming and prone to errors, with the risk of mistakes increasing with each additional

transcription step (Weiss, 2022). Moreover, manual transcription workflows necessitate

extensive quality checks to ensure data integrity and compliance with regulatory requirements,

such as 21 CFR Part 11 or GxP (FDA, 2003).

The ability to generate and access high-quality contextualized data poses a significant

bottleneck in implementing advanced analytical techniques like machine learning (Marcelo

Corrales Compagnucci et al., 2022; Vamathevan et al., 2019; Weiss, 2022). Data scientists with

extensive education levels can spend an excessive amount of time searching for, combining,

and cleaning data to generate datasets for training and validating models (Weiss, 2022).

Therefore, addressing the lack of standards, improving programmatic interfaces, and

automating data integration processes are essential for enhancing data integrity and accessibility

in biopharmaceutical processes, enabling more efficient and accurate analysis for decision-

making and optimization (Weiss, 2022).

4.1.1.2 Emerging informatics trends and problems for BioPharma

The integration of digital topography with custom solutions is challenging for many

organizations due to the lack of IT/software development skills, resources, and time, as well as

Anh NGUYEN Page 25 of 128


Msc Health Management & Data Intelligence

varying support from hardware/software vendors (Weiss, 2022). Developing custom software

code for system integration often creates a resource overhead debt that is difficult to maintain

(Weiss, 2022). Therefore, when procuring new informatics systems or instruments, it is crucial

to consider the vendor's API and documentation quality, support services, and examples of

successful integration projects (Weiss, 2022). Hardware/software vendors should adhere to

F.A.I.R principles (findability, accessibility, interoperability, and reusability) in designing their

products to meet the growing demand for integration support (Weiss, 2022).

For small to mid-sized organizations, building an integrated ecosystem of hardware and

software for automation may be beyond their technical and budgetary capabilities (Levy, 2021).

To address this, some companies have developed software solutions that simplify the

integration of laboratory hardware and software platforms (Levy, 2021). These solutions offer

libraries of connectors to common laboratory equipment and informatics applications,

facilitating automated data exchange between systems (Levy, 2021).

The task of correcting, contextualizing, and aligning data from different sources can be

daunting, involving the combination of partial data sets, harmonization of terms and identifiers,

and ensuring data alignment (Buytaert-Hoefen, 2019). Manual data processing by human

operators introduces significant data integrity risks and is prone to errors (Buytaert-Hoefen,

2019). One approach to automate this process is by utilizing a centrally-managed

metadata/ontology library that automatically annotates data or creates knowledge graphs to

integrate and map terms from different systems (Shafiei et al., 2015). Open-source initiatives

and commercial vendors specializing in ontology management, semantic enrichment, and

knowledge graph tools have expanded the availability of such solutions (Shafiei et al., 2015).

Anh NGUYEN Page 26 of 128


Msc Health Management & Data Intelligence

While the primary focus of systems integration and automated data contextualization is on

hardware and operational tools, there are challenges in capturing all observations in the wet-lab

work dynamic (Weiss, 2022). Researchers often document data manually in handwritten notes,

which may be crucial for experiment qualification but may not be included in the electronic

information chain of custody (Weiss, 2022). This introduces risks of data loss, transcription

errors, or intentional omission from the record (Weiss, 2022).

To address this issue, some companies have developed scientifically intelligent digital voice

assistants that enhance the lab workflow and data capture process (Weiss, 2022). These

assistants provide voice instructions to guide users through complex protocols, import data

directly from lab equipment, and capture important observations and notes by transcribing voice

dictation (Weiss, 2022). These digital tools can operate independently or integrate with other

systems such as ELN (Weiss, 2022).

4.1.2 Ensuring Data Lifecycle strategies

Achieving the goals of automation and analytics in the context of Industry 4.0 necessitates the

improvement of data governance practices and system integration (Alosert et al., 2022). The

FDA has adopted the ALCOA+ framework to establish its expectations regarding data integrity,

which helps industry professionals comply with 21 CFR Part 11 (Weiss, 2022). According to

ALCOA, data must be attributable, legible, contemporaneous, original, and accurate (Chubb,

2021). Building on these principles, ALCOA+ further specifies that data must also be complete,

consistent, enduring, and available (Alosert et al., 2022).

Anh NGUYEN Page 27 of 128


Msc Health Management & Data Intelligence

In the realm of scientific data management and stewardship, the F.A.I.R Principles, established

in 2014, advocate for similar concepts to address broader challenges related to data integration

and system automation (Chubb, 2021). The F.A.I.R Principles emphasize the findability,

accessibility, interoperability, and reusability of data (IDBS, s. d.). They promote the adoption

of a unified system that enables seamless communication between different systems in a

machine-readable format, thereby generating data that is comprehensible, reusable, and

contextualizable (Chubb, 2021). While more of a design principle than a standard, the F.A.I.R

Principles have gained popularity as a valuable tool for data management and stewardship

(Chubb, 2021). In the pharmaceutical industry, the implementation of F.A.I.R principles is

expected to bring several benefits:

1. Accelerated innovation: By implementing FAIR principles and making numerous data

sources available, the pharmaceutical industry can leverage a wide range of data to drive

innovation. This access to diverse data sources can contribute to the discovery of new

insights and the development of novel approaches (Holub et al., 2017; Roe, 2021).

2. Reduced time frames in drug discovery: The ready accessibility of data enabled by

FAIR implementation can expedite the drug discovery process. Researchers can easily

access and utilize relevant data, leading to more efficient and timely decision-making

(Fox, 2019; Wise et al., 2019).

3. Elimination of data silos: FAIR implementation promotes the elimination of data silos

within organizations. This fosters internal and external collaboration, as data becomes

more discoverable, accessible, and reusable across different teams and stakeholders.

Anh NGUYEN Page 28 of 128


Msc Health Management & Data Intelligence

Breaking down data silos encourages collaboration and knowledge sharing, leading to

enhanced productivity and innovation (Van Vlijmen et al., 2020; Wise et al., 2018).

4. Facilitated use of sophisticated analytical methods: FAIR implementation provides a

solid foundation for leveraging advanced analytical methods, such as artificial

intelligence (AI) and other cutting-edge technologies. The availability of FAIR data

allows for more effective utilization of these sophisticated analytical methods, enabling

researchers to gain deeper insights and make more informed decisions (Fleming, 2018;

Vamathevan et al., 2019).

F.A.I.R. and ALCOA+ principles complement each other by focusing on different aspects of

data management. F.A.I.R. emphasizes the importance of metadata in enhancing the reliability

of electronic data capture, while ALCOA+ addresses data integrity challenges to improve the

trustworthiness of the data output (Weiss, 2022). By embracing both sets of principles,

organizations can work towards creating a more comprehensive and reliable data management

system (Chubb, 2021).

It should be highlighted that the alignment of research and development (R&D) datasets with

FAIR principles does require a significant investment of time, effort, and resources (Alharbi et

al., 2021). It is a complex process that involves various steps, including data management,

standardization, documentation, and infrastructure development (Alharbi et al., 2023).

According to Alharbi et al. (2023), to align R&D datasets with FAIR principles, collaboration

between multiple pharmaceutical stakeholders is crucial. This collaboration may involve

Anh NGUYEN Page 29 of 128


Msc Health Management & Data Intelligence

researchers, data scientists, IT professionals, data managers, and other relevant stakeholders

within the organization (Alharbi et al., 2023). It may also extend to external collaborations with

academic institutions, regulatory bodies, and other industry partners (Alharbi et al., 2023).

4.1.3 Data Integrity practices

The pharmaceutical industry is currently facing significant concerns regarding data integrity,

as evidenced by recent FDA observations and warning letters (Eglovitch, 2022). To address the

underlying causes of data integrity problems, a comprehensive approach is necessary,

considering various factors such as quality culture, organizational behavior, leadership,

processes, and technology (Charoo et al., 2023).

According to 501(a)(2)(B) of the FD&C Act, a drug can be considered impure if it doesn't

follow the current good manufacturing practice (cGMP) standards during its production,

processing, packaging, or storage. potentially leading to issues related to safety, identity,

strength, quality, and purity (Code of Federal Regulations, 2022). Studies indicate that

approximately 1 in 10 medical products in developing countries may be substandard or falsified,

while 62% of drug shortages in the United States between 2013 and 2017 were attributed to

manufacturing or product quality problems (Bagozzi & Lindmeier, 2017). Non-compliance

with data integrity requirements can result in unvalidated results, leading to potential risks such

as post-marketing issues and product recalls (Charoo et al., 2023). In this context, the

significance of data integrity cannot be overstated, as it reinforces the pharmaceutical industry's

commitment to manufacturing drugs that are safe, effective, and compliant with quality

standards (Alosert et al., 2022; Rattan, 2018). Data integrity also serves as a critical tool for

regulatory authorities in safeguarding public health (Charoo et al., 2023).

Anh NGUYEN Page 30 of 128


Msc Health Management & Data Intelligence

Violations of data integrity in relation to current Good Manufacturing Practice (cGMP) have

prompted regulatory actions such as warning letters, import warnings, and consent decrees

(Charoo et al., 2023). The Code of Federal Regulations (CFR) outlines the essential criteria of

cGMP data integrity in 21 CFR 211 and 212 (Figure 4), concerning records, data storage,

retaining records, and production/control records to ensure that pharmaceuticals adhere to

standards of safety, identity, strength, quality, and purity (Code of Federal Regulations, 2022).

Figure 4 - Data integrity requirements in 21 CFR 211 and 212. (Charoo et al., 2023)

Anh NGUYEN Page 31 of 128


Msc Health Management & Data Intelligence

To gain a deeper understanding of the importance of adhering to cGMP (current Good

Manufacturing Practices), it is crucial to analyze common issues related to data integrity. The

following analysis helps in defining the risk of non-compliance.

4.1.4 Nine common issues with Data Integrity in Pharmaceutical

In the highly regulated pharmaceutical industry, companies are required to operate with great

care and rigor to meet the stringent standards set by regulatory authorities (Neumeyer, 2020).

However, it's important to recognize that some practices can have significant compliance

implications (Neumeyer, 2020). Nine common issues highlighted below aim to raise awareness

within the pharmaceutical sector about the importance of avoiding these pitfalls to ensure

compliance with the law.

4.1.4.1 Data Retention

Data retention in the pharmaceutical industry refers to the practice of retaining and preserving

various types of data and records related to drug manufacturing, testing, quality control, and

regulatory compliance for a specific period of time as mandated by regulatory authorities

(Alosert et al., 2022). The specific duration for which data must be retained can vary depending

on the type of data and the applicable regulations, such as those set forth by the U.S. Food and

Drug Administration (FDA) in the United States and similar agencies in other countries

(Neumeyer, 2020).

The deliberate destruction of production, control, distribution, and quality records of the

manufacturing facility, before a scheduled FDA inspection, as reported by Neumeyer (2020)

and Cox (2019), is a serious violation of data integrity and regulatory requirements. Such

Anh NGUYEN Page 32 of 128


Msc Health Management & Data Intelligence

actions undermine the accuracy and reliability of data, raising concerns about the integrity of

manufacturing processes and the quality of pharmaceutical products (Cox, 2019). This not only

exposes companies to regulatory sanctions but also poses potential harm to patients relying on

these products for their health and well-being (Neumeyer, 2020).

To prevent breaches of data integrity and ensure the quality and safety of pharmaceutical

products, proper data governance, management oversight, and adherence to current Good

Manufacturing Practice (cGMP) regulations are essential, as highlighted by Charoo et al.

(2023). Unfortunately, data from 2005 to 2017 reveal that a significant percentage (23%) of

warning letters issued by regulatory bodies cited the deletion or destruction of cGMP original

records, which are crucial for demonstrating compliance (Charoo et al., 2023).

The Code of Federal Regulation (CFR) 211.180 specifies that production, control, or

distribution records necessary for cGMP compliance should be retained for a specified period

(Code of Federal Regulations, 2022). For example, the records should be kept for at least one

year after the batch expiration date or three years after the distribution of over-the-counter

(OTC) without an expiration date (Code of Federal Regulations, 2022). Critical records, such

as batch documents, marketing authorization application data and traceability data, may require

long-term retention with appropriate archival systems to ensure data integrity during extended

storage (MHRA, 2015). It is essential to retain data in its original form or true copies, such as

photocopies, microfilm, or microfiche, or any other form that can accurately replicate the

original records to ensure data security (MHRA, 2015). Once the retention period has expired,

the documents should be disposed of following the prescribed procedure, and a record or

register should be maintained to demonstrate proper and timely archiving or destruction of

Anh NGUYEN Page 33 of 128


Msc Health Management & Data Intelligence

retired records in compliance with Good Manufacturing Practice (GMP) requirements (PIC,

2021).

EudraLex Volume 4 Annex 11 emphasizes the importance of incorporating a system-generated

audit trail for GMP-relevant changes and deletions (EudraLex, 2011). GMP-relevant data

should not be altered or deleted without proper justification, and audit trails should be easily

accessible, convertible to widely understandable formats, and subject to routine review to

comply with regulatory requirements (McDowall, 2020). Examples of GMP-relevant data

changes may include adjustments to sample weight, sequence aborts, batch number changes for

samples, and manual integration of High-Performance Liquid Chromatography (HPLC)

chromatograph peaks (McDowall, 2020).

4.1.4.2 Managing out of specification (OOS) results

The failure of a laboratory to integrate chromatography peaks according to standard operating

procedures, disabled peak detection, inadequate investigation of unknown peaks and Out-Of-

Specification (OOS) results is a serious concern in terms of data integrity and regulatory

compliance (Charoo et al., 2023). These actions undermine the reliability and accuracy of

laboratory data, which is critical for ensuring the quality and safety of pharmaceutical products

(Charoo et al., 2023; Neumeyer, 2020).

In response to such issues, the FDA has released a draft guidance titled "Submission of Quality

Metrics Data" to develop compliance and inspection policies and practices (FDA, 2016). This

guidance aims to utilize quality metrics, including the Invalidated Out-of-Specification Rate

(IOOSR), for various usage purposes such as scheduling drug manufacturer inspections based

Anh NGUYEN Page 34 of 128


Msc Health Management & Data Intelligence

on risk assessment, forecasting and mitigating drug shortages, and promoting the

implementation of innovative quality management systems in pharmaceutical manufacturing

(FDA, 2016).

The IOOSR is a measure defined by the FDA as the ratio of invalidated out-of-specification

(OOS) test results for lot release and long-term stability testing, caused by measurement process

aberration, to the total number of lot release and long-term stability OOS test results within a

specified reporting timeframe (FDA, 2016). This metric provides valuable insights into the

reliability and accuracy of testing processes and can help identify potential issues with data

integrity and laboratory practices (FDA, 2016).

By implementing quality metrics like the IOOSR, regulatory authorities can gain better

visibility into the integrity of laboratory data, identify areas of concern, and take appropriate

regulatory actions to ensure compliance and the production of safe and effective pharmaceutical

products (FDA, 2016).

4.1.4.3 Metadata and audit trail

Metadata refers to structured data that provides information about the context and management

of the data (DAMA International, 2009). While an audit trail is a secure, electronically

generated record that contains a time-stamped sequence of events related to the creation,

modification, or deletion of electronic records (Shafiei et al., 2015).

Anh NGUYEN Page 35 of 128


Msc Health Management & Data Intelligence

According to Charoo et al. (2023) both metadata and audit trails play crucial roles in

maintaining cGMP-compliant record-keeping procedures, ensuring data integrity, and

facilitating data retrieval and use.

For example, in 2015, FDA addressed a warning letter to Zhejiang Hisun Pharmaceutical

Company because of a significant deviation from current Good Manufacturing Practice (cGMP)

regulations, which included the deletion of failing data and the repetition of tests until passing

results were obtained. Moreover, the letter indicated that supporting raw data was deleted,

metadata was not archived, and audit trails were unavailable (FDA warning letter, 2015).

To accurately reconstruct cGMP activities, both the data and its associated metadata must be

retained throughout the specified retention period, preserving their relationships in a secure and

traceable manner (Charoo et al., 2023). Examples of metadata components for a specific set of

data may include the date/time stamp, user ID of the person conducting the test or analysis,

instrument ID used for data acquisition, material status data, material identification number,

and audit trails (Charoo et al., 2023).

On the other hand, audit trails enable the reconstruction of activities performed on electronic

records (Charoo et al., 2023). For instance, an audit trail for a High-Performance Liquid

Chromatography (HPLC) run may include information such as the username, date/time of the

run, integration parameters used, details of any reprocessing performed, and documentation

justifying the need for reprocessing (Charoo et al., 2023). Audit trails are essential for

maintaining data integrity, ensuring traceability, and supporting regulatory compliance in

GMP-compliant record keeping (Shafiei et al., 2015).

Anh NGUYEN Page 36 of 128


Msc Health Management & Data Intelligence

In line with FDA recommendations (2018), it is crucial to securely preserve the backup (actual

copy) of the original data, including metadata, throughout the entire records retention period.

Metadata is treated as an integral part of the backup data (FDA, 2018). Electronic data generated

to meet cGMP standards should also include relevant metadata, which must be evaluated as

part of the batch release criteria (Charoo et al., 2023). This ensures that both the original data

and its accompanying metadata are preserved and assessed for compliance with cGMP

requirements during the records retention period (Charoo et al., 2023).

To mitigate the risk of inadvertent data loss or deletion, a risk mitigation approach such as

server-based data gathering with overnight archiving frequency and restricted access to

laboratory staff can be implemented (Charoo et al., 2023). Cloud-based storage applications

that are commercially available and compliant with regulatory requirements can be utilized for

long-term data retention in a cost-effective manner (Charoo et al., 2023). These cloud-based

storage solutions typically have secure protocols in place to control data entering and leaving

the cloud, ensuring data security and integrity (Charoo et al., 2023). However, it is essential to

ensure that the chosen cloud-based storage solution meets all relevant regulatory requirements

for data retention, security, and privacy (Charoo et al., 2023). Appropriate measures should be

taken to safeguard against unauthorized access or data breaches (Charoo et al., 2023). Regular

reviews and audits of the cloud-based storage system's security measures should also be

conducted to ensure ongoing compliance with regulatory requirements (Charoo et al., 2023).

Besides, it is crucial to note that the European Health Data Space constitutes a foundational

element of the robust European Health Union and represents the inaugural EU data space

tailored to a specific domain within the broader European data strategy (European Health Data

Anh NGUYEN Page 37 of 128


Msc Health Management & Data Intelligence

Space, 2023). It extends the regulatory framework established by the General Data Protection

Regulation (GDPR), the proposed Data Governance Act, the draft Data Act, and the Network

and Information Systems Directive (European Health Data Space, 2023).

The French Digital Healthcare Agency (ANS) sets a rigorous framework for practices related

to hosting healthcare data, in which HDS certification is a mandatory requirement (HDS

Certification, s. d.). After initial approval in 2016, OVHcloud received HDS certification in

2019, so that all of its healthcare sector customers could benefit from this guarantee (HDS

Certification, s. d.).

4.1.4.4 Testing into compliance

The practice of pre-injections, also known as "testing into compliance," before the official

sample sequence is not supported by credible scientific evidence and is considered a violation

of cGMP guidelines (Charoo et al., 2023).

According to the FDA (2018), companies should use scientific principles to determine the

number of retests that can be performed, and this number should be predetermined in the

Standard Operating Procedures (SOP) and not changed based on the results obtained during

analysis. The SOP should also specify the time frame within which repeat testing should not be

performed, and any deviations from this procedure should be recorded and justified in

accordance with regulatory requirements (FDA, 2018).

Furthermore, the FDA (2018) recommends that if additional testing is deemed necessary, a

protocol should be created and approved by the facility's quality unit. This protocol should

Anh NGUYEN Page 38 of 128


Msc Health Management & Data Intelligence

outline the additional analytical testing to be conducted and provide details on the scientific

and/or technical handling of the data (FDA, 2018). It is crucial to follow proper procedures and

document any deviations from established protocols or procedures to ensure compliance with

cGMP requirements and uphold data integrity (FDA, 2018).

4.1.4.5 Contemporaneous

The contemporaneous requirement states that data should be recorded as it occurs (EMEA,

2010). To address this issue, analysts should be trained to document their work when it is

finished, and procedures should be in place to ensure adherence to this practice (Charoo et al.,

2023).

Neumeyer's study (2020) revealed that logbooks and batch records were signed by employees

much later than their execution date, which directly conflicts with the "contemporaneous"

requirement of ALCOA+ (EMEA, 2010). Fortunately, there are commercially available

software platforms that can record activity details in real-time and in a secure audit trail that

cannot be edited, providing a solution to improve contemporaneous documentation practices

(Charoo et al., 2023).

Another violation of the contemporaneous requirement was observed during the tablet

manufacturing process, where tablet samples taken at different time intervals were tested in an

area outside the tableting room, but no checks were made to ensure the accuracy and traceability

of the results (Neumeyer, 2020). This violates Chapter 4(4.8) of EU GMP, which states that

records should be made or completed at the time each action is taken, and all significant

Anh NGUYEN Page 39 of 128


Msc Health Management & Data Intelligence

activities related to the manufacture of medicinal products should be traceable (Neumeyer,

2020).

The use of electronic laboratory notebook (ELN) systems can contribute to improving

contemporaneous documentation practices (Charoo et al., 2023). ELN systems allow for real-

time recording of data as activities are conducted, ensuring compliance with the

contemporaneous requirement (Charoo et al., 2023). Furthermore, any changes or edits made

to the original data are recorded with a date, time, and signature stamp, enhancing transparency

and accountability in the documentation process (Charoo et al., 2023). By utilizing ELN

systems, discrepancies can be minimized, and data can be accurately recorded and maintained

according to regulatory requirements (Charoo et al., 2023).

4.1.4.6 Aborting runs

Aborted chromatography sample set runs with deleted data are not in compliance with cGMP

requirements (FDA, 2018). While occasional aborted runs may occur, it is important that the

data from these runs is not deleted, and investigations should be conducted to determine the

cause of the abort (Charoo et al., 2023). All data generated, including aborted runs, should be

retained as per FDA recommendations (FDA, 2018).

Processes should be designed to ensure that data cannot be modified without proper record-

keeping of the modification, including aborted injections (FDA, 2018). This prevents runs from

being intentionally aborted to avoid generating out-of-specification (OOS) results, thus

ensuring data integrity and compliance with regulatory requirements (FDA, 2018). By retaining

Anh NGUYEN Page 40 of 128


Msc Health Management & Data Intelligence

all data, including aborted runs, and conducting appropriate investigations, companies can

demonstrate their commitment to data integrity and regulatory compliance (FDA, 2018).

4.1.4.7 Original, accurate and available

The inability of the company to locate raw data from a standard curve and the use of papers for

recording data in an analytical assay are clear violations of cGMP standards (FDA, 2018).

To ensure good document control, it is recommended to use bound paginated notebooks that

are stamped for official use by a document control group (Neumeyer, 2020). These notebooks

allow for easy detection of unofficial notebooks and help identify any gaps in notebook pages,

ensuring the completeness and integrity of data (Neumeyer, 2020).

If blank forms are used for data recording, they should be controlled by the Quality Assurance

(QA) department and reconciled upon completion to ensure that no pages are missing or

tampered with (Code of Federal Regulations, 2022). Recording data on paper and later

transcribing it into a permanent laboratory notebook violates cGMP standards, as data should

be recorded at the time of performance and saved in a way that accurately replicates the original

records (Code of Federal Regulations, 2022; FDA, 2018).

The use of unofficial and uncontrolled electronic spreadsheets for storing initial data is also a

violation of cGMP standards, according to the letter sent to FACTA Farmaceutici S.p.A (FDA

warning letter, 2017). It is crucial for data to be properly recorded, stored, and accessible to

analysts and operators for retrieval and review during inspections or important quality decisions

related to batch disposition (Code of Federal Regulations, 2022).

Anh NGUYEN Page 41 of 128


Msc Health Management & Data Intelligence

Proper training of analysts and operators is essential to ensure compliance with cGMP

requirements and enable them to retrieve all data related to cGMP processes when needed (Code

of Federal Regulations, 2022). This includes training on data recording practices, document

control procedures, and the importance of accurate and accessible data for regulatory

compliance (Code of Federal Regulations, 2022).

4.1.4.8 Access to computer systems

FDA's finding indicates the lack of appropriate controls over computer systems and access

privileges at BBC Group Limited (FDA warning letter, 2021). Failure to establish proper

controls and access rules for computer systems is a violation of regulatory requirements,

specifically 21 CFR 211.68(a) (Code of Federal Regulations, 2022).

To address these issues and ensure compliance with cGMP guidelines, it is essential for the

company to implement robust controls and access rules for their computer systems (Charoo et

al., 2023). The FDA (2018) recommends assigning the role of system administrator to personnel

who are not responsible for the content of the records. This separation of duties helps prevent

unauthorized modifications to records and ensures the integrity and security of the data (FDA,

2018).

Maintaining a record of authorized personnel with access privileges to each cGMP computer

system is crucial (Charoo et al., 2023). This can be accomplished through a list of authorized

individuals, clearly documenting who has access to the system (Code of Federal Regulations,

2022). Only authorized personnel should be allowed to make changes to computerized

Anh NGUYEN Page 42 of 128


Msc Health Management & Data Intelligence

controlled records, including cGMP records and inputting laboratory data into computerized

systems (Code of Federal Regulations, 2022).

Standard Operating Procedures (SOPs) should define the roles and responsibilities of the system

administrator, as well as the access privileges for each cGMP computer system in use (Charoo

et al., 2023). It is important to assign the system administrator role to personnel who are

independent from those responsible for the record content, such as laboratory personnel

(Charoo et al., 2023). SOPs should outline the specific access privileges and responsibilities of

the system administrator and authorized personnel to ensure appropriate controls are in place

to prevent unauthorized access, changes, or deletions of records (Charoo et al., 2023).

4.1.4.9 Data integrity challenges with contract development and manufacturing organizations

In the pharmaceutical industry, legacy supply agreements often lack the necessary transparency

and data integrity requirements for regulatory compliance and adherence to Good

Manufacturing Practices (GMP) (Charoo et al., 2023). Even well-prepared agreements may not

fully address the challenges arising from the absence of a digital data management system in

many Contract Development and Manufacturing Organizations (CDMOs) (Charoo et al., 2023).

The reliance on spreadsheets for data sharing is common but susceptible to errors and

manipulation (Schell, 2019). Moreover, the lack of a digital data management system can lead

to delays in critical activities such as technology transfer, product scale-up, data submissions,

and regulatory approvals (Schell, 2019).

To ensure compliance with GMP guidelines, contract manufacturing facilities must establish

robust controls to guarantee the accuracy and integrity of data and test results (Friedman, 2012).

Anh NGUYEN Page 43 of 128


Msc Health Management & Data Intelligence

Manufacturers or sponsors should thoroughly review the data generated by the contracted

facility as part of their quality assurance process before releasing the product (Friedman, 2012).

Outsourcing companies should ensure that their contractors have comparable data governance

systems and include data integrity requirements in their contractor qualification program

(Friedman, 2012). Manufacturers should conduct audits of the internal practices of CDMOs to

ensure compliance with data integrity requirements (Friedman, 2012). The contractor's data

governance system should be periodically assessed, and the agreement should include

provisions for process control and data integrity measures such as computer system validation,

access privileges, metadata, audit trails, and backups (Unger, 2017).

In addition to the issues discussed, there are supplementary strategies available to ensure data

integrity in the pharmaceutical context:

4.1.5 Ensuring Data Integrity strategies

To ensure data security and reliability, companies should implement data management policies,

and integrate them into their quality systems (Pérez, 2017). Organizational controls include

instructions for record completion, retention of records, staff training, authorization for data

generation/approval, and design of data governance systems, while technical controls include

computerized system validation, qualification, control, and automation (Pérez, 2017).

Implementing both organizational and technical controls facilitates effective data management

and ensures data integrity (Pérez, 2017).

Anh NGUYEN Page 44 of 128


Msc Health Management & Data Intelligence

To address data integrity issues, the following measures can be adopted (Shafiei et al., 2015):

4.1.5.1 Validation based on sound risk assessment principles.

To ensure data integrity, minimize the risk of data loss, and maintain compliance with

regulatory requirements, the following measures should be taken into consideration based on a

risk assessment (Shafiei et al., 2015):

1. Validate Electronic Data Storage: All electronic data storage locations, including

printouts and system-generated reports, should undergo validation to ensure their

integrity and reliability. This validation process should be based on a risk assessment

that considers the criticality of the data and the potential impact on product quality.

2. Define Roles and Responsibilities: Clearly define the roles and responsibilities for

validation and revalidation activities. This includes identifying individuals or teams

responsible for conducting the validation, establishing procedures for data storage

validation, and defining criteria for assessing the integrity and reliability of the data

storage locations.

3. Enhance Storage Solutions: Implement measures to enhance the storage solutions and

protect data from unauthorized access, loss, or modification. This may involve using

secure storage systems, encryption techniques, access controls, and backup and

recovery processes to ensure the integrity and availability of the data.

4. Risk-Based Approach to Data Governance: Adopt a risk-based approach to data

governance, considering the criticality of the data and the potential risks associated with

its modification or deletion. This involves identifying and assessing the risks to data

Anh NGUYEN Page 45 of 128


Msc Health Management & Data Intelligence

integrity, prioritizing resources and controls based on the level of risk, and

implementing appropriate measures to mitigate those risks.

5. Regular Data Validation and Revalidation: Establish a process for regular data

validation and revalidation to ensure ongoing compliance with data integrity

requirements. This includes conducting periodic assessments of the data storage

locations, reviewing, and updating validation procedures as needed, and performing

revalidation activities when changes or upgrades are made to the storage systems.

4.1.5.2 Defining system access privileges.

To ensure strict control over data access and enhance data security, identify vulnerable points,

and mitigate the risks associated with unauthorized data access, the following measures should

be implemented (Shafiei et al., 2015):

1. Authorization and Role-Based Access Control: Define clear roles and permissions for

administrators and content record personnel. Implement a role-based access control

(RBAC) system where access to data is granted based on the specific tasks and

responsibilities assigned to everyone. This approach ensures that only authorized

personnel have access to data and limits access to sensitive information on a need-to-

know basis.

2. Permissions Management Policy: Establish a permissions management policy that

outlines the criteria and procedures for granting and revoking access privileges. Access

should be granted based on a need-to-know basis, considering the specific job functions

Anh NGUYEN Page 46 of 128


Msc Health Management & Data Intelligence

and responsibilities of individuals. Regular reviews should be conducted to ensure that

access permissions are aligned with the current roles and responsibilities of personnel.

3. User Authentication and Strong Password Policies: Implement robust user

authentication mechanisms to verify the identity of individuals accessing the data. This

can include the use of strong passwords, multi-factor authentication, and periodic

password updates. Enforce password policies that require the use of complex passwords

and discourage password sharing or reuse.

4.1.5.3 Back up.

Data backup is a critical aspect of data integrity and plays a crucial role in ensuring data safety

and reliability. Here are some important considerations related to data backups and archives to

ensure data integrity (Shafiei et al., 2015):

1. Comprehensive Backup Strategy: The backup files should include all data from the

original record, along with its associated metadata. It is essential to ensure that the

backup process captures accurate and complete data to maintain data integrity. The

backups should be performed on a regular basis, ideally daily, to minimize the risk of

data loss.

2. Data Protection: Backup files must be protected from loss, erasure, or alteration.

Implementing appropriate data protection measures, such as secure storage, encryption,

and access controls, helps prevent unauthorized access and maintains the integrity of

the backup data. Regular validation of the backup process is essential to ensure the

effectiveness and reliability of the backups.

Anh NGUYEN Page 47 of 128


Msc Health Management & Data Intelligence

3. Retrieval of Backup Information: During internal audits or in case of data discrepancies,

the ability to retrieve backup information is crucial. It facilitates the verification of data

accuracy, completeness, and consistency. The backup data serves as a reference point

for validating and cross-checking the integrity of the primary data.

4. Electronic Archives: Electronic archives, where long-term storage of data is maintained,

should be validated, secured, and controlled throughout the entire data lifecycle. This

ensures that the archived data remains intact and accessible over time. Adequate

controls should be in place to prevent unauthorized modifications or deletions and to

ensure the integrity of the archived data.

4.1.5.4 Training staff and raising awareness.

Training and educating staff members about data security policies and data integrity guidelines

are crucial for ensuring the successful implementation of data security processes. Here are some

important points to consider (Shafiei et al., 2015):

1. Staff Training: Staff members should receive comprehensive training on data integrity

guidelines, data management practices, and the importance of data consistency and

reliability. This training should cover topics such as data security measures, proper data

handling procedures, access controls, and the consequences of data integrity lapses. By

equipping employees with the necessary knowledge and skills, organizations can

promote a culture of data integrity.

2. Accountability: Individuals responsible for data integrity lapses should be identified and

appropriately addressed. This may involve removing them from positions where they

Anh NGUYEN Page 48 of 128


Msc Health Management & Data Intelligence

could influence current Good Manufacturing Practice (cGMP) or drug application data

to prevent any potential compromise of data integrity. Holding individuals accountable

for their actions reinforces the importance of data integrity and sends a clear message

that non-compliance will not be tolerated.

3. Establishing a Culture of Integrity: Creating a culture of integrity and data management

within the organization is essential. This involves fostering an environment where

employees understand the significance of data integrity, take ownership of their

responsibilities, and actively contribute to maintaining data consistency and reliability.

Leadership should promote and prioritize data integrity as a core value, setting an

example for others to follow.

4. Incentives for Compliance: Offering incentives can be an effective way to encourage

employees and pharmaceutical companies to comply with data integrity regulations.

Incentives, such as recognition or rewards, can be provided for periods of time without

any data integrity issues or for implementing effective data security measures. These

incentives can motivate employees and organizations to maintain high standards of data

integrity (Yang et al., 2010).

4.1.5.5 Third party auditor.

The FDA (2018) emphasizes the importance of engaging a third-party auditor to conduct an

impartial evaluation of data integrity practices within organizations. This independent

assessment enables the identification of any existing gaps or areas for improvement,

Anh NGUYEN Page 49 of 128


Msc Health Management & Data Intelligence

empowering organizations to proactively address issues and implement necessary corrective

actions (FDA, 2018).

Furthermore, the implementation of mechanisms like anonymous reporting systems can foster

an environment where employees feel comfortable reporting suspected data integrity breaches

without fear of retaliation (FDA, 2018). These reporting systems encourage transparency and

facilitate the identification and resolution of data integrity concerns within the organization

(FDA, 2018; Shafiei et al., 2015).

Data governance plays a crucial role in overseeing data integrity practices and ensuring

adherence to established policies and procedures (FDA, 2018; Shafiei et al., 2015). By

establishing robust data governance frameworks, organizations can effectively monitor and

regulate data integrity throughout its lifecycle (FDA, 2018; Shafiei et al., 2015).

Improving quality oversight is another essential aspect of addressing data integrity issues. This

can involve implementing enhanced documentation practices, increasing staff training on data

integrity guidelines and best practices, and refining existing processes to better align with data

integrity requirements (Shafiei et al., 2015).

4.1.5.6 Technology and automation.

The integration of technology and automation in existing systems has proven to be instrumental

in enhancing compliance with data integrity regulations, as highlighted by Shafiei et al. (2015).

Automating data collection processes has been shown to minimize incorrect entries, ensure the

Anh NGUYEN Page 50 of 128


Msc Health Management & Data Intelligence

completion and verification of all fields in patient information, and achieve a comprehensive

data set with 100% completeness (Charoo et al., 2023).

In 2003, the FDA introduced the concepts of quality by design (QbD) and process analytical

technology (PAT) with the aim of incorporating quality into the product manufacturing process

from its inception (FDA, 2004). PAT tools enable real-time measurement of critical process

parameters, facilitating timely product release for commercial distribution (EMEA, 2010).

Apart from providing competitive advantages such as low rejection rates, cost-effective

analysis, and high yield, these tools also possess robust data integrity features (Spivey, 2022).

A recent proposal by Floryanzia et al. (2022) introduces the Computer Vision for Disintegration

(CVD) system, which can be employed alongside traditional tablet disintegration testers to

monitor tablet pieces and differentiate them from the surrounding liquid. By utilizing machine

learning models to analyze and interpret data captured by cameras, the CVD system offers high

efficiency and improves our understanding of the disintegration mechanism (Floryanzia et al.,

2022; Vamathevan et al., 2019).

To enhance data security and integrity, it is essential to upgrade computer systems and software,

minimizing potential security risks (Shafiei et al., 2015). Regular software updates should be

ensured, and robust access controls and permissions should be implemented (Shafiei et al.,

2015). Utilizing advanced technologies like data encryption, firewalls, and intrusion detection

systems can provide an additional layer of protection against data breaches (Shafiei et al., 2015).

Anh NGUYEN Page 51 of 128


Msc Health Management & Data Intelligence

4.1.6 Opportunities for the future

As the landscape of drug development evolves with the introduction of newer modalities,

questions arise regarding the suitability of legacy informatics and hardware systems for small-

molecule drug development (Weiss, 2022). Furthermore, the emergence of mid-market

biotechnology companies with substantial purchasing power and little to no legacy

infrastructure presents an opportunity for a more innovative and disruptive approach to

traditional informatics (Weiss, 2022).

The adoption of these newer technologies is driven by their ability to embrace a holistic

integrated approach to BioPharma Lifecycle Management (Weiss, 2022). This approach entails

leveraging out-of-the-box workflow execution, preconfigured system and hardware

integrations, contextualized data stores built on F.A.I.R principles, and integrated analytics to

drive business intelligence through a unified digital platform (Weiss, 2022). The successful

adoption of these technologies will depend on their ease of implementation, immediate business

benefits, cost-effectiveness, and scalability in supporting data lifecycle management to

accelerate Industry 4.0 initiatives (Weiss, 2022).

4.2 Review the existing data governance models and their applicability in the

pharmaceutical context.

Data governance plays a pivotal role in the pharmaceutical industry, who generates vast

amounts of data throughout the drug development lifecycle, clinical trials, and post-market

surveillance (Khin et al., 2020; Roe, 2021; Truong et al., 2017; Weiss, 2022). Effective data

governance models are essential for managing this data, ensuring its quality, and adhering to

regulatory requirements (FDA, 2016). Currently there are three prominent models: centralized,

Anh NGUYEN Page 52 of 128


Msc Health Management & Data Intelligence

federated, and hybrid (Abu-Elkheir et al., 2013; Al-Ruithe et al., 2019; Dehghani, 2022; Ladley,

2019; Weber et al., 2009):

4.2.1 Centralized Data Governance Model

The centralized data governance model establishes a single, centralized authority responsible

for governing and managing data across the entire pharmaceutical organization (Weber et al.,

2009). Under this model, a central team defines data standards, policies, and procedures,

ensuring consistency and uniformity throughout the organization (Ladley, 2019). This approach

enables better control over data quality, integrity, and security (Ladley, 2019). However, it may

face challenges in scalability, flexibility, and accommodating diverse stakeholder needs

(Dehghani, 2022; Machado et al., 2022; Weber et al., 2009).

Based on the description above, it could be concluded that the centralized model suits

pharmaceutical organizations with a strong need for standardization, regulatory compliance,

and centralized decision-making. It is particularly useful for managing sensitive data, such as

patient records, intellectual property, and clinical trial data. Moreover, this model aids in

maintaining data integrity, facilitating auditing, and enforcing security protocols.

4.2.2 Federated Data Governance Model

The federated data governance model distributes data governance responsibilities across

various business units, departments, or subsidiaries within the pharmaceutical organization

(Abu-Elkheir et al., 2013; Nadal et al., 2023). This model promotes localized decision-making,

allowing each unit to tailor data governance practices according to their specific needs (Abu-

Elkheir et al., 2013; Dehghani, 2022; Nadal et al., 2023). The federated approach encourages

Anh NGUYEN Page 53 of 128


Msc Health Management & Data Intelligence

collaboration, flexibility, and responsiveness to local regulatory requirements (Abu-Elkheir et

al., 2013; Nadal et al., 2023). However, ensuring consistency and coordination across diverse

entities can be challenging, potentially leading to data silos and inconsistent data practices

(Nadal et al., 2023).

Based on the description above, it could be concluded that the federated model is well-suited

for large pharmaceutical organizations with diverse business units or subsidiaries operating in

different regions. It allows for localized decision-making, accommodating regional regulations,

and fostering collaboration among stakeholders. This model supports agility and responsiveness

to local market dynamics, ensuring compliance while catering to varying data needs across

different entities.

4.2.3 Hybrid Data Governance Model

The hybrid data governance model combines elements of both centralized and federated models

(Al-Ruithe et al., 2019). It seeks to strike a balance between centralized control and local

autonomy (Al-Ruithe et al., 2019). In this model, core data governance principles and standards

are established centrally, ensuring consistency and adherence to global policies (Al-Ruithe et

al., 2019). Simultaneously, local units have the flexibility to adapt and extend these standards

to cater to their unique requirements (Al-Ruithe et al., 2019). The hybrid model allows for

efficient data management while accommodating local nuances (Al-Ruithe et al., 2019).

However, it demands careful coordination and communication to maintain alignment across

different governance layers (Al-Ruithe et al., 2019).

Anh NGUYEN Page 54 of 128


Msc Health Management & Data Intelligence

Based on the description above, it could be concluded that the hybrid model is beneficial for

pharmaceutical organizations that seek a balance between centralized control and local

flexibility. It enables the establishment of core data governance principles while empowering

local units to adapt to their specific requirements. The hybrid model ensures consistency in

critical areas while accommodating customization and innovation at the local level.

4.2.4 Data Governance in Cloud Computing environments

In today's world, aside from centralized, federated, and hybrid data governance models, it's also

vital to consider challenges in data governance within cloud computing environments. The

following section is based entirely on an article from the Harvard Business School (Iansiti et

al., 2021) that explores the case of Moderna. This article highlights the utilization of cloud

computing and digital technologies by Moderna to enhance its operations and sheds light on

the challenges in the context of cloud computing.

4.2.4.1 Cloud Computing in Pharmaceuticals

1. Modernizing Operations through Cloud Computing and Digital Technologies

Iansiti et al. (2021) share that according to Moderna's Chief Digital Officer, Damiani, operating

in the cloud is a pivotal decision that brings about numerous benefits, including increased

security, cost-efficiency, agility, resilience, and disaster recovery capabilities. Damiani states,

"Operating in the cloud rather than building our own infrastructure was foundational to

everything else we did. It was the first decision we made". (Iansiti et al., 2021).

Since 2013, Moderna has been utilizing Amazon Web Services (AWS) and has continued to

expand its partnership with the platform over time. This strategic collaboration with AWS has

Anh NGUYEN Page 55 of 128


Msc Health Management & Data Intelligence

played a vital role in Moderna's digital transformation journey, providing them with the

necessary infrastructure and tools to bolster their operations. (Iansiti et al., 2021).

Figure 5, presented below, illustrates Moderna's digitization building blocks, showcasing the

various components and elements that contribute to the company's digital strategy and

implementation. It serves as a visual representation of Moderna's efforts in harnessing cloud

computing and digital technologies to optimize their operations. (Iansiti et al., 2021).

Figure 5 - Moderna’s Digitization Building Blocks (Iansiti et al., 2021)

2. Data Integration, Automation, and AI in Moderna's Digital Transformation

Iansiti et al. (2021) also highlight that another fundamental principle embraced by Moderna is

the concept of data integration. Damiani and other executives recognize the detrimental effects

of siloed data on efficiency and productivity within the organization. To address this challenge,

Moderna strives to harmonize data across systems, aiming to enter it once and enable its

Anh NGUYEN Page 56 of 128


Msc Health Management & Data Intelligence

seamless flow to relevant teams. Furthermore, the company extends data integration to connect

laboratory instruments through the Internet of Things (IoT), thereby enhancing overall data

integration capabilities. (Iansiti et al., 2021).

Automation and robotics play a significant role in Moderna's digital transformation journey.

However, Damiani emphasizes the importance of process maturity before embarking on

extensive automation initiatives. Rather than automating everything at once, Moderna adopts

an incremental approach, gradually automating error-prone manual activities. These automated

islands are then integrated into a cohesive whole, ensuring a cautious and measured approach

to automation. (Iansiti et al., 2021).

Analytics and artificial intelligence (AI) are integral components of Moderna's digitization

strategy. Damiani views AI as the "holy grail" and recognizes the value of digitization in

generating structured data, which is crucial for developing algorithms that support the creation

of next-generation medications. In his words, Damiani states, "We relied on digitization early

on, not for the sake of digitization but for generating data. Today, we have a lot of structured

data, for instance in research and pre-clinical production. When we run experiments, we collect

even more data. This allows us to build better algorithms, which helps build the next generation

of medication. It’s a virtuous cycle". (Iansiti et al., 2021).

Anh NGUYEN Page 57 of 128


Msc Health Management & Data Intelligence

Figure 6 - Digital Integration at Moderna (Iansiti et al., 2021).

Figure 6, presented above, illustrates the digital integration efforts at Moderna, showcasing the

interconnectedness of various digital components within the organization (Iansiti et al., 2021).

It provides a visual representation of Moderna's commitment to leveraging digital technologies

to enhance its operations (Iansiti et al., 2021).

3. Sourcing Digital Solutions and Adoption of Technologies at Moderna

Iansiti et al. (2021) show that Moderna adopts a mixed approach when it comes to sourcing

digital solutions. For undifferentiated processes like finance and HR, the company leverages

off-the-shelf Software as a Service (SaaS) tools. It is estimated that approximately 85% of the

tools used for such processes are existing SaaS solutions. However, for company-specific

Anh NGUYEN Page 58 of 128


Msc Health Management & Data Intelligence

processes and innovation, such as research and technical development, Moderna relies on

custom-built tools tailored to their specific needs. (Iansiti et al., 2021).

Different functions within Moderna exhibit varying levels of adoption when it comes to digital

and AI technologies, with pre-clinical activities leading the way. Despite this variation,

Damiani, Moderna's Chief Digital Officer, expresses confidence in the organization's progress

and estimates that they have achieved approximately 60% to 70% of his vision for digital

transformation. Automation has played a crucial role in reducing cycle times, enabling Moderna

to operate at a faster pace compared to other pharmaceutical and biotech companies. (Iansiti et

al., 2021).

According to Dave Johnson, the head of Informatics, Data Science, and AI at Moderna, artificial

intelligence (AI) provides the company with a competitive advantage by supporting decision-

making processes and enabling predictions that would be unattainable for humans within a

reasonable timeframe. This scalability empowers Moderna to accelerate its operations and gain

a competitive edge in the industry. (Iansiti et al., 2021).

Although the success of Moderna on cloud environments is very inspiring, it should be noted

that the challenges for cloud data governance are still very important and complex.

4.2.4.2 Challenges in cloud data governance

Al-Ruithe & Benkhelifa (2017) observe that the transition to cloud computing environments

presents numerous challenges for many organizations. As a result, data governance strategies

need to adapt in terms of structure, human resources, technology, procedures, roles, and

Anh NGUYEN Page 59 of 128


Msc Health Management & Data Intelligence

responsibilities (Al-Ruithe & Benkhelifa, 2017). Although cloud computing offers various

benefits such as cost efficiency, unlimited storage, backup and recovery capabilities, automatic

software integration, easy data access, quick deployment, scalability, and the availability of

new services, its widespread use is hindered by several concerns (Ko et al., 2011).

One of the most significant concerns is related to security and privacy issues, with 41% of these

concerns attributed to governance and legal issues (Khanghahi & Ravanmehr, 2013). Therefore,

data governance plays a critical role in successful cloud governance. The barriers to

implementing a cloud data governance strategy can be classified into technological,

organizational, legal, policy, financial, and knowledge-related factors (Al-Ruithe & Benkhelifa,

2017). Another constraint is the business value of data governance, which necessitates the

development of a charter for a data governance program, including its mission and vision (Al-

Ruithe & Benkhelifa, 2017). Poor communication between staff and other stakeholders is

identified as a reason for failure in data governance programs within organizations (Ladley,

2019), emphasizing the need for a well-defined communication plan for data governance to

ensure successful implementation in the cloud context (Morabito, 2015).

In the cloud landscape, the absence of cloud regulations and the lack of cloud data governance

requirements are two environmental factors that affect the implementation of data governance

(Al-Ruithe & Benkhelifa, 2017). While some organizations attempt to establish their own cloud

data governance, barriers such as limited knowledge and financial resources hinder their

progress (Khanghahi & Ravanmehr, 2013; Mary et al., 2011; Self, 2014). Additionally,

organizational challenges like general culture and mindset, leading to resistance to change, pose

further obstacles (Al-Ruithe & Benkhelifa, 2017). The complexity of implementing cloud data

Anh NGUYEN Page 60 of 128


Msc Health Management & Data Intelligence

governance arises from the involvement of external stakeholders and the need for innovative

tools and architectures to address regulatory and disciplinary requirements (Tountopoulos et

al., 2014). Insufficient and inadequate technologies can have a negative impact on the feasibility

of cloud data governance initiatives (Tountopoulos et al., 2014).

In addition, 2023 is witnessing an exponential growth of SaaS solutions, commonly referred to

as the modern data stack. “The modern data stack is a combination of various software tools

used to collect, process, and store data on a well-integrated cloud-based data platform. It is

known to have benefits in handling data due to its robustness, speed, and scalability” (Chia,

2023). Navigating the modern data stack can vary in ease or difficulty for pharmaceutical

companies. There are several factors, including their existing infrastructure, data complexity,

regulatory requirements, and the level of expertise within their organization (Chia, 2023).

However, the complexity and variety of tools and technologies available can make it difficult

to determine which combination is the best fit for their specific needs, as Figure 7 below:

Figure 7 - State of Data Engineering Map 2022 (Horner, 2023)

Anh NGUYEN Page 61 of 128


Msc Health Management & Data Intelligence

Horner (2023) states that while the modern data stack represents a substantial advancement

compared to the conventional practice of manually coding data pipelines using outdated tools,

it has also encountered criticism for falling short of its stated benefits in numerous aspects such

as tool sprawl, procurement and billing issues, high cost of ownership, length setup / integration

/ maintenance, disjointed user experience, knowledge silos and staffing challenges.

To handle this complexity of cloud environment, Al-Ruithe et al. (2016) suggest the critical

success factors for cloud data governance, that will be examined in detail below:

4.2.4.3 Critical Success Factors for Cloud Data Governance

Cloud data governance success can be attributed to two key aspects: organizational and

technological factors (Al-Ruithe & Benkhelifa, 2017; Mary et al., 2011; Self, 2014).

Organizational factors encompass clarity of roles and responsibilities, alignment of business

and IT, executive sponsorship, and the establishment of a data governance center of excellence

(Al-Ruithe & Benkhelifa, 2017; Murray, 2023). On the other hand, technological factors

revolve around automating the data integration lifecycle to support data governance objectives

(Khanghahi & Ravanmehr, 2013).

A study identified ten data governance success factors, including strategic accountability,

standards, addressing managerial blind spots, recognizing the complexities of data, cross-

divisional collaboration, data quality metrics, partnership with other companies, strategic points

of control, training and awareness for data stakeholders, and compliance monitoring (Cheong

& Chang, 2007).

Anh NGUYEN Page 62 of 128


Msc Health Management & Data Intelligence

Another article suggests the development of data governance guidelines, principles, policies,

organizational structures, and procedures to support the data governance framework (Rifaie et

al., 2009). The authors highlight three essential elements: structure, process, and

communication. Structure involves defining decision-makers, organizing the structure, and

outlining the roles and responsibilities of each member (Rifaie et al., 2009). Process

encompasses decision-making processes for data assets, reviewing and approving data-related

investments (Rifaie et al., 2009). Communication involves communicating the results of

monitoring and measurement of these processes and decisions, as well as the mechanisms for

communicating data investment decisions to stakeholders (Rifaie et al., 2009).

Successful cloud data governance requires the collaboration of diverse expertise from various

departments within the organization to achieve consistency, transparency, repeatability of

processes, accountability, scope and focus maintenance, KPI measurement, compliance and

legal support, and improvement of income opportunities and customer/partner relationships

(Rifaie et al., 2009).

Addressing security and compliance in a cloud environment is crucial for the development of a

robust cloud data governance program (Rebollo et al., 2015). Cloud computing's disruptive

nature necessitates the implementation of comprehensive data governance strategies that may

vary depending on the deployment or delivery model (Al-Ruithe et al., 2016). Involving cloud

actors as integral stakeholders is vital for successful cloud data governance (Felici et al., 2013).

However, the complexity of legal contracts between cloud actors often poses challenges for

ordinary cloud consumers to understand (Dogo et al., 2013). Therefore, aligning the data

governance strategy with cloud computing regulations is essential to incorporate all relevant

Anh NGUYEN Page 63 of 128


Msc Health Management & Data Intelligence

requirements into the service level agreement when adopting cloud computing services (Badger

et al., 2012).

Figure 8 - A conceptual Cloud Data Governance Critical Success Factors Identification

Framework (Al-Ruithe & Benkhelifa, 2017)

Figure 8, developed by Al-Ruithe & Benkhelifa (2017), presents a simplified conceptual

framework for critical success factors in cloud data governance. The framework comprises four

dimensions (Al-Ruithe & Benkhelifa, 2017):

1. Cloud data governance strategy formulation: This dimension focuses on developing a

cloud data governance strategy that considers the unique challenges and requirements

of the organization.

2. Cloud data governance CSFs: This dimension identifies the critical success factors

crucial for successfully implementing the cloud data governance strategy. These factors

Anh NGUYEN Page 64 of 128


Msc Health Management & Data Intelligence

may encompass technological, environmental, organizational, and other relevant

considerations.

3. Cloud data governance CSFs evaluation: This dimension involves assessing and

evaluating the identified CSFs to determine their importance and potential impact on

implementing the cloud data governance strategy. This evaluation helps prioritize and

allocate resources effectively.

4. Cloud data governance strategy implementation: The final dimension pertains to

executing and operationalizing the cloud data governance strategy based on the

identified CSFs. This includes implementing policies, procedures, and controls to

ensure effective data governance in the cloud environment.

By following this framework, organizations can enhance their understanding of critical factors

contributing to successful cloud data governance and make informed decisions throughout the

implementation process (Al-Ruithe & Benkhelifa, 2017).

4.2.5 Challenges and Considerations in general

Implementing any data governance model in the pharmaceutical context poses numbers of

challenges. These include ensuring data privacy and security, aligning with evolving regulatory

landscapes, managing data across diverse systems and stakeholders, and fostering a culture of

data governance throughout the organization (Khin et al., 2020; Panian, 2010; Truong et al.,

2017). Adequate technological infrastructure, strong governance frameworks, and continuous

communication and collaboration among stakeholders are essential for successful

implementation of data governance models in the pharmaceutical context (Al-Ruithe et al.,

2016; Cheong & Chang, 2007; Rifaie et al., 2009; Weber et al., 2009). Additionally,

Anh NGUYEN Page 65 of 128


Msc Health Management & Data Intelligence

organizations must consider the scalability of their chosen model to accommodate future

growth, the integration of emerging technologies such as artificial intelligence and machine

learning, and the ability to adapt to evolving data governance best practices (Levy, 2021;

Marshall & Wallace, 2019; Vamathevan et al., 2019).

4.3 Critical success factors (CSFs) for Data Governance in general

Data governance has become increasingly important in modern business operations as

organizations recognize the value of data (Khatri & Brown, 2010). With the growing volume

of data used within organizations, the effective management of data has become critical for

successful business operations (Tallon et al., 2013). Data plays a crucial role in both operational

and strategic decision-making processes (Tallon et al., 2013). Establishing trust in data is of

paramount importance as a lack of trust can lead to significant time wastage (50%) in “hunting

for data” (Redman, 2013). Conversely, when data is trusted, it is more likely to be shared,

resulting in higher returns on data investments (Otto, 2015).

In a study conducted by Alhassan et al. in 2019, seven critical success factors (CSFs) for data

governance were identified and ranked based on their perceived importance (Table 1). The

study highlights the significance of understanding the relationships and interconnectedness

among these CSFs to ensure effective data governance.

Anh NGUYEN Page 66 of 128


Msc Health Management & Data Intelligence

Table 1 - CSFs for Data Governance (Alhassan et al., 2019)

4.3.1 Employee data competencies

The competencies of employees directly influence their ability to define, implement, and

monitor data processes, procedures, policies, and requirements, as well as their capability to

Anh NGUYEN Page 67 of 128


Msc Health Management & Data Intelligence

handle data governance activities (Alhassan et al., 2019). These competencies are crucial for

top managers in establishing an overall data governance strategy and treating data as a strategic

asset (Alhassan et al., 2019). It is also essential for employees to possess the necessary

capabilities and awareness to handle data entry and access, ensuring the integrity and security

of the organization's data (Alhassan et al., 2019). To ensure appropriate employee data

competencies, continuous training is recommended for implementing data policies and

processes (Alhassan et al., 2019). This training should be conducted both internally and

externally to increase employees' awareness of the importance of data accuracy and the secure

handling of sensitive information (Alhassan et al., 2019).

4.3.2 Clear data processes and procedures

Data processes and procedures are a major focus in data governance, ensuring the reliable and

effective flow and use of data (DAMA International, 2009). While it is essential to define,

implement, and monitor data processes and procedures across all aspects of data management,

specific emphasis is often placed on areas such as data quality, data access, and data recording

and storage (Alhassan et al., 2019). The absence of clear and well-defined data processes and

procedures can raise doubts about the reliability of data, which can be attributed to factors such

as undefined procedures or missing components like data testing (Alhassan et al., 2019).

To achieve effective data processes and procedures, it is recommended to embed them into the

system itself (Koh et al., 2011). This can be done through the inclusion of mandatory fields,

validation methods, and data flow requirements that guide users in adhering to established

processes (Alhassan et al., 2019). Regular checks and updates of existing processes and

Anh NGUYEN Page 68 of 128


Msc Health Management & Data Intelligence

procedures are also essential to ensure their ongoing relevance and effectiveness (Alhassan et

al., 2019).

4.3.3 Flexible data tools and technologies

Flexible data tools and technologies have a noteworthy impact on various critical success

factors (CSFs) in data governance, including the establishment of standardized and easily

understandable data policies, clear data processes and procedures, and the implementation of

inclusive data requirements (Alhassan et al., 2019). By leveraging flexible data tools and

technologies, organizations can embed these policies and processes into suitable systems,

ensuring proper formatting and enforcement, such as making specific fields mandatory or

employing automated validation methods (Alhassan et al., 2019). The availability of robust data

tools and technologies further enables the successful implementation of other CSFs (Alhassan

et al., 2019).

To effectively address flexible data tools and technologies, it is essential to have the appropriate

IT infrastructure and integrated data environment (Alhassan et al., 2019; Mary et al., 2011;

Tallon et al., 2013). This may involve the adoption of advanced technologies for data

integration, allowing for automated data validation and seamless data flow (Alhassan et al.,

2019). Additionally, thorough testing procedures should be implemented to ensure the

reliability and flexibility of systems, with consideration given to accommodating future changes

(Alhassan et al., 2019). Data privacy and availability should also be considered when

integrating internal and external systems, ensuring compliance with relevant regulations and

safeguards (Alhassan et al., 2019).

Anh NGUYEN Page 69 of 128


Msc Health Management & Data Intelligence

4.3.4 Standardized easy to-follow data policies

The establishment of standardized and easily understandable data policies is crucial for

providing high-level guidelines and rules for handling data within an organization (Alhassan et

al., 2019). When specific data lacks well-defined policies, it can create uncertainty among

employees and hinder decision-making processes due to a lack of understanding on how the

data should be processed (Alhassan et al., 2019).

Furthermore, accessing unnecessary data that compromises privacy can have a negative impact

on business performance (Alhassan et al., 2019). The study suggests that data policy documents

should follow a specific template, keeping them basic and up to date to ensure employees

comprehend and value the importance of adhering to the guidelines (Alhassan et al., 2019).

However, merely having defined data policies is not sufficient for effective data governance. It

is highly recommended to implement these policies by embedding them into systems with

mandatory fields and validation methods (Alhassan et al., 2019). This integration ensures that

data is handled in accordance with the established policies and facilitates compliance (Alhassan

et al., 2019). Additionally, regular monitoring and periodic updates of data policies through

audits are essential to ensure their continued relevance and effectiveness (Alhassan et al., 2019).

4.3.5 Established data roles and responsibilities

Clearly defined data roles and responsibilities are essential in data governance to identify

individuals accountable for various data-related activities within the organization (Khatri &

Brown, 2010). These roles include responsibilities such as defining data policies and processes,

Anh NGUYEN Page 70 of 128


Msc Health Management & Data Intelligence

ensuring data quality and integrity, managing data access and security, and overseeing data

governance initiatives (Khatri & Brown, 2010).

Without well-defined roles and responsibilities, even with good processes in place, there is a

risk of errors and confusion in data management (Alhassan et al., 2019). Assigning specific

responsibilities to individuals or teams helps establish accountability and ensures that the

necessary actions are taken to maintain data accuracy, consistency, and compliance with

policies and regulations (Alhassan et al., 2019; Khatri & Brown, 2010).

By clearly delineating data roles and responsibilities, organizations can improve coordination

and collaboration among stakeholders, enhance data governance practices, and mitigate the risk

of data-related issues (Khatri & Brown, 2010). It is important to regularly review and update

these roles and responsibilities to adapt to evolving business needs and technological

advancements in data management (Khatri & Brown, 2010).

4.3.6 Clear inclusive data requirements

Clear and inclusive data requirements play a crucial role in the successful implementation of

data governance practices (Alhassan et al., 2019). These requirements define various aspects of

data implementation, including data flows, integration, mandatory fields, and validation

methods (Alhassan et al., 2019).

Business owners have a significant responsibility in understanding and articulating data

requirements effectively to ensure a clear understanding of data needs by the IT team (Alhassan

et al., 2019). This requires a strong understanding of the business processes and objectives that

Anh NGUYEN Page 71 of 128


Msc Health Management & Data Intelligence

rely on data, as well as the ability to communicate those requirements in a formal and detailed

manner (Alhassan et al., 2019).

Effective communication between data owners and implementers is essential to ensure that data

requirements are properly translated into IT systems and processes (Alhassan et al., 2019). This

includes specifying the data flow, identifying mandatory fields that need to be captured, and

defining validation methods to ensure data accuracy and consistency (Alhassan et al., 2019).

To address data requirements comprehensively, it is crucial to consider the expertise and

competencies of employees involved in data governance activities (Alhassan et al., 2019).

Employees with strong data competencies can contribute significantly to understanding and

defining the right data requirements for the organization (Alhassan et al., 2019).

4.3.7 Focused and tangible data strategies

Developing focused and tangible data strategies that align with organizational goals is essential

for achieving success in data governance (Alhassan et al., 2019). These data strategies should

encompass the decision domain of data principles and provide a framework for guiding data-

related activities(Alhassan et al., 2019).

To ensure the effectiveness of data strategies, it is essential to consider both short-term and

long-term objectives (Alhassan et al., 2019). Short-term objectives help address immediate data

governance needs, while long-term objectives provide a roadmap for future data management

initiatives (Alhassan et al., 2019).

Anh NGUYEN Page 72 of 128


Msc Health Management & Data Intelligence

Recognizing data as strategic assets is a fundamental aspect of data governance (Alhassan et

al., 2019). Organizations should acknowledge the value of data in driving business outcomes

and treat it as a critical resource (Alhassan et al., 2019). Assigning a top management committee

for data governance can provide the necessary leadership and oversight to support the

development and implementation of data strategies (Alhassan et al., 2019). This committee

should have a clear mandate to define data governance goals, establish policies, allocate

resources, and monitor the progress of data initiatives (Alhassan et al., 2019).

5 Research Methodology

5.1 The conceptual model guiding the thesis

The research methodology applied in this study adopts a systematic approach to investigate and

understand the data governance models prevalent in the pharmaceutical sector. The conceptual

model guiding this thesis is based on a quantitative research design, utilizing surveys conducted

with key pharmaceutical companies. These surveys will provide quantitative insights into the

adoption and prevalence of Centralized, Federated, or Hybrid data governance approaches

within the industry. Additionally, the research will assess the experiences of these organizations

in terms of change management, aiming to identify specific challenges and opportunities

encountered. By analyzing the quantitative survey responses, the study seeks to pinpoint

Critical Success Factors (CSFs) in data governance that contribute to effective implementation

within pharmaceutical context. The collected quantitative data will enable a robust analysis that

sheds light on the most effective approaches and practices in pharmaceutical data governance,

thereby offering valuable insights for industry practitioners and researchers alike.

Anh NGUYEN Page 73 of 128


Msc Health Management & Data Intelligence

5.2 Research context and philosophy: The criteria for selecting participants.

The philosophy underpinning this research is rooted in a pragmatic approach, seeking to derive

practical insights from real-world practices. The selection criteria for pharmaceutical

companies in this study reflect a purposeful approach, leveraging personal networks in

pharmaceutical industry connections to ensure a representative sample. Leveraging these

networks, invitations to participate in the survey were extended to individuals with diverse roles

across various departments such as Data, Marketing, Sales, Production and R&D etc., ensuring

a multifaceted perspective.

Recognizing the sensitivity and confidentiality of strategic data governance information held

by these organizations, the survey is administered anonymously. This anonymity is essential to

foster candid and truthful responses, as respondents can freely share their experiences,

challenges, and opportunities without apprehension. By focusing on departments rather than

specific company names or individual identities, the study maintains a high level of privacy and

confidentiality. This approach enables the research to extract unbiased and authentic insights,

contributing to a more comprehensive understanding of the critical success factors in data

governance within the pharmaceutical domain.

5.3 Instruments/measures

The central tool employed in this study is a structured questionnaire. This instrument has been

carefully designed to comprehensively encompass crucial variables associated with data

governance models, change management undertakings, and the critical success factors specific

to the pharmaceutical industry. Drawing on an established theoretical foundation rooted in

literature covering data governance frameworks, organizational change management principles,

Anh NGUYEN Page 74 of 128


Msc Health Management & Data Intelligence

and industry best practices, the questionnaire explores various dimensions. It delves into the

realm of data governance models, particularly examining the adoption and alignment of

Centralized, Federated, or Hybrid approaches as well as in Cloud environment with the

overarching organizational strategies.

The survey is partitioned into distinct themes for an organized exploration. Topics encompass

the implementation of data governance practices within the organizational framework, the

intricate landscape of data governance in cloud computing environments, prevailing data

governance practices and methodologies, as well as the identification of challenges and

obstacles encountered during implementation. Furthermore, a specialized section is tailored to

address the intricate nuances of data governance pertinent to the pharmaceutical sector,

scrutinizing compliance with FDA regulations. This entails aspects like data retention policies,

metadata and audit trail management, contemporaneous practices, and regulated access to

computer systems, thus providing a comprehensive view of how companies align themselves

with regulatory mandates.

Moreover, the questionnaire inquiries about change management experiences, eliciting insights

into challenges, successes, and lessons learned during the implementation of data governance

initiatives. Critical Success Factors (CSFs) are assessed through a set of structured questions

designed to capture participants' perspectives on factors contributing to the effectiveness of data

governance practices.

Anh NGUYEN Page 75 of 128


Msc Health Management & Data Intelligence

5.4 Data analysis approach

With a diverse participation of 19 individuals from many countries spanning across departments

including Data, R&D, Marketing, Production, Sales, Clinical, and Finance, a comprehensive

approach has been adopted (Figure 9). Acknowledging the importance of unbiased and holistic

comprehension of data governance across the entire organization, the data is treated as a

cohesive entity rather than being analyzed departmentally. This approach ensures that insights

are not skewed by department-specific perspectives, aligning with the broader strategy of clear

and comprehensive communication of data governance throughout the company.

Figure 9 – Department Names of Survey Participants (19 responses)

The amalgamated data from all participants is subjected to a comprehensive analysis aimed at

understanding the various facets of data governance. Each question within the survey is

examined based on the percentage distribution of responses. By assessing the proportion of

different responses, conclusions are drawn about distinct aspects of data governance. This

Anh NGUYEN Page 76 of 128


Msc Health Management & Data Intelligence

approach allows for a comprehensive portrayal of how data governance is comprehended and

operationalized within pharmaceutical companies by amassing insights from a diverse spectrum

of perspectives. Thus, the data analysis program synthesizes a comprehensive mosaic,

illuminating the intricate landscape of data governance's comprehension and implementation.

6 Discussion of Research Results combined with Literature Review

6.1 Data Governance Practices at Pharmaceuticals from Survey Results

6.1.1 Deployment of Data Governance Program

In 2023, as shown by Figure 10 below, a favorable outcome emerges, 63% of organizations

having executed data governance programs with: 36% have adopted a comprehensive

implementation, while 27% have chosen a partial approach. The remaining 27% acknowledge

a lack of awareness regarding any data governance initiatives, while 9% of them confirm a data

governance program will be deployed soon.

Figure 10 – The deployment of data governance program (11 responses)

Anh NGUYEN Page 77 of 128


Msc Health Management & Data Intelligence

The result above shows that the establishment of a data governance model often evolves through

a pragmatic and progressive approach. This approach is frequently adopted to gradually involve

teams and departments, relying on compelling internal results and success stories. The decision

to follow a "partial" approach is often a deliberate choice to build step by step, measure

immediate outcomes, and then venture into more extensive deployments.

According to Figure 11, among the organizations that are implementing data governance

programs, the rate stayed at approximately 30% for both “over the past 3 years” and “less than

3 years”. Notably, around 40% of these projects are developed within a timeline of “5 to 10

years”. This pattern can be interpreted as a signal of the increasing embrace of the partial or

progressive model across different functions within an organization. Such an interpretation

could account for the significance of such a program in the pharmaceuticals industry.

Figure 11 - Duration of the deployment of data governance program (10 responses)

Anh NGUYEN Page 78 of 128


Msc Health Management & Data Intelligence

The findings presented in Figures 12 and 13 reveal that only 50% of organizations put the

responsibility for directing and implementing data governance programs on Data Governance

or Data Managers, who dedicated their efforts entirely to the Data Governance program.

Conversely, the remaining organizations apportion these duties among individuals with other

primary responsibilities, including the Chief Information Officer (CIO), the entire IT

Department, and the Data Department, each accounting for 10% of cases. Additionally, 20% of

organizations delegate these responsibilities to individuals within the Data team, such as Data

Analysts or Data Scientists.

This outcome highlights the relatively immature state of personnel organization within Data

Governance. It could be explained that for organizations that are in the early stages of

implementing this program, they often lack a dedicated position to oversee it, and therefore rely

on existing personnel within their data or IT teams to manage the responsibilities.

Figure 12 – Responsible personnel for data governance implementation (11 responses)

Anh NGUYEN Page 79 of 128


Msc Health Management & Data Intelligence

Figure 13 – Dedication effort in data governance (11 responses)

A positive aspect is that all organizations acknowledge the widespread use of data across their

various departments. However, as indicated in Figure 14, the pharmaceutical industry continues

to heavily rely on a centralized data team, accounting for 70% of cases. This team is responsible

for tasks like data collection, transformation, and provisioning to all departments. Only 20% of

organizations have taken the step of assigning at least one data specialist to each department,

and a mere 10% have designated a data specialist for just some departments.

Figure 14 – Centralized or decentralized model (10 responses)

Anh NGUYEN Page 80 of 128


Msc Health Management & Data Intelligence

This outcome underscores the pharmaceutical industry's inclination towards a controlled

approach to data sharing and utilization with the centralized model. The centralized data

governance model establishes a single authority responsible for governing and managing data

across a pharmaceutical organization (Weber et al., 2009. It ensures consistency and control

over data quality and security but may encounter challenges in scalability, flexibility, and

meeting diverse stakeholder needs (Dehghani, 2022; Machado et al., 2022; Weber et al., 2009).

The promising sign is that a subset of organizations is starting to recognize the significance of

embedding data experts within business units to harness the value of data more efficiently.

However, this shift is still at a nascent stage.

Figure 15 – Center of Excellence (4 responses)

Figure 15 illustrates that 75% of organizations maintain a Center of Excellence staffed with

Data experts. This Center is responsible for overseeing, governing, and providing technical

assistance to all the data members in each department (Al-Ruithe & Benkhelifa, 2017; Murray,

2023). Conversely, the remaining 25% of organizations lack a centralized data team, and

consequently, there is no data sharing among departments at all. The survey results reveal a

Anh NGUYEN Page 81 of 128


Msc Health Management & Data Intelligence

significant shift in data management practices within the pharmaceutical sector. Notably, a

substantial percentage of organizations are adopting hybrid data governance models to balance

decentralization and centralized control. These findings highlight the industry's drive to ensure

data security, compliance, and collaboration while adapting to evolving technological

landscapes as explained by Al-Ruithe et al. (2019).

6.1.2 Data Governance in Cloud Computing environment

Cloud computing adoption in the pharmaceutical industry is predominantly driven by Cost

Savings, garnering 100% support (Figure 16). Following closely, Scalability and Flexibility,

alongside Enhance Data Security and Privacy, share the second spot with 85% agreement each.

Improved Collaboration and Accessibility, combined with the potential for Disaster Recovery

and Business Continuity, secure the third position with a robust result of 71%. This outcome

highlights cloud computing's critical role in pharmaceutical organizations, vital for managing

the escalating volume of data or big data and ensuring their operational efficiency.

Figure 16 – Reasons of adopting Cloud Computing (7 responses)

Anh NGUYEN Page 82 of 128


Msc Health Management & Data Intelligence

The result shares the same conclusion of Ko et al. (2011) that cloud computing undoubtedly

presents a range of advantages, including cost efficiency, boundless storage capacity, robust

backup and recovery capabilities, automated software integration, seamless data accessibility,

swift deployment, scalability, and the provision of novel services (Ko et al., 2011), its broad

adoption is impeded by several notable apprehensions.

As depicted in Figure 17 below, while data practitioners grasp the crucial significance of data

security and privacy in the pharmaceutical realm, only 17% of organizations have a well-

defined Cloud Data Governance program. Conversely, a significant 50% have not yet initiated

implementation, with 33% currently in the developmental phase.

Figure 17 - Data Governance in Cloud Computing (6 responses)

Anh NGUYEN Page 83 of 128


Msc Health Management & Data Intelligence

Figure 18 provides insight into the challenges faced, where 100% of respondents encounter

issues integrating cloud data with on-premises systems. Furthermore, 83% struggle with

ensuring data privacy and meeting regulatory requirements, as well as managing data access

and user permissions. Additionally, 67% grapple with complications related to data residency

and jurisdiction concerns. These findings underline the complexity of establishing effective

Cloud Data Governance in the pharmaceutical landscape.

The findings unequivocally affirm that within the spectrum of concerns, security and privacy

issues assume a significant prominence, with their roots traced back to governance and legal

factors, as identified by Khanghahi & Ravanmehr (2013). Considering this, the role of data

governance becomes paramount in orchestrating an efficacious oversight of cloud

environments. The hurdles that obstruct the seamless execution of a cloud data governance

strategy can be methodically classified into discrete realms, encompassing technological,

organizational, legal, policy-oriented, financial, and knowledge-based dimensions (Al-Ruithe

& Benkhelifa, 2017).

What challenges does your organization face with Data Governance in cloud environment?

Figure 18 – Challenges in data governance in cloud environment (6 responses)

Anh NGUYEN Page 84 of 128


Msc Health Management & Data Intelligence

6.1.3 Data Governance best practices for Policies Compliance in Pharmaceutical

In the realm of pharmaceuticals, addressing the prevalent challenges associated with data

integrity and complying with FDA and cGMP regulations requires a proactive approach (FDA,

2018). Neumeyer (2020) outlines key strategies to mitigate these issues, highlighting the critical

aspects of Data Retention, Metadata and Audit Trails, Data Quality Validation,

Contemporaneous, and Access to computer:

6.1.3.1 Data Retention

For preventing breaches in data integrity and upholding pharmaceutical standards, meticulously

adhering to current Good Manufacturing Practice (cGMP) regulations are pivotal. Notably, the

Code of Federal Regulation (CFR) 211.180 establishes a critical guideline, stipulating those

records pertinent to production, control, or distribution, necessary for cGMP compliance, must

be retained for a specified duration (Code of Federal Regulations, 2022).

Figure 19 – Data Retention Definition Documents (8 responses)

In this context, the survey findings presented in Figure 19 above emphasize a mere 38% of

organizations possess a defined document outlining the duration of data retention along with

suitable archival systems for critical records. These encompass essential elements such as batch

Anh NGUYEN Page 85 of 128


Msc Health Management & Data Intelligence

documents, marketing authorization application data, and traceability data for human-derived

starting materials. Additionally, a significant 50% of respondents acknowledge the absence of

such a document, while an additional 12% remain unaware of its existence. These results

highlight the pressing need for heightened attention to establishing clear guidelines for data

retention and robust archival mechanisms within organizations to uphold data integrity and

regulatory compliance.

On the other hand, maintaining the security of data requires that it be kept in its original form

or accurate duplicates, like photocopies, microfilm, or microfiche, or any other format that

precisely replicates the initial records (MHRA, 2015). A record or log should also be

maintained, demonstrating the correct and timely archiving or elimination of no-longer-needed

records, in line with Good Manufacturing Practice (GMP) standards (PIC, 2021).

Figure 20 – Data Retention in True Copies (8 responses)

The survey outcomes indicate that only 25% of photocopies, microfilm, or microfiche are

retained as true copies (Figure 20). A significant 50% are uncertain, which might be because

Anh NGUYEN Page 86 of 128


Msc Health Management & Data Intelligence

they aren't directly responsible for these files. Still, an alarming 25% confirm that these records

aren't maintained as accurate copies, potentially leading to a violation of cGMP requirements.

6.1.3.2 Metadata and audit trail

As outlined by Charoo et al. (2023), both metadata and audit trails assume pivotal roles in

upholding compliant record-keeping practices aligned with current Good Manufacturing

Practice (cGMP) standards. These elements are vital in ensuring the integrity of data,

facilitating seamless data retrieval, and enabling effective data utilization (Charoo et al., 2023).

Specifically, metadata encompasses structured data that provides contextual insights and data

management information (Charoo et al., 2023).

Figure 21 – Data Catalog, Metadata and Audit Trail (11 responses)

Figures 21 reveals that the integration of Metadata, Data Catalogs, and audit trails within the

pharmaceutical sector is characterized by a lack of uniformity. Encouragingly, a substantial

Anh NGUYEN Page 87 of 128


Msc Health Management & Data Intelligence

70% of organizations indicate the presence of metadata, while a slightly lower yet notable 60%

affirm the existence data catalog and 50% for audit trails within their operations. It's worth

noting that a discernible 10% to 20% of respondents remain uncertain about the presence of

these resources, hinting at a potential gap in awareness or communication among data users.

This uncertainty could mean that these valuable assets aren't being used or shared effectively.

In addition, the list of metadata elements in a particular dataset includes a wide range of

components. These include the date and time stamp of the activity, the unique user identification

linked to the individual executing the test or analysis, the identification code of the instrument

employed for data capture, information about the material's status, a distinctive material

identification number, as well as comprehensive audit trails that provide a documented history

of interactions and changes (Charoo et al., 2023). Figure 22 presented below shows that every

single organization, totaling 100%, has incorporated these specific instances within their

metadata practices. With 80% among them has also audit trails and only 20% add that it depends

on the data domain.

Figure 22 – Elements included in Metadata (5 responses)

Anh NGUYEN Page 88 of 128


Msc Health Management & Data Intelligence

To enhance data governance, Shafiei et al. (2015) recommended establishing mechanisms that

automatically generate audit trails for various data activities, including creation, modification,

and deletion. These trails should encompass crucial information such as timestamps, user IDs,

and event specifics, ensuring a comprehensive and accurate record of data events (Shafiei et al.,

2015).

Figure 23 below demonstrates a positive outcome: the complete integration of these precise

instances into the metadata practices of every single organization, amounting to a 100%

adoption rate.

Figure 23 – Elements included in Audit Trail (4 responses)

6.1.3.3 Clear data processes and procedures

Data governance places significant emphasis on data processes and procedures, which are

pivotal for ensuring the reliable and efficient flow and utilization of data (DAMA International,

2009). While encompassing all aspects of data management, specific attention is often directed

Anh NGUYEN Page 89 of 128


Msc Health Management & Data Intelligence

towards elements such as data quality, data access, and data recording and storage (Alhassan et

al., 2019). The absence of well-defined data processes and procedures can cast doubt on data

reliability, stemming from factors like undefined protocols or incomplete components such as

data testing (Alhassan et al., 2019). To establish robust data processes and procedures,

integration within the system itself is recommended (Alhassan et al., 2019). This can be

achieved by incorporating mandatory fields, validation methods, and data flow prerequisites

that guide users in adhering to established protocols (Alhassan et al., 2019).

Figure 24 notably highlights that only 70% of organizations have initiated processes for

outlining data quality requirements, coupled with the establishment of data quality rules and

the implementation of validation and cleansing procedures. This underscores the need for

enhanced attention and systematic efforts in building and fortifying data processes to bolster

overall data governance.

Figure 24 – Data quality requirements & rules (10 responses)

Anh NGUYEN Page 90 of 128


Msc Health Management & Data Intelligence

6.1.3.4 Contemporaneous

Neumeyer's research (2020) uncovered that employees were signing logbooks and batch

records much later than when the actual tasks were performed, contradicting the

"contemporaneous" principle of ALCOA+ (Attributable, Legible, Contemporaneous, Original,

and Accurate) (EMEA, 2010). This principle emphasizes recording data as events happen. To

tackle this issue, it's crucial to train analysts to document their work promptly after completion

and to establish procedures that enforce this practice (Charoo et al., 2023). Fortunately, there

are software platforms available that can capture activity details in real-time, creating a secure

audit trail that can't be modified (Charoo et al., 2023). These platforms offer a solution to

enhance the practice of contemporaneous documentation (Charoo et al., 2023).

Figure 25 – Utilization of Logbooks, LIMS and MES (8 responses)

The survey findings shown in Figure 25 indicate that only 50% of organizations use tools like

Logbooks and batch records (that record the details of the activity of who carried out of the

Anh NGUYEN Page 91 of 128


Msc Health Management & Data Intelligence

work, when and why, permanently and real time), LIMS (laboratory information management

system), and MES (manufacturing execution systems) to manage operational data, while 30%

to 40% are unaware of these tools. The relatively low adoption of tools like Logbooks, LIMS,

and MES to manage operational data suggests a potential area for improvement in ensuring

real-time and accurate recording of activities.

6.1.3.5 Access Control

To make sure computer systems follow the rules of current Good Manufacturing Practices

(cGMP), Standard Operating Procedures (SOPs) are vital (Charoo et al., 2023). These

procedures should define the roles of system administrators and also outline who has access to

different cGMP computer systems (Charoo et al., 2023). It's important to choose system

administrators who are different from the people in charge of the records, like those working in

the lab (Charoo et al., 2023). This helps keep things fair and reduces the risk of unauthorized

actions (Charoo et al., 2023). Through clear SOPs, the specific rights and duties of both system

administrators and authorized staff can be set, creating strong rules that prevent unauthorized

entry, changes, or removal of records (Charoo et al., 2023).

Figure 26 – The system administrator role (8 responses)

Anh NGUYEN Page 92 of 128


Msc Health Management & Data Intelligence

However, the survey results show a gap, with only 75% of organizations assigning the role of

system administrator (with the power to change files and settings) to people who aren't

responsible for the record content (Figure 26). Fixing this gap is crucial for keeping data

systems secure and trustworthy.

Simultaneously, Figure 27 illustrates that only 60% have implemented processes encompassing

access controls, encryption, and data masking, which are crucial components for maintaining

data quality standards in the pharmaceutical context. However, a notable 10% to 20% of

respondents express uncertainty regarding the existence of these processes. This ambiguity

could potentially stem from the diverse backgrounds of respondents, including those from Sales

and Marketing, who might not possess comprehensive knowledge about these technical

processes, highlighting the need for improved communication and awareness across various

departments.

Figure 27 - Access controls, encryption, and data masking (10 responses)

Anh NGUYEN Page 93 of 128


Msc Health Management & Data Intelligence

Figure 28 demonstrates that only 50% of employees in organizations have unique individual

accounts and passwords for system access. Alarmingly, a substantial 38% of them might be

compelled to share their accounts. This situation significantly complicates the task of tracing

the identities of those who log in. Neglecting to institute effective controls and access

regulations for computer systems constitutes a breach of regulatory mandates, particularly 21

CFR 211.68(a) (Code of Federal Regulations, 2022).

Figure 28 – Unique Account and Password for each employee (8 responses)

6.1.4 Awareness about importance of data governance

Both Figures 29 and 30 affirm a significant trend: the importance of data governance is not

being adequately emphasized by leaders or the C-suite. Only 50% of organizations provide

training to their employees on the significance and best practices of data governance.

Surprisingly, 55% of leaders are deemed to have only a partial awareness of the impact of

effective data governance, with only 27% considered fully conscious and 18% having some

level of awareness.

Anh NGUYEN Page 94 of 128


Msc Health Management & Data Intelligence

Figure 29 - Training employees on data governance (10 responses)

Figure 30 – Awareness of leaders on data governance (11 responses)

These results align with the findings of Weiss (2022), highlighting that while digital

transformation initiatives often take precedence, aspects like data governance, integrity, and

regulatory compliance might not receive the requisite attention from the C-suite.

Anh NGUYEN Page 95 of 128


Msc Health Management & Data Intelligence

Due to the limited awareness evident among employees and leaders alike, the implementation

of incentives to promote adherence to data governance regulations is understandably rare.

According to Figure 31, a mere 12% of organizations have currently instituted such incentive

schemes. This dearth of emphasis on incentives mirrors the prevailing deficiency in awareness

levels, underscoring the urgent necessity for more comprehensive educational and awareness

campaigns. These campaigns aim to cultivate a robust culture of data governance within

organizational contexts.

Figure 31 - Incentives for employees (8 responses)

As mentioned by Yang et al. (2010), regarding the facilitation of compliance, the application

of incentives emerges as a promising strategy. By offering inducements, such as

acknowledgment or rewards, both employees and pharmaceutical enterprises can be effectively

motivated to conform to data integrity regulations. These incentives might span periods

characterized by the absence of data integrity related concerns or the successful implementation

of robust data security measures. Through this approach, employees and organizations could be

galvanized to uphold elevated benchmarks of data integrity (Yang et al., 2010).

Anh NGUYEN Page 96 of 128


Msc Health Management & Data Intelligence

6.1.5 Challenges and obstacles of implementing data governance program

As reviewed extensively in literature sessions, a wealth of studies has delved into the myriad

challenges and obstacles that surround the implementation of data governance within the

pharmaceutical industry. This endeavor is characterized by a nuanced interplay of factors,

encompassing regulatory mandates, the intricate fabric of organizational culture, technological

advancements, and the complexities of day-to-day operations (Cheong & Chang, 2007;

Redman, 2013; Rifaie et al., 2009; Spivey, 2022). As illuminated by these studies, successfully

navigating these challenges is not merely a strategic imperative for organizational triumph, but

it also bears far-reaching consequences for public health and the industry's overall credibility

(Eglovitch, 2022; FDA, 2016; Holub et al., 2017; Truong et al., 2017).

The survey results, as shown in Figure 32, provide valuable insights into the primary challenges

faced in implementing data governance within the pharmaceutical industry. The findings

highlight the multifaceted nature of these challenges, which encompass various aspects of

information and document management.

What are the most important challenges in Data Governance within your organization?

Figure 32 – Challenges in Data Governance (11 responses)

Anh NGUYEN Page 97 of 128


Msc Health Management & Data Intelligence

The transition to a fully digital environment is identified as the most noteworthy challenge, with

90% of respondents acknowledging its complexity. This transition entails not only the technical

aspects of digitization but also the need to adapt existing workflows and practices to the digital

landscape.

Organizing access and facilitating the sharing of information and knowledge emerge as

significant hurdles, with a notable 82% of respondents highlighting this challenge. This reflects

the complexity of ensuring seamless and secure access to critical data while fostering

collaboration across different organizational units.

Equally prominent is the challenge of mastering the risks associated with document and

information management, such as the potential for loss or unauthorized modification. This

challenge also garners an 82% response rate, underscoring the industry's recognition of the

importance of safeguarding the integrity and security of pharmaceutical data.

Defining and implementing rules and processes for document management, including

versioning, workflows, and document naming, emerges as the third challenge, with 54% of

respondents indicating its significance. This highlights the need for standardized and efficient

practices that ensure consistency, compliance, and traceability in managing documents.

Organizing the long-term sustainability of specific documents and information, valuing

information, and mastering document-related costs are challenges recognized by 36% of

respondents each. These challenges underline the broader implications of data governance,

including the need to ensure the preservation of critical information over time, effectively assess

Anh NGUYEN Page 98 of 128


Msc Health Management & Data Intelligence

the value of various data assets, and manage costs associated with document management and

information systems.

As per Figure 33 below, foremost among the identified obstacles is the budgetary constraint

required for the implementation of comprehensive information governance, registering a

substantial 100%. This underscores the financial considerations that organizations must grapple

with, highlighting the resource-intensive nature of establishing robust data governance

frameworks.

What are the main obstacles to implementing Data Governance program?

Figure 33 – Main obstacles in Data Governance implementation (11 responses)

A closely related obstacle is the lack of willingness from decision-makers and managers,

resonating at 64%. The finding aligns with the previous observations, indicating a disparity in

prioritization between data governance and other strategic imperatives. This lack of enthusiasm

from leadership underscores the necessity for advocating the value and significance of data

governance across all levels of the organization.

Anh NGUYEN Page 99 of 128


Msc Health Management & Data Intelligence

The survey further reveals challenges stemming from knowledge gaps. Specifically, the lack of

knowledge about methodologies commands a significant 82%, reflecting the complexity of data

governance methodologies and frameworks. Concurrently, the lack of awareness regarding the

risks entailed by inadequate data governance—such as the loss of critical information and the

exposure of personal data—constitutes a substantial obstacle at 73%. The result accentuates the

need for educational initiatives that empower stakeholders with the requisite understanding of

data governance's intricacies and its implications for organizational success and compliance.

Interestingly, the lack of knowledge about obligations and standards is reported as a less

prominent challenge, representing 36% of respondents. It can be attributed to the highly

regulated nature of the pharmaceutical industry, where compliance with obligations and

standards is inherently ingrained in operations. However, it's worth noting that even within a

regulated environment, maintaining awareness and alignment with evolving standards remains

a vital facet of effective data governance.

Figure 34 below provides an overview of the significant challenges encountered in Data

Governance within pharmaceutical organizations. It showcases various aspects that require

attention. The most prominent challenge, highlighted at 82%, is document lifecycle

management. This underscores the complexity of effectively managing documents throughout

their entire existence. Following closely behind are Electronic Document Management (EDM)

and handling sensitive data, both at 73%. These figures emphasize the critical need for robust

protocols in these areas.

Anh NGUYEN Page 100 of 128


Msc Health Management & Data Intelligence

Figure 34 – Most complicated activities on Data Governance (11 responses)

Market intelligence/business intelligence activities rank at 63%, indicating the importance of

Data Governance in generating actionable insights. Electronic archiving, collaboration on

office files, disparity of solutions within the organization, and document resource management

all pose challenges at a rate of 36%. Managing paper archives, including the associated costs

and space requirements, presents a challenge at 27%. Intranet content management, enterprise

social media, and digitalizing stocks encounter hurdles at 18%, while email management and

digitization of workflow pose the least challenges at 9%.

This figure illustrates the diverse and multifaceted landscape of Data Governance, necessitating

comprehensive approaches to ensure regulatory compliance and data integrity. These findings

aim to underscore the intricate nature of Data Governance and its broad scope within a

company. The objective is to raise awareness among leaders regarding the importance of

establishing a well-structured Data Governance plan that covers all the aspects mentioned

Anh NGUYEN Page 101 of 128


Msc Health Management & Data Intelligence

above. This ensures that the organization effectively addresses these challenges and maintains

data quality and compliance.

The survey findings presented above offer a simultaneous reflection and a deeper insight into

the data governance practices within pharmaceutical companies. These insights, coupled with

the identification of critical success factors (CSFs) for data governance and the suggestions

from other findings in Literature Reviews, aim to formulate the following critical success

factors for Data Governance in the pharmaceutical industry:

7 Critical Success Factors (CSFs) for Implementing Data Governance

in Pharmaceutical Industry

The Critical Success Factors (CSFs) proposed below are drawn from a comprehensive

understanding of industry-specific needs and best practices thanks to the survey and literature

review. Specifically, from the seven critical success factors (CSFs) for data governance

concluded by Alhassan et al. in 2019 (section 4.3), and the six measures to ensure data integrity

proposed by Shafiei et al. in 2015 (section 4.1.5), combined with the CSFs for data governance

in cloud computing thanks to the research of Al-Ruithe & Benkhelifa in 2017 (section 4.2.4.3).

Challenges encompass navigating intricate regulatory landscapes, fostering a data-driven

culture, integrating data from diverse sources, and ensuring data integrity and security.

Conversely, opportunities lie in enhancing decision-making through data-driven insights,

optimizing operational efficiency, complying with regulatory mandates, and capitalizing on

Anh NGUYEN Page 102 of 128


Msc Health Management & Data Intelligence

data for innovation. To effectively address these challenges and capitalize on the available

opportunities, a multi-faceted approach is recommended:

7.1 Sponsorship from Leaders

In the context of the pharmaceutical industry, obtaining sponsorship from organizational

leaders is significant for the successful implementation of a robust data governance framework

(Al-Ruithe & Benkhelifa, 2017; Murray, 2023). This sponsorship functions as a foundational

cornerstone, carrying the dual role of drawing attention and expediting the allocation of critical

resources, encompassing both human capital and financial provisions (Al-Ruithe & Benkhelifa,

2017; Murray, 2023). Through the acquisition of such sponsorship, pharmaceutical enterprises

can ensure that their data governance endeavors attain the essential prominence and backing

indispensable for their efficacious execution (Al-Ruithe & Benkhelifa, 2017; Murray, 2023).

Given the industry's landscape characterized by rigorous regulations, intricate data dynamics,

and the imperative of data-driven decision-making, the backing and dedication of leaders

underscore the strategic significance of data governance (Al-Ruithe & Benkhelifa, 2017;

Murray, 2023). This affirmation reverberates across the entire organizational fabric,

establishing a guiding precedent for all departments to prioritize and proactively engage with

the tenets of data governance.

Furthermore, this sponsorship will serve as a catalyst for introducing incentives that can emerge

as a potent strategy to induce compliance with data integrity regulations for both employees

and pharmaceutical companies (Yang et al., 2010). These incentives, taking the form of

acknowledgments or rewards, can be extended as recognition for periods devoid of data

integrity issues or for the successful implementation of robust data security measures (Yang et

Anh NGUYEN Page 103 of 128


Msc Health Management & Data Intelligence

al., 2010). By employing such incentives, a motivating impetus is cultivated among employees

and organizations, spurring them to uphold elevated standards of data integrity (Yang et al.,

2010).

7.2 Employee Data Competency Training

Within the dynamic landscape of the pharmaceutical industry, the integration of comprehensive

employee data competency training holds pivotal importance. This specialized training

initiative aims to enhance the data literacy and proficiency of the workforce, all while fostering

a deep understanding of regulatory compliance directives (Alhassan et al., 2019; Shafiei et al.,

2015). By instating these targeted programs, pharmaceutical corporations can effectively

mitigate the risks linked with mishandling data, ensuring the precision, credibility, and security

of sensitive information (Alhassan et al., 2019).

The complex nature of the pharmaceutical industry amplifies the importance of guaranteeing

that each staff member possesses a strong grasp of data competency and a clear grasp of

regulatory protocols (Alhassan et al., 2019; MHRA, 2015; PIC, 2021; Shafiei et al., 2015).

Through these meticulously crafted training endeavors, a culture rooted in conscientious data

stewardship permeates various departments, thereby facilitating well-informed decision-

making, regulatory adherence, and the optimal utilization of data assets (Alhassan et al., 2019;

Shafiei et al., 2015). Within the pharmaceutical context, characterized by its unwavering

commitment to precision, safety, and compliance, furnishing employees with the aptitude to

navigate the intricate data terrain not only enhances operational efficiency but also reinforces

the industry's steadfast dedication to achieving excellence driven by data (Truong et al., 2017).

Anh NGUYEN Page 104 of 128


Msc Health Management & Data Intelligence

Adequate training for analysts and operators stands as an imperative prerequisite to ensure

adherence to current Good Manufacturing Practice (cGMP) requisites and empower them to

access all data pertinent to cGMP processes when required (Charoo et al., 2023). This

encompasses instruction on adept data recording practices, adept management of documents,

and an understanding of the crucial role of accurate and accessible data in upholding regulatory

compliance (Code of Federal Regulations, 2022).

An essential facet lies in providing employees with comprehensive training concerning the

stipulations outlined in 21 CFR Part 11, which pertains to electronic records and signatures

(Charoo et al., 2023). In the contemporary landscape, a cardinal directive mandates the

immediate recording of data as it transpires (Charoo et al., 2023). To address this requirement,

analysts should be trained to document their activities once they conclude, and mechanisms

must be firmly in place to ensure adherence to this procedural norm (Charoo et al., 2023).

7.3 Clear Roles and Responsibilities

Within the pharmaceutical realm, where precision, transparency, and unwavering regulatory

compliance form the bedrock, the establishment of well-defined roles and open channels of

communication stands as a powerful mechanism (Alhassan et al., 2019; Al-Ruithe et al., 2016;

Khatri & Brown, 2010; Murray, 2023). In this context, everyone within the organization is

empowered to uphold data integrity, aligning with the overarching mission of advancing

healthcare through meticulous data management (Pérez, 2017; Spivey, 2022; Unger, 2017).

Given the intricate nature of pharmaceutical operations encompassing research, development,

clinical trials, regulatory adherence, and patient safety, a methodical approach to data handling

Anh NGUYEN Page 105 of 128


Msc Health Management & Data Intelligence

becomes imperative (Khin et al., 2020). By meticulously outlining roles and responsibilities,

pharmaceutical entities foster a comprehensive understanding among team members regarding

their pivotal contributions to data governance and regulatory alignment (Alhassan et al., 2019;

Al-Ruithe et al., 2016; Khatri & Brown, 2010; Murray, 2023). This cultivation of awareness

nurtures a culture of accountability and ownership, effectively safeguarding data accuracy,

security, and ethical utilization (Rebollo et al., 2015; Tountopoulos et al., 2014).

Furthermore, the establishment of effective communication pathways serves to seamlessly

disseminate critical information across diverse departments, creating a collaborative ecosystem

where data governance principles naturally interweave with day-to-day operational practices

(Alhassan et al., 2019). This concerted endeavor ensures that precision, transparency, and

regulatory adherence remain steadfast at every level, fortifying the bedrock of conscientious

data management and the pursuit of enhanced healthcare outcomes (Charoo et al., 2023; FDA,

2018; Rifaie et al., 2009).

In line with this, the FDA (2018) recommends the assignment of the system administrator role

to individuals not responsible for record content. This segregation of duties serves to thwart

unauthorized alterations to records, thereby upholding data integrity and security (FDA, 2018).

7.4 Advanced Technology

The embracement of advanced technological solutions stands as a fundamental aspect, driving

enhanced data management practices facilitated by Artificial Intelligence (AI)-driven analytics

and secure storage solutions (Al-Ruithe et al., 2016; Iansiti et al., 2021). The pharmaceutical

industry functions within a landscape that is progressively data-intensive, where copious

Anh NGUYEN Page 106 of 128


Msc Health Management & Data Intelligence

amounts of information are generated and scrutinized for various purposes such as research,

development, clinical trials, regulatory adherence, and patient safety (Khin et al., 2020). In this

intricate terrain, tapping into the potential of advanced technologies, notably AI, not only

accelerates the processing and analysis of data but also augments the precision and accuracy of

the insights gleaned (Al-Ruithe et al., 2016; Iansiti et al., 2021). AI-driven analytics can uncover

patterns, correlations, and trends within massive datasets that might be difficult to discern

through traditional methods (Al-Ruithe et al., 2016; Iansiti et al., 2021).

In essence, the infusion of cutting-edge technology into the pharmaceutical context transcends

mere efficiency gains. It serves as the bedrock for data-driven decision-making, enriches

research and development undertakings, and reinforces the industry's dedication to patient

safety and regulatory conformity (Alosert et al., 2022; Charoo et al., 2023; Khin et al., 2020).

As the industry progresses, the strategic incorporation of advanced technology seamlessly

aligns with the dynamic landscape of the sector, equipping pharmaceutical companies to

adeptly navigate challenges and seize opportunities presented by an era increasingly

characterized by the centrality of data (Alhassan et al., 2016).

Furthermore, secure storage solutions become paramount due to the sensitive nature of

pharmaceutical data, which includes proprietary research, patient records, and regulatory

documentation (Shafiei et al., 2015). By employing advanced encryption techniques and secure

storage systems, pharmaceutical companies can safeguard critical data from unauthorized

access, breaches, and potential cyber threats (Iansiti et al., 2021; Khin et al., 2020; Shafiei et

al., 2015). This fortified data infrastructure not only ensures compliance with stringent

regulatory requirements but also fosters trust among stakeholders, including regulatory

Anh NGUYEN Page 107 of 128


Msc Health Management & Data Intelligence

agencies, partners, and patients (Shafiei et al., 2015). The integration of cutting-edge

technological solutions, including advanced tools like electronic laboratory notebooks (ELN),

laboratory information management systems (LIMS), and manufacturing execution systems

(MES), has emerged as a cornerstone in augmenting data management practices within the

biopharmaceutical sector (Weiss, 2022). In this regard, Electronic Laboratory Notebook (ELN)

systems enable real-time data recording during activities, ensuring compliance with the

contemporaneous requirement (Charoo et al., 2023). Additionally, commercially available

software platforms offer the capability to record activity details in a secure audit trail that

remains unalterable, providing a solution to enhance contemporaneous documentation practices

(Charoo et al., 2023).

7.5 Collaboration

Cultivating a culture of collaboration between Information Technology (IT), Data Departments,

and various business units emerges as a fundamental catalyst, facilitating the seamless

integration of diverse data sources and fostering cohesive decision-making across the

organization (Khatri & Brown, 2010). The pharmaceutical industry operates within a complex

landscape where departments such as research, clinical trials, regulatory affairs, manufacturing,

and data management each generate and manage a wealth of data (Khatri & Brown, 2010).

Promoting collaboration between these units, Data Departments, and IT ensures that data flows

harmoniously across functional boundaries, enabling a comprehensive perspective that informs

strategic decision-making (Khatri & Brown, 2010).

This collaboration takes on added significance in the pharmaceutical context due to the

industry's intricate demands and stringent regulations that must be adhered to, all while

Anh NGUYEN Page 108 of 128


Msc Health Management & Data Intelligence

maintaining a patient-centric focus (Khatri & Brown, 2010). The amalgamation of data sharing

empowers departments to access precise and current information, enhancing the efficiency of

processes such as clinical trial management, regulatory submissions, and pharmacovigilance

(Schell, 2019). Moreover, as pharmaceutical companies endeavor to pioneer innovative

treatments and therapies, the symbiotic relationship between IT, Data Departments, and

business units nurtures an environment where data-driven insights can guide and expedite

research and development endeavors, make well-informed choices, and contribute

substantively to the advancement of healthcare outcomes (Cheong & Chang, 2007; Redman,

2013; Rifaie et al., 2009).

7.6 Data Driven Culture

By championing the value of data stewardship across all ranks, a shared sense of collective

ownership and commitment to the data governance framework is fostered (Rosenbaum, 2010;

Shafiei et al., 2015). This cultural shift goes beyond the mere adoption of technologies; it

encapsulates a mindset where data is recognized as a strategic asset, and its responsible

management is ingrained in the fabric of the organization's ethos (Al-Ruithe et al., 2016; Charoo

et al., 2023; Shafiei et al., 2015).

The pharmaceutical industry operates within a framework that requires meticulous attention to

detail, adherence to regulatory requirements, and the pursuit of scientific breakthroughs (Iansiti

et al., 2021; Rosenbaum, 2010). A data-driven culture equips employees with the tools and

mindset to make informed decisions, grounded in accurate and reliable data (Shafiei et al.,

2015). By embracing this culture, pharmaceutical companies can facilitate streamlined

Anh NGUYEN Page 109 of 128


Msc Health Management & Data Intelligence

processes, enhance research and development endeavors, and optimize patient care (Charoo et

al., 2023; Friedman, 2012; Shafiei et al., 2015).

7.7 Clear data processes and procedures

In an industry where strict adherence to regulations takes precedence, aligning data governance

practices with compliance requirements becomes imperative (FDA, 2018; Seddon & Currie,

2013). Through transparent workflows, precise documentation, and unwavering compliance,

the integrity of data is upheld while potential risks are mitigated (Charoo et al., 2023; Shafiei

et al., 2015).

Establishing clear data processes and procedures forms a cornerstone in effective data

governance within the pharmaceutical industry (Charoo et al., 2023; Shafiei et al., 2015). By

delineating step-by-step guidelines for data collection, storage, usage, sharing, and disposal,

pharmaceutical companies ensure consistency, accuracy, and compliance throughout their data

lifecycle (Weiss, 2022). These standardized processes streamline operations, mitigate errors,

and enhance data integrity, allowing the organization to make informed decisions based on

reliable information (Weiss, 2022).

The gravity of clear data processes and procedures is heightened in the pharmaceutical realm

due to its intricate regulatory environment and the paramount significance of data in domains

like drug development, clinical trials, and patient safety (Alosert et al., 2022; Charoo et al.,

2023; Khin et al., 2020). Regulatory bodies mandate transparent documentation of data

handling practices to validate research outcomes and safeguard patient well-being (FDA, 2018).

By adhering rigorously to meticulously defined data processes, pharmaceutical entities not only

Anh NGUYEN Page 110 of 128


Msc Health Management & Data Intelligence

harmonize with regulatory mandates but also reduce the risk of non-compliance, data breaches,

and erroneous conclusions (Alosert et al., 2022; Charoo et al., 2023; Khin et al., 2020).

In crafting these processes and procedures, pharmaceutical companies can draw inspiration

from established frameworks such as F.A.I.R. (Findable, Accessible, Interoperable, Reusable)

and ALCOA+ (Attributable, Legible, Contemporaneous, Original, Accurate, Complete,

Consistent, Enduring). These frameworks ensure the quality, traceability, and authenticity of

data (Alharbi et al., 2021; Chubb, 2021; Fox, 2019; Roe, 2021). Additionally, integrating

principles from Good Manufacturing Practice (cGMP) standards further augments the

dependability and integrity of data (Spivey, 2022). cGMP, requiring meticulous documentation

and standardized processes, plays a pivotal role in sustaining product quality and patient safety

(Spivey, 2022).

After defining clear procedures, communication emerges as a critical aspect, encompassing the

dissemination of monitoring and measurement results, decision-making processes, and

mechanisms to communicate data-related investments to stakeholders (Rifaie et al., 2009). The

significance of effective communication is underscored by identified instances of data

governance program failures attributed to inadequate staff and stakeholder communication

(Ladley, 2019). This highlights the urgency for well-structured communication strategies

within the context of data governance implementation, especially in cloud environments

(Morabito, 2015).

Anh NGUYEN Page 111 of 128


Msc Health Management & Data Intelligence

8 Limitation and future research

While this study has provided valuable insights into data governance within the pharmaceutical

industry, certain limitations warrant acknowledgment and suggestions for future research. One

limitation is the sample size, as the survey garnered responses from only 19 participants,

potentially limiting the breadth of coverage across the industry. Although the findings are

informative, a larger and more diverse sample could provide a more comprehensive

understanding of data governance challenges and practices. Additionally, the inability to

conduct one-on-one interviews due to confidentiality constraints restricted the depth of insights.

In future studies, it would be beneficial to explore ways to convince participants to engage in

interviews or utilize alternative methods that ensure anonymity, thereby facilitating more

profound and nuanced discussions on data governance matters.

Furthermore, the global nature of the pharmaceutical industry introduces a geographical aspect

to data governance challenges. As participants represented different countries, variations in

regulatory frameworks and industry practices could impact the results. To address this, future

studies could focus on specific geographic zones, delving into the intricacies of data governance

issues within that region. By narrowing the scope, researchers can gather more localized, in-

depth insights that align with the regulatory and operational nuances of that specific zone. This

approach would yield more targeted and actionable recommendations for companies operating

within a particular geographical context.

In retrospect, mitigating the limitations of sample size and participant interaction, while

adopting a more localized research focus, would enhance the comprehensiveness and

Anh NGUYEN Page 112 of 128


Msc Health Management & Data Intelligence

applicability of findings, driving a deeper understanding of data governance intricacies and

facilitating the implementation of effective strategies in the pharmaceutical sector.

9 Conclusion

This thesis has undertaken a thorough investigation into data governance within the

pharmaceutical industry, yielding valuable insights into the core factors that contribute to

effective data management. Through a careful blend of literature review and survey analysis,

several significant findings have come to light.

The literature review has shed light on various aspects, highlighting the importance of data

management, data governance, the challenges related to data integrity in pharmaceuticals, data

governance in cloud computing environments, and the data lifecycle in BioPharma. While

earlier studies have emphasized the significance of data governance and integrity across

different industries, including the biopharmaceutical sector, there remains a gap in

understanding the specific challenges and best practices for implementing data governance

within the context of digital transformation in this industry. While some research has explored

the adoption of specific tools to improve data integrity, comprehensive studies examining data

governance frameworks and their impact on digital transformation outcomes are lacking.

As the landscape of drug development evolves with the introduction of new approaches,

questions arise regarding the suitability of existing informatics and hardware systems for small-

molecule drug development (Weiss, 2022). Moreover, the emergence of mid-market

biotechnology companies with considerable purchasing power but limited legacy infrastructure

Anh NGUYEN Page 113 of 128


Msc Health Management & Data Intelligence

presents an opportunity for a fresh and innovative approach to traditional informatics (Weiss,

2022).

The adoption of these new technologies is driven by their potential to offer a holistic approach

to BioPharma Lifecycle Management (Weiss, 2022). This approach involves utilizing out-of-

the-box workflow execution, preconfigured system and hardware integrations, contextualized

data repositories based on F.A.I.R principles, and integrated analytics to support business

intelligence through a unified digital platform (Alharbi et al., 2021, 2023; Fox, 2019; Wise et

al., 2019). The success of these technologies hinges on their ease of implementation, immediate

business benefits, cost-effectiveness, and scalability in supporting data lifecycle management

for Industry 4.0 initiatives (Roe, 2021; Weiss, 2022)

Building upon this context, this thesis developed a survey to delve deeply into the real

challenges faced by the pharmaceutical industry and identified seven critical success factors to

address these issues. The significance of obtaining sponsorship from organizational leaders,

fostering a culture that values data, and defining clear roles and responsibilities has been

emphasized. These elements collectively provide a strong foundation for the implementation of

effective data governance practices. Additionally, harnessing advanced technology and

promoting collaboration between IT, Data Departments, and business units have been identified

as essential components that contribute to the smooth integration of data sources and enhance

decision-making processes. Furthermore, the adoption of clear data processes and procedures,

guided by principles such as F.A.I.R., ALCOA+, and cGMP, ensures data integrity, regulatory

compliance, and transparency throughout the data lifecycle.

Anh NGUYEN Page 114 of 128


Msc Health Management & Data Intelligence

Drawing from these insights, tailored recommendations can be developed for different types of

pharmaceutical companies. While Centralized data governance may suit larger organizations

requiring a unified approach, Federated models might benefit companies with distinct business

units seeking autonomy. Hybrid approaches, blending elements from both models, could

provide flexibility and adaptability for companies with diverse data needs. The implications of

this research extend widely. Effective data governance empowers pharmaceutical companies to

optimize decision-making, innovate with confidence, and align with regulatory requirements.

Future research could explore the measurable impact of different data governance models on

operational efficiency, patient outcomes, and overall business performance. Additionally,

investigating the influence of emerging technologies such as blockchain and AI on data

governance practices could provide valuable insights into evolving industry trends.

Anh NGUYEN Page 115 of 128


Msc Health Management & Data Intelligence

10 References

10 internet of things (IoT) healthcare examples. (2023). [Blog]. Ordr. https://ordr.net/article/iot-

healthcare-examples/

Abu-Elkheir, M., Hayajneh, M., & Ali, N. A. (2013). Data Management for the Internet of

Things : Design Primitives and Solution. Sensors, 13(11), Article 11.

https://doi.org/10.3390/s131115582

Alharbi, E., Skeva, R., Juty, N., Jay, C., & Goble, C. (2021). Exploring the Current Practices,

Costs and Benefits of FAIR Implementation in Pharmaceutical Research and

Development : A Qualitative Interview Study. Data Intelligence, 3(4), 507‑527.

https://doi.org/10.1162/dint_a_00109

Alharbi, E., Skeva, R., Juty, N., Jay, C., & Goble, C. (2023). A FAIR-Decide framework for

pharmaceutical R&D : FAIR data cost–benefit assessment. Drug Discovery Today,

28(4), 103510. https://doi.org/10.1016/j.drudis.2023.103510

Alhassan, I., Sammon, D., & Daly, M. (2016). Data governance activities : An analysis of the

literature. Journal of Decision Systems, 25(sup1), 64‑75.

https://doi.org/10.1080/12460125.2016.1187397

Alhassan, I., Sammon, D., & Daly, M. (2019). Critical Success Factors for Data Governance :

A Theory Building Approach. Information Systems Management, 36(2), 98‑110.

https://doi.org/10.1080/10580530.2019.1589670

Alosert, H., Savery, J., Rheaume, J., Cheeks, M., Turner, R., Spencer, C., S. Farid, S., &

Goldrick, S. (2022). Data integrity within the biopharmaceutical sector in the era of

Industry 4.0. Biotechnology Journal, 17(6), 2100609.

https://doi.org/10.1002/biot.202100609

Anh NGUYEN Page 116 of 128


Msc Health Management & Data Intelligence

Al-Ruithe, M., & Benkhelifa, E. (2017). Analysis and Classification of Barriers and Critical

Success Factors for Implementing a Cloud Data Governance Strategy. Procedia

Computer Science, 113, 223‑232. https://doi.org/10.1016/j.procs.2017.08.352

Al-Ruithe, M., Benkhelifa, E., & Hameed, K. (2016). Key Dimensions for Cloud Data

Governance. 2016 IEEE 4th International Conference on Future Internet of Things and

Cloud (FiCloud), 379‑386. https://doi.org/10.1109/FiCloud.2016.60

Al-Ruithe, M., Benkhelifa, E., & Hameed, K. (2019). A systematic literature review of data

governance and cloud data governance. Personal and Ubiquitous Computing, 23(5),

839‑859. https://doi.org/10.1007/s00779-017-1104-3

Badger, M. L., Grance, T., Patt-Corner, R., & Voas, J. (2012). Cloud computing synopsis and

recommendations (NIST SP 800-146; 0 éd., p. NIST SP 800-146). National Institute of

Standards and Technology. https://doi.org/10.6028/NIST.SP.800-146

Bagozzi, D., & Lindmeier, C. (2017). 1 in 10 medical products in developing countries is

substandard or falsified. World Health Organization.

https://www.who.int/news/item/28-11-2017-1-in-10-medical-products-in-developing-

countries-is-substandard-or-falsified

Buytaert-Hoefen, K. (2019, avril 23). A Harmonized Approach to Data Integrity. BioProcess

International. https://bioprocessintl.com/manufacturing/information-technology/a-

harmonized-approach-to-data-integrity/

Cave, A. E. (2017). Exploring Strategies for Implementing Data Governance Practices.

ProQuest.

https://www.proquest.com/openview/7f3db1944e62aa8e5476538565c86ca2/1?pq-

origsite=gscholar&cbl=18750

Anh NGUYEN Page 117 of 128


Msc Health Management & Data Intelligence

Charoo, N. A., Khan, M. A., & Rahman, Z. (2023). Data integrity issues in pharmaceutical

industry : Common observations, challenges and mitigations strategies. International

Journal of Pharmaceutics, 631, 122503. https://doi.org/10.1016/j.ijpharm.2022.122503

Cheong, L. K., & Chang, V. (2007). The Need for Data Governance : A Case Study.

Chia, J. (2023). The Modern Data Stack Explained : What The Future Holds | Alation. Alation.

https://www.alation.com/blog/modern-data-stack-explained/

Chubb, P. (2021, décembre 7). Setting the Standard : FAIR & ALCOA+ in research during the

pandemic. Labfolder. https://labfolder.com/fair-alcoa-and-the-pandemic/

Code of Federal Regulations. (2022). 21 CFR Part 211—Current Good Manufacturing Practice

for Finished Pharmaceuticals. Code of Federal Regulations.

https://www.ecfr.gov/current/title-21/chapter-I/subchapter-C/part-211

Cox. (2019, juillet 16). US FDA Warning Letter Hits Strides On Data Integrity. Pharma

Intelligence. https://pink.pharmaintelligence.informa.com/PS140515/US-FDA-

Warning-Letter-Hits-Strides-On-Data-Integrity

DAMA International. (2009). The DAMA Guide to The Data Management Body of Knowledge

(DAMA-DMBOK Guide) (world). Technics Publications, LLC.

https://doi.org/10.5555/3165209

Dehghani, Z. (2022). Data Mesh. Marcombo.

Dogo, E. M., Salami, A. F., & Salman, S. (2013). Feasibility Analysis of Critical Factors

Affecting Cloud Computing in Nigeria.

https://uilspace.unilorin.edu.ng/handle/20.500.12484/2832

Eglovitch, J. (2022). Experts Say FDA Enforcement Focus Unchanged, Use Of Alternative

Tools To Grow. raps. org.

Anh NGUYEN Page 118 of 128


Msc Health Management & Data Intelligence

EMEA. (2010). Guideline on real time release testing. EMEA.

http://www.ema.europa.eu/docs/enGB/document library/Scientific

guideline/2010/03/WC500075028.pdf

EudraLex. (2011). The Rules Governing Medicinal Products in the European Union: Vol.

Volume 4 Good Manufacturing Practice Medicinal Products for Human and Veterinary

Use. European Commission Health And Consumers Directorate-General.

https://health.ec.europa.eu/system/files/2016-11/annex11_01-2011_en_0.pdf

European Health Data Space. (2023, mai 12). https://health.ec.europa.eu/ehealth-digital-

health-and-care/european-health-data-space_en

FDA. (2004). Guidance for Industry PAT - A Framework for Innovative Pharmaceutical

Development, manufacturing, and Quality Assurance. U.S. Department of Health and

Human Services Food and Drug Administration.

https://www.fda.gov/media/71012/download

FDA. (2016). Submission of Quality Metrics Data Guidance for Industry. U.S. Department of

Health and Human Services Food and Drug Administration.

https://www.fda.gov/media/93012/download

FDA. (2018). Data Integrity and Compliance With Drug CGMP Questions and Answers

Guidance for Industry. U.S. Department of Health and Human Services Food and Drug

Administration. https://www.fda.gov/media/119267/download

FDA. (2003). Part 11, Electronic Records; Electronic Signatures—Scope and Application. U.S.

Food and Drug Administration; FDA. https://www.fda.gov/regulatory-

information/search-fda-guidance-documents/part-11-electronic-records-electronic-

signatures-scope-and-application

Anh NGUYEN Page 119 of 128


Msc Health Management & Data Intelligence

FDA warning letter. (2015). FDA warning letter Zhejiang Hisun Pharmaceutical Co., Ltd.

https://www.fdanews.com/ext/resources/files/2016/01/01-12-16-

ZhejiangHisunPharma.pdf?1520830779

FDA warning letter. (2017). FDA warning letter FACTA Farmaceutici S.p.A. MARCS-CMS

495986. https://www.fda.gov/inspections-compliance-enforcement-and-criminal-

investigations/warning-letters/facta-farmaceutici-spa-495986-01132017

FDA warning letter. (2021, octobre 8). BBC Group Limited—614659—08/04/2021. FDA; FDA.

https://www.fda.gov/inspections-compliance-enforcement-and-criminal-

investigations/warning-letters/bbc-group-limited-614659-08042021

Felici, M., Koulouris, T., & Pearson, S. (2013). Accountability for Data Governance in Cloud

Ecosystems. 2013 IEEE 5th International Conference on Cloud Computing Technology

and Science, 2, 327‑332. https://doi.org/10.1109/CloudCom.2013.157

Fleming, N. (2018). How artificial intelligence is changing drug discovery. Nature, 557(7707),

S55‑S57. https://doi.org/10.1038/d41586-018-05267-x

Floryanzia, S., Ramesh, P., Mills, M., Kulkarni, S., Chen, G., Shah, P., & Lavrich, D. (2022).

Disintegration testing augmented by computer Vision technology. International Journal

of Pharmaceutics, 619, 121668. https://doi.org/10.1016/j.ijpharm.2022.121668

Fox, B. (2019). Leveraging the FAIR principles of data in pharma. Pharmaphorum.

https://pharmaphorum.com/views-analysis-digital/leveraging-the-fair-principles-of-

data-in-pharma

Friedman, R. L. (2012). Current Expectations for Pharmaceutical Quality Systems. FDA.

https://www.fda.gov/media/84744/download

HDS certification : Healthcare data hosting | OVHcloud. (s. d.). Consulté 18 septembre 2023,

à l’adresse https://www.ovhcloud.com/en-gb/enterprise/certification-conformity/hds/

Anh NGUYEN Page 120 of 128


Msc Health Management & Data Intelligence

Henstock, P. V. (2019). Artificial Intelligence for Pharma : Time for Internal Investment.

Trends in Pharmacological Sciences, 40(8), 543‑546.

https://doi.org/10.1016/j.tips.2019.05.003

Hodgson, D., Maini, F., Greenrose, W., Christiani, S., Chan, S., & Hargitai, B. (2017). Under

the spotlight : Data integrity in life sciences. Deloitte, London, UK.

https://www2.deloitte.com/content/dam/Deloitte/uk/Documents/life-sciences-health-

care/deloitte-uk-data-integrity-report.pdf

Holub, P., Kohlmayer, F., Prasser, F., Mayrhofer, M. T., Schlunder, I., Martin, G. M., & Litton,

J. E. (2017). Enhancing Reuse of Data and Biological Material in Medical Research :

From FAIR to FAIR-Health. Biopreserv. Biobank., 97‑105.

Horner, M. (2023). The Modern Data Stack is Broken. TimeXTender.

https://www.timextender.com/blog/data-empowered-leadership/the-modern-data-

stack-is-broken

Huff, N. S., DHA, MBA, CHC, & CHSP. (2019). Maintaining data integrity.

https://compliancecosmos.org/maintaining-data-integrity

Iansiti, M., Lakhani, K. R., Mayer, H., & Herman, K. (2021). Moderna (A). Harvard Business

School Publishing.

IDBS. (s. d.). IDBS - The FAIR principles : A quick introduction. IDBS. Consulté 3 mai 2023,

à l’adresse https://www.idbs.com/the-fair-principles-a-quick-introduction/

Khanghahi, N., & Ravanmehr, R. (2013). Cloud Computing Performance Evaluation : Issues

and Challenges. International Journal on Cloud Computing: Services and Architecture

(IJCCSA) ,Vol.3, No.5, 3. https://doi.org/10.5121/ijccsa.2013.3503

Khatri, V., & Brown, C. V. (2010). Designing data governance. Communications of the ACM,

53(1), 148‑152. https://doi.org/10.1145/1629175.1629210

Anh NGUYEN Page 121 of 128


Msc Health Management & Data Intelligence

Khin, N. A., Francis, G., Mulinde, J., Grandinetti, C., Skeete, R., Yu, B., Ayalew, K., Cho, S.-

J., Fisher, A., Kleppinger, C., Ayala, R., Bonapace, C., Dasgupta, A., Kronstein, P. D.,

& Vinter, S. (2020). Data Integrity in Global Clinical Trials : Discussions From Joint

US Food and Drug Administration and UK Medicines and Healthcare Products

Regulatory Agency Good Clinical Practice Workshop. Clinical Pharmacology &

Therapeutics, 108(5), 949‑963. https://doi.org/10.1002/cpt.1794

Ko, R. K. L., Jagadpramana, P., Mowbray, M., Pearson, S., Kirchberg, M., Liang, Q., & Lee,

B. S. (2011). TrustCloud : A Framework for Accountability and Trust in Cloud

Computing. 2011 IEEE World Congress on Services, 584‑588.

https://doi.org/10.1109/SERVICES.2011.91

Koh, S. C. L., Gunasekaran, A., & Goodman, T. (2011). Drivers, barriers and critical success

factors for ERPII implementation in supply chains : A critical analysis. The Journal of

Strategic Information Systems, 20(4), 385‑402.

https://doi.org/10.1016/j.jsis.2011.07.001

Ladley, J. (2019). Data Governance : How to Design, Deploy, and Sustain an Effective Data

Governance Program. Academic Press.

https://books.google.fr/books?id=AkW9DwAAQBAJ&lpg=PP1&ots=OO7TJLvFwG

&dq=Ladley%20J.%20Data%20Governance%20Program&lr&hl=fr&pg=PR12#v=on

epage&q&f=false

Leesakul, N., Oostveen, A.-M., Eimontaite, I., Wilson, M. L., & Hyde, R. (2022). Workplace

4.0 : Exploring the Implications of Technology Adoption in Digital Manufacturing on a

Sustainable Workforce. Sustainability, 14(6), 3311.

https://doi.org/10.3390/su14063311

Anh NGUYEN Page 122 of 128


Msc Health Management & Data Intelligence

Levy, D. (2021). Overcoming Challenges to Machine Learning Adoption and Implementation

in the Lab. Lab Manager. https://www.labmanager.com/insights/overcoming-

challenges-to-machine-learning-adoption-and-implementation-in-the-lab-26870

Machado, I. A., Costa, C., & Santos, M. Y. (2022). Data Mesh : Concepts and Principles of a

Paradigm Shift in Data Architectures. Procedia Computer Science, 196, 263‑271.

https://doi.org/10.1016/j.procs.2021.12.013

Marcelo Corrales Compagnucci, Michael Lowery Wilson, Mark Fenwick, Nikolaus Forgó, &

Till Bärnighausen. (2022). AI in EHealth : Human Autonomy, Data Governance and

Privacy in Healthcare. Cambridge University Press.

https://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=3341835&site=e

host-live&scope=site

Marshall, I. J., & Wallace, B. C. (2019). Toward systematic review automation : A practical

guide to using machine learning tools in research synthesis. Systematic Reviews, 8(1),

163, s13643-019-1074‑1079. https://doi.org/10.1186/s13643-019-1074-9

Mary, B., Mccarthy, P., & Hill, S. (2011). Cloud adoption points to IT risk and data governance

challenges. Directorship, 7, 209‑211.

McDowall, R. (2020). Nightmare on lab street—Are you haunted by hybrid systems ? Agilent

Technologies Inc 2020. Agilent Open Lab.

https://www.agilent.com/cs/library/articlereprints/public/Agilent-LCGC-ebook-data-

integrity-tips-for-regulated-laboratories-part-1.pdf

MHRA. (2015). MHRA GMP Data Integrity Definitions and Guidance for Industry March

2015. Regulating Medicines and Medical Devices. https://www.ipqpubs.com/wp-

content/uploads/2015/04/Data_integrity_definitions_and_guidance_v2.pdf

Anh NGUYEN Page 123 of 128


Msc Health Management & Data Intelligence

Morabito, V. (2015). Big Data Governance. In V. Morabito (Éd.), Big Data and Analytics :

Strategic and Organizational Impacts (p. 83‑104). Springer International Publishing.

https://doi.org/10.1007/978-3-319-10665-6_5

Murray, S. (2023). Organizing Talent : Return Of The Data Center Of Excellence. Monte Carlo

Data. https://www.montecarlodata.com/blog-data-center-of-excellence/

Nadal, S., Abelló, A., Romero, O., Vansummeren, S., & Vassiliadis, P. (2023). Graph-Driven

Federated Data Management. IEEE Transactions on Knowledge and Data Engineering,

35(1), 509‑520. https://doi.org/10.1109/TKDE.2021.3077044

Neumeyer, M. (2020, juin 23). Data Integrity : 2020 FDA Data Integrity Observations in

Review. http://www.americanpharmaceuticalreview.com/Featured-Articles/565600-

Data-Integrity-2020-FDA-Data-Integrity-Observations-in-Review/

Otto, B. (2015). Quality and Value of the Data Resource in Large Enterprises. Information

Systems Management, 32(3), 234‑251.

https://doi.org/10.1080/10580530.2015.1044344

Panian, Z. (2010). Some Practical Experiences in Data Governance.

Pérez, J. R. (2017). Maintaining Data Integrity Avoiding regulator scrutiny in the medical

products industry. Quality Progress. https://bec-global.com/wp-

content/uploads/2017/10/Article-Data-Integrity.pdf

PIC. (2021). Good Practices For Data Management And Integrity In Regulated Gmp/Gdp

Environments. Pharmaceutical Inspection Convention Pharmaceutical Inspection Co-

Operation Scheme. http://organex.com.br/wp-content/uploads/2021/07/4234.pdf

Rattan, A. K. (2018). Data Integrity : History, Issues, and Remediation of Issues. PDA Journal

of Pharmaceutical Science and Technology, 72(2), 105‑116.

https://doi.org/10.5731/pdajpst.2017.007765

Anh NGUYEN Page 124 of 128


Msc Health Management & Data Intelligence

Rebollo, O., Mellado, D., Fernández-Medina, E., & Mouratidis, H. (2015). Empirical

evaluation of a cloud computing information security governance framework.

Information and Software Technology, 58, 44‑57.

https://doi.org/10.1016/j.infsof.2014.10.003

Redman, T. C. (2013). Data’s Credibility Problem. Harvard Business Review, 91(12), 84‑88.

Rifaie, M., Alhajj, R., & Ridley, M. (2009). Data governance strategy : A key issue in building

Enterprise Data Warehouse. Proceedings of the 11th International Conference on

Information Integration and Web-based Applications & Services, 587‑591.

https://doi.org/10.1145/1806338.1806449

Roe, R. (2021). Lack of FAIR data reduces life sciences innovation in laboratory informatics.

30‑31.

Rosenbaum, S. (2010). Data Governance and Stewardship : Designing Data Stewardship

Entities and Advancing Data Access. Health Services Research, 45(5p2), 1442‑1455.

https://doi.org/10.1111/j.1475-6773.2010.01140.x

Schell, D. (2019). How To Avoid Data-Integrity Woes In Pharma.

https://www.lifescienceleader.com/doc/how-to-avoid-data-integrity-woes-in-pharma-

0001

Seddon, J. J. M., & Currie, W. L. (2013). Cloud computing and trans-border health data :

Unpacking U.S. and EU healthcare regulation and compliance. Health Policy and

Technology, 2(4), 229‑241. https://doi.org/10.1016/j.hlpt.2013.09.003

Self, R. J. (2014). Governance Strategies for the Cloud, Big Data, and Other Technologies in

Education. 2014 IEEE/ACM 7th International Conference on Utility and Cloud

Computing, 630‑635. https://doi.org/10.1109/UCC.2014.101

Anh NGUYEN Page 125 of 128


Msc Health Management & Data Intelligence

Selvaraj, S., & Sundaravaradhan, S. (2019). Challenges and opportunities in IoT healthcare

systems : A systematic review. SN Applied Sciences, 2(1), 139.

https://doi.org/10.1007/s42452-019-1925-y

Shafiei, N., Montardy, R. D., & Rivera-Martinez, E. (2015). Data Integrity—A Study of Current

Regulatory Thinking and Action. PDA Journal of Pharmaceutical Science and

Technology, 69(6), 762‑770. https://doi.org/10.5731/pdajpst.2015.01082

Spivey, C. (2022). Ensuring CGMP Standards for Data Integrity. In the Lab eNewsletter, 17(6).

https://www.pharmtech.com/view/ensuring-cgmp-standards-for-data-integrity

Tallon, P. P., Ramirez, R. V., & Short, J. E. (2013). The Information Artifact in IT Governance :

Toward a Theory of Information Governance. Journal of Management Information

Systems, 30(3), 141‑178. https://doi.org/10.2753/MIS0742-1222300306

Tountopoulos, V., Felici, M., Pannetrat, A., Catteddu, D., & Pearson, S. (2014). Interoperability

Analysis of Accountable Data Governance in the Cloud. In F. Cleary & M. Felici (Éds.),

Cyber Security and Privacy (p. 77‑88). Springer International Publishing.

https://doi.org/10.1007/978-3-319-12574-9_7

Truong, T., George, R., & Davidson, J. (2017). Establishing an Effective Data Governance

System. Pharmaceutical Technology, 41(11), 42‑45.

Unger, B. (2017). Best Practices For Data Integrity Oversight At Your Contract Manufacturer.

Pharmaceutical Online. https://www.pharmaceuticalonline.com/doc/best-practices-for-

data-integrity-oversight-at-your-contract-manufacturer-0001

Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., Li, B., Madabhushi,

A., Shah, P., Spitzer, M., & Zhao, S. (2019). Applications of machine learning in drug

discovery and development. Nature Reviews Drug Discovery, 18(6), Article 6.

https://doi.org/10.1038/s41573-019-0024-5

Anh NGUYEN Page 126 of 128


Msc Health Management & Data Intelligence

Van Vlijmen, H., Mons, A., Waalkens, A., Franke, W., Baak, A., Ruiter, G., Kirkpatrick, C.,

da Silva Santos, L. O. B., Meerman, B., Jellema, R., Arts, D., Kersloot, M.,

Knijnenburg, S., Lusher, S., Verbeeck, R., & Neefs, J.-M. (2020). The Need of Industry

to Go FAIR. Data Intelligence, 2(1‑2), 276‑284. https://doi.org/10.1162/dint_a_00050

Weber, K., Otto, B., & Österle, H. (2009). One Size Does Not Fit All—A Contingency

Approach to Data Governance. Journal of Data and Information Quality, 1(1), 4:1-4:27.

https://doi.org/10.1145/1515693.1515696

Weiss, S. (2022). An Integrated Approach to the Data Lifecycle in BioPharma : Successful

digital transformation in biopharma requires an integrated approach to the data lifecycle.

Pharmaceutical Technology, 46(8), 44‑46.

Wende, K. (2007). A Model for Data Governance – Organising Accountabilities for Data

Quality Management. ACIS 2007 Proceedings. https://aisel.aisnet.org/acis2007/80

Wise, J., De Barron, A. G., Splendiani, A., Balali-Mood, B., Vasant, D., Little, E., Mellino, G.,

Harrow, I., Smith, I., Taubert, J., Van Bochove, K., Romacker, M., Walgemoed, P.,

Jimenez, R. C., Winnenburg, R., Plasterer, T., Gupta, V., & Hedley, V. (2019).

Implementation and relevance of FAIR data principles in biopharmaceutical R&D.

Drug Discovery Today, 24(4), 933‑938. https://doi.org/10.1016/j.drudis.2019.01.008

Wise, J., Möller, A., Christie, D., Kalra, D., Brodsky, E., Georgieva, E., Jones, G., Smith, I.,

Greiffenberg, L., McCarthy, M., Arend, M., Luttringer, O., Kloss, S., & Arlington, S.

(2018). The positive impacts of Real-World Data on the challenges facing the evolution

of biopharma. Drug Discovery Today, 23(4), 788‑801.

https://doi.org/10.1016/j.drudis.2018.01.034

Yang, L., Sun, G., & Eppler, M. J. (2010). Making Strategy Work : A Literature Review on the

Factors Influencing Strategy Implementation. In P. Mazzola & F. Kellermanns,

Anh NGUYEN Page 127 of 128


Msc Health Management & Data Intelligence

Handbook of Research on Strategy Process (p. 13234). Edward Elgar Publishing.

https://doi.org/10.4337/9781849807289.00015

Anh NGUYEN Page 128 of 128

You might also like