2021RDMandSharing AwarnessAttitudeBehaviour

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Original Manuscript

Information Development
1–15
Research data management and © The Author(s) 2021
Article reuse guidelines:

sharing awareness, attitude, sagepub.com/journals-permissions


DOI: 10.1177/02666669211048491
journals.sagepub.com/home/idv
and behavior of academic researchers

Muhammad Rafiq
University of the Punjab

Kanwal Ameen
University of Home Economics, Lahore

Abstract
This study assesses the research data management (RDM) awareness, attitude, practices, and behaviors of
Pakistan’s academic researchers. By using an internationally designed structured questionnaire as a data collec-
tion instrument. Quantitative survey research method was opted to meet the research objectives and data was
collected from academicians and researchers of four premier universities of Pakistan. The study reveals used
and produced data file formats, data acquisition sources, data storage patterns, metadata and tagging practices,
data sharing patterns, RDM awareness, attitude, and behavior of the respondents by investigating the self-opi-
nion of respondents on extensive sets of structured questionnaire items. It is a comprehensive assessment of
the phenomenon from a developing country’s perspective where research data management policies are
absent at national and institutional level. The findings have theoretical implications for researchers and practical
implications for policymakers, university administrators, university library administrators, and educational
trainers.

Keywords
data literacy, data management practices, data sharing behaviors, data management skills, data management
training, higher education, metadata behaviors, tagging behaviors, Pakistan

Submitted: 5 April 2021; accepted: 19 August 2021.

Introduction are considered key drivers towards the opening up of


research data for public access. The preceding years
The research and progress in Information and
witnessed several policy enforcements from different
Communication Technologies (ICTs) have resulted in
policy institutions to enforce public access to the
exponential research data growth, mainly in digital
research data of publicly funded research projects. In
formats. A large amount of research data is created
the United States, the US Office of Science and
and collected in digital formats by all disciplines as
Technology Policy (OSTP) memorandum (OSTP,
digital revolution and infrastructure facilitate storage,
2013) outlined a vision for academic papers, scholarly
sharing, and re-usage of data (Van den Eynden and
products, and research data produced by public funds
Corti, 2020). National and international attention
for research and development. This memo made it
focused on RDM during the 2000s (Rice and
Haywood, 2011). The ‘Principles and Guidelines for
Access to Research Data from Public Funding’ issued
Corresponding author:
by The Organization for Economic Co-operation and
Muhammad Rafiq, Associate Professor, Institute of Information
Development (OECD, 2007) and the ‘Berlin Management, University of the Punjab, Lahore, Pakistan.
Declaration on Open Access to Knowledge in the Cell: + 92(0)333-3110909.
Sciences and Humanities’ (Berlin Declaration, 2003) Email: rafi[email protected] | dr.rafi[email protected]
2 Information Development 0(0)

mandatory for all the US federally funded agencies and that fall within the category of data management
research bodies, with a threshold of $100 million include: file naming (the proper way to name com-
budget, to ensure the OA availability of research data puter files); data quality control and quality assurance;
and products. UK Research and Innovation’s Data data access; data documentation (including levels of
Policy, Common Principles on Data Policy (UKRI, uncertainty); metadata creation and controlled voca-
n.d.), and Concordat on Open Research Data (UKRI, bularies; data storage; data archiving and preservation;
2016) established the framework of effective research data sharing and re-use; data integrity; data security;
data management and sharing by the researchers. data privacy; data rights; notebook protocols (lab or
Similar developments have taken place in other field).”
regions like Australia (National Health and Medical Research data is a valuable resource and requires a
Research Council, 2018), Canada (Government of great deal of time and other resources to create, pre-
Canada, 2016), and the European Union (Donnelly, serve, manage and re-use. The role of researchers is
2017). However, in developing countries’ contexts, vital in RDM and sharing. Researchers are the
policies on data management and sharing cease to exist. primary tool in collecting, organizing, and sharing
Research data may be defined as “Data that are used data. They hold their data rights, and sharing data
as primary sources to support technical or scientific for others largely depends on their will and attitude.
enquiry, research, scholarship, or artistic activity, Policies of major funding bodies around the world
and that are used as evidence in the research process put the burden of sharing research data on researchers.
and/or are commonly accepted in the research com- Researchers’ awareness, attitude, and behavior corres-
munity as necessary to validate research findings pond to successful RDM and sharing practices. Thus,
and results. All other digital and non-digital content considering the importance of RDM and the research-
has the potential of becoming research data. ers’ role, it seemed appropriate to conduct a study to
Research data may be experimental data, observa- assess the RDM practices, awareness, and attitude of
tional data, operational data, third-party data, public Pakistani researchers. The literature search revealed
sector data, monitoring data, processed data, or repur- the dearth of studies on the topic. This study is
posed data” (CASRAI, n.d.). based on the primary data collected from Pakistani
Data may be conceptualized in different ways and researchers in connection with an international
perspectives. Researchers/scholars work with many project on RDM literacy skills and the first one from
kinds of data and sources. Humanities’ scholars Pakistan addressing RDM in the context of individual
might talk about their primary sources or texts. In researchers.
social science, they think in terms of survey results,
interviews, observations, and tests. Natural science
data come from experiments and observations. On Literature review
the other side, research data can be qualitative, quan- Major databases, including Science Direct, Emerald
titative, or both and available in print, analogue, or Insight, LISTA, LISA, Google Scholars, etc., were
digital formats. Moreover, it may further be divided searched by formulating and applying different
into numeric, images, audio, video, text, tabular search strategies to identify the related literature on
data, modeling data, spatial data, instrumentation the topic. A review of the identified literature is pre-
data, etc. There are various formats within one type sented in this section.
of digital data, e.g., image data may be in JPEG, In terms of current RDM practices, the literature
BMP, GIF, TIFF, JFIF, Exi, PNG, BAT, BPG, showed that individual faculty’s research data falls
WebP, etc. Such dynamics add complexity to within the gigabyte range (Akers and Doty, 2013;
Research Data Management (RDM). Chen and Wu, 2017). Researchers rely heavily on perso-
CASRAI, (n.d.) defines RDM as “the storage, nal computers (Chen and Wu, 2017; Wolff-Eisenberg
access, and preservation of data produced from a et al., 2016), local computers (Aydinoglu, Dogan, and
given investigation. Data management practices Taskin, 2017), and mostly at their institutions
cover the entire lifecycle of the data, from planning (Berghmans et al., 2017) to store data, particularly
the investigation to conducting it and from backing during the active research project. Cloud-based storage
up data as it is created and used to the long-term pre- such as Google Drive, Dropbox, etc., is used by a
servation of data deliverables after the research inves- smaller proportion of researchers (Wolff-Eisenberg
tigation has concluded. Specific activities and issues et al., 2016). For long term storage and back up, basic
Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 3

science researchers used university-based servers and rely requirements. Tenopir et al. (2011) identified
heavily on specialized instruments such as hard drives common barriers encountered by the researchers in
(Akers and Doty, 2013; Burnette, Williams, and Imker, RDM: insufficient time, lack of funding, and lack of
2016); On the other hand, arts and humanities researchers standards in managing research data. The researchers
depend heavily on personal computer/external hard also expressed some degree of concerns or anxiety
drives as well as Internet-based storage (Akers and regarding their abilities to effectively meet the chal-
Doty, 2013). After completing a research project, data lenges of RDM (Bardyn et al., 2012). Several other
sharing is primarily influenced by the requirements of studies highlighted similar issues being faced by the
funding bodies stored in data repositories. A large researchers in meeting the requirements of RDM,
number of repositories are available (Data Repositories, such as lack of sufficient time to handle the data man-
2018; Registry of Research Data Repositories, 2018; agement (Federer et al., 2016); data storage, integrity,
Scientific Data, 2018; University of Minnesota, 2018) backup options (Mclure et al., 2014); lack of technical
for a variety of disciplines, including humanities, social skills and knowledge needed (Aydinoglu et al., 2017);
sciences, science and technology, health sciences, and and lack of metadata knowledge (Akers and Doty,
other allied and multidisciplinary subjects. However, at 2013; Aydinoglu et al., 2017; Burnette et al., 2016).
present, sharing data in most scientific disciplines (with Van Panhuis et al. (2014) conducted a systematic
notable exceptions in genomics, astronomy, physics) is review of the literature on barriers to data sharing in
still at nominal (Piwowar, 2011; Warrd, Rotman, and public health. The study identified several researchers’
Lauruhn, 2014), and research data sharing practices are concerns hindering data sharing and divided these
at the discipline level (Mallasvik and Martins, 2020). concerns into technical, motivational, economic, poli-
Berghmans et al. (2017) also noted the relationship tical, legal, and ethical. Van den Eynden and Bishop
between data-sharing practices and the field of research. (2014) also highlighted numerous concerns of
In the academic disciplines such as Soil Science, researchers in data sharing such as fear of competition,
Human Genetics, and Digital Humanities, where fear of being scooped, the cost in both time and money
sharing data is well placed, and researchers work collab- to prepare data and documentation for sharing,
oratively, data sharing is integral. Open data practices are absence of funding, absence of professional rewards
less uniform in other fields, and data remains limited to for data sharing, lack of standards and data infrastruc-
the researcher in personal, departmental, or institutional ture, and ethical and legal concerns. Similar findings
archives. On the other hand, Mallasvik and Martins were reported in a qualitative study by Cheah et al.
(2020) indicated that “research data sharing behaviors (2015) and through a questionnaire by Schmidt et al.
are heavily mediated by institutional rules and rational- (2016).
ities that inform researchers’ attitudes”. In a recent study on researchers of the top 25 most
Complexities in data formats, standards, infrastruc- productive Turkish universities, deficiency in techni-
ture, etc., pose challenges to the researchers to meet cal skills and expertise to meet the RDM requirement
RDM requirements. The ease with which researchers was observed along with the absence of RDM policy,
collect large and complex data sets is outpacing their procedures, and guidelines; lack of organizational
knowledge and skill to properly manage them support and training opportunities; lack of finances;
(Whitmire et al., 2015). Accuracy, completeness, and lack of necessary tools and technical support for
timeliness of data, the disparity in metadata standards, researchers to meet the RDM requirements
incompatibility between commercial products and (Aydinoglu et al., 2017). A more recent study
institutional databases and online systems, etc., add (Houtkoop et al., 2018) on 600 psychology research-
further complications in RDM. Researchers showed ers (identified from Web of Science) also established
the willingness to share their data in many studies that researchers’ have certain concerns in data
(Aydinoglu et al., 2017; Berghmans et al., 2017; sharing, such as their perception that sharing data
Burnette et al., 2016; Chen and Wu, 2017). requires extra work, lack of training on sharing data,
However, the extent of individual researchers fears that data might be misinterpreted, and they
making their own data available to others is lower might be scooped.
(Fecher et al., 2015). Certain fears, concerns, issues, Researchers feel the need for enhanced skills and
and barriers hinder the researchers from sharing their support in RDM. The researcher showed interest in
data. A number of studies reported the problems faculty workshops on data management practices
being faced by the researchers in meeting the RDM and assistance in preparing data management plans
4 Information Development 0(0)

for grant applications (Akers and Dotty, 2013). Chen 2. determine the research data collaboration
and Wu (2017) identified the researchers’ require- and sharing practices.
ments of RDM services which include: tools for data 3. explore the state of awareness, practices, and
recording and processing; introduction of policies of attitude of academic researchers regarding
research funding agencies and academic journals’ RDM.
data requirements; methods and standards for collect- 4. assess the status of RDM training needs of aca-
ing data and ways for publishing and submitting data demic researchers
papers. The researchers expressed their intentions to
access RDS through special lectures, social media, Research design and methodology
online courses, phone, email, instant messengers,
The study adopted quantitative research design and
training, a platform for knowledge exchange and
conducted a questionnaire-based survey to collect
sharing, workshop, the library microblogging, etc. In
data. The study was a part of an international multicul-
Pakistan, Piracha and Ameen (2018) conducted a
tural research project aimed to collect data about the
small-scale qualitative study on RDM practices of uni-
data literacy and RDM skills of academics and
versity faculty members through qualitative research
researchers in higher education institutions of different
design. The data was collected through semi-
countries and compare the findings based on disci-
structured interviews from purposely selected ten
plines and participatory countries. The questionnaire
faculty members of the University of the Punjab,
was developed by a team of academicians from infor-
Lahore. The study reported that faculty members
mation schools of the UK and Turkey. We participated
store their research data on personal computers and
in this survey from Pakistan, and permission was
devices, however, prefer a central repository at their
granted to us to use this data for a separate publication
university premises for long term data storage. The
that we are submitting here.
study reported the need for metadata training and
This study’s sample included academic researchers
RDM guidance for faculty members.
from four premier institutions of Pakistan: University
The literature review established that both quanti-
of the Punjab, Lahore; GC University, Lahore;
tative and qualitative studies exist on RDM in the
University of Engineering and Technology, Lahore;
context of the developed world where data manage-
and National University of Science and Technology,
ment and sharing policies of government and
Islamabad. The first two universities are the oldest
research funders’ have been devised and implemen-
institutions of general education, while the other two
ted. However, in developing countries’ perspectives,
are top national universities of specialized disciplines
policies on data management and sharing do not
in the country. The survey was launched online, and
exist Thus, such developing countries, like
respondents were invited to participate through
Pakistan, present a special context to be studied.
emails, listservs, and university teachers’ social
There is a scarcity of studies on RDM awareness,
media groups. After multiple follow-ups, the study
attitudes, and behaviors concerning the developing
received 271 responses. However, 11 responses
countries’ researcers, particularly of Pakistan. This
were incomplete and discarded. Finally, 260
study fills the literature gap and comprehensively
responses were analyzed to present the findings. The
addresses the phenomenon by unfolding the per-
data were analyzed by applying appropriate statistics
ceived awareness, attitude, and typical behaviors of
using SPSS, Ver. 20.
academic researchers with regards to the types of
data used and produced, sources of data acquisition,
data storage patterns and preferences, assigning Data analysis
metadata to research data, and collaboration and
data sharing. Demographic characteristics of respondents
Data on demographic variables shows that the major-
ity of the respondents were male (55%) and research
Research objectives students (57%), belonged to 26–35 years of age
The objectives of the study were to: (70%) and having less than five years of research
experience. Half of the total respondents belonged to
1. identify the patterns of research data used and science and technology disciplines, followed by
produced by the academic researchers. social sciences (45.4%) and Humanities(4.6).
Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 5

Re-use of data 51%) of the respondents create new data (Figure 2).
File types used by the respondents A little less than half of the respondents acquire data
from their own research team/group at the university
Researchers were predominantly (81.2%) using stan- (111; 43%) and/or from their own research networks
dard office documents such as text, spreadsheets, pre- such as personal and professional networks (107;
sentations, etc. In contrast, Internet and web-based 41%).
data (webpages, emails, blogs, social network data,
etc.), Images (JPEG, GIF, TIFF, PNG, etc.), structured
scientific and statistical data were used around ∼50% Use of data acquired from others/outside resources
of respondents (Figure 1).
Almost half of the respondents use the acquired data for
Sources of data acquisition their research after spending a lot of time and effort to
make it usable for the project (126; 49%). A similar per-
The majority of the respondents (151; 58%) acquire centage (123; 47%) reported the use of data with a bit of
data from multiple known sources, and a half (132; effort for some cleaning and modifications (123; 47%),
while 44 (17%) respondents mentioned that they do
not use data from outside sources. Only 13% of respon-
dents mentioned that they use data, as it is, without any
problem for their research (Figure 3). Nevertheless, most
of the respondents have to modify or clean the data
before using it, and significant efforts and time are
required for such work.

Data produced
File types of data produced
Most of the respondents (189; 73%) produced data in
standard office formats during their research projects
(Figure 3). Structured scientific and statistical data
(e.g. SPSS, GIS, etc.) formats were produced by
∼50%, followed by data in Image formats (JPEG,
GIF, TIFF, PNG, etc.) produced by 44%. Only 21
(8%) respondents mentioned audio file production
Figure 1. File types of data used (N = 260). and source code as data during their research work.
Thus, in most cases, respondents produce data in

Figure 2. Sources of research data (N = 260).


6 Information Development 0(0)

Storage of the data


Respondents were storing their research data using
different sources (Figure 5), predominantly (238;
92%) on their own devices (computer, tablets, external
drive, etc.). Only 49 (23.6%) respondents would use
cloud computing for their data storage, whereas 33
(15%) store data on their universities’ central services
or repositories. Data storage in repositories outside of
respondents’ institutions was very low (18; 7%).

Assigning metadata to research data


Figure 3. Use of data from outside sources (N = 260).
A little less than half percent of respondents (120;
47%) assign administrative information such as
creator, date of creation, file name, access terms/
restrictions, etc., to the research data (Figure 6).
‘standard office documents’ followed by images and One-third of the respondents assign technical meta-
structured scientific and statistical formats data (file format, file size, software/hardware needed
(Figure 4). The percentages of topmost file formats to manage their research data, etc.) (33%), or discov-
of data produced are similar to the topmost file ery information (creator, funding body, project title,
formats of open data used that were acquired from project ID, keywords, etc.) (31%). Almost a quarter
other sources (Figure 1). of the respondents assign descriptive data (28%) or

Figure 4. File types of data produced (N = 260).


Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 7

Figure 5. Storage of produced data (N = 260).


Figure 7. Collaboration and sharing of data (N = 260).

openly on a request basis. Almost a similar number


(94; 36%) reported that their research data is available
only to their research team. Only one-fourth of respon-
dents (68, 26%) respondents offer open access to their
data to all. However, an almost equal number of
respondents put restricted access (46, 18%) or do
not provide access (42, 16%) to their research data
(Figure 8).

Concerns for sharing data


There were certain concerns about sharing data
Figure 6. Metadata and tagging datasets (N = 260). (Figure 9). Legal and ethical issues (98; 38%) and
misuse of data (92; 35%) were topmost concerns
mentioned by almost one-third of the respondents,
do not assign any metadata to their research data followed by Lack of appropriate policies and
(23%). It appears that assigning metadata to research rights protection (64, 25%), Misinterpretation of
data is not imperative to Pakistani researchers. data (52, 20%), Fear of losing the scientific edge
(39, 15%), and Lack of resources (technical, finan-
Collaboration and sharing of data cial, personal, etc.) (28, 11%). Nevertheless, 81
In response to collaboration and data sharing with other (31%) respondents have no concern about sharing
researchers, 142 (55%) respondents mentioned that data with others.
they collaborate and share research data within their
teams, and 110 (42%) share with the researchers in Preferred location of data storage
the same university. Only one-third (89; 34%) of for long term access
respondents collaborate and share data beyond their
universities. Interestingly, 42 (16%) respondents do The respondents were asked about their preferred
not collaborate and share their research data (Figure 7). location of data storage for long-term access
(Figure 10). Two-thirds of respondents (178; 69%)
preferred to store data at their universities, whereas
Data availability 107 (41%) preferred unpaid external storage, followed
In response to data availability, 102 (39%) respon- by with funding body (68; 26%) and paid external
dents mentioned that their research data is available storage (53; 20%).
8 Information Development 0(0)

Figure 8. Availability of data (N = 260).

uncertain about the institutional Data Management


Plan (DMP) and 107 (51.4%) never used DMP for
their own research. They were not aware of any insti-
tutional DMP and did not possess skills in using DMP.
Merely half (106; 51%) of the respondents were aware
of Digital Object Identifier (DOI).
Nevertheless, more than two-thirds were aware of
their university’s citing data requirements, and 148
(71.5%) respondents would use the standard style
for citing research data. They have skills in using
the standard style of citing research data. Regarding
their awareness about metadata, 109 (52.4%) were
Figure 9. Concerns for sharing data (N = 260). familiar with the term. The respondents wanted to
learn about DMP and were urged to have formal train-
ing on metadata; as almost two-third (129; 62.3%)
Preferred funding source for storage agreed that DMP helps in better research data manage-
and public access to data sets ment, and 137 (65.9%) considered that formal training
The respondents were asked about who should pay for on metadata would be useful for them.
the storage and public access of the data created. More Some other statements related to the skill-based prac-
than two-thirds (73%) of respondents consider that tices were given for responses on a five-point Likert type
their university should pay for their research data’s scale. Table 2 shows that the respondents cite research
long-term storage, whereas 45% considered paying data often. However, they sometimes use file naming
by the funding body and 35% mentioned a national conventions or standards, work with data that have
body (35%). Only 19% of respondents considered restricted access, have different versions of the same
paying by themselves or their teams. dataset(s), use system/techniques for version control,
use their own/in-house tags and metadata.
Nevertheless, the use of metadata standards for tagging
Awareness, skills, and attitude towards their data and using datasets that are already tagged
data management with standard metadata is rare. These practices are
The respondents were given a set of statements to based primarily on typical skills. Thus, the data
assess their awareness, skills, and attitude regarding (Table 2) reveal that skill base RDM practices (excluding
RDM. The data are presented in Figure 12, Tables 2, citing research data) are not often but sometimes or rare.
and Table 3. The data (Figure 12) reveals that the Attitude related statements are presented in Table 3,
majority of the respondents were unfamiliar or where mean scores reveal the respondents’ agree that
Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 9

Figure 10. Preferred location of data storage for long term access (N = 260).

universities to recommend standard file naming


system and metadata sets for the management of
their research data. However, they have certain uncer-
tainties about sharing their data.

Formal training attained on RDM


Figure 13 presents details about attending formal
training on RDM related skills such as data manage-
ment plan, metadata, file naming, version controlling
of data sets, and data citation styles. The majority of
respondents (49%) did not attend any training
related to RDM topics. However, 88 (34%) respon-
Figure 11. Preferred funding source for data storage and dents attended training on data citation styles, fol-
public access (N = 260). lowed by 43 (17%) on DMP, 42 (16%) on metadata,
23 (9%) on consistent file naming, 19 (7%) attended
their university should have a DMP. Universities training on version control of datasets. (18.3%)
should recommend using a standard file naming respondents mentioned training on metadata.
system and metadata set for uploading data into a
repository. However, the respondents neither agreed
nor disagreed with two statements: “I am comfortable
Opinion about formal training needs
and willing to share my research data with others,” on RDM
and “I foresee no problems with sharing my research Data about the respondents’ interest in formal training
data”. It means that many of the respondents have (Figure 9) show that most of the respondents (184;
certain reservations about sharing data with others. It 71%) were interested in training on DMP, followed
is also highlighted by the respondents’ agreement by metadata (144; 55%), data citation styles (133;
with the statement, “I perceive data ethics could be 51%), file naming (127; 49%), and version control
an issue when research data is shared with others.” of datasets (119; 46%). Only 35 (14%) respondents
It may be inferred that respondents have a positive showed a lack of interest in attending any sort of
attitude towards RDM and see a major role of their training.
10 Information Development 0(0)

Figure 12. Analysis of statements on RDM awareness (N = 260).

Discussion and conclusions produced. Standard office documents (such as text,


spreadsheets, presentations, etc.), images (such as
This study addressed the researchers’ RDM practices,
JPEG, GIF, TIFF, PNG, etc.), and structural scientific
awareness, and attitude by investigating the respon-
and statistical data formats were used and produced by
dents’ opinions on extensive sets of statements
most of the respondents. Furthermore, the Internet and
related to the study’s objectives. The conclusions
web-based data were also used by more than half of
and discussions are presented below according to the
the respondents. The respondents acquire data from
research objectives.
multiple sources.
Nevertheless, most of them have to modify or clean
Patterns of data used and produced by the it before using it. In terms of current data storage
researchers places, the respondents’ own devices (computer,
The absence of equivalent research instruments across tablets, external drive, etc.) were largely used to
international contexts makes comparisons difficult store the research data. The findings are similar to
(Wolff-Eisenberg et al., 2016). This study’s findings Piracha and Ameen (2018) in this regard. Moreover,
are quite similar to the findings of Aydinoglu et al. 24% respondents of this study use cloud computing
(2017) in terms of file types of data used and data for their data storage, whereas 15% rely on their
Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 11

Table 1. Characteristics of respondents (N = 260). Table 2. Analysis of statements on RDM practices (N =


260).
Frequency Percent
How often do you
Gender practice the following Std.
Male 143 55 statements? N Min Max Mean Dev.
Female 109 42.3
Don’t want to 7 2.7 Citing research data 260 1 5 3.73 1.371
disclose Working with data that 260 1 5 3.26 1.298
Primary Role are generally in the
Academic staff 110 42.3 public domain
Research 147 56.5 Using file naming 260 1 5 2.95 1.377
student convention or
Discipline standard
Social Sciences 118 45.4 Having different 260 1 5 2.85 1.299
Science and 130 50 versions of the same
Technology dataset(s)
Humanities 12 4.6 Working with data that 260 1 5 2.81 1.312
Age have restricted access
18–25 years 90 34.6 Using systems/ 260 1 5 2.78 1.337
26–35 years 91 35 techniques for version
36–45 years 48 18.5 control to easily
46–55 years 27 10.4 recognize a specific
56–65 years 4 1.5 version
Research Using your own/ 259 1 5 2.67 1.419
Involvement in-house (your
<5 years 142 54.6 research team) tags
5–10 years 64 24.6 and metadata
11–15 years 23 8.8 Using metadata 260 1 5 2.36 1.329
16–20 years 11 4.2 standard for tagging
>20 years 7 2.7 your data
Never involved 13 5 Using datasets that are 260 1 5 2.35 1.322
in research tagged with standard
metadata
Note: 1 = Never, 2 = Rarely, 3 = Sometimes, 4 = Often, 5 = Almost
universities’ central services or repositories. The indi- Always. Cronbach alpha = .086.
cated percentage is also according to the study of
Wolff-Eisenberg et al. (2016), who revealed that insti- respondents’ current practices and wishes, it seems
tutional research data storage among researchers in appropriate to recommend providing data storage
universities is under 40%. Less than half of the facilities by individual universities along with some
respondents assign some sort of metadata to their metadata standardization. The purpose of data
data files for better management and retrieval. sharing is re-using of data, and for re-using of data,
Notably, we did not ask about any particular metadata it is necessary that data should be preserved in a stan-
standard being practiced by the respondents. Thus the dardized and discoverable manner. It is usually
findings may vary in the case of metadata standardiza- observed that data collected at a local scale is context-
tion. The current percentage is satisfactory because it ual and more useful for local researchers. Thus, data
includes very basic attributes assigned to the data by availability in a local institutional or national reposi-
the respondents. tory may help long-term preserving, discovering,
Moreover, almost two-thirds of respondents recom- and re-using by other researchers. Higher Education
mended storing data in their universities’ cyberspace. Commission of Pakistan (HEC), the epic body of
This revelation is similar to the study of Piracha and higher education in the country, may play a vital
Ameen (2018), who reported that the respondents role in developing such data repository on the lines
(faculty of Pakistani universities) of their qualitative of Pakistan Research Repository (http://prr.hec.gov.
research felt the need for a central repository of the pk/jspui/). Higher educational institutions (HEIs)
University for research data storage. Based on the may also share their resources to set up such
12 Information Development 0(0)

Table 3. Analysis of statements on attitude towards RDM


(N = 260).
Std.
Statements Min Max Mean Dev.
Every university should have a 1 5 4.18 .970
Data Management Plan
(DMP)
Universities should 1 5 4.14 .957
recommend and use a
standard file naming system
Every university should have a 1 5 4.09 .974
prescribed metadata set for
uploading data into a
repository Figure 14. Interest in formal training(N = 260).
I perceive data ethics could be 1 5 3.90 .955
an issue when research data
is shared with others extent. Collaboration and sharing of research data
I would like to store my 1 5 3.88 .883 beyond researchers’ own universities are reported by
research datasets beyond one-third of the respondents. Of course, the lack of
the lifetime of the project standardized RDM facilities, infrastructure, and ser-
I am comfortable and willing 1 5 3.42 1.120 vices; sharing of research data merely depends on
to share my research data individual researchers. Centralized repositories,
with others
either at the national or university level, may
I foresee no problems with 1 5 3.27 1.038
sharing my research data enhance the sharing of research data. Centralized
repositories will be more useful for preserving and
Note: 1 = Strongly Disagree, 2 = Disagree, 3 = Neither Agree/Nor storing data on a long-term basis, usually after com-
Disagree, 4 = Agree, 5 = Strongly Agree. Cronbach alpha = .774.
pleting the research project. In contrast, institutional-
level data storage facilities may be instrumental for
successful RDM practices. Almost ∼40% of respon-
dents mentioned that their research data is available
for other researchers on a request basis. Researchers
show concerns about sharing their research data,
such as legal and ethical issues, misuse of data, etc.
Lack of proper policies, rights protection, and misin-
terpretation of data was also mentioned by almost one-
fifth of the respondents. Data sharing requires
resources including time, finances, infrastructural
support, technical knowledge, and skills. However,
lacking in skills and infrastructural support while
having certain concerns reported in the literature and
Figure 13. Formal training attained (N = 260). mentioned by the respondents of this study, sharing
data remains minimal. This is where policy institu-
tions and research bodies and institutions need to
repositories. Sharing resources by HEIs will particu- intervene by revisiting their current policies and/or
larly be instrumental in a country like Pakistan developing new policies and necessary infrastructural,
where HEIs are facing financial constraints in initiat- informational, and technical support to enhance data-
ing such projects. sharing practices. It is necessary to address the con-
cerns of the researchers that are hindering data
sharing. Such concerns may be addressed by introdu-
Collaboration and sharing of data cing better standards of data sharing along with the
Collaboration and data sharing is occurring amongst right rewards mechanism. Currently, academic pro-
team members and within the university to some motions and incentives largely depend on academic
Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 13

publications, and contributing data sets do not help in the term metadata. This revelation is like the findings
this regard. It is necessary to incorporate the policies of Aydinoglu et al. (2017), who found the lack of
for recognition of data contributors and introduce a metadata knowledge among Turkish researchers.
reward mechanism for data sets’ contributions to aca- The respondents have a positive attitude towards
demic publications for academic promotions and RDM and see a major role of their universities to
incentives. recommend standard file naming systems and meta-
data sets to manage their research data. However,
they have uncertainties about sharing their data.
RDM awareness, practices, and attitude Such uncertainties are because of the absence of an
More than half of the respondents never used a data established mechanism (policies, procedures, hard-
management plan (DMP) for their research project. ware, software, training, services, etc.) of RDM in uni-
Almost 80% were uncertain about the availability of versities. These uncertainties may be diminishing by
the DMP in their institutions, or they consider that placing the RDM mechanism at the national and insti-
their institutions do not have any DMP. These two tutional levels.
findings portray the primitive stage of RDM in
Pakistani universities. RDM does not exist from an
institutional perspective in the Pakistani higher educa- Training attained and needs
tion sector. The main funding agencies in Pakistan are The data revealed that the RDM training component is
the Higher Education Commission of Pakistan (HEC) absent. The highest number of respondents who attained
and Provincial Higher Education Commissions any training was less than one-third of the total respon-
(PHECs). Neither has an RDM policy/strategy nor dents, and interestingly that training was about data cit-
asks for an RDM plan from the scientists’ funds. A ation styles. More than 80% of the respondents never
similar situation is with the universities as they do attended any training on metadata or DMP. The situa-
not have formal policies, procedures, software, infra- tion about training on consistent file naming and
structure, staff, and services to support the researchers version controlling presents a bleak picture too. It is
about RDM activities. For example, the University of encouraging that the respondents showed a great interest
the Punjab, Lahore (the oldest and largest institution in in getting training on DMP, metadata, data citation
the country) funds research projects every year, but no styles, file naming, and file version controlling.
RDM policy exists. The situation is similar to Two-thirds of the respondents mentioned that formal
Aydinoglu et al. (2017) finding in Turkey. It is recom- training on metadata would be useful for managing
mended that HEC and PHEC should devise a formal research data. This is the area where institutions need
policy and set up a mechanism (procedure, software, to plan and intervene by active programming of semi-
hardware, training, etc.) to intact RDM in research nars, workshops, and training programs. Such programs
activities in Pakistan. It is essential to meet the new may be instrumental in addressing the challenges of the
norms and requirements of scholarly communication RDM by enhancing the researchers’ knowledge and
and the scientific community. As mentioned in this skills. With enhanced knowledge of data formats, meta-
paper’s introductory section, funding agencies world- data standards, RDM framework, requirements of
wide have already made mandatory data management policy institutions, funding bodies, journals, and data
plans with every research project funded by public repositories, along with enhanced RDM skills and tech-
funds. Respondents cite research data often. Thus, it nical capabilities, it may be assumed that researchers
may be inferred that Pakistan’s academic researchers will be more inclined to manage and share their data
have better citing research data skills while possessing as an imperative to the researchers.
a moderate level of skills of working with open data,
file naming convention or standard, working with
restricted data, and file version control, and using Originality
own/in-house tags and metadata. The study offers comprehensive assessment of the
Moreover, they lack the skills to use metadata stan- phenomenon from a developing country perspective
dards to tag their data and use datasets tagged with where research data management policies are absent
standard metadata, as the respondents rarely practiced at national and institutional level. The study fills
these two elements. It is interesting to note that almost the literature gap and first of its kind in Pakistani
half of the respondents do not know or uncertain about researchers’ context.
14 Information Development 0(0)

Limitations Consortia Advancing Standards in Research Administration


Information (CASRAI) (n.d.) The CASRAI dictionary.
The study participation was voluntarily, thus the results Available at: https://casrai.org/rdm-glossary/ (accessed
have limitations to generalize in statistical terms. 16 June 2020).
Data Repositories (2018) Available at: http://oad.simmon-
Declaration of Conflicting Interests s.edu/oadwiki/Data_repositories (accessed 15 June 2020).
Donnelly M (2017) An analysis of national open data and
The author(s) declared no potential conflicts of interest with
open science policies in Europe: process and findings.
respect to the research, authorship, and/or publication of
Report, Digital Curation Centre/SPARC-Europe.
this article.
Available at: https://sparceurope.org/new-sparc-europe-
report-analyses-open-data-open-science-policies-europe/.
Funding Fecher B, Friesike S, Hebing M, et al. (2015) A reputation
The author(s) received no financial support for the research, economy: Results from an empirical survey on academic
authorship and/or publication of this article. data sharing. arXiv preprint arXiv:1503.00481.
Federer LM, Lu YL and Joubert DJ (2016) Data literacy
training needs of biomedical researchers. Journal of
ORCID iD the Medical Library Association: JMLA 104(1): 52–57.
Kanwal Ameen https://orcid.org/0000-0001-7909-1862 Government of Canada (2016) Tri-agency statement of prin-
Muhammad Rafiq https://orcid.org/0000-0002-8291- ciples on digital data management. Available at: https://
2569 www.ic.gc.ca/eic/site/063.nsf/eng/h_83F7624E.html
(accessed 15 June 2020).
Houtkoop BL, Chambers C, Macleod M, et al. (2018) Data
References sharing in psychology: A survey on barriers and precon-
Akers KG and Doty J (2013) Disciplinary differences in ditions. Advances in Methods and Practices in
faculty research data management practices and perspec- Psychological Science 1(1): 70–85.
tives. International Journal of Digital Curation 8(2): 5–26. Mclure M, Level AV, Cranston CL, et al. (2014) Data cur-
Aydinoglu AU, Dogan G and Taskin Z (2017) Research ation: A study of researcher practices and needs intro-
data management in Turkey: Perceptions and practices. duction and research questions. Portal Libraries and
Library Hi Tech 35(2): 271–289. the Academy 14(2): 139–164.
Bardyn TP, Resnick T and Camina SK (2012) Translational Mallasvik ML and Martins JT (2020) Research data sharing
researchers’ perceptions of data management practices behaviour of engineering researchers in Norway and the
and data curation needs: Findings from a focus group UK: Uncovering the double face of Janus. Journal of
in an academic health sciences library. Journal of Web Documentation 77(2): 576–593.
Librarianship 6(4): 274–287. National Health and Medical Research Council (2018)
Berghmans S, Cousijn H, Deakin G, et al. (2017) Open data: Australian code for the responsible conduct of research.
the researcher perspective. Available at: https://doi.org/ Available at: https://www.nhmrc.gov.au/about-us/publica-
10.17632/bwrnfb4bvh.1 (accessed 18 October 2020). tions/australian-code-responsible-conduct-research-2018
Berlin Declaration (2003) Berlin Declaration on Open Access (accessed 15 June 2020).
to Knowledge in the Sciences and Humanities. Available Office of Science and Technology Policy (OSTP) (2013)
at: https://openaccess.mpg.de/Berlin-Declaration (accessed OSTP increasing access to the results of federally
15 June 2020). funded scientific research. Available at: https://obama-
Burnette MH, Williams SC and Imker HJ (2016) From plan whitehouse.archives.gov/sites/default/files/microsites/
to action: Successful data management plan implementa- ostp/ostp_public_access_memo_2013.pdf (accessed 13
tion in a multidisciplinary project. Journal of EScience March 2018).
Librarianship 5(1): 6. Organisation for Economic Cooperation and Development
Cheah PY, Tangseefa D, Somsaman A, et al. (2015) (OECD) (2007) OECD principles and guidelines for
Perceived benefits, harms, and views about how to access to research data from public funding, OECD
share data responsibly: A qualitative study of experi- Publishing. Available at: https://doi.org/10.1787/
ences with and attitudes toward data sharing among 9789264034020-en-fr.
research staff and community representatives in Piracha HA and Ameen K (2018) Research data manage-
Thailand. Journal of Empirical Research on Human ment practices of faculty members. Pakistan Journal of
Research Ethics 10(3): 278–289. Information Management and Libraries 20: 60–75.
Chen X and Wu M (2017) Survey on the needs for chemis- Piwowar HA (2011) Who shares? Who doesn’t? Factors
try research data management and sharing. Journal of associated with openly archiving raw research data.
Academic Librarianship 43(4): 346–353. PLoS ONE 6(7): 1–13.
Rafiq and Ameen: Research data management and sharing awareness, attitude, and behavior of academic researchers 15

Registry of Research Data Repositories (2018) Available at: Wolff-Eisenberg C, Rod AB and Schonfeld RC (2016) UK
https://www.re3data.org/. Survey of Academics 2015: Ithaka S + R | Jisc | RLUK.
Rice R and Haywood J (2011) Research data management Available at: https://doi.org/10.18665/sr.282736
initiatives at University of Edinburgh. International (accessed 15 March 2021).
Journal of Digital Curation 6(2): 232–244.
Schmidt B, Gemeinholzer B and Treloar A (2016) Open
data in global environmental research: The Belmont About the authors
Forum’s Open data survey. PloS One 11(1): e0146695.
https://doi.org/10.1371/journal.pone.0146695. Muhammad Rafiq is currently serving as Associate
Scientific Data (2018) Recommended data repositories. Professor at the Department of Information Management,
Available at: https://www.nature.com/sdata/policies/ University of the Punjab, Lahore, Pakistan. Dr Rafiq has
repositories#close (assessed 25 July 2020). over 20 years of experience in teaching, conducting and
Tenopir C, Allard S, Douglass K, et al. (2011) Data sharing supervising research, and administrating libraries and
by scientists: Practices and perceptions. PloS One 6(6): archives of government and non-government organizations.
e21101. https://doi.org/10.1371/journal.pone.0021101.
He has published several research articles, book chapters,
UK Research and Innovation (UKRI) (2016) Concordat on
open research data. Available at https://www.ukri.org/ and conference papers. in esteemed journals and book
files/legacy/documents/concordatonopenresearchdata-pdf/ series. Dr Rafiq also received: The Jay Jordan IFLA/
(accessed 16 June 2020). OCLC Early Career Development Fellowship 2005,
UK Research and Innovation (UKRI) (n.d.) Data policy - ASISandT Best Paper Award 2009, and Fulbright
UK Research and Innovation. Available at: https:// Fellowship, USA for his Post-Doc studies at the State
www.ukri.org/funding/information-for-award-holders/data- University of New York at Buffalo NY USA. He is also
policy/ (accessed 16 June 2020). serving as the Editor of Pakistan Journal of Information
University of Minnesota (2018) Discipline-based data Management and Libraries (PJIMandL), a Scopus
archives. Available at: https://www.lib.umn.edu/data- journal, since 2014. His research interests are: open
management/datacenters (accessed 18 June 2020). access, research data management, social media, ICTs
Van den Eynden V and Bishop L (2014) Sowing the seed: applications in information settings, and digital library.
incentives and motivations for sharing research data, a
Contact: Institute of Information Management, University
researcher’s perspective. Available at: https://repository.-
jisc.ac.uk/5662/1/KE_report-incentives-for-sharing- of the Punjab, Lahore [Pakistan].Email: rafi[email protected].
researchdata.pdf (accessed on 5 July 2020). pk | drrafi[email protected] Cell: + 92(0)333-3110909
Van den Eynden V and Corti L (2020) The importance of
managing and sharing research data. In: Corti L, Eynden Kanwal Ameen was a Professor in Information Management
V, Bishop L and Woollard M (eds) Managing and at the University of the Punjab, Lahore [Pakistan]. She has
Sharing Research Data. London: Sage Publications, 1–32. served as: chairperson (2009–2018) of the Department of
Van Panhuis WG, Paul P, Emerson C, et al. (2014) A sys- Information Management (University of the Punjab),
tematic review of barriers to data sharing in public Chairperson Doctoral Program Coordination Committee
health. BMC Public Health 14(1): 1–9. (DPCC), and as Director, Directorate of External
Waard AD, Rotman D and Lauruhn M (2014) Research data Linkages. Since 2018, she is serving as Vice-Chancellor
management at institutions: visions, bottlenecks and ways of the University of Home Economics, Lahore. Ameen is
forward. Library Connect Digest:18–21. Available at one of the most prolific author of her discipline in
https://www.elsevier.com/__data/assets/pdf_file/0020/
Pakistan and published more than 150 scholarly publica-
1046603/Library-Connect-Digest-2014.pdf (accessed
15 March, 2021). tions and also remained the chief editor for the Pakistan
Whitmire AL, Boock M and Sutton SC (2015) Variability Journal of Information Management and Libraries till
in academic research data management practices: 2018. She has also served ASISandT, ALISE, and IFLA
Implications for data services development from a in different capacities. Contact: Institute of Information
faculty survey. Program: Electronic Library and Management, University of the Punjab, Lahore [Pakistan]
Information Systems 49(4): 382–407. Email: [email protected]

You might also like