Orcid: A System To Uniquely Identify Researchers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

ORCID: a system to uniquely identify researchers 259

Learned Publishing, 25: 259–264


doi:10.1087/20120404
ORCID: a system to uniquely identify researchers
Laurel L. Haak et al.
LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012

ORCID:
a system to
T
he ORCID initiative
ORCID (Open Researcher &
Contributor ID)1 is an inter-
national, interdisciplinary, open
and not-for-profit organization created to
uniquely identify
solve the researcher name ambiguity prob-
lem for the benefit of all stakeholders,
including research institutions, funding
organizations, publishers, and researchers
researchers
themselves. The core mission of ORCID is Laurel L. Haak
to provide a registry of persistent unique Laurel L. HAAKa, Martin FENNERa,
identifiers for researchers and scholars. Laura PAGLIONEa, Ed PENTZa,b and
Working with stakeholders to embed these Howard RATNERa,c
identifiers in research workflows, including a
ORCID Inc.
manuscript submission, will support timely b
CrossRef
and complete attribution by automating the c
Nature Publishing Group
contributor–research linkage. In turn, the
ORCID registry can serve an important role ABSTRACT. The Open Researcher & Contributor
Martin Fenner
in supporting efforts in the publishing ID (ORCID) registry presents a unique opportunity
community including conflict-of-interest to solve the problem of author name ambiguity. At
reporting and author role acknowledgement. its core the value of the ORCID registry is that it
While several author identifier initiatives crosses disciplines, organizations, and countries,
exist already, they are limited by organiza- linking ORCID with both existing identifier schemes
tion, discipline, or geographic region – or as well as publications and other research activities.
they are part of a proprietary system. How- By supporting linkages across multiple datasets –
Laura Paglione
ever, researchers increasingly work across clinical trials, publications, patents, datasets – such
disciplines and institutions, and are geo- a registry becomes a switchboard for researchers
graphically mobile. ORCID is designed for and publishers alike in managing the dissemination
the researcher community: the organization of research findings. We describe use cases for
works across all of these boundaries to pro- embedding ORCID identifiers in manuscript
vide a registry for individuals or their submission workflows, prior work searches,
organizations to create identifiers and man- manuscript citations, and repository deposition. We
age ORCID records. In its first phase, make recommendations for storing and displaying
ORCID will provide a self-claim system that ORCID identifiers in publication metadata to
allows individuals fine-grained control of include ORCID identifiers, with CrossRef Ed Pentz
privacy settings, as well as data exchange integration as a specific example. Finally, we
with grant and manuscript submission sys- provide an overview of ORCID membership and
tems and other identification systems such integration tools and resources.
as Scopus,2 RePEc,3 ResearcherID,4 and
VIVO.5
Use of an identifier must have clear ben-
efits, otherwise it will not be adopted. In
discussions with many stakeholders it © Laurel L. Haak, Martin Fenner, Laura Paglione, Ed Pentz
became clear that simply providing a unique and Howard Ratner 2012 Howard Ratner

LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012


260 Laurel L. Haak et al.

identifier would not be enough, and that the perform accurate name-based searches, and
ORCID identifier needed to be integrated find and manage reviewers. Another pain
into research workflows and linked to infor- point is the processing of citation metadata,
mation on research activities such as which without author identifiers requires
publications, grants, patents, and datasets. manual disambiguation to match authors
ORCID has taken the stance that the use and articles.
of an identifier should first and foremost Perhaps a little less directly, publishers
reduce the reporting burden for researchers, face issues with authorship roles on a daily
both in the immediate task of filling out basis. Who should be listed as an author?
basic information on forms, as well as in How can an author role be appropriately
longer-term progress reporting. A good acknowledged? How can publishers discern
example for this is the manuscript submis- author responsibility? Linked to this is con-
sion process, for which much of the flict-of-interest reporting. Who needs to
identifying information requested from report what, and in what context? Clearly,
authors (name, affiliation, email address) there is a role for a central registry that
could be retrieved from the ORCID registry. crosses disciplines, work places, sectors, and
System-to-system authentication also pro- national boundaries. By supporting linkages
ORCID also vides a researcher with the option to create a across multiple datasets – clinical trials, pub-
recognizes ‘trusted’ relationship with a publisher, so lications, patents, datasets – such a registry
that when a manuscript is accepted the pub- becomes a switchboard for researchers and
that, first lisher can update the researcher’s ORCID publishers alike in managing the dissemina-
and foremost, record with publication metadata. Other sys- tion of research findings.
individuals own tems can then use the data from the ORCID
record to maintain, for example, university
their record profile databases and local digital research Use cases
repositories, or to support grant progress Unique author identifiers are the only way
reporting at funding agencies. we can address these issues. To be effective
ORCID also recognizes that, first and identifiers will need to be incorporated in
foremost, individuals own their record. A several publisher workflows. Perhaps the
central principle of the ORCID initiative is most straightforward is the manuscript sub-
that researchers control the defined privacy mission system, where an ORCID identifier
settings of their own ORCID record data. can be collected from the corresponding
Individual record holders can control what author at the time of submission. Associ-
information is displayed publicly, what is ating a unique identifier with an author
shared with trusted partners, and who those reduces duplicate author accounts and
trusted partners are. Furthermore, ORCID enables publishers to provide a more accu-
does not collect sensitive information. The rate representation of an author’s prior work
only information required to register for an and citations. If additional data are linked
identifier is name and email address, and with the identifier, association with the
only the ORCID identifier is always publicly author profile may assist the author in filling
available. All other information in the regis- out the submission form, including affiliation
try can be marked as non-public. information. ORCID identifiers for co-
authors are preferably collected after the
manuscript is accepted for publication. In
Pain points for publishers
this scenario journals could ask all
Similar to researchers, publishers are co-authors to enter their ORCID identifiers
affected by name ambiguity in direct and via authentication with the ORCID service,
indirect ways. On the direct path are author together with the information on author
databases, which for many journals are col- contributions and potential conflicts of
lections of duplicate records requiring a interest. At this stage authors could also
substantial investment to disambiguate and agree to a trusted party relationship with the
manage. This in turn has an impact on the publisher for subsequent ORCID record
ability to understand an author’s history, updates.

LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012


ORCID: a system to uniquely identify researchers 261

In addition to manuscript submission, Best practices


another use case for publishers is searching
For any identifier to be effective, it must be
for prior work, to support vetting of review-
widely adopted by the research community –
ers, creation of author profiles for authors, or
not only among individuals but also at the
to assist in the processing of citations, not all
touch points where research findings are dis-
of which might be using DOIs or other seminated: publication submission, dataset
unique identifiers. Searching by name is deposition, research grant and contract
fraught, in particular because authors and/or applications, faculty and staff profiles, patent
the journals they publish in use multiple applications, etc.
name forms for the same person. Any search How an identifier is collected at these
needs to incorporate these multiple forms. touch points is critical. Providing a field for a
An identifier linked with an author name researcher to type in their identifier may cre- for any
would eliminate this ambiguity, and allow ate more problems than it solves, in identifier to be
not just for multiple variants of the same particular due to typographical errors, lack
name but also continuity along a career for of validation, and inability to update with
effective, it
those authors who have changed their current data. A more effective method of must be widely
names. Many publishers have started to collection would employ authentication and adopted by the
build value-added services for their authors, storage in an existing touch point ‘profile’
and professional societies that are also pub-
research
such as the author profile, where the identi-
lishers are linking author and membership fier could be reused each time the author community
information. Publishers are also connecting submits a manuscript. This process would
author and reviewer databases. ORCID support disambiguation and maintenance of
identifiers will greatly facilitate the creation author databases, and could also be used to
and maintenance of these author profiles, in support unified sign-on, another method to
particular the disambiguation of authors and reduce the likelihood of duplicate profile
the linking to external information. creation.
Including ORCID identifiers in manuscript For longer-term benefits to be realized,
citations would greatly improve citation publishers will need not only to collect the
accuracy with regards to author names, ORCID identifier but also to store it (and
which in turn would assist authors and their the ID type) with the paper’s metadata,
institutions in managing publication lists. deposit it with the paper in various systems,
Many authors are asked to deposit their and determine how best to display that
accepted manuscripts into institutional and/or metadata in versions of the manuscript.
discipline-specific repositories. In some cases, ORCID uses a semantically opaque identi-
the publisher handles this process; in other fier6 – meaning that is not possible to deduce
cases, it is the author’s responsibility to the name or other identifying information
deposit, or a library’s task to find articles from the identifier. It is therefore important
produced by university faculty. An ORCID to provide a visual display of the ORCID in
identifier will facilitate local deposition, in association with the author name. Best prac-
particular if done by the library on behalf of tices for publishers and other organizations
their researchers. For those repositories on how to include and display ORCID iden-
linked to research funding, an identifier is a tifiers are still evolving, but based on
key component of efforts to link funding discussions with publishers and repository
with research outputs, and would assist in a managers we recommend including ORCID
identifiers at least in the following scenarios:
number of research reporting efforts cur-
rently underway. 1. as a footnote or in-line in the HTML and
Other use cases for publishers include PDF versions of published manuscripts;
managing login credentials, cross-journal 2. in the article metadata used on journal
transfer, improving the collection of dis- websites;
closure information from authors, and 3. in the article metadata sent to CrossRef
management of open access fees and permis- and bibliographic databases such as Pub-
sions. Med;

LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012


262 Laurel L. Haak et al.

4. in downloadable reference lists using the ambiguate names. This makes author-name
RIS, BibTeX or Endnote format. based queries on the system impossible to do
with any meaningful reliability. CrossRef
Similar recommendations apply to the use
members would like to see ORCIDs depos-
and display of ORCID identifiers in research
ited along with CrossRef author metadata to
datasets and grant applications. The ulti-
improve the querying system. Cross-
mate goal for ORCID is to have the ORCID
Ref would need to make minimal changes in
unique identifier used whenever a scholarly
its data structures and API extensions to
contribution is made or reported. This not
support ORCID deposits and publishers
only includes journal articles and books, but
would need to ensure that ORCIDs were
also conference abstracts, research datasets,
gathered at manuscript submission. By sub-
scientific presentations, and other scholarly
mitting ORCIDs to CrossRef with publica-
contributions. Scholarly publishers will play
tion metadata, the publisher would be
a central role in the adoption of the ORCID
verifying the linkage between author and
identifier.
publication necessary for updating an
Publishers use a number of different
author’s ORCID record.
DTDs. Ideally, the ORCID identifier should
be associated with the author name field, so With the launch of ORCID, researchers
that each author has an identifier and iden- will be able to retrospectively search
tifier type specified in a <IDtype> format. CrossRef metadata for their own publica-
with the launch Some organizations have already made rec- tions and add them to their ORCID record.
Once ORCIDs start being deposited with
of ORCID, ommendations on how to include ORCID
CrossRef metadata, CrossRef will be in a
identifiers in the documents they provide.
researchers will The DataCite Metadata schema7 describes position to help ORCID ‘push’ updates
be able to the core metadata properties for datasets to researchers every time a new publication
is deposited with ORCIDs into CrossRef. At
retrospectively using DataCite DOIs and includes fields for
launch, researchers also will be able to link
search the ORCID identifier (<nameIdentifier>,
<nameIdentifierScheme>). The Journal their ORCID record to other external IDs,
CrossRef Article Tag Suite (JATS),8 a NISO draft and thereby import past publication and
metadata for standard for standardized markup for journal information into their ORCID record. At a
articles based on the NLM DTDs developed later date, ORCID will support importing of
their own additional information types through this
at the National Library of Medicine, uses the
publications <contrib> element to identify authors and linking method.
other contributors to a work and allows the These kinds of ‘pushed’ updates from
<ext-link> element to store identifier trusted sources such as CrossRef will eventu-
and identifier type inside of it. The ally play an important role in the ORCID
@ext-link-type attribute can be used to give system. In essence, a CrossRef pushed
an indication of the type of resource to update which is then claimed by the
which the external link points. The JATS researcher indicates two parties (the pub-
tag set does not constrain the values of this lisher and the researcher) agree that
attribute, and ordinary text is acceptable. the publication DOI and the ORCID are
Thus, to specify ORCID, a value of related to each other. Contrast this with the
<ext-link ext-link-type="orcid"> would be use-case of a researcher manually importing
used. or entering a publication into their ORCID
record. In this latter case, the ‘publisher’
has played no role in ensuring that
ORCID/CrossRef integration
that particular ORCID was associated with
From the start, the CrossRef and ORCID sys- the respective DOI. In short, this would be a
tems have been envisioned as complemen- ‘pure self claim’ as opposed to a ‘verified’
tary infrastructures for uniquely identifying claim. In addition to the claims by research-
researchers and enabling researchers to ers and publishers, claims by other partners –
connect with their publications. Cross- such as the institution of the author or the
Ref’s current system for recording author funder who paid for the research – will fur-
names makes no effort to normalize or dis- ther increase the trust in the claims made in

LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012


ORCID: a system to uniquely identify researchers 263

the ORCID registry. This community review munity. These include use cases, API
and validation process also offers a level of classes, a test server, and help documents.
assurance that the data in ORCID are accu- These resources are updated as feedback
rate. More details on how ORCID will deal from the community is received. Currently, a
with different claims about the same piece of number of publishers are testing ORCID
work can be found in a white paper.9 integration using these resources, including
Another important benefit of ‘pushed’ Nature Publishing Group, Springer, and
updates is in ensuring that ORCID records Wiley; professional society publishers includ-
are current, which in turn will increase the ing the American Association for Cancer
value of ORCID to all stakeholders. This Research and the Association for Comput-
functionality will be available once publish- ing Machinery, and service providers
ers start to collect ORCID identifiers and including Aries, eJournal Press, Highwire benefits include
deposit them along with CrossRef metadata. Press, and Scholar One. The goal is to have reduced
ORCID integration in the first manuscript
tracking systems implemented when the reporting
Integrating with the ORCID service ORCID registry launches in October 2012, workload,
ORCID is an open initiative; individuals alongside ORCID records created by early improved
adopter academic institutions. This will
may create, share, and maintain an ORCID
kick-start the use of ORCID identifiers by
attribution,
record free of charge. ORCID software is
made available under an MIT Open Source the research community. and a better
license. The public data in the ORCID Reg- understanding
istry is searchable without a license, and
Summary
of knowledge
ORCID will provide an annual public data flows
file for free under a Creative Commons Zero By creating a registry for researchers and
waiver. ORCID is sustained by organiza- working with stakeholders to link digital
tional memberships. For the fee, member research documents and other contributions
organizations receive more frequently to this registry, ORCID aims to provide a
updated data, authenticated access to the high-fidelity solution to the name ambiguity
ORCID registry, are registered as a trusted problem in scholarly communication. Bene-
organization, and may create or update fits include reduced reporting workload,
ORCID records on behalf of employees or improved attribution, and a better under-
students.10 standing of knowledge flows to support
ORCID currently provides two applica- research, collaboration, and evaluation. This
tion programming interface (API) classes for vision is only possible if all of the stake-
the community to use. The Tier 1 service holders work together. This paper details
can be used by individuals and organizations steps for publishers to integrate ORCIDs
to query and retrieve public data. The Tier 1 into manuscript submission systems, and
Query API may be used without any regis- provides recommendations for specifying
tration or configuration. The Tier 2 service and displaying ORCID metadata. In addi-
is intended for third parties who need to tion to publishers, ORCID is working with
query and retrieve limited access data, research organizations, funders, and
update or add new record data, and require researchers to detail use cases and
production-level integration. Authentica- requirements ensure that the system is
tion and authorization follows the OAuth 2 responsive to their needs.
standard. Tier 2 is available to member orga-
nizations and requires registration of the References
client application as an OAuth ‘consumer’.
1. Open Researcher & Contributor ID (ORCID),
The technical design of the ORCID service http://about.orcid.org (accessed 27 July 2012).
has been described in more detail else- 2. Scopus, http://www.info.sciverse.com/scopus/about
where.11 (accessed 27 July 2012).
3. Research Papers in Economics (RePEc), http://repec.org/
The ORCID developer’s portal (accessed 27 July 2012).
(http://dev.orcid.org) provides a number of 4. ResearcherID, http://www.researcherid.com/ (accessed
resources to developers in the research com- 27 July 2012).

LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012


264 Laurel L. Haak et al.

5. VIVO | enabling national networking of scientists, Contributer ID (ORCID): solving the name ambigu-
http://vivoweb.org (accessed 27 July 2012). ity problem. Educause Review, 47: 54–55.
6. Structure of the ORCID Identifier. http://dev.orcid.
org/structure-orcid-identifier (accessed 16 August Laurel L. HAAK, Laura PAGLIONE,
2012). ORCID, Inc.
7. DataCite Metadata Schema for the Publica-
tion and Citation of Research Data. 2011.
Email: [email protected]
doi:10.5438/0005
8. Journal Publishing Tag Set 2011. http://jats.nlm.nih.
Martin FENNER
gov/publishing/ (accessed 25 July 2012). Board member ORCID
9. Bilder, G. Disambiguation without de-duplication:
modeling authority and trust in the ORCID system, Ed PENTZ
http://about.orcid.org/content/disambiguation-with- Board member ORCID and CrossRef
out-de-duplication-modeling-authority-and-trust-
orcid-system> (accessed 27 July 2012). Howard RATNER
10. ORCID membership and subscriptions, http://about. Chair ORCID Board and Nature Publishing
orcid.org/membership (accessed 27 July 2012).
11. Wilson, B. and Fenner, M. 2012. Open Researcher & Group

LEARNED PUBLISHING VOL. 25 NO. 4 OCTOBER 2012

You might also like