0% found this document useful (0 votes)
72 views37 pages

Research Paper 1 (Health Informatics)

Research Paper on Integrating Health Data in EHR format. Discussion about OpenEHR)

Uploaded by

Gourav Bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views37 pages

Research Paper 1 (Health Informatics)

Research Paper on Integrating Health Data in EHR format. Discussion about OpenEHR)

Uploaded by

Gourav Bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

1

Semantic Interoperability in Standardized Electronic


Health Record Databases
SHELLY SACHDEVA and SUBHASH BHALLA, University of Aizu, Japan

Different clinics and hospitals have their own information systems to maintain patient data. This hinders
the exchange of data among systems (and organizations). Hence there is a need to provide standards for
data exchange. In digitized form, the individual patient’s medical record can be stored, retrieved, and shared
over a network through enhancement in information technology. Thus, electronic health records (EHRs)
should be standardized, incorporating semantic interoperability. A subsequent step requires that healthcare
professionals and patients get involved in using the EHRs, with the help of technological developments.
This study aims to provide different approaches in understanding some current and challenging concepts
in health informatics. Successful handling of these challenges will lead to improved quality in healthcare
by reducing medical errors, decreasing costs, and enhancing patient care. The study is focused on the
following goals: (1) understanding the role of EHRs; (2) understanding the need for standardization to
improve quality; (3) establishing interoperability in maintaining EHRs; (4) examining a framework for
standardization and interoperability (the openEHR architecture; (5) identifying the role of archetypes for
knowledge-based systems; and (6) understanding the difficulties in querying HER data.
Categories and Subject Descriptors: A.1 [General Literature]: Introductory and Survey; H.2.1 [Database
Management]: Logical Design; J.3 [Computer Applications]: Life and Medical Sciences—Medical infor-
mation systems
General Terms: Design, Languages, Standardization
Additional Key Words and Phrases: Electronic health records, data quality in healthcare, archetype-based
EHR, quality-based EHR, semantic interoperability, standardization in EHR, openEHR
ACM Reference Format:
Sachdeva, S. and Bhalla, S. 2012. Semantic interoperability in standardized electronic health record
databases. ACM J. Data Inf. Qual. 3, 1, Article 1 (April 2012), 37 pages.
DOI = 10.1145/2166788.2166789 http://doi.acm.org/10.1145/2166788.2166789

1. INTRODUCTION
Healthcare is an information-intensive activity producing large quantities of data from
laboratories, wards, operating theatres, primary care organizations, and from wearable
and wireless devices [Simonov et al. 2005]. Thus, the management of information across
systems and organizations requires collaboration, portability, and data integration. In
addition, both patient safety and healthcare costs influence the quality of healthcare.
To obtain these, efficient and accurate data capture is needed. In view of these contin-
gencies, EHRs are becoming a method by which physicians are able to electronically
capture high-quality data at a fast speed and at low cost.

Authors’ addresses: S. Sachdeva, Graduate Department of Computer and Information Systems, Univer-
sity of Aizu, Japan; email: [email protected]; S. Bhalla, Graduate Department of Computer and
Information Systems, University of Aizu, Japan; email: [email protected].
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or [email protected].
c 2012 ACM 1936-1955/2012/04-ART1 $10.00
DOI 10.1145/2166788.2166789 http://doi.acm.org/10.1145/2166788.2166789

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:2 S. Sachdeva and S. Bhalla

1.1. Role of Electronic Health Records


An Integrated Care EHR [ISO/TC215 2003] is defined as: “a repository of information
regarding the health of a subject of care in computer processable form, stored and
transmitted securely, and accessible by multiple authorized users. It has a commonly
agreed logical information model which is independent of EHR systems. Its primary
purpose is the support of continuing, efficient and quality integrated healthcare and
it contains information which is retrospective, concurrent and prospective”. EHRs are
made easily accessible through the World Wide Web (WWW). Consequently, this data
can be used by clinicians, patients, healthcare organizations, and decision makers for
a variety of purposes.
A record of the longitudinal health history of each patient is required to improve qual-
ity of care. Some of the main concerns in maintaining EHRs are privacy, security, stan-
dardization, and interoperability. EHRs will play an important role in telemedicine,
emergency situations, homecare, epidemiological situations, and in creating an e-health
environment. The new environment will help to prevent medication errors, reduce du-
plication, and save time. It will facilitate better coordination of long-term patient data.
Hence, EHRs must be designed to capture relevant clinical data using standardized
data definitions and standardized quality measures. These will help in improving pre-
ventive care and in increasing physician efficiency [Poissant et al. 2005; Øvretveit
et al. 2007]. Several organizations are working to create EHR standards, such as
openEHR Foundation [openEHR 2009]; Consolidated Health Informatics Inititiative
(CHI) [EHR Standards 2009]; Certification Commission for Healthcare Information
Technology (CCHIT) [EHR Standards 2009]; Healthcare Information and Manage-
ment Systems Society (HIMSS) [EHR Standards 2009]; International Organisation
for Standardisation (ISO), American National Standards Institute (ANSI) [EHR Stan-
dards 2009]; Canada Health Infoway [EHR Standards 2009]; and European Committee
for standardization (CEN’s TC 251) [CEN/TC251]. The European Institute for Health
Records, or EuroRec Institute, is a European certification body that defines functional
criteria and provides the EHRs quality labelling (EuroRec, 2010). It has published a
document that details the management and maintenance policies for EHR interop-
erability resources based on both ISO 13606 and OpenEHR [Kalra et al. 2008].The
standards created by these organizations are formalized, controlled, and documented
[Lewis et al. 2008].
Standards provide a definitional basis for interoperability [Atalag et al. 2010].
The single-vendor, closed-data paradigm of commercial development is broken by
the open-source software development sector promoting shared, universal standards.
The openEHR Foundation has proposed openEHR standards that support version-
controlled health records. Version control enables all past states of a health record to
be investigated in the event of clinical or medico-legal investigations. The openEHR
stores the most frequently used information in a separate record for fast lookup and
querying. In this report, we study the openEHR proposals and use the term “openEHR”
to mean the foundation or the standard, depending on the context. The openEHR Foun-
dation was established by University College London and Ocean Informatics [2009],
which an international foundation working towards semantic interoperability of EHR
and improvement of healthcare. Recently, Microsoft has also adopted the openEHR’s
approach for EHRs [Microsoft 2009]. Other organizations are also developing stan-
dards, such as HL7 version 3 and CEN13606, with similar goals [Eichelberg 2005].

1.2. Data Quality in Electronic Health Records


EHR data quality is often considered only within the narrow scope of data verification
and validation. Data quality should also concern the equally critical aspect of assuring

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:3

Hospital A Hospital B

Demographic Demographic
Data Semantic Data

EHR Medical Exchange Medical EHR


Database Record Record Database

Fig. 1. Standardization of an electronic health record.

that EHR data is appropriate for use [Orfanidis et al. 2004]. The various data issues
can be incompleteness (missing information), inconsistency (information mismatch be-
tween various sources or within the same EHR data sources), and inaccuracy (nonspe-
cific, nonstandards-based, inexact, incorrect, or imprecise information) [Gendron and
D’Onofrio 2001; Hristidis 2009]. Such inaccuracies in the attribute values of patient
records make it difficult to find specific patient records [Mikkelsen and Aasly 2005].
The remaining part of this article is organized as follows. Section 2 emphasizes the
role of standardization of EHR for improvement of quality. Section 3 describes semantic
interoperability with emphasis on a dual-model approach. In Section 4, the standard-
ized openEHR architecture is discussed. Section 5 details enhancement of quality by
use of archetypes. It also explains an archetype description language, a language rich
enough to capture and model entities within the medical care domain. In Section 6, the
real challenge of achieving quality in healthcare systems is addressed. Section 7 ex-
plains querying the EHR data, describing various research challenges in querying and
presenting a brief description of an archetype query language. Section 8 presents dis-
cussions of the challenges in design and implementation. Section 9 presents high-level
query language interfaces. Section 10 describes the semantic interoperability consid-
erations. Finally, the summary and conclusions for the research study are included in
Section 11.

2. STANDARDIZATION OF EHR FOR IMPROVING QUALITY


Data collected in various systems can have quality faults [Miettinen and Korhonen
2008]. It can, for instance, be noncoherent or include contradictory information. The
desired data may be completely missing. For example, the unit for temperature may not
be entered definitively as degree Celsius or Fahrenheit, or may be outside the permis-
sible range. There is a need for a communication format and protocol for the purpose of
standardization, since a patient’s health information is shared in a multi-disciplinary
(shared care) environment. Thus, the development and adoption of national and in-
ternational standards for EHR interoperability is essential. It is necessary to support
interoperability between software from different vendors. Standardization will enhance
the quality of EHR systems [Miettinen and Korhonen 2008; Maldonadoa et al. 2007],
and, in this regard, many research studies discuss different approaches to improve
quality [Øvretveit 2003].
It is useful to consider the Internet as an analog in development of standardization.
As a result of many years of research, the Internet is based on standards such as TCP/IP,
SMTP, UTF-8, XML, and HTML. In most cases, a user is able to use any browser on
any platform to navigate the World Wide Web. Similarly, there is a need to achieve
timely and consistent access to EHRs. Currently, there is no single universally accepted
clinical data model that will be adhered to by all [Blobel and Pharow 2008]. Figure 1
illustrates why standardization is needed in interorganization transfer of data. The
meaning of information must be preserved across various applications, systems, and

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:4 S. Sachdeva and S. Bhalla

enterprises. However, the major problem is the huge amount of different (proprietary
or standardized) interfaces that are in use [Bott 2004]. For example,
(i) message or interface standards: Health Level 7(HL7) [HL7 2010]; Electronic Data
Interchange for Administration, Commerce and Transport (EDIFACT); and Digital
Imaging and Communications in Medicine (DICOM);
(ii) Content-oriented standards: Logical Observation Identifiers Names and Codes
(LOINC); The International Statistical Classification of Diseases and Related
Health Problems 10th Revision (ICD-10); International Classification of Procedures
in Medicine (ICPM); or
(iii) Hybrid standards: CEN 13606 and openEHR.
A recent study compares the available EHR standards [Blobel and Pharow 2008], and
there are mappings being developed among standards. For example, ISO 13606-1 is the
model to enable data to be shared between different EHR systems [ISO 13606-1 2008].
A mapping algorithm is also being developed that allows a bidirectional transform
between openEHR and ISO 13606 [Beale 2010].
Formally, four layers of standardization have been recognized: content, structure,
technological, and organizational [Bott 2004].
(i) The content layer addresses aspects of coding. It uses terminological systems such
as classifications or controlled vocabularies.
(ii) The structure layer focuses on regulations concerning the structure of EHR ele-
ments. Its examples include XML-files that are based on standardized DTDs (or
XML-Schemas). Several content-oriented aspects, such as a discharge letter, are
usually modeled by defining the structure.
(iii) The technological layer contains regulations concerning aspects such as software
and hardware components, distribution, objects, and services, and the Public Key
Infrastructure (PKI) for data security.
(iv) The organizational layer focuses on changes caused by the usage of an EHR system
in an organization. These concern business processes, guidelines, protocols, roles,
and PKI.
Organizations adopt the standards to achieve interoperability and promote informa-
tion quality [Lewis et al. 2008]. However, there are problems in reaching agreements
on standards. There is a technical problem involving the development of a language
sufficiently rich to capture and model the medical domain. Also, there is a human
problem involving agreement on what is contained within the domain, and why it is
important. This study addresses these problems in Sections 5 and 6.
2.1. Standardized EHRs
In essence, the proposed electronic health records (EHRs) have a complex structure
that may include data from about 100 to 200 parameters, such as temperature, blood-
pressure, and body mass index. Individual parameters will have their own contents.
Each contains an item such as “data” (e.g., captured for a blood pressure observation).
It offers complete knowledge about a clinical context, (i.e., attributes of data); “state”
(context for interpretation of data); and “protocol” (information regarding gathering of
data), as shown in Figure 2 (depicting completeness).
In order to serve as an information interchange platform, EHRs aim to use archetypes
to accommodate various forms of content [Beale and Heard 2008a; ISO 13606-1 2008].
The EHR data will have a multitude of representations. The contents may be struc-
tured, semi-structured or unstructured, or a mixture of all three. These may be plain
text, coded text, paragraphs, measured quantities (with values and units); date, time,
date-time (and partial date/time); encapsulated data (multimedia, parsable content);

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:5

Systolic Blood Pressure Standing

Diastolic Blood Pressure Position Sitting


Lying
Mean arterial pressure
Data State
Comment All-rest
Exercise
Post-exercise

Exertion Level
Baseline Reading Blood Pressure
5 minutes reading
Instrument
10 minutes Reading
Specific events Cuff size Leg
Postural change Protocol
Arm
Paradox Location of
Measurement Side

Fig. 2. Blood pressure as a concept.

basic types (such as Boolean, state variable); container types (list, set); and uniform
resource identifiers (URI).

3. ESTABLISHING INTEROPERABILITY IN MAINTAINING EHR


Different clinics and hospitals have their own complex information systems to maintain
patient data. These may be made up of paper records or electronic medical records
(EMRs). EMRs consist of data such as patient demographics, medical history, medicine,
and allergy lists (including immunization status), laboratory test results, radiology
images, billing records and advanced directives. There is redundancy in existing data
because of distributed and heterogeneous data resources. The same data is sometimes
input several times, leading to quality faults. An additional concern is the complexity
of the health domain, which is evolving at a fast rate. Healthcare-related knowledge is
becoming broad, deep, and rich with time. Thus, there is a need for legacy migration
of data for interoperability and health information exchange [Walker et al. 2005].
EHRs can be standardized and should incorporate semantic interoperability. National
Health Information Network (NHIN) defines semantic interoperability as “the ability to
interpret, and, therefore, to make effective use of the information so exchanged” [NHIN
2005]. Similarly, IEEE Standard 1073 defines semantic interoperability as shared data
types, shared terminologies, and shared codings [Kennelly 1998].
Consequently, standardized terminology is a critical requirement for healthcare ap-
plications to ensure accurate diagnosis and treatment. It has led to developing stan-
dards such as Systematized Nomenclature of Medicine, Clinical Terms (SNOMED-CT)
[SNOMED 2009]. Shared codings refer to establishing standard encodings to be shared
among systems. Such codings refer not only to encoding software functions, but also to
encoding medical diagnoses and procedures for claims processing purposes, research,
and statistics gathering.1 Shared data types refer to the types of data exchanged by
systems. Interoperability requires that systems share data types on many different
levels, including messaging formats (e.g., XML, ASCII), and programming languages
(e.g., integer, string). Shared terminologies refer to establishing a common vocabulary
for the interchange of information. Semantic interoperability may also require support

1 Forexample, the International Statistical Classification of Diseases and Related Health Problems, 10th
Revision (ICD10) and Current Procedural Terminology. This is a systematic coding system for reporting
medical services and procedures performed by physicians.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:6 S. Sachdeva and S. Bhalla

Clinical Model
(Archetypes
and
Templates) Clinical Domain
Actor

Clinical
Actor
Expert
User
Database
Schema
(Reference
Model) Software
Expert Actor

Fig. 3. Two-level modeling approach.

of ontological mappings at the conceptual level (Figure 2 and Section 5). The follow-
ing sections present the model (Section 3) and specifications (Section 4) for achieving
interoperability.

3.1. Dual Model Approach


Traditionally, three methodologies have been used for building systems such as EHR
systems [Patrick et al. 2006]:
—an unstructured approach,
—the “BIG” model approach, and
—the generic approach.
The unstructured approach to EHR is simply a warehouse filled with unstructured
text; the “BIG” model approach has a separate table for each clinical concept leading
to excessively large schemas. However, the generic model is designed to allow a wide
variety of data to be accommodated in a general-purpose set of data structures. The
stored data is similar to that of the unstructured process.
In order to overcome the problem of lower data quality (DQ) from generic model-
ing, a constraint mechanism must be introduced (for each parameter) to ensure that
the stored information is valid in terms of the clinical domain (Figure 3). The mech-
anism is referred to as the “Archetype Model”. This model expresses the character
of clinical data attributes and stores this information as data in the database rather
than in the database schema. There is a quality improvement as each time there is
change in the domain knowledge, the software need not be changed. It was initially
developed by the Good Electronic Health Record project [GEHR], and later adopted by
the openEHR Foundation. It is well aligned, and makes advances on the CEN 13606
standard [CEN/TC251].
Further, in order to achieve interoperability, a two-level modeling approach for sep-
aration of information and knowledge has been used. It is specified in open elec-
tronic health record architecture (EHRA) [Beale and Heard 2008a; ISO 13606-1 2008].
Examples of dual model EHR architecture are CEN/TC251 EN13606 [CEN/TC251] (de-
veloped by the European Committee for Standardization) and openEHR. The modeling
helps improve quality by sharing archetypes via a repository with versioning, by as-
signing of unique archetype identifiers, and by a widely-accepted underlying reference
model.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:7

Modeller: provides terminology

creates semantic creates


conformance

User Information Archetpes and


Domain Expert
templates

Data- Conceptual description-


normal instance/ expressed in
class conformance ADL

semantics of
constraint
ADL Archetype Model/
Reference
Model Language

Fig. 4. Expansion of healthcare knowledge: Archetype meta-architecture.

The two-level approach consists of a reference model (RM) and the domain-level
definitions in the form of archetypes and templates (Figure 3). The concept behind it is
the introduction of a level of abstraction between the program logic and the database
schema [Beale and Heard 2008a]. This mechanism provides data independence, similar
to the case of conventional database management systems (DBMSs) [Silberschatz et al.
2010]. EHR systems based on this approach have the capability of incorporating new
standardized data elements in a timely manner. A domain expert designs archetypes,
and the user creates the information item which is mapped to an archetype (Figure 4)
[Beale and Heard 2008a]. The dual model EHRA specifications have already been
adopted by Microsoft [Microsoft 2009]; HL7 is also in process of adopting a dual-model
approach.

3.2. A Reference Model and a Conceptual Model


At the lower level, the reference model (RM) is an object-oriented model. It contains
basic entities for representing any entry in an EHR. The software and data that can be
built from RM concepts in openEHR RM are invariant. These comprise a small set of
classes that define the generic building blocks to construct EHRs. At the upper level,
semantic interoperability is achieved by a precise definition of information items, called
archetypes. The EHR is based on such archetypes [Beale and Heard 2005], which are
exchanged as formal definitions of different clinical concepts in the form of structured
and constrained combinations of the entities of an RM.
A conceptual definition of data as archetype can be developed in terms of constraints
on structure, types, values, and behaviors of RM classes. These can be created for each
concept in the domain for which the user may have a need. For example, a generic class
“PARTY” can represent different domain concepts such as patient, doctor, or nurse
(Figure 5). Thus, the archetype model (conceptual model) level focuses on individual,
self-contained, clinical attributes that are independent meta-descriptions of clinical

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:8 S. Sachdeva and S. Bhalla

PARTY

is a

Patient Doctor Nurse

Fig. 5. PARTY, as a generalization of concepts.

information such as “blood pressure” (Figure 3), “mode of delivery” and “birth weight”.
This model expresses the character of these clinical data attributes. It stores this
information as data in the database rather than in the database schema. Additional
software is needed to manage this meta-data. A component named as the “modeler”
supports data capture with reference to the archetype model (Figure 4).
Standardization can be achieved in this manner. Whenever there is a change in the
clinical knowledge (or requirements), the software need not be changed. The archetypes
need to be modified (or added) in conformity with RM. This leads to enhancement in
terms of data quality and information quality (IQ). The segregation of information from
knowledge is shown by the dual levels and directional arrows in Figure 3 and Figure 4.
The terminologies contain facts about the real world. The clinical user can enter and
access the information through clinical application. The clinical domain expert can
record and maintain the clinical model through the modeler. The modeler is software
needed to manage the archetypes. The clinical model addresses aspects of coding the
content of EHR-Element using terminological systems like classifications or controlled
vocabularies.
Thus, archetypes have the feature to separate the internal model data from formal
terminologies. The internal data is assigned local names which can later be bound
or mapped to external terminology codes. This feature eliminates the need to make
changes to the model whenever the terminology changes. Matching clinical data to
codes in controlled terminologies is the first step towards achieving standardization of
data for safe and accurate data interoperability.
Archetype Definition Language (ADL) syntax has been proposed by openEHR. It is
one possible serialization of an archetype. It is used to describe constraints on data
which are instances of the reference model (information model). The Archetype Model
structurally expresses the semantics of the ADL (Figure 4). ISO has accepted ADL as
a standard language for description of archetypes [ISO 13606-2 2008].

4. STADARDIZATION AND INTEROPERABILITY (openEHR ARCHITECTURE IN A DATA


QUALITY PERSPECTIVE)
The preceding dual-level system has been developed and implemented by openEHR to
improve data quality. It is a two-level software engineering approach that separates
knowledge from information. The information is generated, stored, manipulated, and
consumed in a manner that improves data quality. Other organizations with activities
close to openEHR are ASTM, CEN, HL7, IHE, ISO, and IHTSDO. We have chosen
openEHR as a case for study because the openEHR system puts special emphasis on
semantic interoperability to improve the quality of data exchanged among multiple
organizations or within a single organization. It takes into account the clinical work-
flows and operational contexts, thus ensuring interoperability across enterprises. Its
standards are supplying both system specifications for data and archetypes for clinical

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:9

Abstract defined UML notation and formal


Specifications as textual class specification

is available
Computable UML
as
can be substituted
for

System development
Platform

ITS ITS ITS


(programming system (programming system (programming system
and schema language A) and schema language B) and schema language C)

Fig. 6. OpenEHR Specifications.

phenomena [Pishev 2006]. Together, these facilitate a stable reference implementation


and open-source software for the users [openEHR 2009].
With openEHR, clinicians are not just passive users of openEHR-enabled software
and systems, but actively determine the possible breadth, depth, and richness of pa-
tient data kept in EHR systems. This directly affects the quality of patient care through
clinicians’ pivotal role in creating, revising, and updating archetypes. The openEHR
community has a growing library of high-quality authored archetypes for use in clinical
care and tools to support their maintenance, governance, and release [CKM 2009]. The
openEHR-based EHR system has been implemented for hospitals (emergency depart-
ment of Austin Health in Australia, and maternity care in a hospital in Cambodia)
[Gok 2008; MOSS8 2009].

4.1. OpenEHR Specifications


The openEHR specifications have been developed to standardize international elec-
tronic health records. The openEHR project [Beale and Heard 2008a] deliverables in-
clude requirements, abstract specifications, implementation technology specifications
(ITS), computable expressions and notations for conformance criteria (Figure 6). The
abstract specifications consist of the reference model (RM), archetype model (AM), and
the service model (SM) (Figure 7). RM represents the semantics of storing and process-
ing in the system. It contains a set of generic data structures that are flexible enough to
model most of the logical structures for knowledge representation (occurring in clinical
records). AM contains the knowledge-enabling environment by defining domain-level
structure and constraints on the generic data structures described in the RM. Thus AM
describes the semantics of archetypes and templates, and their use within openEHR.
At the user level, the openEHR service model includes definitions of basic services in
the health information environment, centered around the EHR (Figures 7 and 8).
The abstract specifications above published by openEHR are defined using the UML
notation and formal textual class specifications [Beale and Heard 2008a]. These are
also available in a tool-oriented computable UML format in order to enable devel-
opment of software and systems [Beale and Heard 2008a] (Figure 6). The computable
expressions for all practical purposes can be assumed as being a lossless representation
of the published abstract specifications. The implementation technology specifications
(ITS), on the other hand, correspond to the expression of abstract specifications in

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:10 S. Sachdeva and S. Bhalla

User User User Health Application Knowledge Service


View 1 View 2 View 3 integration development management model
platform platform platform

Conceptual/ Logical Templates Queries Terminology interface Archetype


Level model
Concepts as Archetypes

Physical/ Internal Basic entities as Reference


Service objects and versions model

Fig. 7. Comparison of DBMS architecture with openEHR architecture.

various programming and schema languages. The approach to implementing any of


the openEHR abstract models in a given implementation technology is to first define
an ITS for the particular technology, then to use it by formally mapping the abstract
models into expressions in that technology.

4.2. EHR Levels of Abstraction


The three levels of abstract specifications in EHR architecture are similar to those in
database management system (DBMS) architecture [Silberschatz et al. 2010] (Figure 7
and Figure 17).
(i) Physical level. The lowest level of abstraction describes the details of reference
model, such as identification, access to knowledge resources, data types and structures,
versioning semantics, support for archetyping and semantics of enterprise level health
information types.
(ii) Logical level. The conceptual level describes the clinical concepts that are to be
stored in the system; these are represented in the form of archetypes and templates.
A user of a logical level does not need to be aware of their complexity. Clinical domain
experts use the logical level.
In common with object model classes, archetypes can be specialized, as well as com-
posed (i.e., aggregated) [Beale and Heard 2005]. An example of composition archetype
is openEHR-EHR-OBSERVATION.laboratory.v1. An archetype is a specialization of
another archetype if it mentions that archetype as its parent, and only makes changes
to its definition, such that its constraints are “narrower” than those of the parent. (as
in openEHR-EHR-OBSERVATION.laboratory-glucose.v1).
(iii) View level. The highest level of abstraction describes only a part of the entire
EHR architecture depending upon the need. This corresponds to the service model.
Several views may be defined, which users can see. In addition to hiding details of
the logical level for simplicity, the views also provide a security mechanism to prevent
users from accessing certain parts of the EHR architecture.
The architecture is similar to the DBMS environment. To obtain any information,
there must be a significant amount of pre- and postprocessing to decompose and recon-
struct information from the generic data structures. For example, to insert an instance
of EHR information (e.g., a blood pressure reading) during runtime, the software layer
must query and construct its corresponding archetype from an archetype repository,

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:11

Virtual EHR
SM
Terminology Demographic EHR Archetype
service service service service

Template OM
AM
OpenEHR Archetype Profile

Archetype OM ADL

EHR Extract
domain
EHR Demographic Integration

Composition
RM
patterns { Security Common

Data Structures

core Data types

Support (identifiers, terminology access)

Fig. 8. The openEHR package structure [Beale and Heard 2008a].

and then perform a comparison to make sure the data instance adheres to all con-
straints and rules imposed by the archetype.

4.3. The openEHR Model


The three abstraction levels provide a macro-level view of the EHR architecture. The
design aim of openEHR is to provide a coherent, consistent, and reusable type system
for health computing. The components within each level are described further. The
“core” of the reference model (RM) provides various common design patterns that can
be reused ubiquitously in the upper layers of the RM, as well as in the archetype model
(AM) and service model(SM) layers. Figure 8 illustrates the relationships between
RM, AM, and SM packages. Dependencies only exist from higher packages to lower
packages.

—Reference Model (RM). As mentioned earlier, RM provides identification, access to


knowledge resources, data types and structures, versioning semantics, and support
for archetyping. The components within the RM are organized into three packages:
core, pattern and domain (Figure 8). The core group package is generic. It is used
by all openEHR models and in all the outer packages. The packages in the patterns
and domain group define the semantics of enterprise-level health information types,
including the EHR and demographics, which are described as follows.
—Core. The main component in this group is support. It consists of the ‘definitions’,
‘identification’, ‘terminology’ and ‘measurement’ packages. The semantics defined in
‘support’ allows all other models to use identifiers and to have access to knowledge
services like terminology and other reference data. The use of standardized data
types enhances the interoperability of low-level data semantics across systems.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:12 S. Sachdeva and S. Bhalla

—Pattern. Components include: Security and Common. The Security Information


Model defines the semantics of access control and privacy setting for information
in the EHR. The Common Information Model (IM) contains classes (such as LOCAT-
ABLE and ARCHETYPED) that provide the link between information and archetype
models.
—Domain. The EHR IM defines the containment and context semantics of the concepts
EHR, COMPOSITION, SECTION, and ENTRY. Components within this “domain”
are described in Section 4.4. The EHR Extract IM defines how an EHR extract is built
from COMPOSITIONs, demographic, and access control information from the EHR.
The integration model defines the class GENERIC ENTRY, a subtype of ENTRY
used to represent free-form legacy or external data as a tree. The demographic
model defines generic concepts of PARTY, ROLE and related details such as contact
addresses.
Archetype Model (AM). The openEHR AM package contains the models necessary to
describe the semantics of archetypes and templates, and their usage within openEHR.
The openehr profile package defines a profile of the generic archetype model defined
in the archetype package for use in openEHR (and other health computing endeavors)
(Figure 8). Further details of this level are discussed in Section 5.
Service Model (SM). The openEHR service model includes definitions of basic services
centered around the EHR. These service sets will be evolving with time. Some details
of the service model are discussed in Section 7. Its components are described in the
following (refer to Figure 8).
(i) The virtual EHR application programming interface (API) defines the interface to
EHR data at the level of compositions and below. It allows an application to create new
EHR information, and to request parts of an existing EHR and modify them. This API
enables fine-grained archetype-mediated data manipulation. Changes to the EHR are
committed via the EHR service.
(ii) The EHR service model defines the coarse-grained interface to electronic health
record service. It also defines the semantics of server-side querying, that is, queries
which cause large amounts of data to be processed, generally returning small ag-
gregated answers, such as averages, or sets of identifications of patients matching a
particular criterion.
(iii) The archetype service model defines the interface to online repositories of
archetypes.
(iv) The terminology interface service provides the means for all other services to
access any terminology available in the health information environment. The termi-
nology service is the gateway to all ontology- and terminology-based knowledge services
in the environment.
(v) The demographic service provides all services regarding the demographic infor-
mation so as to provide privacy and security.

4.4. Detailed EHR Reference Model Architecture


4.4.1. EHR Extract Information Model. Each content item potentially contains a whole
hierarchy of information items, only some of which are generally of interest to the
requestor. The typical database idea of a “query result” is usually expected to return
only such fine-grained pieces. Clinician or software may need to obtain some or all of
a single patient’s EHR. A hospital or clinic may require (from a laboratory) results of
testing done for multiple patients. The openEHR extract supports detailed access to the
versioned view of data (Figure 8 and Figure 9(b)). Information transferred in an EHR
extract needs to be self-standing in the clinical sense, that is, it can be understood by
the requestor without assuming any other means of access to the responding system.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:13

Request for subject X, Y


Requesting Responding
system system
Extract

(a)

Version Version
Container Container
Subject X
Responding Versions Versions
record
System Content Content

Subject Y Version Version


record Container Container
Versions Versions
Content Content

(b)

Fig. 9. (a) Information extraction (requesting System and responding System). (b) Operational openEHR
environment for extracts based on Beale and Frankel [2007].

As the topmost layer in RM, the EHR extract information model defines architecture
for communication of EHR extracts, or documents (Figure 9(a) and 9(b)). It is prescribed
under “domain” in RM as “EHR extract” (Figure 8).
The responding system contains one or more subject records. Each subject record
consists of one or more version containers, each of which contains the version history
of a particular piece of content. Each version corresponds to the state of a particular
content item at some point in time when it was committed by a user. Contribution
corresponds to the set of versions (each from a different version container) committed
at one time by a particular user to a particular system. For example, a patient may
have EHR both at clinic and home PC. Whenever changes are made at either place, it
is possible to copy just the required changes (copying contributions) to the device since
the last synchronization, thus enhancing the quality of the information system.
4.4.2. The openEHR EHR Information Model . This model is a part of RM (domain) which
defines a logical EHR information architecture (rather than just architecture for com-
munication of EHR extracts or documents between EHR systems). The package struc-
ture of openEHR EHR contains the elements ehr, compositon, and content (Figure 10).
EHR. The EHR consists of distinct, coarse-grained items known as compositions
added over time and organized by folders. Each composition consists of entries, orga-
nized by sections within the composition (Figure 10). The audit information for each
context is recorded at the corresponding level of the EHR.
The root EHR object records three pieces of information that are immutable after
creation: the identifier of the system in which the EHR was created (system id); the
identifier of the EHR (distinct from any identifier for the subject of care) (ehr id);
and the time of creation of the EHR (time created). It acts as an access point for the
component parts of the EHR. It contains the versioned objects by references. This
package contains the top-level structure, the EHR (the root object, identified by a
globally unique EHR identifier), which consists of an EHR ACCESS object (containing

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:14 S. Sachdeva and S. Bhalla

composition

content
ehr

navigation entry

Fig. 10. Package structure of openEHR EHR Information model [Beale et al. 2008].

access control settings for the record); an EHR STATUS object (containing various
status and control information, optionally including the identifier of the subject (i.e.,
patient) currently associated with the record); versioned data containers in the form of
VERSIONED COMPOSITIONs (containers of all clinical and administrative content of
the record), optionally indexed by a hierarchical directory of FOLDERs (which contain
compositions by reference). A collection of CONTRIBUTIONs is also included, which
documents the changes to the EHR over time.
Composition. The composition is the EHR’s top level “data container”. It is described
by the COMPOSITION class. The main data of the EHR is found in its compositions
(Figure 10). There are two general categories of information at the coarse level, which
are found in an EHR: event items and persistent items. Events record what happens
during healthcare system events (with or for the patient), such as patient contacts.
These also record sessions in which the patient is not a participant or is not present
(e.g., pathology testing). Persistent compositions record items of long-term interest in
the record; they can be thought of as proxies for the state or situation of the patient.
The composition concept in the openEHR’s EHR originated from the transaction
concept of the GEHR project. It was based on the concept of a unit of information
corresponding to the interaction of a healthcare agent with the EHR. It was designed
to satisfy the ACID characteristics [Silberschatz et al. 2010] along with indelibility,
modification, and traceability.
Content. The content package contains the CONTENT ITEM class, ancestor class
of all content types, and the navigation and entry packages, which contain SECTION,
ENTRY and related types. The classes in the package describe the structure and se-
mantics of the contents of compositions in the health record.
(a) Navigation. The SECTION class provides a navigational structure to the record,
similar to “headings” in the paper record. ENTRYs and other SECTIONs can appear
under SECTIONs. Sections provide both a logical structure for the author to arrange
entries, and a navigational structure for readers of the record. The main benefit of
Sections is that they may provide significant performance benefits to querying by
automated systems.
(b) Entry. This package contains the generic structures for recording clinical state-
ments. All information created in the openEHR health record is expressed as an in-
stance of a class in the entry package, containing the ENTRY class and a number
of descendants. An ENTRY instance is logically a single “clinical statement”. It may
contain a significant amount of data (e.g., a microbiology result, a psychiatric exami-
nation, a complex prescription). In terms of clinical content, the entry classes are the
most important in the openEHR EHR information model. These define the semantics
of all the “hard” information in the record; these are intended to be archetyped.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:15

1.
observations

published
evidence patient
base 2.
opinions
4.
and
personal actions
treatment plan
knowledge
base investigator investigator
3. agents
instructions

Fig. 11. Clinical investigator recording process [Beale and Heard 2008a].

The design of the entry package is based on the clinical investigator recording process
as shown in Figure 11. The observation, evaluation, instruction, and action cycle for
building the elements of EHR is analogous to data quality improvement by following
the cycles of define, measure, analyze, and improve [Madnick et al. 2009].
The details of generic structures within the ENTRY package are shown in Figure 12.
Entry types include CARE ENTRY. This contains information that relates to care pro-
cess. OBSERVATION includes all observed phenomena, including those mechanically
or manually measured, and responses in interviews. EVALUATION includes assess-
ments, diagnoses, and plans. INSTRUCTION contains actionable statements such as
medication orders, recalls, monitoring, and reviews. ACTION contains information
recorded as a result of performing instructions. Consider a few examples.
Examples of contents in the ENTRY package: Under entry, in the EHR information
model (Figure 12), the CARE ENTRY class is an abstract precursor of classes that
express information of any clinical activity in the care process around the patient. The
CARE ENTRY type includes two attributes particular to all clinical entries, namely
the protocol and guideline id, which allow the “how” and “why” aspects of any clinical
recording to be expressed. Also, the ADMIN ENTRY is used to capture administrative
information. Administrative information is created by staff. It expresses details to do
with coordinating the clinical process, including admission information, appointments,
discharge/dismissal, billing, and insurance information.
Examples of contents in CARE ENTRY: The clinical information about the problem
list is recorded inside persistent composition. It is maintained as one or more “evalu-
ations” (which are generated by clinicians), as a result of “observations”. The clinical
information about referral is recorded inside event composition and is recorded as “in-
structions”. The concepts in these examples are defined in terms of archetypes of entry
and other reference model types in openEHR.
Further “actions” are interventions whereas “observations” record only information
relating to the situation of the patient (not what is done to him/her). Observation is
expressed in terms of “data,” “state,” and “protocol” (Figure 2) as shown in Table I.
The “time” in the observation category has a linear historical structure, whereas in
the instruction category it has a branching, potentially cyclic structure. Time is used
for all kinds of statements which evaluate other information, such as interpretations
of observations, diagnoses, differential diagnoses, hypotheses, risk assessments, goals
and plans. It has attribute data in the form of a spatial data structure.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:16 S. Sachdeva and S. Bhalla

PATHABLE
(rm.common.archetyped)

LOCATABLE
(rm.common.archetyped)

CONTENT_ITEM
(rm.composition.content)

entry

ENTRY
language[1]: CODE_PHRASE
encoding[1]: CODE_PHRASE
subject[1]: PARTY_PROXY
provider[0..1]:PARTY_PROXY
workflow_id[0..1]: OBJECT_REF
subject_is_self: Boolean

ADMIN_ENTRY CARE_ENTRY
data[1]: ITEM_STRUCTURE protocol[0..1]: ITEM_STRUCTURE
guideline_id[0..1]: OBJECT_REF

OBSERVATION EVALUATION INSTRUCTION ACTION


data[1]: data[1]: narrative[1]: DV_TEXT time [1]: DV_DATE_TIME
HISTORY<ITEM_STRUCTURE> ITEM_STRUCTURE expiry_time[0..1]: DV_DATE_TIME description[1]: ITEM_STRUCTURE
state[0..1]: wf_definition[0..1]: DV_PARSABLE
HISTORY<ITEM_STRUCTURE>
instruction_
ism_transition 1 0..1
activities * details

ACTIVITY ISM_TRANSACTION INSTRUCTION_DETAILS


description[1]: ITEM_STRUCTURE instruction_id[1]:
current_state[1]:DV_CODED_TEXT
timing[1]: DV_PARSABLE LOCATABLE_REF
transition[0..1]:DV_CODED_TEXT
action_archetype_id[1]: STRING activity_id[1]: STRING
careflow_step[0..1]:DV_CODED_TEXT
wf_details[0..1]:
ITEM_STRUCTURE

Fig. 12. The rm.composition.content.entry package (in UML) [Beale et al. 2008].

Table I. Observation Expressed as Data, State, and Protocol


Data Expressed in the form of a History of Events.
It can be List, Table, Single (value), or Tree.
State Information about the state of the Entry
(necessary to correctly interpret the data).
Protocol Details of how the observation was carried out.

“Instruction” is used to specify actions in the future. It enables simple and complex
specifications to be expressed, including in a fully-computable workflow form. “Activity”
defines a single activity within an instruction, such as a medication administration.
“Action” is used to record a clinical action that has been performed, which may have
been ad hoc, or due to the execution of an activity in an instruction workflow (recorded
as attribute instruction details).

4.4.3. Demographic. The demographic IM defines demographic information. The gen-


eral approach of openEHR is to enable the complete separation of demographic (partic-
ularly patient-identification information) from health records. This is in the interests
of privacy (in some cases required by national legislation) and separated data manage-
ment. Thus, the demographic information regarding a patient is kept separate from
the medical record of a patient. It is shared if the patient agrees to share it. This helps
in maintaining information quality.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:17

5. ARCHETYPES IN A KNOWLEDGE-BASED SYSTEM


5.1. Introduction to Archetypes
As per the concise Oxford Dictionary, an archetype is “an original model, prototype or
typical specimen”. Knowledge representation of clinical concepts is through archetypes
which enable semantic interoperability of heterogeneous systems. These define data
quality constraints to be placed on the organization and the content of record entries.
An archetype defines a data structure, including optionality and multiplicity, data value
constraints, and relevant bindings to natural language and terminology systems. An
archetype might define or constrain relationships between data values within a data
structure. These are expressed as algorithms, formulas, or rules. Its metadata defines
its core concept, purpose, and use, evidence, authorship, and versioning. An archetype
ensures a maximal dataset. It contains all the relevant information regarding a clin-
ical concept. For example, the archetype for blood pressure contains all the relevant
information (the data, state, and protocol parts) (Figure 2).
Consequently, there are checks on quality of information for every clinical concept.
For example, an archetype on blood pressure measurement would comprehensively
and formally describe what a clinician needs to know about a measurement, including
its clinically safe interpretation [Beale and Heard 2005]. High-quality archetypes with
high-quality clinical content are the key to semantic interoperability of clinical systems
[Bisbal and Berry 2009]. Archetypes incorporate an ontology section, such domain-
specific ontologies provide a common semantics for all shared data.
Different archetypes are instances of an archetype model. The archetype object model
is an object-oriented data model. An object can be specialized as well as composed
(aggregated). An archetype may logically include other archetypes, and may be a spe-
cialization of another archetype. Thus, they are flexible and vary in form. In terms
of scope, they are general-purpose, reusable and composable. Archetypes are sepa-
rate from the data, and are stored in archetype repositories. The archetype repository
at any particular location will include archetypes from well-known online archetype
libraries.
The function of archetypes and templates at runtime is to facilitate data valida-
tion (at data capture or import time). These guarantee that elements of data con-
form to the reference model (and to the archetypes themselves) [Beale and Heard
2005]. Data validation with archetypes is mediated by the use of openEHR tem-
plates. Figure 13 illustrates the relationships between EHR and archetypes. Hence
the clinical user can share data with other health record systems (HRS’s) through
archetypes.

5.1.1. Archetypes and Data Validation. By design, archetypes incorporate rules. Data en-
tered into an archetype-enabled system will only be captured if it fits those rules. This
feature improves the data quality of a system considerably, and archetype-powered
searches or queries on the EHR are able to find specific data. Similarly, archetyped
data is able to be viewed consistently and reproducibly, no matter where they appear
within a single EHR or within any number of EHR systems. Some of the rules that
clinicians can set within archetypes to make information captured or viewed fit their
requirements include the following:

—the maximum and minimum value of a measurement (for example, not allow a pulse
rate that is less than zero);
—the allowed units of measurement (e.g., weight in gm or kg);
—the appropriate set of terms from a terminology (for example, a set of blood groups
including the associated rhesus typing);

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:18 S. Sachdeva and S. Bhalla

submit search for view share clinical data


clinical data test results clinical data with other HRS’s

Preserve
search Format clinical
Clinical validate
for data data(using knowledge
user data
archetypes) when sharing
data (using
archetypes)

Archetypes

EHR

Database

Fig. 13. All functions use archetypes to refer to the clinical data in the EHR systems.
—an internal value set that is allowed for an element (for example, in a subjective
assessment of blood loss there may be the options of none, light, normal, and heavy;
and
—establish whether a piece of clinical data is required (or optional).
These rules constrain data entry, thus considerably improving data quality. Some
of the key data quality problems such as missing values, syntax violation, domain
violation, existence of synonyms, heterogeneity of syntaxes, and heterogeneity of mea-
sure units are taken care of by archetypes.
5.2. Major Categories of openEHR Archetypes
With reference to openEHR specifications, an archetypable data instance in an EHR
is a composition, a section, an entry or an item structure. According to the openEHR
reference model [Beale et al. 2008], there are five sorts of entries and four types of item
structures which form categories within archetypes in openEHR EHRs (Figure 14)
[Thurston 2006]. Every type of clinical knowledge (information) can be mapped to one
of these categories.
(1) Composition (or document). this contains information committed to the EHR.
Compositions contain sections, or organizing classes, which themselves contain entry.
Examples of compositions are documents created by a clinician and stored in the EHR,
laboratory report, ECG report, a problem list, and a family history list.
(2) Section. this allows information within a composition to be organized. Sections
provide both a logical structure and a navigational structure. Sections are archetyped
in trees with each tree containing a root section, one or more subsections, and any
number of entries at each node.
(3) Entry. An entry is like the leaf node of the document and contains information,
such as blood pressure, assessment, diagnosis, and medication order. The “entries”
can be interpreted independent of the composition (or the section) within which they
are located. This is a key principle of the openEHR methodology. It is very important
for automatic processing of EHR data. An ENTRY instance is logically a single
“clinical statement”. It may be a single short narrative phrase, and may also contain a

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:19

EHR_EXTRACT Compositions (openEHR-EHR-COMPOSITION.x)


organised by Sections (openEHR-EHR-SECTION.x)
FOLDERs
Entries:
COMPOSITIONS Observations(openEHR-EHR-OBSERVATION.x)
organised by Evaluations (openEHR-EHR-EVALUATION.x)
SECTIONS Instructions (openEHR-EHR-INSTRUCTION.x)
ENTRY Actions (openEHR-EHR-ACTION.x)
organised by Administrative (openEHR-EHR-ADMIN.x)
CLUSTERS
Item structures:
ELEMENT Single items (openEHR-EHR-ITEM_SINGLE.x)
Item lists (openEHR-EHR-ITEM_LIST.x)
OPTIONAL Item tables (openEHR-EHR-ITEM_TABLE.x)
Item trees (openEHR-EHR-ITEM_TREE.x)

Fig. 14. Hierarchical structure of EHR and categories of openEHR archetypes.

significant amount of data, for example, a microbiology result, a psychiatric examina-


tion, a complex prescription. In terms of clinical content, these define the semantics
of all the “hard” information in the record. Archetypes for entries make up the vast
majority of defined important clinical archetypes. For example, entry can be a “care
entry” or an “admin entry” (relating to similar entities in RM, section 4.4.2, Figure 12).
(4) Item structures. an EHR requires structured data in the form of single values
(e.g., weight, height), or as lists of values (e.g., blood test results), or as tables (e.g.,
visual acuity results) or trees (e.g., biochemistry results) of values.

5.3. Archetypes and Data Quality


Archetypes are stand-alone entities and can be created, shared, reused, specialized,
revised, and versioned. These are both language, and terminology-independent. They
provide knowledge-level interoperability. Thus, systems reliably communicate with
each other at the level of knowledge concepts. Archetypes contain validation rules for
all the data entered into systems, and thus help in preventing errors in health records.
They can help in epidemiological and other public health research functions. There
will be a transformation in clinical care and research by virtue of the ability to work
from a shared knowledge framework. Thus, the archetypes are capable of providing
intelligent generic decision-support programs that provide future-proof (designed not
to be obsolete in the future) EHR systems to enhance patient care and safety over the
course of a lifetime.
Archetypes aid the quality of data, as they are defined by clinical domain experts
and not by software experts for their own use, and thus provide maximal information
regarding a clinical concept [Leslie 2006]. They are measured and analyzed by the
clinical review board before being published (i.e., they are developed through a domain
knowledge governance tool (clinical knowledge manager) [CKM 2009]). They can be
improved, as there is a process for versioning of archetypes. With expansion of knowl-
edge, we can have a new version of the archetype or a new specialization as per the
specialization property of the archetype (see Section 5.5.2).

5.4. Terminology and Archetypes


To share information (for computers rather than humans) we need to share two things:
a schema or rules for how the information is stored and a domain vocabulary for each
point in the schema.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:20 S. Sachdeva and S. Bhalla

The openEHR approach is to have a single logical schema for all systems. It is
attuned to the European approach (CEN 13606), which allows storage and retrieval of
potentially infinitely complex information. This means that everyone can receive data
(as atomic information, not text) from everyone else. To add meaning to the data, it
must conform to a second “schema” or set of rules, held in files that can be shared,
called “archetypes”. For computers to understand the data, it must be compliant with
at least one archetype (as a specialization of one of the archetypes).
This simplifies the requirements for terminology, as we now have shared data
points that require a specific vocabulary. It is appropriate to describe these within
the archetype itself, as there is no useful classification of small domain vocabularies
used for one data point, which has a number of advantages.
—The domain vocabularies can be safely translated, as it is quite clear what the context
of the term is within the archetype.
—The vocabularies can be “bound” to the different terminologies used within systems
(there are many such cases).
Terminologies, coding, and classification systems are for structured data collection.
By providing a basis for semantic-level interoperability and a common language, they
facilitate the exchange of information among different applications. They remove am-
biguity from information and language dependence and also enable proper automatic
processing such as medical decision support. Some data points within an archetype (via
a constraint definition and binding) point to an external terminology. OpenEHR and
International Health Terminology Standards Development Organization (IHTSDO)
collaborate to explore how clinical terminologies and archetype-based record struc-
tures can best be aligned to support electronic health records [IHTSDO collaboration
2009].

5.5. Detailed Description-Archetype Definition Language (ADL)


Archetype Definition Language (ADL) is a formal language for expressing archetypes,
which are constraint-based models of domain entities, or “structured business rules”2 .
It has evolved on notions similar to those of KML by Google Maps API, or the use
of XML for web documents or databases. Available support such as from XML alone
is, however, not enough for expressing healthcare objects [Sokolowski 1999]. Table II
compare ADL and XML.
Every ADL archetype is written with respect to a reference model. Archetypes are
applied to data via the use of templates, which are defined at a local level. The openEHR
archetype object model (AOM) [Beale 2008] describes the definitive semantic model of
archetypes, in the form of an object model.
The AOM defines relationships that must hold true between the parts of an archetype
for it to be valid as a whole. The ADL syntax is one possible serialization of an archetype.
ADL and AOM have been adopted by CEN TC/251, the European standards agency
Health Telematics Committee, for use in its revised EN 13606 Electronic Health Record
standard.
Previously, archetypes have been expressed as XML instance documents conforming
to W3C XML schemas, for example, in the Good Electronic Health Record [GEHR]
and openEHR projects. XML archetypes are equivalent to serialized instances of the

2 The Business rule is a widely-used documentation concept for conceptual schemas. Business rules are used
to describe the properties of an application (e.g., the fact that an employee cannot earn more than his or
her manager. A business rule can be) the description of a concept relevant to the application (also known
as a business object); an integrity constraint on the data of the application; and a derivation rule, whereby
information can be derived from other information within a schema.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:21

Table II. Comparison between ADL and XML


Properties ADL XML
Machine Yes Yes
processable
Human readable Yes Sometimes unreadable (e.g.,
XML-schema instance, OWL-RDF
ontologies)
Leaf data types More comprehensive set, including String data; with XML schema option, a
interval of numerics and date/time more comprehensive set
types
Structure Universal schema for temporal Semistructured data (rooted acyclic
database (EHRs) graph with unique path from root to leaf)
(history database)
Adhering to Yes, particularly for container types XML schema languages do not follow
object-oriented object-oriented semantics
semantics
Ontological Domain entities/ archetypes Global terms/ concepts
reference
Representation of Uses attributes Uses attributes and sub-elements
object properties
Space (for storage) Uses nearly half of space for tags May have data redundancy in contents
Efficiency Is a domain-specific language Good for web document modeling with
(sufficiently rich to capture and model limited ability to represent database
medical domain) contents
Path syntax ADL path is semantically a subset of Hierarchical
the Xpath query language

parse tree, that is, particular ADL archetypes serialized from objects into XML in-
stances. Archetypes connect information structures to formal terminologies. They are
completely path-addressable in a manner similar to XML data, using path expressions
that are directly convertible to Xpath expressions. With ADL parsing tools, it is pos-
sible to convert ADL to any number of forms, including various XML formats. XML
instances can be generated from the object form of an archetype in memory. An XML-
schema corresponding to the ADL object model has been published at openEHR.org
[openEHR 2009].
An example of XML/ADL use in openEHR: In order to accept a report from a pathol-
ogy laboratory for inclusion in the EHR repository of a patient (in the ADL form), an
XML form is generated using the archetype. This form is shared with the laboratory for
on-site validation of data input. Thus, XML is used as an input and transport medium.

5.5.1. Organization of ADL. In serialized form, archetypes are represented in the


Archetype Definition Language (ADL) [Beale and Heard 2008b], and in XML-based
serializations [Beale and Heard 2008a]. ADL is an abstract language based on frame
logic queries (also known as F-logic) with the addition of terminology. F-logic is a
knowledge representation and ontology language. It accounts in a declarative fash-
ion for structural aspects of an object-oriented and frame-based language. An ADL
archetype is a guaranteed 100% lossless rendering of the semantics of any archetype,
and is designed to be a syntactic analog of the AOM (Figure 8). Thus, ADL is a textual
language for specifying constraints on data instances of an RM in a formal way [Beale
and Heard 2008b]. An archetype expressed in ADL is composed of four main parts:
header, definition, ontology, and revision history (Figure 15). The header section con-
tains the archetype metadata. In the definition section, the modeled clinical concept is
represented in terms of a particular RM class. This description is built by constraining
several properties of classes and attributes, such as existence, occurrences, or cardi-
nality, or by constraining the domain of atomic attributes. In this section, only those
entities appear that need to be constrained. In the ontology section, the entities defined

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:22 S. Sachdeva and S. Bhalla

(A) archetype (adl version =1.4)


archetype_id
[specialise] : archetype_id
concept : concept_id
language:

dADL : language details


[description ]

dADL : descriptive meta data


[declarations]
FOPL : declaration statements
(B) definition
cADL : Formal constraint definition

(C) [invariant]
optional
sections FOPL : assertion statements
(D) ontology
dADL: terminology and language
definitions
(E) [revision history]
dADL: history of change audits

Fig. 15. ADL archetype structure [Beale and Heard 2008b].

in the definition section are described and bound to terminologies. Finally, the revision
history section contains the audit of changes to the archetype.

5.5.2. Structure of the Archetype in ADL. ADL uses three other syntaxes, cADL (con-
straint form of ADL); dADL (data definition form of ADL); and a version of first-order
predicate logic (FOPL), to describe constraints on data which are instances of some
information model (e.g., expressed in UML) [Beale and Heard 2008b]. Thus, ADL can
be used to write archetypes for any domain where formal object model(s) exist, which
describe data instances. Further, when archetypes are used at runtime in particular
contexts, they are composed into larger constraint structures, with local or specialist
constraints added, via the use of templates. The formalism of templates is presented
by using dADL. The cADL syntax is used to express the archetype definition, while the
dADL syntax is used to express data which appears in the language, description, ontol-
ogy, and revision history sections of an ADL archetype. The various keywords in ADL
are archetype, specialise/specialize, concept, language, description, definition, invari-
ant, ontology. The top-level structure of an ADL archetype is shown in Figure 15; see
Appendix A (example of ADL archetype structure).
The details of an archetype in ADL with dADL, cADL, and assertion language are
described in recent references [Beale and Heard 2008b]. The example illustrates that
the ADL formal model facilitates conversion to the XML form.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:23

Clinical application
GUI : Hospital

Templates : Eye , Pediatric,....


Templates and
Archetypes

Archetypes: Weight, Temperature, BP,...


Terminology

Data resources
Data
repository

Fig. 16. EHR data quality controls based on archetypes.

6. EHR DATA ENTRY, VALIDATION AND QUERY - A DQ PERSPECTIVE


The two-level model provides data quality to ensure accurate and consistent data. The
data can be entered into the EHR repository if and only if it can satisfy the constraints
defined in templates and archetypes. For example, the unit of temperature which can
be entered into the EHR system can be either degree Celsius or degree Fahrenheit and
cannot be any other unit. Also, the archetypes and templates refer to the terminology
that is standardized (e.g., the SNOMED-CT). This has an impact on the data quality
(Figure 16).
The mechanism of data entry and validation in EHR systems does not resemble
other existing common information systems. Before capturing the data, it must be
validated by archetypes. For example, in the data entry regarding BP parameter, a
unit of measurement must conform to ‘mm/Hg’. For the purpose of data entry and
validation, the archetypes are used at runtime by templates. The template is a directly
usable definition which composes archetypes into larger structures often corresponding
to a screen form, document, report or message [Beale and Heard 2008a].
Thus, there are templates on top of archetypes. All the data created due to the use
of templates is guaranteed to conform to the referenced archetypes. The generating
archetype allocates an archetype node identifier on every node of data. This forms a
semantic imprint. As an example, the XML form of EHR data refers to a template to
generate a suitable structure for data.

6.1. Templates
Templates are artifacts that enable the content defined in archetypes to be used for a
particular business purpose [Beale and Heard 2005]. They are created as department-
specific or disease-specific, and are in medical record format on a departmental basis for
cardiology, eye, or liver, and so on. They support bindings to terminology subsets specific
to their intended use, and can be used to generate or partly generate a number of other
artifact types including screen forms and message schemas, as shown in Figure 17.
In general, these comprise the complete application-level lumps of information to be
captured or sent. They are generally developed and used locally, while archetypes are
usually widely used.
An openEHR template is a specification that defines a tree of one or more archetypes,
each constraining instances of various reference model types, such as composition,
section, entry subtypes, and so on. Thus, while there are likely to be archetypes for
such things as “biochemistry results” (an observation archetype) and “SOAP headings”
(a section archetype), templates are used to put archetypes together to form whole
compositions in the EHR (e.g., for “discharge summary” and “antenatal exam”).

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:24 S. Sachdeva and S. Bhalla

Screen
Forms
Message
Schemas 1: n

Reports Templates

Terminology
n: n
Data conver- Bindings
sion schemas
Archetypes Terminologies

1: n Querying

Reference
Model

Fig. 17. The openEHR semantic architecture [Beale and Heard 2009].

The following specifications are related to templates [Beale and Heard 2009].
(i) Template definition language (TDL): an abstract language for expressing template
definitions in a syntactic fashion;
(ii) Template object model (TOM): an object model that expresses the same semantics
as TDL in a structural fashion;
(iii) Operational template model (OTM): an object model describing the stand-alone,
operational template which is generated from template definitions and referenced
archetypes and terminologies.

6.2. Three-Layered Building Block Structure


Figure 18 illustrates the three-layered building block structure of components in Fig-
ure 17. The first layer consists of archetypes, which are medical parameters (concepts)
that define content on the basis of topic or theme (such as height, weight, BP), inde-
pendent of any particular business event. These can be created using the archetype
editor tool (which is a free open-source tool provided by Ocean Informatics) [Archetype
Editor 2009].
The second layer consists of templates. It refers to one or more archetypes, and
usually imposes further constraints. These provide a way of using a particular set
of archetypes, choosing a particular set of nodes from each, and then limiting values
and/or terminology in a way specific to a particular kind of event, such as “diabetic
patient admission” and “discharge”. The template is often a direct precursor to a form
in the presentation layer of the application software. They are the principal means of
using archetypes in runtime systems. Thus the functions of templates are archetype
slot-filling, tightening constraints, providing default values, and metadata (including
node-level annotations).
The third layer consists of the patient record. The EHR of a specific patient can
be created by assembling appropriate template(s). The record of patient 1 refers to

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:25

EHR Patient 1 EHR Patient 2


Date 2
Template 1 Date 1 Layer 3
Template 1
Template 2

Archetype 1 Archetype 1
Archetype 2 Archetype 2 Layer 2

Archetype n Archetype n
T e m p late 1 T e m p l ate 2

Archetype 1 Archetype 2 Archetype n Layer 1

Fig. 18. Three-layered building block structure [MOSS8 2009].

data from multiple departments. The record of patient 2 refers to data from a single
department. A new page of EHR will be created for a new date, as shown in Figure 18.

7. SERVICE MODEL
The service model consists of service definitions for the major services in the EHR
computing environment. These interfaces help application developers to safely assume
a “standard” API, regardless of which implementation they use. The openEHR provides
different implementation technology, that is, java / .Net / other [openEHR Java Project
and openEHR .Net Project]. The service model helps back-end system implementers
know what interfaces they need to expose in order to enable middleware and application
developers. Also, through these interfaces, healthcare enterprises engaged in system
procurement can rely on a standardized middleware “bus” definition, which ensures
that the environment that is built is always open when purchasing services.

7.1. Screen Forms


For the graphical interface of an application, screen forms play an important role; They
are provided by archetypes through templates.

7.2. Querying EHR Data Using AQL


Archetype paths form the basis of reusable semantic queries on archetyped data. These
can be used to construct queries that specify data items at a domain level. Hence, they
are not limited to the directly connected classes and attributes of the reference model.
This is in contrast to a query in standard database theory. For example, paths from
a “blood pressure measurement” archetype may identify the systolic blood pressure
(baseline), systolic pressures for other time offsets, the patient position, and numerous
other data items [Beale and Heard 2005].
Hence, all the above components of data elements are captured by using archetypes.
Thus, openEHR data elements are guaranteed to conform to the “semantic paths”.
They are created by the composition of archetypes within a template. The paths are
incorporated within a familiar SQL-style syntax, to form queries that can be evaluated

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:26 S. Sachdeva and S. Bhalla

to retrieve items on a semantic basis. Queries are expressed in a language that is


a synthesis of SQL (SELECT/FROM/WHERE) and W3C XPaths, extracted from the
archetypes. The language is called the archetype query language (AQL) [AQL 2009].
At the present stage, the technical and design aspects of openEHR have largely been
outlined. EHRs should be designed with clinicians in mind, and in the next phase,
clinicians will be involved in archetype development. The function of an archetype
is to act as a design basis for queries. However, this concerns complex access plans.
Querying has been identified as a major area of review by NEHTA due to the lack of
clear standards in the EHR query services [NEHTA 2006].
7.3. Research Challenges in Querying
EHRs allow multiple representations [Chunlan et al. 2007]. In principle, EHRs can be
represented as relational structures (governed by an object/relational mapping layer),
and in various XML storage representations. There are many properties and classes
in the reference model, but the archetypes will constrain only those parts of a model
that are meaningful to constrain. These constraints cannot be stronger than those in
the reference model. For example, if an attribute is mandatory in RM, it is not valid
to express a constraint allowing the attribute to be optional in the archetype (ADL).
So the ADL file (alone) is not sufficient for querying. The user may want to query
some properties or attributes from RM, along with the querying from properties in
archetypes. In order to create a data instance of a parameter of EHR, we need different
archetypes in ADL, and these archetypes may also belong to different categories of
archetypes.
At the time of query, a user or an application faces the problem while querying
archetype systems. For example, the different categories have different structures. To
create a data instance for blood pressure, we need two different archetypes, namely en-
counter and blood pressure. These archetypes belong to different categories. Thus, the
following archetypes must be included in querying: the encounter archetype (belonging
to the COMPOSITION category of RM) and the blood pressure archetype (belonging
to the Observation category of RM). This problem can be addressed by the use of
templates. Archetypes are encapsulated by templates for the purpose of intelligent
querying [Beale and Heard 2009]. The templates are used for archetype composition or
chaining. Archetypes provide the pattern for data rather than an exact template. As a
result, the structure of data in any top-level object conforms to the constraints defined
in a composition of archetypes chosen by a template.
Querying the system with the dual-model architecture is not the same as querying a
relational database system or an XML database system. At the user level, querying data
regarding “blood pressure” (BP) must be made very simple. The user only knows BP as
a parameter and will query that parameter only. There is a need for a query support
that is neutral to system implementation, application environment, and programming
language. The domain professionals and software developers, both, should be able to
use the query language. For example, a patient may need to query details of medicines
actually taken by him or her.
7.4. Archetype Query Language
A query language, AQL, is being supported to query data described by archetypes in AM
and RM. This is the query support that is provided by openEHR and is going to be pro-
posed as a standardized language for querying dual-model-based (standardized) EHRs.
AQL is a declarative language, neutral to EHR systems, programming languages, and
system environments. It depends on an openEHR archetype model and semantics. It
was developed on the basis of many observations, namely, a set of clinical query scenar-
ios, study of the current available query language syntaxes (including XQuery, SQL,

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:27

Identified path -systolic Naming retrieved results

SELECT obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude AS Systolic,


obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude AS Diastolic
Class Expression Archetype Predicate
FROM EHR [ehr_id/value=$ehrUid]
CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION .encounter.v1]
CONTAINS OBSERVATION obs openEHR-EHR-OBSERVATION.blood_pressure.v1]
WHERE obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude>= 140
AND
obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude>=90

Fig. 19. Syntax of AQL [Chunlan et al. 2007].

and object query language), and study of the archetypes technology, openEHR RM, and
openEHR path mechanisms. It was first named EQL (EHR query language) [Chunlan
et al. 2007]. It has evolved and subsequently formed the following two innovations:
(i) utilizing the openEHR path mechanism to represent the query criteria and re-
turned results (Figure 19); and (ii) using a “containment” mechanism to indicate the
data hierarchy and constrain the source data to which the query is applied (Figure 19).
OpenEHR path mechanism enables any node within a top-level structure to be spec-
ified from the top of the structure via a “semantic” (i.e., archetype-based) X-path com-
patible path. The use of a common RM, archetypes, and a companion query language,
such as AQL, facilitates semantic interoperability of EHR information. The syntax of
AQL is illustrated with the help of an example. The syntax makes use of the path
expression, naming retrieved results, the class expression, and archetype predicate, as
shown in the example in Figure 19.
Query: Find all blood pressure values where the systolic value is greater than or
equal to 140, and the diastolic value is greater than or equal to 90, within a specified
EHR.
7.5. Archetype Query Language versus other Query Languages
With existing query languages (such as XQuery, SQL, OQL), users must know the
persistent data structure of an EHR in order to write an appropriate query for querying
EHR data. Thus, none of these can be directly used as a query language required by
integrated care EHRs. It is possible to convert specification (in ADL) and patient data
into its equivalent form, presented through XML. There is a great variety of software
querying on XML. Table III shows the comparison between AQL and XQuery through
sample queries.
Some of the features of AQL are still under development, whereas XQuery is a stan-
dard language incorporating all the important features of a database query language.
The structure of AQL query results are still not standardized because the represen-
tation of the results has to be neutral to the system environment and the structure
should be flexible (results may be structured using relational tables or represented
using a hierarchical structure). However, the AQL query builder uses a generic Result-
set, which has structure similar to a table [AQB 2009]. All products provided by Ocean
Informatics (including query builder) are based on release 1.0.1 of the openEHR spec-
ifications. They are designed for deployment within traditional and service-oriented
architectures, and support major published and emerging standards, including CEN
EN13606, HL7 Clinical Document Architecture (CDA), and HL7/OMG HSSP [Ocean
Informatics 2010].
8. DISCUSSION
Traditionally, clinicians and system users were not considered users in the devel-
opment and design phases of the systems. Also, few people are trained to work at

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:28 S. Sachdeva and S. Bhalla

Table III. Comparison of AQL versus XQuery


S.No Features AQL XQuery
1 Expression syntax Neutral Dependent on system implemen-
tation and environment.
2 Path expression Yes Yes
3 Existential quantification Yes Yes
4 Projection Yes Yes
5 Selection predicates Yes Yes
6 Relational operators/ Yes Limited to simple cases
Boolean operators
7 Renaming Yes Yes
8 Parameterization support Yes No
9 Construction of new No Yes
elements
10 Negation Yes Yes
11 Nesting Yes Yes
12 Portable Yes No
13 Value range as leaf data Yes No
14 TOP operator Yes No
15 Querying multiple Allows multiple archetypes Allows multiple documents
16 Universal quantification Not Yet Yes
17 Cartesian product Not Yet Yes
18 Arithmetic functions Yes (still to be finalized) Yes
19 Timewindow clause Yes No

the intersection of biomedicine and IT. Ultimately, implementing and enforcing the
standards can help in improving quality [Øvretveit 2003]. At the present stage, it
is important to develop a query capability that allows healthcare professionals to
examine the data from a variety of perspectives.
Similarly, patients generally do not have highly advanced computer skills. They
cannot access their EHR and patient health information (PHI) without the help of an
easy query interface. Thus, there is a need to bridge the gap between these consumers
and EHR systems.
Database query languages that assist database programmers have been around for
over a decade. The languages are very good and versatile (the specifications have
evolved well over the years); but these are too demanding for the hospital-based users.
AQL is a language that is at the developers’ level; SQL and XQuery are at the ap-
plication level. Thus, AQL is even one level lower than the SQL XQuery. SQL is very
suitable for querying relational databases. XQuery is well-suited for semi-structured
data. Object-oriented query languages are meant for object-oriented databases, but are
complex. None of the above can support medical personnel. For skilled users, query
builders and “input form and search” techniques [Jayapandian and Jagdish 2009] are
available for querying systems. At the system developers’ level, ADL requires highly
skilled programmers for developmental stages.
One of the possible approaches for querying EHRs is to use ADL to generate a storable
XML output of the corresponding XML database and then to use XQuery. Also, we can
use XQBE on the top of the generated XML file [Sachdeva and Bhalla 2009]. The corre-
sponding XQuery and XQBE for the example in Figure 19 are given in Figures 20 and
21. XQuery uses extensible mark-up language (XML) as its underlying data model. It
is limited to purely XML data environments. Direct use of XQuery for archetype-based
EHR would require that all data be generated in XML format [Sachdeva and Bhalla
2009, 2010]. This approach suffers from many difficulties. First, openEHR is designed
as an object-oriented framework. It allows for a multitude of data representations
(Figure 6). These include programming language persistent objects (e.g., in the form of
Java objects in a product such as db4o2); as relational structures (governed by an object/

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:29

Fig. 20. Syntax XQuery for the BP query example.

Fig. 21. XQBE interface for the BP query example [Sachdeva and Bhalla 2009].

relational mapping layer), and in various XML storage representations (e.g., XML blob
or XML databases) [Chunlan et al. 2007]. These systems may lose form or content
detail with changes in data representations. Usage of XQuery is therefore problematic,
because the query syntax is directly tied to the representational format of the data.
Considerable efforts would be required to convert openEHR data in each deployment
context to XML just for the purpose of querying; such transformation may well be
custom made in each case [Chunlan et al. 2007; Sachdeva and Bhalla 2009].

9. HIGH-LEVEL QUERY INTERFACES


At the end-user level in a hospital environment, the openEHR proposal supports forms
through templates. There is a strong need for studying the available query languages
and high-level query language interfaces for a match with healthcare professional
needs. The existing query languages such as XQuery and SQL are not suitable for
users. Many new approaches are being studied in different domains to provide high-
level query interfaces. Several high-level query languages such as QBE, QBO, XQBE,
and XML-GL exist [Zloof 1975; Rahman et al. 2006; Braga et al. 2005; Ceri et al. 1999].
At the system level, the AQL language is supported for development of initial support

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:30 S. Sachdeva and S. Bhalla

Table IV. IQ Categories and Dimensions [Wang and Strong 1996]


IQ Categories IQ dimensions
Intrinsic IQ Accuracy, Objectivity, Believability, Reputation
Accessibility IQ Accessibility, Security
Contextual IQ Relevancy, Value-Added, Timeliness, Completeness, Amount of Info
Representational IQ Interpretability, Ease of Understanding, Concise Representation, Consis-
tent Representation

Table V. DQ Aspect in Standardized EHRs


DQ Dimensions DQ Enhancements in EHRs
Accuracy and Validity • Business rules defined in archetypes
• “null flavor” in ELEMENT class of RM
Believability • RM based on clinical investigator recording process;
• Archetypes developed by domain experts
Reliability • Two-level modeling approach (stable RM)
• Richer data structures and data types
• Data independence among RM,AM and SM
Accessibility • Sharable archetypes
Security • Security information model of RM
Timeliness • Version control in RM
• New and modified archetypes are developed as clinical knowledge expands
Completeness • Standardized data definitions, content and structure
Interpretability • Fine granularity of data in archetypes
• Linkage to terminology standards
Ease of Understanding • User-level query ability and usability
Concise Representation • Rich health data definition (archetypes)
Consistency • Interfaces for legacy systems (non-standardized)
• Interfaces to other systems (HL7, CEN 13606)
• Ontology-based archetype transformation process (e.g. openEHR
archetypes to HL7 CDA archetypes or CEN 13606 archetypes)

infrastructure. Higher-level support is an active area of research. Many research efforts


aim to improve user interaction facilities [Jayapandian and Jagdish 2009; Braga et al.
2005]. This will improve the quality of care.
One possible proposal is to provide an interface at the user level. Query-by-object
(QBO) is a high-level query interface which is user-friendly and simple to query [Bhalla
and Hasegawa 2006; Rahman et al. 2006]. It follows a step-by-step procedure, which
helps in removing ambiguities in a user’s intentions. It is based on the information
requirement elicitation (IRE) approach [Sun 2003]. IRE is an interactive communica-
tion activity in which an information system helps users specify their requirements
with adaptive choice prompts. This is a calculator-oriented approach using an object-
by-object query. The users need not possess programming skills prior to accessing the
web-based information system. A similar approach, such as QBE [Zloof 1975] or QBO,
may be used by the end-users (such as clinicians, decision makers, and patients) to sim-
plify the process of querying EHR. An aim of the new approaches is that the user is not
required to know details of persistent data. In the near future, it is desirable to support
such a high-level query interface that will help to collect the required knowledge.
10. SEMANTIC INTEROPERABILITY CONSIDERATIONS
10.1. Data Quality in EHRs
Wang and Strong [1996] describe various categories and dimensions of information
quality as shown in Table IV.
Some of the data quality requirements are accuracy and validity, believability, ac-
cessibility, security, timeliness, completeness, interpretability, ease of understanding,
and consistency [Orfanidis et al. 2004]. Table V gives a narrative view of how the DQ
aspect is enhanced in standardized EHRs.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:31

Users Tasks

Data-Production
Processes
Collectors (Medical and
Administrative Staff)

Data storage,
maintainence
Custodians (DBA or and security
Computer Scientists)

Data Utilisation
Processes
Consumers (Physicians,
Researchers, Managers, Patients)

Fig. 22. Data users impacting data quality.

Example of accuracy. In the openEHR RM, the class ELEMENT has attribute
null flavor. It is used to mark a lack of data. Using this attribute was inspired by
(a) the need to do something about marking missing data in health information and (b)
the use of data quality markers in SCADA control systems, which show on the screen
when a measured value from the field is out of date or wrong due to technical failure to
obtain the current value. In the development of openEHR, a data quality marker has
been made available for a similar reason: to indicate technical incapacity to obtain data.
Thus, the problems of incompatible basic data types and overlapping and incompatible
definitions of clinical content have been addressed and solved by openEHR.

10.2. Data Users Impacting DQ in EHR systems


The DQ is analyzed using three data users, that is, collectors, custodians, and con-
sumers [Wang et al. 2001]. In healthcare, the various users and their associated tasks
are shown in Figure 22. Quality in data collection is achieved by adherence to guide-
lines and data definitions. Sufficient data checks at the point of data entry are enforced
by the use of interfaces generated on the top of archetypes (i.e., templates).
The medical staff requires training to avoid typing errors, transcription errors, and
incomplete transcription. The protocol of data collection is also significant regarding
the DQ aspect. Special attention is given in openEHR standard as it contains a protocol
section in all the ENTRY classes of RM. The data custodians can easily maintain the
data quality aspect of the system because, in two-level modeling, the software develop-
ment is separated by domain knowledge. Healthcare professionals require aggregated
and integrated patient information that may be distributed across multiple sites.

10.3. Data Quality Improvement Methodology for EHRs


A number of methodologies (AIMQ [Lee et al. 2002]; TDQM [Wang 1998]; and CDQM
[Batini and Scannapieco 2006]) have been developed to help information quality prac-
titioners to discover which forms of information quality are of relevance to their stake-
holders, and to help convert the forms into specific quality scores.
A data quality framework is a tool for the assessment of data quality within an orga-
nization [Wang et al., 1996]. In a total data quality management (TDQM) framework,
the “define” component identifies the important DQ dimensions and the corresponding
DQ requirements. The “measure” component produces the DQ metrics. The “analyze”
component identifies the root causes for DQ problems and calculates the impact of

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:32 S. Sachdeva and S. Bhalla

Define Measure Define Measure


EHR
EHRQ

EHRC
Improve Analyze
Improve Analyze

EHRC (EHR Characteristics)


EHRQ (EHR Quality)

(a) A Schematic of the TDQM Cycle [Wang 1998] (b) Data Quality Improvement Methodology for
EHRs in EHR System over a period of time

Fig. 23. Similarity between TDQM cycle and data quality improvements for EHRs.

poor quality information. The “improve” component provides techniques for improv-
ing DQ. Analogous to the TDQM cycle, a data quality improvement methodology for
EHRs in an EHR system has been proposed, as shown in Figure 23. In applying the
framework, we must define the characteristics of EHR, assess the EHR data quality
requirements, and identify the EHR system for the EHR. In the figure, EHRC stands
for EHR characteristics and EHRQ stands for EHR quality. Figure 23 also depicts the
DQ components.
The EHR team needs to identify key areas for improvement such as the following.
(1) Making the EHRs semantically more interoperable. The new and modified
archetypes are developed as the clinical knowledge is enhanced. Quality improvement
is an iterative cycle [Kerr et al. 2008]. The requirements may continue to change over
time.
(2) Exchange of EHRs among different standards-compliant EHR systems. CEN
13606 has adopted the openEHR two-level modeling approach, known as the “archetype
methodology”. HL7 CDA [HL7 CDA, Release 2] defines what could be considered a
“single document extract”. In openEHR, an EHR Extract specification is more flexible,
fully archetypable, and is trying to improve its ability to accommodate data in 13606
and CDA form. Thus, HL7 CDA is approximately a subset of the 13606 EHR Extract,
limited to one version of one composition, with some minor differences. ISO 13606-2
defines ADL as a formal language that is related to the reference model. Archetypes
expressed in this language will be convertible to HL7 refined message information
models (R-MIMs) and common message element types (CMETs). It is intended to
harmonize the openEHR archetype concept with the HL7 CDA and HL7 templates.
11. SUMMARY AND CONCLUSIONS
Healthcare activity needs to be automated to bring uniformity and to improve quality
[Øvretveit et al. 2007]. EHRs are life-long health records that can transform medi-
cal practice, making it more efficient and saving money and time [Wang et al. 2003].
Enhancements in EHR architecture have become available which focus on two main
issues, standardization and interoperability. Standardization is being accomplished
through a dual-level modeling approach. In this approach, the software develop-
ment can proceed separately from domain modeling, and if new concept models are
introduced or altered, the software need not have to be redesigned, coded, tested,
and redeployed. The dual-level model thus enhances the quality of information sys-
tems. Knowledge-level interoperability will be achieved through the establishment of

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:33

archetypes. Archetypes are developed through domain knowledge governance, which


resolves the human problem involving agreement on what is contained within the
domain and why it is important.
Tayi and Ballou [1998] consider four dimensions of data quality: accuracy, com-
pleteness, consistency, and timeliness. Archetypes are accurate insofar as they provide
correctness and precision with which the real-world data is represented. They are com-
plete insofar as fine granularity of data is provided (i.e., all relevant data are recorded).
They are consistent insofar as they are the basis for data that satisfies specified con-
straints and business rules. They provide timeliness insofar as versioning is possible,
so the recorded data is up-to-date.
EHR semantic interoperability ensures the necessary data quality and consistency.
It will enable meaningful and reliable use of longitudinal and heterogeneous data for
public health, research, and health service management. The growing size and quality
of the openEHR repository of archetypes means that individual organizations using
the technology have to do less work to be interoperable, while gaining access to the
content models created and used by some of the largest health organizations in the
world, including the Australian government and the NHS in the UK.
Pioneering new archetypes will allow the clinical concepts to be expanded; they
also provide the basis for querying EHR repositories. Querying over EHR data has
to be neutral to EHR systems, programming languages, and system environments.
The query syntax has to be neutral with respect to the reference model, that is, the
common data model of the information being queried. These objectives are met by
the new architectural design. EHRs can help in delivering the right information to
the right person at the right time. Patients can have complete control over access and
distribution of their health records. The openEHR uses low-cost software (because it
is open source). Its maintenance due to software robustness (suitable for developing
countries’ economies) is easy. For these reasons, it has been chosen for study and
adoption by various organizations: Microsoft, Queensland Health (Australia), Bert
Verhees (Netherlands), NexJ systems of Canada, National e-health programs ( going
on in Singapore, Sweden, Denmark, Great Britain) are working on an archetype-based
openEHR approach [Microsoft 2009; MOSS8 2009].
The openEHR approach can, moreover, provide the common basis for ubiquitous
presence of meaningful and computer-processable knowledge and information, and
thus contribute to the usability of clinical systems, improve data quality, and improve
semantic interoperability. To utilize the full potential of interoperable EHR systems,
they have to be accepted by their users, that is the healthcare providers. Graphical
user interfaces that support customization and data validation play a decisive role for
user acceptance and data quality. Further research and study to provide information
that focuses on improving existing tools and algorithms are required.

APPENDIX
Example of ADL Archetype Structure
In the following example, the notion of “patient” is defined in terms of constraints on a
generic model of the concept PARTY (Figure 24).

archetype (adl version = 1.4)


adl-test-party.patient.draft
concept
[at0000] -- patient
language
original language = <[iso 639-1::en]>

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:34 S. Sachdeva and S. Bhalla

PARTY

is a

Patient

Fig. 24. Specialization- patient is a PARTY (subpart of Figure 5).


definition
PARTY[at0000] matches {
details matches { -- details
address matches {[a-zA-Z0-9 ]+)∗ } -- alphanumeric ok
identity cardinality matches {1..∗ } matches {
PARTY IDENTITY[at0001] matches { -- demographic details
name matches {[local::at0002] } -- patient’s name
contact matches {[local::at0003]} - patient’s contact
} }
relationships cardinality matches {0..∗ ; ordered} matches {
PARTY RELATIONSHIP [at0004] matches { } -- patient relation-
ships
} } }
ontology
term definitions = <
[‘‘en’’] = <
items = <
[‘‘at0000’’] = <
text = <‘‘patient’’>;
description = <‘‘patient’s data’’>
>
[‘‘at0001’’] = <
text = <‘‘ Demographic details ’’>;
description = <‘‘ A patient’s demographic
details ’’>
>
[‘‘at0002’’] = <
text = <‘‘name’’>;
description = <‘‘ A patient’s name ’’>
>
[‘‘at0003’’] = <
text = <‘‘contacts’’>;
description = <‘‘ A patient’s contact’’>
>
[‘‘at0004’’] = <
text = <‘‘Relationships’’>
description = <‘‘A patient’s relationships,
especially family ties.’’>
>
> > >

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:35

ACKNOWLEDGMENTS
The authors received valuable comments from William Rozycki (Center for Language Research at the Uni-
versity of Aizu) for corrections and improvements.

REFERENCES
AQB 2009. Archetype query builder. http://www.oceaninformatics.com/Solutions/ocean-products/Clinical-
Modelling/Ocean-Query-Builder.html. (Accessed 12/09).
AQL 2009. Archetype query language. http://www.openehr.org/wiki/display/spec/Archetype+Query+-
Language+Description. (Accessed 12/09).
Archetype editor tool. 2009. http://wiki.oceaninformatics.com/confluence/display/TTL/Archetype+Editor+-
Releases. (Accessed 12/09).
ATALAG, K., KINGSFORD, D., PATON, C., AND WARREN, J. 2010. Putting health record interoperability standards
to work. J. Health Inf. 5, 1 (e-jhi).
BATINI, C. AND SCANNAPIECO, M. 2006. Data Quality: Concepts, Methodologies, and Techniques. Springer, Berlin.
BEALE, T. 2008. The openEHR archetype model: Archetype object model. In the openEHR release 1.0.2,
openEHR Foundation.
BEALE, T. 2010. OpenEHR to ISO 13606-1, ISO 21090 mapping. http://www.openehr.org/wiki/display/stds/
openEHR+to+ISO+13606-1%2C+ISO+21090+mapping.
BEALE, T. AND FRANKEL, H. 2007. The openEHR reference model: Extract information model. The openEHR
release 1, openEHR Foundation.
BEALE, T. AND HEARD, S. 2005. Archetype definitions and principles. In the openEHR release 1.0.2, openEHR
Foundation.
BEALE,T. AND HEARD, S. 2008a. The openEHR architecture: Architecture overview. In the openEHR release
1.0.2, openEHR Foundation.
BEALE, T., AND HEARD, S. 2008b. The openEHR archetype model-archetype definition language ADL 1.4. In
openEHR release 1.0.2 (issue date: 12/08).
BEALE, T., HEARD, S., KALRA, D., AND LLYOD, D. 2008. The openEHR reference model: EHR information model.
In the openEHR release 1.0.2., openEHR Foundation.
BEALE, T. AND HEARD S. 2009. The openEHR archetype model: openEHR Templates. In openEHR release 1.0.2.
(issue date 4/20/09).
BHALLA, S. AND HASEGAWA, M. 2006. Query interface for ubiquitous access to database resources. In Proceedings
of the 13th International Conference on Management of Data (COMAD).
BISBAL, J. AND BERRY, D. 2009. Archetype alignment: A two-level driven semantic matching approach to
interoperability in the clinical domain. In HEALTHINFO, 216–221.
BLOBEL, B. G. M. E. AND PHAROW, P. 2008. Analysis and evaluation of EHR approaches. In eHealth Beyond the
Horizon: Get It There (MIE’08).
BOTT, O. J. 2004. The electronic health record: Standardization and implementation. In Proceedings of the
2nd OpenECG Workshop.
BRAGA, D., CAMPI, A., AND CERI, S. 2005. XQBE (XQueryBy example): A visual interface to the standard XML
query language. ACM Trans. Datab. Syst. 30, 2, 398–443.
CEN TC/251. European standardization of health informatics. ENV 13606 Electronic Health Record Com-
munication. http://www.centc251.org/.
CERI, S., COMAI, S., DAMIANI, E., FRATERNALI, P., PARABOSCHI, S., AND TANCA L. 1999. XML-GL: A graphical
language of querying and restructuring XML documents. In Proceedings of the WWW.
CHUNLAN, M., FRANKEL, H., BEALE, T., AND HEARD S. 2007. EHR query language (EQL): A query language for
archetype-based health records. In MEDINFO.
CKM. Clinical Knowledge Manager. http://www.openehr.org/knowledge/. (Accessed 12/09).
EHR Standards. Electronic health records standards. http://en.wikipedia.org/wiki/Electronic health record#
Standards. (Accessed 12/09).
EICHELBERG, M., ADEN, T., RIESMEIER, J., DOGAC, A., AND LALECI, G. B. 2005. A survey and analysis of electronic
healthcare record standards. ACM Comput. Surv. 37, 4, 277–315.
EUROREC. 2010. European Record Institute. http://www.eurorec.org/.
GEHR. Good electronic health record project. http://www.gehr.org/ .
GENDRON, M. S. AND D’ONOFRIO, M. J. 2001. Data quality in the healthcare industry. Data Quality J. 7, 1.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
1:36 S. Sachdeva and S. Bhalla

GÖK, M. 2008. Introducing an openEHR-based electronic health record system in a hospital. Masters thesis,
University of Goettingen.
HL7. Health level 7. www.hl7.org. (Accessed 11/10).
HL7. CDA. Clinical document architecture. Release 2 http://www.hl7.org/v3ballot/html/infrastructure/cda/
cda.htm.
HRISTIDIS, V. 2009. Data quality and integration issues in EHRs. In Information Discovery on Electronic
Health Records, Chapman & Hall, Ch. 4.
IHTSDO 2009. IHTSDO and openEHR collaboration. http://www.openehr.org/292-OE.html?branch=
1&language=1.
ISO 13606-1. 2008. Health informatics: Electronic health record communication. Part 1: RM (1st Ed.).
ISO 13606-2. 2008. Health informatics: Electronic health record communication. Part 2: Archetype inter-
change specification (1st Ed.).
ISO/TC 215 TECHNICAL REPORT. 2003. Electronic health record definition, scope, and context. (2nd. draft,
Aug.).
JAYAPANDIAN, M. AND JAGADISH, H. V. 2009. Automating the design and construction of query forms. IEEE
Trans. Knowl. Data Eng. 21, 10, 1389–1402.
KALRA, D., TAIPURIA, A., FRERIKS, G., MENNERAT, F. AND DEVLIES J. 2008. Management and maintenance policies
for EHR interoperability resources. Q-REC Project IST 027370 3.3. The European Commission, Brussels.
KENNELLY, R.J. 1998. IEEE 1073, Standard for medical device communications. In Proceedings of the IEEE
Systems Readiness Technology Conference (AUTOTESTCON ‘98). IEEE, Los Alamitos, CA, 335–336.
KERR, K.A., NORRIS, T., AND STOCKDALE, R. 2008. The strategic management of data quality in healthcare.
Health Inform. J. 14, 259.
LEE, Y., STRONG, D., KAHN, B., AND WANG, R. 2002. AIMQ: A methodology for information quality assessment.
Inform. Manage. 40, 2, 133–146.
LESLIE, H. AND HEARD S. 2006. Archetypes 101. In Proceedings of the HIC and HINZ. Health Informatics
Society of Australia, 18–23.
LEWIS, G. A., MORRIS, E., SIMANTA, S., AND WRAGE, L. 2008. Why standards are not enough to guarantee end-
to-end interoperability. In Proceedings of the IEEE 7th International Conference on Composition-based
Software Systems. IEEE, Los Alamitos, CA.
MADNICK, S. E., WANG, R. Y., LEE, Y. W., AND ZHU, H. 2009. Overview and framework for data and information
quality research. ACM J. Data Inf. Quality 1, 1, Article 2.
MALDONADOA, J. A., MONERA, D., TOMÁSA, D., ÁNGULOA, C., ROBLESA, M., AND FERNÁNDEZB, J. T. 2007. Framework
for clinical data standardization based on archetypes. In MEDINFO.
Microsoft Connected Health Framework. http://www.microsoft.com/industry/healthcare/technology/ Health-
Frameok.mspx. (Accessed 11/09).
MIETTINEN, M. AND KORHONEN, M. 2008. Information quality in healthcare: Coherence of data compared
between organization’s electronic patient records. In Proceedings of the 21st IEEE International Sym-
posium on Computer-Based Medical Systems. IEEE, Los Alamitos, CA.
MIKKELSEN, G. AND AASLY, J. 2005. Consequences of impaired data quality on information retrieval in electronic
patient records. Int. J. Med. Inf. 74, 5, 387–394.
MOSS8. 2009. Eight Medical Open Source Software Seminar. http://www.openehr.org/293-OE.html?branch=
1&language=1.
NEHTA. 2006. Review of shared electronic health records standards. National E-Health Transition
Authority.
NHIN. 2005. NHIN: Interoperability for the National Health Information Network. IEEE, E-books.
OCEAN INFORMATICS. 2010. http://www.oceaninformatics.com/. (Accessed 1/10).
OpenEHR Community. http://www.openehr.org/. (Accessed 5/09).
ORFANIDIS, L., BAMIDIS, P.D., AND EAGLESTONE, B. 2004. Data quality issues in electronic health records: An
adaptation framework for the Greek health system. Health Inform. J. 10, 1, 23.
ØVRETVEIT, J. 2003. What are the best strategies for ensuring quality in hospitals? In WHO Regional Office
for Europe’s Health Evidence Network (HEN).
ØVRETVEIT, J., SCOTT, T., RUNDALL, G. T., SHORTELL, S. M., AND BROMMELS, M. 2007. Improving quality through
effective implementation of information technology in healthcare. Int. J. Quality Health Care 19, 5,
259–266.
PATRICK, J., LY, R., AND TRURAN, D. 2006. Evaluation of a persistent store for openEHR. In Proceedings of the
HIC and HINZ. Health Informatics Society of Australia, 83–89.

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.
Semantic Interoperability 1:37

PISHEV, O. 2006. The openEHR advantage. White paper. http://www.oceaninformatics.com/ocean-informatics-


resources/ocean-documentation/Published-Articles/The-iopeniEHR-advantage2.html.
POISSANT, L., PEREIRA, J., TAMBLYN, R., AND KAWASUMI, Y. 2005. The impact of electronic health records on time
efficiency of physicians and nurses: A systematic review. J. Amer. Med. Inform. Assoc. 12, 505–516.
RAHMAN, S. A., BHALLA, S., AND HASHIMOTO, T. 2006. Query-by-object Interface for information requirement
elicitation in M-commerce. Int. J. Hum. Comput. Interact.
SACHDEVA, S. AND BHALLA, S. 2010. Semantic Interoperability in healthcare information for EHR databases.
In Proceedings of the 6th International Workshop, Databases in Networked Information Systems (DNIS).
157–173.
SACHDEVA, S. AND BHALLA, S. 2009. Implementing high-level query language interfaces for archetype-based
electronic health records database. In Proceedings of the International Conference on Management of
Data (COMAD). 235–238.
SILBERSCHATZ, A., KORTH, H. F., AND SUDARSHAN, S. 2010. Database Systems Concepts 6th Ed., McGraw Hill,
New York.
SIMONOV, M., SAMMARTINO, L., ANCONA, M., PINI, S., CAZZOLA, W., AND FRASCIO, M. 2005. Information, knowledge
and interoperability for healthcare domain. In Proceedings of the 1st International Conference on Au-
tomated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS’05). IEEE, Los
Alamitos, CA.
SNOMED. Clinical terms. Systematized nomenclature of medicine. http://www.snomed.org/documents/-
snomed overview.pdf.
SOKOLOWSKI, R. 1999. Expressing health care objects in XML. In Proceedings of the 8th Workshop on Enabling
Technologies on Infrastructure for Collaborative Enterprises.
SUN, J. 2003. Information requirement elicitation in M-Commerce: An interactive approach to facilitate
information search for mobile users. Comm. ACM 46, 12, 45–47.
TAYI, G.K. AND BALLOU, D.P. 1998. Examining data quality. Comm. ACM 41, 2, 54–57.
The openEHR .Net knowledge tools project. http://www.openehr.org/projects/dotnet.html.
The openEHR Java reference implementation project. http://www.openehr.org/projects/java.html.
THURSTON, L.M. 2006. Flexible and extensible display of archetyped data: The openEHR presentation chal-
lenge. In Proceedings of HIC and HINZ. Health Informatics Society of Australia, 28–36.
WALKER, J., PAN, E., JOHNSTON, D., ADLER-MILSTEIN, J., BATES, D.W., AND MIDDLETON, B. 2005. The value of health
care information exchange and interoperability health affairs.
http://content.healthaffairs.org/cgi/content/abstract/hlthaff.w5.10.
WANG, S. J., MIDDLETON, B., PROSSER. L. A., BARDON, C. G., SPURR, C. D., CARCHIDI, P. J., KITTLER, A. F., GOLDSZER,
R. C., FAIRCHILD, D. G., SUSSMAN, A. J., KUPERMAN, G. J., AND BATES, D. W. 2003. A cost-benefit analysis of
electronic medical records in primary care. Amer. J. Med. 114. 5, 397–403.
WANG, R. Y. 1998. A product perspective on total data quality management. Comm. ACM 41, 2, 58–65.
WANG, R. Y. AND STRONG, D. M. 1996. Beyond accuracy: What data quality means to data consumers. J.
Manage. Inf. Syst. 12, 4, 5–33.
WANG, R. Y., ZAID, M., AND LEE, Y. W. 2001. Data Quality, Kluwer, Amsterdam.
ZLOOF, M. M. 1975. Query-by-example. In Proceedings of AFIPS’75.

Received January 2010; revised December 2010; accepted January 2012

ACM Journal of Data and Information Quality, Vol. 3, No. 1, Article 1, Publication date: April 2012.

You might also like