100% found this document useful (1 vote)
23 views17 pages

Discover Introduction To Bioinformatics and Clinical Scientific Computing 1st Edition One-Click Ebook Download

The document is an introduction to the field of bioinformatics and clinical scientific computing, published by CRC Press in 2023. It covers a wide range of topics including data structures, databases, SQL, data mining, data analysis, network architecture, and web programming. The book is designed for those interested in the intersection of biology and computational science, providing foundational knowledge and practical applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
23 views17 pages

Discover Introduction To Bioinformatics and Clinical Scientific Computing 1st Edition One-Click Ebook Download

The document is an introduction to the field of bioinformatics and clinical scientific computing, published by CRC Press in 2023. It covers a wide range of topics including data structures, databases, SQL, data mining, data analysis, network architecture, and web programming. The book is designed for those interested in the intersection of biology and computational science, providing foundational knowledge and practical applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Introduction to Bioinformatics and Clinical Scientific

Computing 1st Edition

Visit the link below to download the full version of this book:

https://medidownload.com/product/introduction-to-bioinformatics-and-clinical-sci
entific-computing-1st-edition/

Click Download Now


Cover image: spainter_vfx/Shutterstock

First edition published 2023


by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2023 Paul S. Ganney

CRC Press is an imprint of Informa UK Limited

Reasonable efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The authors
and publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.com
or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.
co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used
only for identification and explanation without intent to infringe.

ISBN: 978-1-032-32413-5 (hbk)


ISBN: 978-1-032-32693-1 (pbk)
ISBN: 978-1-003-31624-4 (ebk)
DOI: 10.1201/9781003316244

Typeset in Times
by SPi Technologies India Pvt Ltd (Straive)

Access the companion website: https://www.routledge.com/9781032324135


Contents
Acknowledgements..................................................................................................xiv

Chapter 1 Data Structures...................................................................................... 1


1.1 Introduction................................................................................ 1
1.2 Arrays......................................................................................... 1
1.3 Stack or Heap............................................................................. 8
1.4 Queue........................................................................................ 11
1.5 Linked List................................................................................ 13
1.6 Binary Tree............................................................................... 18
Notes��������������������������������������������������������������������������������������������������� 19

Chapter 2 Databases............................................................................................. 21
2.1 Introduction.............................................................................. 21
2.2 Terminology.............................................................................22
2.3 The Goals of Database Design................................................. 23
2.4 Example.................................................................................... 24
2.5 More Terminology.................................................................... 24
2.6 Example – Making the Design More Efficient......................... 26
2.7 Fourth and Fifth Normal Form................................................. 29
2.8 Many-to-Many Relationships................................................... 33
2.9 Distributed Relational Systems and Data Replication.............. 33
2.10 Columnstore and Data Warehousing........................................ 35
2.11 OLAP Cubes............................................................................. 38
2.12 Star Schema.............................................................................. 40
2.13 Database Standards and Standards for Interoperability
and Integration.......................................................................... 41
2.13.1 Database Naming Conventions................................... 42
2.13.2 Data Administration Standards.................................... 43
2.13.3 Database Administration Standards............................44
2.13.4 System Administration Standards............................... 44
2.13.5 Database Application Development Standards........... 44
2.13.6 Database Security Standards....................................... 45
2.13.7 Application Migration and Turnover Procedures........ 45
2.13.8 Operational Support Standards.................................... 46
Notes��������������������������������������������������������������������������������������������������� 46
References........................................................................................... 47

Chapter 3 SQL..................................................................................................... 49
3.1 Introduction.............................................................................. 49
3.2 Common Commands: SELECT, INSERT, UPDATE and
DELETE................................................................................... 49
v
viContents

3.3 Some Useful Commands/Functions......................................... 53


3.4 SELECT Modifiers................................................................... 53
3.4.1 MySQL 8.0.................................................................. 74
3.5 Create/Alter Table..................................................................... 74
3.6 Indexes...................................................................................... 78
3.7 Privileges.................................................................................. 80
3.8 Loading Large Data Sets.......................................................... 84
3.9 Stored Routines........................................................................ 85
3.10 Triggers..................................................................................... 88
3.11 Columnstore............................................................................. 90
3.12 Concurrency Control and Transaction Management................ 91
3.13 Database Performance Tuning.................................................. 95
3.14 Hints and Tips........................................................................... 98
3.14.1 Naming Standards....................................................... 98
3.14.2 Data Types................................................................... 98
3.14.3 In Code........................................................................ 98
3.14.4 Documentation............................................................ 98
3.14.5 Normalization and Referential Integrity...................... 99
3.14.6 Maintenance: Run Periodic Scripts to Find................. 99
3.14.7 Be Good....................................................................... 99
Notes��������������������������������������������������������������������������������������������������� 99
References......................................................................................... 101

Chapter 4 Data Mining....................................................................................... 103


4.1 Introduction............................................................................ 103
4.2 Pre-Processing........................................................................ 105
4.3 Data Mining............................................................................ 105
4.3.1 Some Data-Mining Methods..................................... 109
4.3.1.1 Decision Trees and Rules........................... 109
4.3.1.2 Nonlinear Regression and
Classification Methods............................... 110
4.3.1.3 Example-Based Methods........................... 111
4.3.1.4 Probabilistic Graphic Dependency
Models....................................................... 112
4.3.1.5 Principal Component Analysis (PCA)....... 113
4.3.1.6 Neural Networks........................................ 116
4.4 Data Mining Models in Healthcare........................................ 118
4.5 Results Validation................................................................... 118
4.6 Software.................................................................................. 119
Notes������������������������������������������������������������������������������������������������� 119
References......................................................................................... 119

Chapter 5 Data Analysis and Presentation......................................................... 121


5.1 Introduction............................................................................ 121
5.2 Appropriate Methods and Tools............................................. 121
Contents vii

5.3 Interpretation of Results......................................................... 122


5.4 Presentation of Results........................................................... 123
5.5 Quality Indicators................................................................... 123
5.6 Graphical Presentation............................................................ 124
5.7 Standards................................................................................ 127
5.8 Commercial Software: Excel.................................................. 127
5.8.1 Charts........................................................................ 127
5.9 Blinded with Science.............................................................. 138
Notes������������������������������������������������������������������������������������������������� 140
References......................................................................................... 141

Chapter 6 Boolean Algebra................................................................................ 143


6.1 Introduction............................................................................ 143
6.2 Notation.................................................................................. 143
6.3 Truth Tables............................................................................ 143
6.4 Algebraic Rules...................................................................... 144
6.5 Logical Functions................................................................... 146
6.5.1 Functions of One Variable......................................... 146
6.5.2 Functions of Two Variables....................................... 146
6.6 Simplification of Logical Expressions.................................... 147
6.7 A Slight Detour into NAND and NOR................................... 148
6.8 Karnaugh Maps...................................................................... 149
6.9 Using Boolean Algebra in Forming and
Validating Queries.................................................................. 151
6.10 Binary and Masking............................................................... 151
Notes������������������������������������������������������������������������������������������������� 152
References......................................................................................... 152

Chapter 7 NoSQL.............................................................................................. 153


7.1 Introduction............................................................................ 153
7.1.1 Strengths.................................................................... 155
7.1.2 Weaknesses................................................................ 155
7.2 Document Storage.................................................................. 155
7.3 GraphDB................................................................................. 158
7.4 Conclusions............................................................................ 161
Notes������������������������������������������������������������������������������������������������� 161

Chapter 8 Network Architecture........................................................................ 163


8.1 Introduction............................................................................ 163
8.2 Networking and the Network Environment............................ 163
8.2.1 The Network Packet.................................................. 163
8.2.2 Hardware – Hub, Switch, Router, Firewall............... 164
8.2.3 Network Topologies.................................................. 166
viiiContents

8.3 Cabling Infrastructure............................................................. 170


8.4 IP Addressing and DNS.......................................................... 173
8.4.1 IP Mask..................................................................... 173
8.4.2 Ports........................................................................... 175
8.5 IP Routing Tables................................................................... 176
8.5.1 IP Routing Table Entry Types...................................176
8.5.2 Route Determination Process.................................... 177
8.5.3 Example Routing Table for Windows 2000.............. 178
8.5.4 Static, Dynamic and Reserved IPs............................ 179
8.5.4.1 Two Devices with the Same
IP Address.................................................. 180
8.5.5 Where Is Your Data?.................................................. 180
8.6 Connecting Medical Devices to the Hospital Network.......... 181
8.6.1 Firewalls.................................................................... 182
8.6.2 Bandwidth................................................................. 183
8.7 Infrastructure.......................................................................... 184
8.8 The OSI 7-layer Model........................................................... 184
8.9 Scalability............................................................................... 186
8.9.1 RIP............................................................................. 187
8.9.2 OSPF......................................................................... 187
8.9.3 Intermediate System to Intermediate
System (IS-IS)........................................................... 188
8.9.4 EIGRP....................................................................... 188
8.10 Web Services: Introduction.................................................... 189
8.11 Web Services: Representational State Transfer (REST)......... 190
8.11.1 Client-Server Architecture......................................... 191
8.11.2 Statelessness.............................................................. 191
8.11.3 Cacheability............................................................... 191
8.11.4 Layered System......................................................... 191
8.11.5 Code on Demand....................................................... 191
8.11.6 Uniform Interface...................................................... 191
8.11.6.1 Resource Identification in Requests......... 192
8.11.6.2 Resource Manipulation through
Representations........................................ 192
8.11.6.3 Self-Descriptive Messages....................... 192
8.11.6.4 Hypermedia as the Engine of
Application State (HATEOAS)................ 192
8.11.7 Relationship between URL and HTTP Methods....... 193
8.12 Web Services: Simple Object Access Protocol (SOAP)......... 194
8.13 Web Services and the Service Web......................................... 197
8.14 SOAP Messages..................................................................... 198
Notes������������������������������������������������������������������������������������������������� 200
References......................................................................................... 203
Contents ix

Chapter 9 Storage Services................................................................................ 205


9.1 Introduction............................................................................ 205
9.2 Virtual Environments.............................................................. 205
9.3 Cloud Computing................................................................... 207
9.4 Security and Governance for Cloud Services......................... 208
Notes������������������������������������������������������������������������������������������������� 209
References......................................................................................... 209

Chapter 10 Encryption......................................................................................... 211


10.1 Introduction............................................................................ 211
10.2 Encryption.............................................................................. 211
10.2.1 Ciphers and Cryptography........................................211
10.2.2 RSA and PGP Encryption......................................... 212
10.2.3 Steganography, Checksums and
Digital Signatures...................................................... 213
Notes������������������������������������������������������������������������������������������������� 214
References......................................................................................... 215

Chapter 11 Web Programming............................................................................. 217


11.1 Introduction............................................................................ 217
11.2 Strategies for Web Development............................................ 217
11.2.1 Design Style.............................................................. 218
11.3 HTML..................................................................................... 218
11.3.1 Static HTML............................................................. 219
11.4 Style Sheets – CSS................................................................. 222
11.4.1 The Class Selector..................................................... 224
11.4.2 Applying a Style Sheet.............................................. 225
11.4.3 Multiple Style Sheets................................................ 226
11.5 Dynamic HTML – Forms....................................................... 226
11.6 Dynamic HTML – JavaScript................................................. 227
11.7 Dynamic HTML – CGI.......................................................... 230
11.8 Server- and Client-Side Architecture...................................... 231
11.9 Server Files............................................................................. 232
11.10 Limiting Access...................................................................... 233
11.11 Interfacing with a Database.................................................... 234
11.12 Privacy and Security............................................................... 235
11.12.1 Web Sessions............................................................. 237
11.12.2 Cookies...................................................................... 238
Notes������������������������������������������������������������������������������������������������� 239
References......................................................................................... 240
xContents

Chapter 12 Data Exchange................................................................................... 241


12.1 Introduction............................................................................ 241
12.2 Parity and Hamming Codes.................................................... 241
12.2.1 Decide on the Number of Bits in the Codeword....... 242
12.2.2 Determine the Bit Positions of the Check Bits.......... 242
12.2.3 Determine Which Parity Bits Check
Which Positions......................................................... 242
12.2.4 Calculate the Values of the Parity Bits...................... 243
12.2.5 Using the Codeword to Correct an Error................... 243
12.3 JSON and XML...................................................................... 244
12.4 DICOM................................................................................... 246
12.4.1 Images as Data.......................................................... 247
12.4.2 Information Entities................................................... 248
12.4.3 Information Object Definitions................................. 249
12.4.4 Attributes................................................................... 251
12.4.4.1 Value Representations................................ 252
12.4.4.2 Sequence Attributes................................... 252
12.4.4.3 Private Attributes.......................................254
12.4.4.4 Unique Identifiers...................................... 255
12.4.4.5 Attribute Example: Orientation................. 256
12.4.5 Standard Orientations................................................ 258
12.4.6 DICOM Associations................................................ 259
12.4.7 DICOM-RT............................................................... 259
12.5 HL7 (Health Level Seven)...................................................... 261
12.6 Fast Healthcare Interoperability Resources (FHIR)............... 266
Notes������������������������������������������������������������������������������������������������� 267
References......................................................................................... 269

Chapter 13 Hospital Information Systems and Interfaces................................... 271


13.1 Introduction............................................................................ 271
13.2 Data Retention........................................................................ 271
13.3 Hospital Information Systems and Interfaces......................... 271
13.4 Equipment Management Database Systems........................... 272
13.5 Device Tracking Systems....................................................... 273
13.6 Interfaces................................................................................ 275
Notes������������������������������������������������������������������������������������������������� 275
References......................................................................................... 275

Chapter 14 Backup............................................................................................... 277


14.1 Introduction............................................................................ 277
14.2 Replication.............................................................................. 277
14.3 Archiving................................................................................ 278
14.4 Resilience Using RAID.......................................................... 278
14.5 Business Continuity................................................................ 280
Contents xi

Notes������������������������������������������������������������������������������������������������� 280
Reference........................................................................................... 280

Chapter 15 Software Engineering........................................................................ 281


15.1 Introduction............................................................................ 281
15.2 Software.................................................................................. 282
15.2.1 Operating Systems..................................................... 282
15.2.1.1 Microsoft Windows.................................... 282
15.2.1.2 Unix........................................................... 283
15.2.1.3 Linux.......................................................... 283
15.2.1.4 iOS/macOS................................................ 284
15.2.1.5 General....................................................... 284
15.2.1.6 Paradigms..................................................285
15.3 The Software Lifecycle.......................................................... 287
15.3.1 Requirements Specification: Gathering and
Analysing User Requirements................................... 288
15.3.2 Software Design........................................................ 291
15.3.3 Coding....................................................................... 292
15.3.4 Testing....................................................................... 294
15.3.4.1 Acceptance Testing.................................... 300
15.3.5 Installation and Maintenance....................................301
15.4 Software Lifecycle Models..................................................... 302
15.4.1 Waterfall Model......................................................... 302
15.4.2 Incremental Model/Prototyping Model..................... 302
15.4.3 Spiral Model.............................................................. 304
15.4.4 Agile Methodology................................................... 304
15.5 Overview of Process Models and Their Importance.............. 306
15.5.1 Comparison of Process Models................................. 307
15.5.1.1 Joint Application Development.................307
15.5.1.2 Assembling Reusable Components........... 308
15.5.1.3 Application Generation.............................. 309
15.6 Systems Design Methods....................................................... 309
15.6.1 Top-Down Example................................................... 311
Notes������������������������������������������������������������������������������������������������� 313
References......................................................................................... 314

Chapter 16 Software Quality Assurance.............................................................. 315


16.1 Introduction............................................................................ 315
16.1.1 Attributes................................................................... 315
16.1.2 Configuration Management and
Change Control......................................................... 316
16.1.3 Documentation.......................................................... 316
16.1.4 Hungarian Notation................................................... 317
16.1.5 Comments.................................................................. 318
xiiContents

16.2 Version Control....................................................................... 319


16.3 Software Tools and Automation for Testing........................... 320
16.3.1 Record and Playback................................................. 322
16.3.2 Web Testing............................................................... 322
16.3.3 Database Tests........................................................... 322
16.3.4 Data Functions........................................................... 323
16.3.5 Object Mapping......................................................... 323
16.3.6 Image Testing............................................................ 324
16.3.7 Test/Error Recovery................................................... 324
16.3.8 Object Name Map..................................................... 324
16.3.9 Object Identity Tool................................................... 324
16.3.10 Extensible Language................................................. 325
16.3.11 Environment Support................................................. 325
16.3.12 Integration................................................................. 325
16.4 Standards................................................................................ 326
16.4.1 IEC 601..................................................................... 326
16.4.2 The Medical Devices Directive................................. 327
16.4.3 The Medical Devices Regulations............................. 330
16.4.3.1 Scripts........................................................ 335
16.4.3.2 Brexit......................................................... 336
16.4.4 CE Marking............................................................... 337
16.4.5 Other Standards......................................................... 338
16.4.6 Process Standards...................................................... 339
16.4.6.1 ISO/IEC 62366-1: 2015 Medical
Devices – Part 1: Application of
Usability Engineering to Medical
Devices....................................................... 341
16.4.6.2 ISO 14971:2012 Application of Risk
Management to Medical Devices.............. 341
16.4.6.3 IEC 62304:2006/A1:2015 Medical
Device Software – Lifecycle Processes..... 341
16.4.6.4 ISO 13485: 2016 Medical Devices –
Quality Management Systems –
Requirements for Regulatory
Purposes..................................................... 342
16.4.7 Coding Standards...................................................... 342
16.4.8 Standards and Guidelines Issued by
Professional Bodies................................................... 343
16.5 Market..................................................................................... 343
Notes������������������������������������������������������������������������������������������������� 344
References......................................................................................... 346

Chapter 17 Project Management.......................................................................... 349


17.1 Introduction............................................................................ 349
17.2 Starting Off............................................................................. 350
17.3 Keeping It Going – Managing the Project.............................. 352
Contents xiii

17.4 Stopping (The Hard Bit)......................................................... 352


17.5 Risk Management................................................................... 356
17.6 Team Management (Personnel and Technical)....................... 356
17.7 Project Planning (Resource and Technical)............................ 357
17.7.1 Quantifying the Resource Requirements:
Labour....................................................................... 359
17.7.2 Constructing a Resource Schedule............................ 360
17.8 Education and Training........................................................... 360
17.9 Cost Estimation...................................................................... 361
17.9.1 Tactical versus Strategic Purchasing Decisions........362
Notes������������������������������������������������������������������������������������������������� 362
References......................................................................................... 363

Chapter 18 Safety Cases...................................................................................... 365


18.1 Introduction............................................................................ 365
18.2 The Purpose of a Safety Case................................................. 365
18.3 The Structure of a Safety Case............................................... 366
18.3.1 Claims........................................................................ 366
18.3.2 Evidence.................................................................... 366
18.3.3 Argument................................................................... 366
18.3.4 Inference.................................................................... 367
18.3.5 The GSN Diagram..................................................... 368
18.4 Implementation of a Safety Case............................................ 369
18.5 Design for Assessment........................................................... 370
18.6 The Safety Case Lifecycle...................................................... 370
18.7 The Contents of a Safety Case................................................ 370
18.8 Hazard Log............................................................................. 371
18.8.1 The Therac-25 Incident............................................. 372
Notes������������������������������������������������������������������������������������������������� 374
References......................................................................................... 374

Chapter 19 Critical Path Analysis........................................................................ 375


19.1 Introduction............................................................................ 375
19.2 Planning Stage........................................................................ 375
19.3 Analysis Stage........................................................................ 377
19.3.1 The Forward Pass...................................................... 377
19.3.2 The Backward Pass................................................... 377
19.3.3 Float........................................................................... 378
19.4 Scheduling.............................................................................. 378
19.5 Control Stage.......................................................................... 379
Note�������������������������������������������������������������������������������������������������� 379
Appendix................................................................................................................ 381
List of Abbreviations������������������������������������������������������������������������ 381
Index....................................................................................................................... 387
Acknowledgements
This book which you are holding has taken a lot of time to write. There is no way I’d
have been able to accomplish any of this without a lot of help and encouragement.
The first acknowledgement must therefore be to Rachel Ganney who foolishly
offered to turn a set of lecture notes into a book (which we self-published) and then
even more foolishly offered to do a second (longer) set, which are pretty much the
ones you have before you. Without her eye for detail, this book would still be on the
“good idea – maybe one day” pile. She also drew some of the figures, re-wrote some
of the text and generally put up with me when it wasn’t going well. I’m sure there
ought to be a paragraph in the marriage vows about this sort of thing.
I must also thank those with whom I have co-authored articles and book chapters,
which I have adapted for inclusion here. Specifically, Sandhya Pisharody, Phil
Cosgriff, Allan Green, Richard Trouncer, David Willis, Patrick Maw and Mark
White. James Moggridge deserves specific thanks for inspiring and proof-reading the
“NoSQL” chapter. I should also thank those who invited me to write for or with
them, which encouraged me greatly. Usman Lula and Azzam Taktak join Richard
from the previous list. Finally, I should thank my students, who have asked difficult
questions (which made me work out the answers, some of which you have here) and
laughed at the jokes. The ones they didn’t laugh at have been retired.

xiv
1 Data Structures

1.1 INTRODUCTION
At some point in the construction and development of programs, the question as to
what to do with the data1 is going to arise. Issues such as the order in which data is
required, whether it will change, whether new items will appear and old ones will
need to be removed will determine the shape of the data and as such, the structure in
which it is best stored.
A data structure defines how data is to be stored and how the individual items of
data are to be related or located, if at all, in a way that is convenient for the problem
at hand and easy to use. To return to an oft-quoted2 equation:

Information = Data + Structure

There are many ways of arranging data, some of which we will examine here. The
most important part is to select the most appropriate model for the data’s use: e.g. a
dictionary arranged in order of the number of letters in each word isn’t as useful as
one with the words in alphabetic order, even though it is still a valid sort (and, indeed,
this is how a Scrabble dictionary is arranged).

1.2 ARRAYS
An array is the simplest form of data structure and exists as a part of most (if not all)
high-level programming languages. An array is simply a collection of similar data
items, one after another, referred to by the same identifier. Different items within
the array are referenced by means of a number, called the index, which is the item’s
position in the array. There is no relationship between the value of the data item and
its position in the array, unless one is imposed by the programmer.
A 1-dimensional array is often referred to as a vector and can be thought of as a
simple list, such as in Figure 1.1.
An array’s logical structure (which item follows which) mirrors its physical struc-
ture, which aids its simplicity but does reduce its flexibility, as we shall see.
Most high-level languages start array indices at 0, which also points towards the
memory storage model: a 1D integer array will reserve 4 bytes per integer; thus this
example has reserved 40 bytes. Element 7 is thus stored at the base address of the
array plus 7 * 4 = 28 bytes. For this reason, languages that don’t check for program-
mers exceeding array bounds (e.g. C) will still produce working programs, they’ll
just overwrite (or read from) memory that they weren’t supposed to, with unpredict-
able results.

DOI: 10.1201/9781003316244-1 1
2 Introduction to Bioinformatics and Clinical Scientific Computing

FIGURE 1.1 A 1D array, or vector.

FIGURE 1.2 A tiny picture of a cat in greyscale, from an original in colour.

An obvious extension to the 1D array is a multiply-dimensioned one. This has a


similar structure to the vector above, but with additional columns, forming a table.
This enables data to be kept together, e.g. a set of prices and discounts, or high and
low values for physiological measurements. Another common usage is in holding
pixel values for an image. For a greyscale image, this would be a 2D array as shown
in Figure 1.3, from Figure 1.2, and for a colour image a 3D array is required as shown
in Figure 1.4 (from the original colour image shown in Figure 1.2).
Taking our earlier example from Figure 1.1, we can extend the vector into an array
as shown in Figure 1.5.
Note that in this example we have stored different data types in the columns. Most
programming languages won’t allow this, so the data will normally have to be con-
verted (e.g. to strings) which in this case (a set of addresses) would be appropriate
anyway. Our simple storage calculation no longer holds though, as the strings are of
variable length. This is usually overcome by storing pointers to the data in the array,
with the actual data being stored elsewhere.
Data Structures 3

FIGURE 1.3 Rows 1–3 of the greyscale array, the tips of the ears.
Note that 0=black and 255=white.

FIGURE 1.4 Rows 1–3 of the colour picture (the tips of the ears), as R, G, B, showing how
a 3D array is now required. The bottom two rows with a dark grey background are red values,
the light grey background shows the green values and the mid-grey background (top row and
right-hand side) show the blue values.
Note that 0 = no colour and 255 = maximum colour.
4 Introduction to Bioinformatics and Clinical Scientific Computing

FIGURE 1.5 A 2D array.

So far we have data but little information, as the structure we have only really
allows data to be “thrown in” as it appears. Extracting data from such a structure is a
simple task but not an efficient one. Therefore arrays like this are usually kept in
order, which works well when the data changes little (if at all).3 However, adding and
inserting data into such a structure necessitates a lot of data moving, in order to create
Data Structures 5

new rows (and hence move everything below it down) or remove old ones (moving
all data “up one”).
In order to extract information from our example array, we could sort it each time;
e.g. to find all the addresses in Bridlington, we might sort on column “Value 3”, giv-
ing rise to Figure 1.6.

FIGURE 1.6 The previous array sorted on the column “Value 3”.

You might also like