Unit 4 Hca
Unit 4 Hca
Introduction on Deep Learning – DFF network CNN- RNN for Sequences – Biomedical Image
and Signal Analysis – Natural Language Processing and Data Mining for Clinical Data – Mobile
Imaging and Analytics – Clinical Decision Support System.
effectiveness of algorithm very much depends on how insightful the programmer is.
Deep learning attempts to mimic the human brain—albeit far from matching its ability—
enabling systems to cluster data and make predictions with incredible accuracy.
Deep learning is a subset of machine learning, which is essentially a neural network with three or
more layers. These neural networks attempt to simulate the behaviour of the human brain—albeit
far from matching its ability—allowing it to “learn” from large amounts of data.
Deep learning models are capable of learning to focus on the right features by themselves,
requiring little guidance from the programmer.
Basically, deep learning mimics the way our brain functions i.e. it learns from experience. As
you know, our brain is made up of billions of neurons that allows us to do amazing things. Even
the brain of a one year old kid can solve complex problems which are very difficult to solve even
using super-computers.Recognize the face of their parents and different objects as well.
Discriminate different voices and can even recognize a particular person based on his/her voice.
Draw inference from facial gestures of other persons and many more.
Deep learning uses the concept of artificial neurons that functions in a similar manner as the
biological neurons present in our brain. Therefore, we can say that Deep Learning is a subfield
of machine learning concerned with algorithms inspired by the structure and function of the brain
called artificial neural networks. Now, let us take an example to understand it. Suppose we want
to make a system that can recognize faces of different people in an image. If we solve this as a
typical machine learning problem, we will define facial features such as eyes, nose, ears etc.
and then, the system will identify which features are more important for which person on its
own.
Now, deep learning takes this one step ahead. Deep learning automatically finds out the features
which are important for classification because of deep neural networks, whereas in case of
Machine Learning we had to manually define these features.
The inspiration for deep learning is the way that the human brain filters the information. Its main
motive is to simulate human-like decision making. Neurons in the brain pass the signals to
perform the actions. Similarly, artificial neurons connect in a neural network to perform tasks
clustering, classification, or regression. The neural network sorts the unlabeled data according to
the similarities of the data. That’s the idea behind a deep learning algorithm.
a) Input layer
b) Hidden layer
c) Output layer
Input Layer
Hidden Layer
• The “deep” in Deep Learning refers to have more than one hidden layer.
Output Layer:
Weight:
The connection between neurons is called weight, which is the numerical values. The weight
between neurons determines the learning ability of the neural network. During the learning of
artificial neural networks, weight between the neuron changes. Initial weights are set randomly.
Transfer Function
The transfer function translates the input signals to output signals. Four types of transfer
functions are commonly used, Unit step (threshold), sigmoid, piecewise linear, and Gaussian.
Sigmoid
Piecewise Linear
Gaussian
Gaussian functions are bell-shaped curves that are continuous. The node output (high/low) is
interpreted in terms of class membership (1/0), depending on how close the net input is to a
chosen value of average.
Linear
Like a linear regression, a linear activation function transforms the weighted sum inputs of the
neuron to an output using a linear fnction.
Activation Function
Activation function decides, whether a neuron should be activated or not by calculating weighted
sum and further adding bias with it.
Here, z(1) is the vectorized output of layer 1.W(1) be the vectorized weights assigned to neurons
of hidden layer i.e. w1, w2, w3 and w4. X be the vectorized input features i.e. i1 and i2. b is the
vectorized bias assigned to neurons in hidden layer i.e. b1 and b2. a(1) is the vectorized form of
any linear function.
• a(2) = z(2)
• In FFNN we obtain the input for the hidden layer by applying the activation function, for
this, we only need the input vector and the weights matrix.
• The unfolded architecture of the RNNs can be altered as per the requirement, say if you
want to do the sentiment classification task we can have multiple inputs and single output
• while in the case of language generation models we need to have multiple inputs and
multiple outputs, also RNNs can be stacked together for some special use cases.
Types of Recurrent Neural Networks
There are four types of Recurrent Neural Networks:
➢ One to One
➢ One to Many
➢ Many to One
➢ Many to Many
One to One RNN
This type of neural network is known as the Vanilla Neural Network. It's used for general
machine learning problems, which has a single input and a single output.
One to Many RNN
This type of neural network has a single input and multiple outputs. An example of this is the
image caption.
Natural language processing (NLP) is the ability of a computer program to understand human
language as it is spoken and written -- referred to as natural language. It is a component of
artificial intelligence (AI).
Electronic health records (EHR) of patients are major sources of clinical information that
arecritical to improvement of health care processes. Automated approach for retrieving
informationfrom these records is highly challenging due to the complexity involved in
converting clinical textthat is available in free-text to a structured format. Natural language
processing (NLP) and datamining techniques are capable of processing a large volume of clinical
text (textual patient reports)
The input for a NLP system is the unstructured natural text that is extracted from patient’s
medical record and send it to report analyser.
Report Analyzer:
The clinical text differs from the biomedical text with the possible use of pseudotables,
i.e.,natural text formatted to appear as tables, medical abbreviations, and punctuation in addition
to the natural language. The text is normally dictated and transcribed to a person or speech
recognition software and is usually available in free-text format. Some clinical texts are even
available in the image or graph format which are in unstructured format.
As a result, NLP processing techniques are applied to convert the unstructured free-text into a
structured format.
The first and foremost task of report analyzer is to preprocess the clinical input text by applying
NLP methodologies. The major preprocessing tasks in a clinical NLP include text segmentation,
text irregularities handling, domain specific abbreviation, and missing punctuation
Text Analyzer
Text analyzer is the most important module in clinical text processing that extracts the clinical
information from free-text and makes it compatible for database storage.The syntactic and
semantic interpreter component of the text analyzer generates a deeper structure such as
constituent or dependency tree structures to capture the clinical information present in the text.
The conversion rules or ML algorithms encode the clinical information from the deep tree
structures. An advantage of the rule-based approach is that the predefined patterns are expert-
curated and are highly specific. The database handler and inference rules component generates a
processed form of data from the database storage
MORPHOLOGICAL ANALYSIS:
Stop word Remove - it remove unwanted word like punctuations and articels etc.
Stemming – It is the process of reducing word into its base forms. example:base form of
took is take i.e,the word took is derived from take.
o bigram:-It process two words at a time and so on. By this we can find the
probabilty of the word
Research in NLP for clinical domain makes the computers understand the free-form clinical text
for automatic extraction of clinical information. The general aims of clinical NLP understandings
include the theoretical investigation of human language to explore the details of language from
computer implementation point of view and the more natural man-machine communications that
aims at producing a practical automated system. Due to the complex nature of the clinical text,
the analysis is carried out in many phases such as morphological analysis, lexical analysis,
syntactic analysis, semantic analysis, and data encoding.
LEXICAL ANALYSIS:
The words or phrases in the text are mapped to the relevant linguistic information such as
syntactic information, i.e., noun, verb, adverb, etc., and semantic information i.e., disease,
procedure, body part, etc. Lexical analysis is achieved with a special dictionary called a lexicon,
which provides the necessary rules and data for carrying out the linguistic mapping. The
development of maintenance of a lexicon requires extensive knowledge engineering and effort to
develop and maintain. The National Library of Medicine (NLM) maintains the Specialist
Lexicon with comprehensive syntactic information associated with both medical and English
terms.
Semantic Analysis
It is used to check whether the sentence is meaningful or not. It find some importent tokens and
find its base words. It find parts of speech of each word (It is done in lexical analysis). It need to
check, the two words come together in a sentence does they make a sense. It is done by mapping
syntactic structure and objects in a domain.
It determines the words or phrases in the text that are clinically relevant, and extracts their
semantic relations. The natural language semantics consists of two major features:
The representation of the meanings of a sentence, which can allow the possible
manipulations (particularly inference)
Relating these representations to the part of the linguistic model that deals with the
structure (grammar or syntax).
The semantic analysis uses the semantic model of the domain or ontology to structure and
encodes the information from the clinical text. The semantic model is either frame oriented or
conceptual graphs. The generated structured output of the semantic analysis is subsequently
used by other automated processes.
SYNTATIC ANALYSIS:
The word “syntax” refers to the study of formal relationships between words in the text. The
grammatical knowledge and parsing techniques are the major key elements to perform syntactic
analysis. The context free grammar (CFG) is the most common grammar used for syntactic
analysis. CFG is also known by various other terms including phrase structure grammar (PSG)
and definite clause grammar (DCG). The syntactic analysis is done by using two basic parsing
techniques called top-down parsing and bottom-up parsing to assign POS tags (e.g., noun, verb,
adjective, etc.) to the sequence of tokens that form a sentence and to determine the structure of
the sentence through parsing tools.
DATA ENCODING:
The process of mining information from EHR requires coding of data that is achieved either
manually or by using NLP techniques to map free-text entries with an appropriate code. The
coded data is classified and standardized for storage and retrieval purposes in clinical research.
Manual coding is normally facilitated with search engines or pick-up list.
The use of data mining in healthcare is being adopted by organizations with a focus on
optimizing the efficiency and quality of their predictive analytics.
In the healthcare industry specifically, data mining can be used to decrease costs by increasing
efficiencies, improve patient quality of life, and perhaps most importantly, save the lives of more
patients.
Text mining in clinical domain is usually more difficult than general domains (e.g. newswire
reports and scientific literature) because of the high level of noise in both the corpus and training
data for machine learning (ML). Healthcare systems and specifically health record systems
contain both structured and unstructured information as text.
It is a subfield of biomedical NLP to determine classes of information found in clinical text that
are useful for basic biological scientists and clinicians for providing better health care.
More specifically, it is estimated that over 40% of the data in healthcare record systems contains
text, so-called clinical text, sometimes also called electronic patient record text.
Clinical text contains valuable information about symptoms, diagnoses, treatments, drug use and
adverse (drug) events for the patient that can be utilized to improve healthcare for other patients.
However, clinical text also contains sensitive information such as personal names, telephone
numbers and addresses of the patient and relatives. This information needs to be pseudonymized
before the clinical text can be utilized for secondary use.
Text mining and data mining techniques to uncover the information on health, disease, and
treatment response support the electronically stored details of patients’ health records. A
significant chunk of information in HER and CDA are text and extraction of such information by
conventional data mining methods is not possible. The semi-structured and unstructured data in
the clinical text and even certain categories of test results such as echocardiograms and radiology
reports can be mined for information by utilizing both data mining and text mining techniques.
Information extraction
Information extraction (IE) is a specialized field of NLP for extracting predefined types of
information from the natural text. It is defined as the process of discovering and extracting
knowledge from the unstructured text.
IE differs from information retrieval (IR) that is meant to be for identifying and retrieving
relevant documents. In general, IR returns documents and IE returns information or facts.
A typical IE system for the clinical domain is a combination of components such as tokenizer,
sentence boundary detector, POS tagger, morphological analyzer, shallow parser, deep parser
(optional), gazetteer, named entity recognizer, discourse module, template extractor, and template
combiner.
A careful modeling of relevant attributes with templates is required for the performance of high
level components such as discourse module, template extractor, and template combiner. The high
level components always depend on the performance of the low level modules such as POS
tagger, named entity recognizer, etc.
IE for clinical domain is meant for the extraction of information present in the clinical text. The
Linguistic String Project–Medical Language Processor (LSP–MLP), and Medical Language
Extraction and Encoding system (MedLEE) are the commonly adopted systems to extract
UMLS concepts from clinical text.
Preprocessing
The primary source of information in the clinical domain is the clinical text written in natural
language. However, the rich contents of the clinical text are not immediately accessible by the
clinical application systems that require input in a more structured form. An initial module
adopted by various clinical NLP systems to extract information is the preliminary preprocessing
of the unstructured text to make it available for further processing. The most commonly used
preprocessing techniques in clinical NLP are spell checking, word sense disambiguation, POS
tagging, and shallow and deep parsing.
Spell Checking
The misspelling in clinical text is reported to be much higher than any other types of texts. In
addition to the traditional spell checker, various research groups have come out with a variety of
methods for spell checking in the clinical domain: UMLS-based spell-checking error correction
tool and morpho-syntactic disambiguation tools.
The process of understanding the sense of the word in a specific context is termed as word sense
disambiguation. The supervised ML classifiers and the unsupervised approaches automatically
perform the word sense disambiguation for biomedical terms.
POS Tagging
An important preprocessing step adapted by most of the NLP systems is POS tagging that reads
the text and assigns the parts of speech tag to each word or token of the text. POS tagging is the
annotation of words in the text to their appropriate POS tags by considering the related and
adjacent words in a phrase, sentence, and paragraph. POS tagging is the first step in syntactic
analysis and finds its application in IR, IE, word sense disambiguation, etc. POS tags are a set of
word categories based on the role that words may play in the sentence in which they appear. The
most common set contains seven different tags: Article, Noun, Verb, Adjective, Preposition,
Number, and Proper Noun.
Parsing is the process of determining the complete syntactic structure of a sentence or a string of
symbols in a language. Parser is a tool that converts an input sentence into an abstract syntax tree
such as the constituent tree and dependency tree, whose leafs correspond to the words of the
given sentence and the internal nodes represent the grammatical tags such as noun, verb, noun
phrase, verb phrase, etc. Most of the parsers apply ML approaches such as PCFGs (probabilistic
context-free grammars) as in the Stanford lexical parser [50] and even maximum entropy and
neural network.
Few parsers even use lexical statistics by considering the words and their POS tags. Such taggers
are well known for overfitting problems that require additional smoothing. An alternative to the
overfitting problem is to apply shallow parsing, which splits the text into nonoverlapping word
sequences or phrases, such that syntactically related words are grouped together. The word
phrase represents the predefined grammatical tags such as noun phrase, verb phrase,
prepositional phrase, adverb phrase, subordinated clause, adjective phrase, conjunction phrase,
and list marker. The benefits of shallow parsing are the speed and robustness of processing.
Parsing is generally useful as a preprocessing step in extracting information from the natural text.
Context-Based Extraction
The fundamental step for a clinical NLP system is the recognition of medical words and phrases
because these terms represent the concepts specific to the domain of study and make it possible
to understand the relations between the identified concepts. Even highly sophisticated systems of
clinical NLP include the initial processing of recognizing medical words and phrases prior to the
extraction of information of interest. While IE from the medical and clinical text can be carried
out in many ways, this section explains the five main modules of IE.
Concept Extraction
Extracting concepts (such as drugs, symptoms, and diagnoses) from clinical narratives
constitutes a basic enabling technology to unlock the knowledge within and support more
advanced reasoning applications such as diagnosis explanation, disease progression modeling,
and intelligent analysis of the effectiveness of treatment. The first and foremost module in
clinical NLP following the initial text preprocessing phase is the identification of the boundaries
of the medical terms/phrases and understanding the meaning by mapping the identified
term/phrase to a unique concept identifier in an appropriate ontology. The recognition of clinical
entities can be achieved by a dictionary-based method using the UMLS Metathesaurus, rule-
based approaches, statistical method, and hybrid approaches. The identification and extraction of
entities present in the clinical text largely depends on the understanding of the context. For
example, the recognition of diagnosis and treatment procedures in the clinical text requires the
recognition and understanding of the clinical condition as well as the determination of its
presence or absence. The contextual features related to clinical NLP are negation (absence of a
clinical condition), historicity (the condition had occurred in the recent past and might occur in
the future), and experiencer (the condition related to the patient). While many algorithms are
available for context identification and extraction, it is recommended to detect the degree of
certainty in the context.
Association Extraction
Clinical text is the rich source of information on patients conditions and their treatments with
additional information on potential medication allergies, side effects, and even adverse effects.
Information contained in clinical records is of value for both clinical practice and research;
however, text mining from clinical records, particularly from narrative-style fields (such as
discharge summaries and progress reports), has proven to be an elusive target for clinical Natural
Language Processing (clinical NLP), due in part to the lack of availability of annotated corpora
specific to the task. Yet, the extraction of concepts (such as mentions of problems, treatments,
and tests) and the association between them from clinical narratives constitutes the basic
enabling technology that will unlock the knowledge contained in them and drive more advanced
reasoning applications such as diagnosis explanation, disease progression modeling, and
intelligent analysis of the effectiveness of treatment.
Negation
“Negation” is an important context that plays a critical role in extracting information from the
clinical text. Many NLP systems incorporate a separate module for negation analysis in text
preprocessing. However, the importance of negation identification has gained much of its interest
among the NLP research community in recent years. As a result, explicit negation detection
systems such as NegExpander, Negfinder, and a specific system for extracting SNOMED-CT
concepts as well as negation identification algorithms such as NegEx that uses regular expression
for identifying negation and a hybrid approach based on regular expressions and grammatical
parsing are developed by a few of the dedicated research community. While the NegExpander
program identifies the negation terms and then expands to the related concepts, Negfinder is a
more complex system that uses indexed concepts from UMLS and regular expressions along
with a parser using LALR (look-ahead left-recursive) grammar to identify the negations.
Extracting Codes
Extracting codes is a popular approach that uses NLP techniques to extract the codes mapped to
controlled sources from clinical text. The most common codes dealing with diagnoses are the
International Classification of Diseases (ICD) versions 9 and 10 codes. The ICD is designed to
promote international comparability in the collection, processing, classification and presentation
of mortality statistics.
Preprocessing of texts such as tokenisation and text segmentation.
•Stemming: Stemming is a natural language processing technique that lowers inflection in words
to their root forms, hence aiding in the preprocessing of text, words, and documents for text
normalization.
•Compound splitting: Dealing with word compounding in statistical machine translation (SMT)
is essential to mitigate the sparse data problems that productive word generation causes. There
are several issues that need to be addressed: splitting compound words into their correct
components (i.e. disambiguating between split points), deciding whether to split a compound
word at all, and, if translating into a compounding language, merging components into a
compound word
Generally, the same building blocks used for regular texts can also be utilised for clinical text
processing. However, clinical texts contain more noise in the form of incomplete sentences,
misspelled words and non-standard abbreviations that can make the natural language processing
cumbersome.
Applications:
Healthcare associated infections are also called hospital associated infections or nosocomial
infections. An important goal in defeating HAIs is to collect statistics by detecting and measure
the prevalence of HAIs, but also to predict and warn if a particular patient has a high risk of
obtaining HAI. HAIs can encompass, for example, pneumonia, urinary tract infection, sepsis or
various wound infections but also norovirus (winter vomiting disease). Two machine learning
algorithms ; Support Vector Machine (SVM) and Random Forest (RF) in the Weka toolkit were
applied on the annotated Stockholm EPR Detect-HAI Corpus.
Adverse drug events (ADEs) are a major public health problem, around 5% of all hospital
admissions in the world are due to ADEs
All drugs are poisonous in some sense but given in the correct amount they may cure a disease.
(d) Time-related, becomes apparent some time after the use of the drug.
First of all, ICD-10 diagnosis codes related to adverse drug events that are assigned to the patient
records need to be studied.
Medical terminologies, classification systems and available controlled vocabularies are used in
healthcare to report, administer, classify and explain diseases and treatment, including
medication.
Mobile imaging is the technique of creating visual representations of the interior of a body for
clinical analysis and medical intervention, as well as visual representation of the function of
some organs or tissues.
Introduction:
Mobile technology and smart devices, especially smartphones, allows new ways of easier
imaging at the patient’s bedside and possess the possibility to be made into a diagnostic tool that
can be used by both professionals as well as lay people. Smartphones usually contain at least one
high-resolution camera that can be used for image formation. However, careful consideration has
to be taken when dealing with cameras in general, and with nonscientific cameras specifically.
Many parameters are usually reported on camera in public commercials, but not all of them are
useful. Especially, pixel resolution can be misleading as the number of pixels itself is not a
measure of quality. Quality is usually measured in signal-to-noise ratio (SNR)
• Shot noise, which is dependent on the quality of the sensor and the discretization of
different number of photons. This noise mostly occurs when only a few photons hit the
sensor.
• Transfer noise, which is introduced by connectivity in the sensor. This is usually static
for all images and can be reduced using background subtraction with an image acquired
in complete darkness.
In case of a camera, the signal is the amount of light captured by the sensor. Since image
noise is reduced, more photons are available. The most important parameter for the quality of
an optical system is the amount of light accumulated on each pixel. This parameter is
determined by the physical size of a pixel (or chip size in relation to number of pixels), as
larger pixel acquires more light, and the diameter of the entry lens, which regulates the
amount of light. The size of the entry lens is usually given in f-stop k (written as 1:k or f/k),
the ratio of distance from sensor to entry lens to diameter of entry lens, the lower, the better.
Most modern smartphones have similar optical parameters as regular consumer cameras,
while being built at a far smaller scale.
First integrations of these cameras into clinical routine and research have already shown
manifold applications for mobile technology in medicine.One example is the usage of the
smartphone camera to take pictures of test strips for automatic analysis.
Another example is the use of smartphone cameras to document necrotic skin lesions caused
from the rare disease calciphylaxis in a multicenter clinical registry. Here, special care must be
taken when dealing with multiple different smartphones or lighting conditions due to different
efficiencies in capturing colors.
A color reference has to be used to calibrate the camera colors in a later step. To control
illumination, zoom, and distance, the German company FotoFinder has developed an integrated
lens system that is easily attached to and powered by an iPhone transforming it into a
dermatoscope.
Beside the integrated camera, additional image formation methods can also be used on smart
devices by either incorporating special sensors (like ultrasound or ECG) or by connecting
themwired or wireless to more powerful imaging machines like micronuclearmagnetic resonance
(micro-NMR) for bedside diagnostic.
Data Visualization
The task of transforming an acquired image dataset into a perceptible form is called
visualization.This is rather simple for most 2D methods like digital photographs, but for 3D
volumes, in particular, if voxels are annotated with several features or monitored over time
(3D+t). In general, all data is displayed by transforming it into a colored 2D representation.
Hence, we need to consider the output devices as well as the definition and value ranges of the
initial data.
Visualization Basics
The human eye is capable of detecting light between 390 and 700 nm wavelengths. Images that
are recorded and displayed within this so-called visible spectrum show the data in “true color.”
But because many modalities like X-ray, ultraviolet, or infrared imaging capture wavelengths
outside the visible spectrum, a modification of the recorded data has to be performed. The
resulting image (e.g., a grayscale image for X-ray) is displayed in “false color.” A special case of
this is the so-called “pseudo color,” which means that the color of an image has been artificed to
enhance certain features. Here, a single channel image and a so-called color map are used to
convert each value of the single channel into a corresponding color.
As an example, the Doppler signal contains information on direction of movement for each
position. This movement can be either positive (towards the detector), negative (away from the
detector), or zero (no movement). To superimpose this information to morphologic image data (B
mode), a different color scheme is applied. The zero level would be encoded in black, negative
values in blue, and positive values in red. Larger absolute value of the signal results in brighter
color.
Output Devices
All data is displayed on a computer screen, where colors are mixed from three basic channels:
red, green, and blue (RGB). This results in a cubic color space.Setting all three colors to the same
value creates different shades of gray. Each color is usually scaled from 0 (dark) to 255 (bright).
This equals a bit depth of 8, meaning that 8 bits in memory are allocated for each color channel
yielding in total 256 power 3∼ 16 million possible values. Higher bit-depth color or gray values
are also possible but rarely used, as they are not well supported by computer screens and file
formats.
However, in some cases a higher contrast or distribution of color or gray values is needed, e.g.,
for diagnostics in radiology. Therefore, computer screens in diagnostic radiology support higher
bit depth (e.g., grayscale bit depth of 10), and have a better contrast (e.g., 1400:1 compared to
1000:1 regular) and brightness (e.g., 400 cd/m2 brightness compared to 200 cd/m2 regular) than
regular computer screens.
Printers differ from screens in that the background color of a screen (no color turned on) is black,
while the background color of a printout (paper) is white. Thus, higher values in color for screens
result in brighter colors, while higher amounts of color from a printer result in darker colors.
Therefore, printers usually use cyan, magenta, yellow, and black (CMYK) color space to
compensate for the nonblack background. Black is used as a key ingredient when mixing the
colors to minimize the fluid on the paper. Therefore, computer screens in diagnostic radiology
support higher bit.
Mobile Visualization
Recently, visualization and display technology has been dominated by trends in mobile
computing.For example, prior to the introduction of the first retina display with the iPhone 4 in
2010, almost all computer and smartphone displays had a pixel density of about 70–100 pixels
per inch(ppi). Increase in resolution was mostly achieved through larger monitor screens.
However, the introduction of the retina display increased the pixel density above 300 ppi,
improving perceived contrast and also outperforming radiology displays in many other aspects
(e.g., iPhone 4 brightness: 500 cd/m2). Thereby, these new types of screens show great potential
for radiologists.
Additionally, modern smartphones and tablet computers provide a high amount of processing
power (e.g, 64-bit dual core, 1.3 GHz in iPhone 5s) that can be used for image visualization.
Almost all 2D and surface-rendering visualization techniques can be employed in real time. Real
time means that the result is delivered fast enough to make an impact on the current situation, or,
in terms of visualization of data, so that no delay between action (e.g., zooming) and result
(zoomed image) is perceived. Usually, this requires 15 to 20 frames per second (fps).[Frame
rate, then, is the speed at which those images are shown]
Volume rendering
Example, using H.264 video compression that is standard in mobile communication. On the
other hand, the client captures touches, swipes, and other interactions of the user and sends these
to the server to update the live view. Streaming of video data has the benefit of allowing the user
to use a mobile device while having the computational power of a workstation. The drawback of
this approach is the needed bandwidth to stream images in real time from the server to the
mobile device.
For example, a video with 30 frames per second (fps) and a resolution of 1920 by 1080 pixels
(FullHD/1080 p) requires about 1 Mb/sec bandwidth. This is not possible through most current
wireless networks like 3G, which is limited between 350 and 2000 kilobits per second
(kbit/s),depending on country and reception.
Calibration
Important for distributed visualization on a range of different devices is calibration. This means
that the same image is displayed in the exact same way on all devices, even if background
illumination differs between these devices. For this, an application has been developed that allow
users to calibrate their devices visually on their own. In this application the user is guided
through 8 steps, each showing a visual pattern. In each step, the user has to adjust a slider to
change the visibility of the pattern.
One concern that is often raised when visualizing biomedical images on mobile devices is the
appropriateness for diagnostics. For example, software that displays medical images might have
to undergo investigation by the Food and Drug Administration (FDA) or other local legal
authorities to be cleared for commercial marketing. Smartphones and tablet computers do not
necessarily meet the requirements to undergo these studies. Therefore, the appropriateness and
legitimacy of the device chosen for visualization should always be taken into account when
considering the use of a mobile device for diagnostic or visualization of medical images.
Image Analysis
Image analysis is the task of extracting abstract information or semantics and knowledge from
the raw pixels of image and signal data.
This is the most challenging task in biomedical imaging as it supports researchers and clinicians
in finding clues for disease or certain phenotypes (diagnostics), supports novices and experts in
performing procedures (therapy) and follow-up to the outcome, and allows scientist to gain
knowledge from imaging data.
With the growing number of digital imaging devices, automated knowledge extraction becomes
more and more important. The new trend towards mobile and personalized health data
additionally drives the need for automation. For example, many applications for the smartphone-
based investigation of skin cancers do already exist but only a few are actually accurate. Pulse
frequency is determined accurately and contactless by any smartphone device simply filming the
face and determining the very slight periodic changes in skin color, which are usually not
observed by humans.
Basically all images from biomedical imaging modalities and especially those from smart phone
cameras are noisy and contain artifacts. Therefore, preprocessing is required before the data can
be used for analysis. Additional preprocessing can also help to prepare the image for certain
analysis tasks, such as edge detection. Most of the preprocessing algorithms are low in
computation time and memory requirements and hence suitable for mobile devices.
Gaussian filter
A Gaussian filter is commonly used to remove noise and recording artifacts from an image by
blurring. The filter consists of a multidimensional Gaussian distribution that is convolved with
the image. For convolution, the center value is replaced with the accumulated weighted values
according to the mask. High frequency noise in the image is thereby reduced.
On convolution of the local region and the Gaussian kernel gives the highest intensity value to
the center part of the local region(38.4624) and the remaining pixels have less intensity as the
distance from the center increases.Sum up the result and store it in the current pixel
The median filter is also used to reduce noise. For this filter, a sliding window with a fixed size
(here a 3x3 pixel) is moved across the image. The center point of the window is replaced by the
median value within the window. For median computation, the image pixel values at current
mask position (A to I) are sorted, and the center is replaced by the fifth value in the sorted
row.This removes outliers in an otherwise smooth area while maintaining the value of the
majority of the pixels.
Sobel filter
The Sobel filter is used to enhance edges in the image. For this, an asymmetric filter is
convolved with the image. The mask that is visualized is sensitive to vertical edges, in particular
to vertical edges from black to white. Usually, this mask is turned by 90◦ and the signs are
changed ending up with a set of eight different masks. All eight masks are applied individually
and, for instance, the maximum is used as a replacement for the center pixel to obtain an edge
map.
Feature Extraction
Features are simplified descriptors of an image or part of an image. Features are used to compare
two images, or find similarities or shared objects between multiple images. Image features can be
either global (describing the image as a whole) or local (describing a part of any size of the
image).
A very basic global image feature is the image histogram. A histogram is a probability
distribution of the pixel/voxel values in the image. For each possible value, the number of
occurrences is counted in the image. This results in a very simplified representation as
information on the intensity is maintained, but all spatial information is lost. Global features,
such as the shape of the histogram, can be used, for instance, to distinguish between classes of
images, e.g., hand and skull radiographs
Local features describe only a part of the image at a certain spatial position. Most are created in
two separate steps. The first one is feature detection, in which points of interest (POIs) are
localized.The second step features description. For each of the detected points, a description of
this position (possibly including some surrounding areas) is created. Since images can be
acquired under different conditions like scale and rotation, certain invariance against these
changes is needed for both detector and descriptor.
Recognizing objects in images is one of the most important problems in computer vision. A
common approach is to first extract the feature descriptions of the objects to be recognized from
reference images, and store such descriptions in a database. When there is a new image, its
feature descriptions are extracted and compared to the object descriptions in the database to see
if the image contains any object we are looking for. In real-life applications, the objects in the
images to be processed can differ from the reference images in many way:
Orientation
Viewpoint
Illumination
Partially covered
Scale-invariant feature transform (SIFT) is an algorithm for extracting stable feature description
of objects call keypoints that are robust to changes in scale, orientation, shear, position, and
illumination.
mo
Clinical Decision Support System.
Clinical decision support (CDS) provides clinicians, staff, patients or other individuals with knowledge and
person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health
care. CDS encompasses a variety of tools to enhance decision-making in the clinical workflow. These tools include
computerized alerts and reminders to care providers and patients; clinical guidelines; condition-specific order sets;
focused patient data reports and summaries; documentation templates; diagnostic support, and contextually relevant
reference information, among other tools.