01-2 - Digital Literacy Foundations
01-2 - Digital Literacy Foundations
Foundations
Kenneth Ban
Department of Biomedical Informatics &
Department of Biochemistry
Judice Koh
Department of Biomedical Informatics
[email protected]
[email protected]
Roadmap
How can we use AI/
AI/Machine Learning Generative AI ML for healthcare?
How do we use
Computational Thinking computational thinking
to solve problems?
Analog Digital
Analog signals Digital signals
(e.g. audiotape) (e.g. CD)
• Can be compressed
10 repeated 3x
101010111111
1 repeated 6x
How to represent numbers?
• Computing relies on the representation, storage and
manipulation of data in the form of numbers
• Different number systems are used to represent
numbers in digital computing
Base Digits
10 (Decimal) 0 to 9
2 (Binary) 0 and 1
16 (Hexadecimal) 0 to 9, A to F
How to represent numbers?
Base 10 (Decimal)
• Most commonly used
• Digits: 0 to 9
2 3 7 8
1 6
4 5
0 9
2 1 0
34510 = 3 × 10 + 4 × 10 + 5 × 10
How to represent numbers?
Base 2 (Binary)
• Used in digital circuits for computing
• Digits: 0 and 1 0 1
3 2 1 0
10112 = 1 × 2 + 0 × 2 + 1 × 2 + 1 × 2
= 1110
How to represent numbers?
Base 16 (Hexadecimal)
• Used to represent 4 binary digits compactly as a
single hexadecimal digit (24 = 16)
• Digits: 0 to 9, A to F (representing 10 to 15)
2 3 7 8 B C D F
1 6
4 5 E
0 9 A
1 0
1F16 = 1 × 16 + F(15) × 16
= 3110
Demo
https://www.rapidtables.com/convert/number/hex-dec-bin-converter.html
How to represent digital data?
Snapshot Time-series
Numeric Text Image Audio Video
Digital representation
How do we represent data? (1) discrete
• Unsigned 23 22 21 20
Text (Unicode)
Snapshot Time-series
Numeric Text Image Audio Video
Digital representation
How do we represent data? (2) continuous
Sampling (spatial)
• We can sample a continuous signal across space in
discrete units (e.g. pixels in an image)
• The density determines how much detail/resolution
is captured (e.g. 24.5 megapixel camera sensor)
Sampling density
Original Sampling
(spatial)
How do we represent data? (2) continuous
Sampling (time)
• We can sample a continuous signal (e.g. sound) across time
in discrete units depending on the frequency (Hz)
• The frequency determines how much detail/
resolution is captured (e.g. 44.1 kHz for audio les)
Analog Digital
Sampling
(time)
fi
How do we represent data? (2) continuous
Quantization
Quantized representation
• A continuous value (e.g. of continuous values
Output Signal
• The conversion approximates
(quantizes) the signal to the
nearest binary value Input Voltage
• The number of bits determines
the precision and range of the
Sensor
conversion
How can we check for errors/integrity?
Checksum
• Used for checking errors in data (e.g. during copying/transmission)
• Simple mathematical function to generate a value (checksum)
e.g. addition/division
• Checksum included together with data so that it can be veri ed at
destination
Compute checksum Compute checksum and verify
fi
How can we check for errors/integrity?
Hashing
• Used for checking integrity of data (e.g. changes due to tampering)
• Uses one-way hash function to generate a ngerprint
‣ Fingerprint is unique and any change in data will alter it
‣ Original data cannot be reconstructed from ngerprint
• Original ngerprint is communicated separately and compared with
the computed ngerprint at destination for veri cation
Transmission
Data Data
Compute Compute
hash hash
Compare to verify integrity
Fingerprint Fingerprint
fi
fi
fi
fi
fi
Demo
https://emn178.github.io/online-tools/
How can we compress digital data?
• Information content in data can be measured using
Shannon entropy, which measures the amount
of uncertainty in data
‣ High entropy → less predictable data
‣ Low entropy → more predictable data
• Low entropy data has more redundancy,
making it easier to compress or represent
compactly
Compress
AAAAAABBBBCCCCC A6B4C5
Decompress
A6B4C5 AAAAAABBBBCCCCC
How can we compress digital data?
Example: Lossy compression
• Typically used to compress image or audio data by removing
details that less noticeable by human perception
• JPEG compression for images
‣ Reduces the colors and blurs/smooths out details
‣ Original image data is not recoverable
‣ Decompression may introduce artifacts
CPU
Input Output
Memory
fi
How do we communicate through a network?
TCP/IP protocol
• TCP splits data into smaller, numbered packets
• Packets are labeled with source and destination addresses (IP)
• Packets are routed to the destination with the IP protocol
• TCP checks if packets are missing or contain errors, and requests
source to resend packets
• TCP reassembles packets in the correct order at destination
Packets Routing Unpack
Source Destination
1 1
Split Assemble
2 2
Resend Error/missing?
How do we communicate through a network?
‣ IPv6 (16 bytes = 128 bits): more recent but not widely used yet,
covers 2128 = 3.4 x 1038 addresses
fi
How do we communicate through a network?
Allocation of addresses
• Addresses are allocated for ef ciency and separation of functions
• For IPv4, the addresses are more limited and they are allocated
into private (LAN), public (WAN) and reserved categories
Large internal
10.0.0.0 - 10.255.255.255 ~ 16 million
networks
Private Medium internal
172.16.0.0 - 172.31.255.255 ~ 1 million
(LAN) networks
Gateway
Public network
(Internet)
VPN
gateway Internet
service
provider
Private (ISP)
network
How do we communicate through a network?
Resolves domain
Private Accessing internal services like Internal DNS (e.g.
names within a private
DNS hospital portals. 192.168.1.1)
network.
fi
How do we communicate through a network?
Encryption Decryption
Encryption
Shared
key
Decryption
fi
Demo
https://www.kerryveenstra.com/cryptosystem.html
How do we secure data? (1) symmetric
Data Encryption
Legacy standard 1970s 56 bits Weak, can be cracked
Standard (DES)
Alice Bob
Can be intercepted
Encrypt Decrypt
Private
Decrypt Encrypt
Who Can Only the recipient with Anyone with the sender’s
Decrypt? their private key public key
Public CA
Public
Private CA
Trusted
Encrypt with
private CA key
Signed certi cate
with public key
Signed
public
fi
fi
How do we secure transmission of data?
Encrypt with
public server key
http://example.com
https://example.com
Option to
encrypt
fi
How do we secure storage of data?
Data storage encryption Device-level
(symmetric) • All data written to
device is encrypted
• Protects data in
device if lost/stolen
Device-level File-level
Bitlocker File-level
(Windows) • Only selected les are
encrypted
FileVault
... • Protects data in
(MacOS) encrypted les for
selected access
fi
fi
Securing data: recommendations
Storage
• Activate device-level encryption to protect storage devices
containing sensitive data against loss/theft
• Encrypt sensitive les that are shared (e.g. email) and share the
password/key over a different channel to reduce chance of
interception
Communication
• When using the browser, always use HTTPS (https://) protocol
• When using email, consider encryption
• Do not send any sensitive information over unsecured connection
fi
Securing data: recommendations
Passwords
• Follow
organizational
guidelines
• Consider
passphrases
that have higher
entropy and
easier to
remember
Are our passwords safe?
https://haveibeenpwned.com/
4.
Digital identity
What is identity?
• Identity is how we represent ourselves to others and how others
recognize us
• An identi er is a unique piece of information (e.g. NRIC) that
distinguishes us from others
• We have different identi ers in different spheres of life
Centralized
Access to internal • Data breaches
Organization
systems like
resources/services • Mismanaged credentials
Single Sign-On (e.g. sharing, failure to
(e.g. hospitals)
(SSO) revoke)
National
O cial government • Data breaches
Government databases (e.g.,
and linked services • Potential misuse for tracking
GovTech) • Misuse of personal data
ffi
Securing identity: recommendations
Fake WiFi
hotspot
Active
• Data that is submitted actively to online services are
collected as part of the service (e.g. personal Content
information, images, comments)
• Data may be visible publicly (depending on the
privacy setting)
• Data (e.g. photos/posts) may be hard to remove Genie-in-
a-bottle
once it is made public as it is archived (https://
web.archive.org/)
Passive
• Online services can collect data without active IP
tracking
participation (e.g. online activities and IP addresses)
Where are digital footprints? (1) online
Agreements
• Platforms (e.g. social media) have agreements that
include clauses to give rights over users' data and
online activities
• Common themes include:
Clause Explanation
Data collection Platforms collect personal info, activity, device data
Data ownership User owns content but platforms can use it
Tracking Permission to track activities across sites
Data sharing Data shared with advertisers and partners
Data retention Platforms may retain data even after account deletion
Liability Platforms not liable for breaches or data misuse
Where are digital footprints? (2) local
Connection
Keep track of
user (e.g. login,
preference, cart)
Cookie
Browser Online service
Where are digital footprints? (2) local
Browser tracking ( ngerprinting)
• Cookies by third parties (e.g advertising services not directly related
to primary service) can be misused to track users without consent
• Third-party cookies can be blocked but online services can still track
users by ngerprinting
• Fingerprinting is based on triangulating multiple characteristics
(type of browser/computer, IP address) to create a unique pro le for
tracking without user's consent
Connection
Collect characteristics
Browser Online service
fi
fi
fi
Demo
https://trackme.dev/
Why should we care about digital footprints?
Privacy Security
• Personal data: private information • Target: personal information
can be exploited makes one vulnerable to phishing,
identity theft
• Patient con dentiality: sharing of
sensitive information breaks trust • Tracking: open to exploitation
Professionalism Ethical/Legal
• Online reputation: impacts • Regulation: compliance to privacy
personal credibility laws of patient information
• Public perception: impacts • Ethics: duty to protect sensitive
perception of healthcare data and minimize harm
students/workers
fi
Digital footprints: recommendations
Manage tracking
• Do not accept unnecessary cookies
• Use browser extensions to block tracking