Data Representation
Data Representation
CS BY IFFAT ANSARI
Data Representation
Computers use binary digits, also known as bits, to represent information. Each bit can be
in one of two states: "on" or "off," which are typically represented by the digits 1 and 0,
respectively. These individual switches are called transistors. Transistors are electronic
devices that can control the flow of electricity. They act as switches that can be turned on or
off by an electrical signal.
Computers contain millions, or even billions, of these transistors organized in intricate
patterns on integrated circuits. These transistors work together to perform calculations, store
data, and execute various tasks.
By combining these simple binary switches, computers can represent and manipulate
complex information such as numbers, text, images, and videos. This binary system forms
the foundation of digital computing and is fundamental to the operation of modern computers
What are Number systems?
A numeral system (or system of numeration) is a writing system for expressing numbers;
that is, a mathematical notation for representing numbers of a given set, using digits or other
symbols in a consistent manner.
Decimal 10 Yes No
Binary 2 No Yes
Hexadecimal 16 Yes No
Page 1 of 21
The value of a digit is determined by multiplying it with the corresponding power of 10 and
CS BY IFFAT ANSARI
adding the products together. For example, in the number 456:
4 × 10^2 (4 multiplied by 10 raised to the power of 2, which is 100) represents 400.
5 × 10^1 (5 multiplied by 10 raised to the power of 1, which is 10) represents 50.
6 × 10^0 (6 multiplied by 10 raised to the power of 0, which is 1) represents 6.
Adding these together: 400 + 50 + 6 = 456.
This positional system allows us to represent and work with numbers of various magnitudes
and perform arithmetic operations such as addition, subtraction, multiplication, and division
in a straight forward manner.
For Example
The number 2020 is interpreted as
= (2020)10
= (1024 + 512 + 256 + 128 + 64 + 32 + 4)10
= (210x1+29x1+28x1+27x1+26x1+25x1+24x0+23x0+22x1+21x0+20x0) 10
= (11111100100)2
Binary Number System
The binary number system is a numbering system that represents numeric values using two
unique digits (0 and 1). Mostly computing devices use binary numbering to represent
electronic circuit voltage state, (i.e., on/off switch), which considers 0 voltage input as off and
1 input as on. This is also known as the base-2 number system, or the binary numbering
system.
For example
14 can be written as 1110, 19 can be written as 10011, 50 can be written as 110010.
Hexadecimal Number System
The hexadecimal number system, often referred to as base-16, uses 16 digits to represent
numbers. The first 10 digits are the same as in the decimal system (0 to 9), and the remaining
six digits are represented by the letters A, B, C, D, E, and F. In the hexadecimal system, the
letters A to F are used to represent the decimal numbers 10 to 15, respectively. Here's the
correspondence between decimal and hexadecimal numbers:
Page 2 of 21
0 =0
CS BY IFFAT ANSARI
1 =1
2 =2
3 =3
4 =4
5 =5
6 =6
7 =7
8 =8
9 =9
10 = A
11 = B
12 = C
13 = D
14 = E
15 = F
When working with hexadecimal numbers, the position of each digit also carries a weight,
just like in the decimal system. However, in hexadecimal, the weights are powers of 16. The
rightmost digit represents the ones place, the next digit to the left represents 16 raised to
the power of 1 (16), the next digit represents 16 raised to the power of 2 (256), and so on.
The hexadecimal system is commonly used in computer science and digital systems to
represent binary numbers more compactly and to specify memory addresses, colors, and
other data structures.
Page 3 of 21
Types of Conversion
CS BY IFFAT ANSARI
Conversion 1
Binary to Decimal/Denary
The conversion from binary to denary/decimal is a relatively straight-forward process. Each
time a 1-value appears in a binary number column, the column value (heading) is added to
a total. Consider the following examples:
Example 1:
Convert the binary number, 11101110 into a denary number.
1 1 1 0 1 1 1 0
(29) (28) (27) (26) (25) (24) (23) (22) (21) (20)
(210) 256
1024 512 128 64 32 16 8 4 2 1
1 1 1 1 0 0 0 1 0 1 1
Page 4 of 21
Method 1
CS BY IFFAT ANSARI
Example 1
Convert denary number 142 into binary:
The denary number 142 is made up of 128 + 8 + 4 + 2 (that is, 142 – 128 = 14; 14 – 8 = 6;
6 – 4 = 2, 2 - 2 = 0) in each stage, subtract the largest possible power of 2 and keep doing
this until 0 is reached. This gives the following 8-bit binary number:
128 64 32 16 8 4 2 1
1 0 0 0 1 1 1 0
Example 2
Convert denary number 59 into binary:
The denary number 59 is made up of 32 + 16 + 8 + 2 + 1 (that is, 59 – 32 = 27;
27 – 16 = 11; 11 – 8 = 3; 3 – 2 = 1, 1 – 1 = 0)
128 64 32 16 8 4 2 1
0 0 1 1 1 0 1 1
Method 2
Example 1
This method involves successive division by 2. Start with the denary number, 142, and divide
it by 2. Write the result of the division including the remainder (even if it is 0) under 142 (that
is 142 / 2 = 71 remainder 0); then divide again by 2 (that is, 71 / 2 = 35 remainder 1) and
keep dividing until the result is zero. Finally write down all the remainders in reverse order:
Page 5 of 21
Example 2
CS BY IFFAT ANSARI
Conversion 3
Binary to Hexadecimal Conversion
Converting from binary to hexadecimal is a fairly easy process. Starting from the right and
moving left, split the binary number into groups of 4 bits. If the last group has less than 4
bits, then simply fill in with 0s from left. Take each group of 4 bits and convert it into the
equivalent hexadecimal digit.
Example 1
1000 1110
8421 8421
8 8+4+2
8 15
8 E
(10001110)2 = (8E)16
Example 2
0011 1011. 1110
8421 8421 8421
2+1 8+2+1 8+4+2
3 11 14
3 B E
(111011.111)2 = (3BE)16
Page 6 of 21
Conversion 4
CS BY IFFAT ANSARI
Hexadecimal to Binary Conversion
In converting a hexadecimal number into binary, each hex digit is converted into four bits of
binary.
Example 1
4 5 A
8 4 2 1 8 4 2 1 8 4 2 1
0 1 0 0 0 1 0 1 1 0 1 0
So, the resulting binary number is 0100 0101 1010
Example 2
Convert hexadecimal BF08 into binary.
B F 0 8
8 4 2 1 8 4 2 1 8 4 2 1 8 4 2 1
1 0 1 1 1 1 1 1 0 0 0 0 1 0 0 0
So, the resulting binary number is 1011 1111 0000 1000
Conversion 5
Hexadecimal to Decimal/Denary Conversion
This conversion involves the heading values of each hexadecimal digit
that is, 4096, 256, 16, 1.
Take each of the hexadecimal digits and multiply it by the heading values. Add all the
resultant totals together to give the denary number. Remember that the hex digits A- F need
to be first converted to the values 10 -15 before carrying out the multiplication.
Example 1
2 x (160) = 2 x 1 = 2
C x (161) = 12 x 16 = 192
3 x (162) = 3 x (16 x 16) = 3 x 256 = 768
A x (163) = 10 x (16 x 16 x 16) = 10 x 4096 = 40960
Add the results together: 2 + 192 + 768 + 40960 = 41922
Example 2
Page 7 of 21
7DE is a hex number
CS BY IFFAT ANSARI
7DE = (7 * 162) + (13 * 161) + (14 * 160)
7DE = (7 * 256) + (13 * 16) + (14 * 1)
7DE = 1792 + 208 + 14
7DE = 2014 (in decimal number)
Conversion 6
Denary/Decimal to Hexadecimal
Conversion from denary to hexadecimal involves successive division by 16 until the value
“0” is reached.
Example 1
Example 2
Advantages of Hexadecimal
• Compact representation: Hexadecimal numbers can represent a larger range of
values using fewer digits compared to decimal numbers. For example, two
hexadecimal digits can represent 256 different values (from 00 to FF), whereas two
decimal digits can only represent 100 values (from 00 to 99). This compact
representation is particularly useful in computer memory addressing and storage.
• Readability and human-friendly: Hexadecimal numbers are often easier for humans
to read and understand than binary or even decimal numbers. With a smaller set of
symbols and a compact representation, hexadecimal numbers can be quickly
interpreted and communicated between humans, making it easier to analyze and
debug data or code.
• Conversions and operations: Hexadecimal numbers are straight-forward to convert
to binary and vice versa. Each hexadecimal digit corresponds directly to four binary
bits, making it convenient for bitwise operations and digital logic manipulations. This
Page 8 of 21
property is especially advantageous in computer architecture, low-level
CS BY IFFAT ANSARI
programming, and hardware design.
• Application in data science and machine learning: In data science, hexadecimal
numbers can be used to represent colors, where each color component (red, green,
blue) can be represented by two hexadecimal digits. In machine learning and artificial
intelligence, hexadecimal numbers can be used to encode and manipulate data
structures, network addresses, and other numerical representations.
Overall, the hexadecimal number system offers benefits such as compactness, memory
efficiency, easy conversion, readability, and usability in various domains, making it a
valuable tool in computer-related fields and beyond.
Uses of Hexadecimal
• Error codes
• MAC address
• IPv6 address
• HTML color codes
Error Codes
Error codes are often represented in hexadecimal format, particularly in computer systems
and software development. These codes provide information about specific errors that occur
during program execution or system operation. These number refers to the memory location
of the error and are usually automatically generated by the computer. The programmer
needs to know how to interpret the hexadecimal error codes.
Memory Dump
A memory dump refers to the procedure of displaying and storing the contents of computer
memory in the event of an application or system failure. It allows software developers and
system administrators to analyze, identify, and resolve issues that caused the failure of the
application or system.
In the realm of computing, a hex dump entails presenting computer data, whether it
originates from RAM, a file, or a storage device, in a hexadecimal format. Hex dumps are
frequently utilized as a debugging or reverse engineering technique to examine and interpret
data at a low-level. This involves viewing the data in a hexadecimal representation on a
screen or on paper.
Media Access Control (MAC) Address
In computer networking, a MAC address, short for Media Access Control address, is a
distinctive identifier assigned by the manufacturer to a network adapter or network interface
card (NIC). Unlike an IP address, the MAC address is permanent and cannot be changed.
Page 9 of 21
This means that even if a device's IP address changes, it can still be identified using its
CS BY IFFAT ANSARI
unique MAC address.
A MAC address is composed of six pairs of two-digit hexadecimal numbers, which translates
to 48 bits of binary code used by the computer. The format of a MAC address is typically
represented as NN-NN-NN-DD-DD-DD, where the first half (NN-NN-NN) corresponds to the
manufacturer's identity number, and the second half (DD-DD-DD) represents the device's
serial number. For example, in the MAC address 00:A0:C9:14:C8:29, the prefix 00A0C9
indicates that the manufacturer of the device is Intel Corporation.
The MAC address serves as a unique identifier for a device on a network. It specifically
identifies the network interface controller (NIC) associated with the device. MAC addresses
are often displayed as a series of hexadecimal digits separated by colons.
Hyper Text Markup Language color codes
HTML color codes is frequently used to specify the colors of text displayed on computer
screens. All colors can be formed by different combinations of the three primary colors: red,
green, and blue. The intensity of each color is determined by its hexadecimal value. In other
words, different hexadecimal values correspond to different colors. Here are some
examples:
#FF0000 represents the primary color red.
#00FF00 represents the primary color green.
#0000FF represents the primary color blue.
#FF00FF represents the color fuchsia.
#FF8000 represents the color orange.
#B18904 represents a tan color.
IP Address
Every device that connects to a network is assigned an address called the Internet Protocol
(IP) address. IPv4 (Internet Protocol version 4) is a widely used IP version that serves the
purpose of identifying devices on a network using an addressing system. It utilizes a 32-bit
address scheme, which provides a total of 2^32 addresses, equivalent to more than 4 billion
addresses. However, due to the exponential growth of the internet and the increasing
number of devices requiring unique addresses, IPv4 addresses have become scarce. To
address this limitation, IPv6 (Internet Protocol version 6) has been introduced as the most
recent IP version.
IPv6 boasts a significantly larger address space compared to IPv4. It employs a 128-bit
address format, which allows for approximately 340 undecillion unique addresses. This vast
expansion in address space enables the allocation of addresses to an unprecedented
number of devices, ensuring that the growing demand for internet connectivity can be met.
Page 10 of 21
By adopting IPv6, the internet infrastructure becomes capable of accommodating the
CS BY IFFAT ANSARI
expanding network of devices and supporting the evolving requirements of the digital age.
Addition of Binary
Example
Overflow Error
In computing, an overflow error can occur when a calculation is run but the computer is
unable to store the answer correctly. All computers have a predefined range of values they
can represent or store. Overflow errors occur when the execution of a set of instructions
return a value outside of this range.
The greater the number of bits which can be used to represent the number then the larger
the number that can be stored. For example, a 16-bit register would allow a maximum denary
value 65,535 to be stored, a 32-bit register would allow a maximum denary value of
4,294,969,295
Example
Page 11 of 21
CS BY IFFAT ANSARI
Logical Binary Shift
Binary numbers are multiplied and divided through a process called shifting. There are two
types of binary shift - arithmetic and logical.
Multiplication
To multiply a number, a binary shift moves all the digits in the binary number along to the
left and fills the gaps after the shift with 0:
• to multiply by two, all digits shift one place to the left
• to multiply by four, all digits shift two places to the left
• to multiply by eight, all digits shift three places to the left
Example
Division
To divide a number, a binary shift moves all the digits in the binary number along to the right
and fills the gaps after the shift with 0:
• to divide by two, all digits shift one place to the right
• to divide by four, all digits shift two places to the right
• to divide by eight, all digits shift three places to the right
Example
Page 12 of 21
CS BY IFFAT ANSARI
Two’s Complement
It is possible to represent negative numbers using the binary numbers system. To do this we
must make use of a system called 2s complement.
Sign bit
With 2s complement, the first thing that we should note is that the left most bit (MSB) is
reserved as a sign bit. This means that the left most bit cannot be used to represent a number
but rather it is used to signify positive or negative. If the left most bit is 1 then the number is
negative If the left most bit is 0 then the number is positive
Converting positive binary into negative binary
Take the positive binary for 1 which (if using the 4-bit system above) is 0001.
Step 1 - Change the sign bit to 1:
This would leave us with 1001
Step 2 - Flip the bits. - This literally means change the 1s to 0 and the 0s to 1. This should
be applied to all bits except the sign bit: This would leave us with 1110
Step 3 - Add 1 to the binary number 1110:
This would leave us with 1111 and therefor -1 in binary 2s compliment is 1111
Converting negative binary into positive binary
Converting a negative 2s complement binary number into a positive one can be achieved by
following these steps.
Step 1 - Flip the sign bit from a 1 to a 0
Step 2 - Flip all of the bits i.e., 0 to 1 and 1 to 0
Step 3 - Add 1
The result will be the positive binary representation. Once you have this you can then
perform a binary to denary conversion should you require the denary value.
Method 1
Now let’s consider the number 67. One method of finding the binary equivalent to 67 is to
simply put 1s in their correct places:
Page 13 of 21
CS BY IFFAT ANSARI
Method 2
Character Set
A character set encompasses all the characters present in a particular encoding system,
along with their corresponding binary values. When a computer processes text, it converts
it into binary using a character set. A character set includes various types of characters,
such as letters (both uppercase and lowercase), numbers, symbols, and even non-printing
commands like Enter or Delete.
To represent text digitally, each character must have a unique bit-pattern assigned to it.
These bit-patterns consist of combinations of 1s and 0s that represent data within a
computer. Each character is associated with a numeric character code, which is derived
from its bit-pattern. For effective communication and seamless exchange of text between
computers, a standardized character set is essential. This standard defines which character
code is used to represent each character.
Two Types of Character Set:
1. ASCII
2. Unicode
ASCII
The ASCII (American Standard Code for Information Interchange) character set is a 7bit
encoding system that allows for the representation of 128 different characters. This set
includes all uppercase and lowercase letters, digits, and common punctuation marks found
on most keyboards. ASCII is primarily used for the English language. It represents the
various characters on a standard keyboard. It also includes 32 control codes (using codes 0
to 31 in decimal or 00 to 19 in hexadecimal) for special functions.
To accommodate characters from non-English alphabets and include some graphical
characters, an extended version called Extended ASCII was introduced. Extended ASCII
uses 8-bit codes, providing an additional 128 codes (ranging from 0 to 255 in decimal or 00
to FF in hexadecimal). However, ASCII has certain limitations. One major disadvantage is
that it does not represent characters from non-Western languages, such as Chinese
characters. Additionally, different operating systems, like DOS and Windows, may use
different characters for some ASCII codes, leading to inconsistencies.
Page 14 of 21
Over time, alternative coding methods have been developed to address these limitations.
CS BY IFFAT ANSARI
Nonetheless, ASCII remains a widely recognized and used character set, particularly in
English-based computer systems.
Unicode
Unicode is a character encoding system that uses between 8 and 32 bits per character,
allowing it to represent characters from languages used worldwide. It is widely adopted
across the internet and supports multiple operating systems, search engines, and internet
browsers. Compared to ASCII, Unicode has a much larger character repertoire, making it
suitable for global communication.
Global companies like Facebook and Google would not rely solely on the ASCII character
set because their users communicate in diverse languages. Unicode, on the other hand, can
encompass all languages and writing systems. While the first 128 characters in Unicode
overlap with the standard ASCII code, Unicode has the capacity to represent several
thousand different characters in total.
The Unicode consortium was established in 1991 with the goal of creating a universal
standard that covers all languages and writing systems. Unicode employs a more efficient
coding system than ASCII, using 16-bit or 32-bit encoding for each character. It ensures
unambiguous encoding, meaning that each 16-bit or 32-bit value consistently represents the
same character. Additionally, Unicode reserves a portion of the code for private use, allowing
users to assign codes for their own characters and symbols.
The limitation of ASCII is its inability to represent a wide range of characters found in various
languages, scripts, numbers, and symbols beyond the English alphabet. With the growing
prominence of the World Wide Web, the need for a universal international coding system
became crucial. Unicode emerged as the preferred choice, addressing the demand for a
comprehensive character set. Each Unicode character can be encoded using three different
encoding standards, with the minimum number of bits used depending on the encoding
standard selected.
Representation of Sound
Soundwaves are vibrations in the air that are perceived by the human ear as sound. Each
sound wave has three characteristics: frequency, wavelength, and amplitude. The amplitude
determines the loudness of the sound.
Properties of Soundwaves
Soundwaves are continuous and analog in nature, which means they vary smoothly.
However, computers can only work with digital data. Therefore, soundwaves need to be
sampled, which involves measuring the amplitude of the sound wave at regular time
intervals. This is done using an analog-to-digital converter (ADC).
Page 15 of 21
Sampling Soundwaves
CS BY IFFAT ANSARI
Sampling is the process of converting analog sound waves into digital data that can be stored
and processed by a computer. By sampling the sound wave at specific time intervals, an
approximate representation of the sound is obtained. The accuracy of the sampled sound
depends on the range of amplitudes used to represent the sound. The number of bits used
to represent each amplitude value is called the sampling resolution or bit depth.
Sample Rate
The sample rate refers to the number of sound samples taken per second. It is measured in
hertz (Hz), where 1Hz represents one sample per second. A higher sample rate results in a
more faithful representation of the original sound source. However, a higher sample rate
also leads to larger file sizes. Therefore, there is often a trade-off between sound quality and
file size, with commonly used sample rates being around 44.1 kHz for audio files.
The benefits and drawbacks of using a larger sampling resolution when recording sound
Benefits Drawbacks
Larger dynamic range Produces larger file size
Better sound quality Takes longer to
transmit/download music files
Less sound distortion Requires greater processing power
Representation of images
Bitmap images are made up of pixels, which are small dots that form the image. These
images are stored in computers as a series of binary numbers. Each pixel can be
represented by a binary number, allowing for different colors or shades in the image.
Representation of pixels in bitmap images
The representation of pixels in a bitmap image depends on the color depth. For example, a
black and white image only requires 1 bit per pixel, where 1 represents black and 0
represents white. If each pixel is represented by 2 bits, it can have four colors (00, 01, 10,
11). Similarly, with 3 bits per pixel, there can be eight colors.
Color Depth in Bitmap Images
The color depth refers to the number of bits used to represent each color in a pixel. An 8-bit
color depth allows for 256 colors (2^8 = 256), while modern computers typically have a 24-
bit color depth, enabling over 16 million colors. Increasing the color depth increases the
range of colors that can be represented but also increases the file size.
Page 16 of 21
Image Resolution and Quality
CS BY IFFAT ANSARI
Image resolution refers to the number of pixels that make up an image. Higher resolution
images have more pixels and therefore more detail. Lower resolution images have fewer
pixels and less detail. The resolution of an image can be adjusted, but reducing the
resolution decreases the image quality and can result in pixilation or fuzziness.
The relationship between resolution, color depth, and image quality impacts the file size of
an image. Increasing the resolution or color depth results in larger file sizes. This affects the
storage capacity of devices, download times from the internet, and transfer times between
devices. However, reducing the resolution can be done to decrease file size with some loss
of quality.
Data Measuring units
Unit Description
Bit This is the smallest measurement for data
Nibble There are 4 bits in a nibble
Byte There are 8 bits in a byte
Kibibyte (KiB) There are 1024 bytes in Kibibyte
Mebibyte There are 1024 Kibibytes in a Mebibyte
(MiB)
Gibibyte (GiB) There are 1024 Mebibytes in a Gibibyte
Tebibyte (TiB) There are 1024 Gibibytes in a Tebibyte
Pebibyte (PiB) There are 1024 Tebibyte in a Pebibyte
Page 17 of 21
For a stereo sound file, you would then multiply the result by two.
CS BY IFFAT ANSARI
Example 1
A photograph is 1024 x 1080 pixels and uses a color depth of 32 bits. How many
photographs of this size would fit onto a memory stick of 64GiB?
1. Multiply number of pixels in vertical and horizontal directions to find total number
of pixels = (1024 x 1080) = 1 105920pixels
2. Now multiply number of pixels by color depth then divide by 8 to give the number
of bytes = 1 105920 x 32 = 35389440/8 bytes = 4423680 bytes
3. 64 GiB = 64 x 1024 x 1024 x 1024 = 68719476736 bytes
4. Finally divide the memory stick size by the files size = 68719476736/4423 = 15 534
photos.
Example 2
A camera detector has an array of 2048 by 2048 pixels and uses a color depth of 16. Find
the size of an image taken by this camera in MiB.
1. Multiply number of pixels in vertical and horizontal directions to find total number
of pixels = (2 048 x 2 048) = 4194 304 pixels
2. Now multiply number of pixels by color depth = 4194304 x 16 = 67108864bits
3. Now divide number of bits by 8 to find the number of bytes in the file
= (67108864J/8 = 8 388 608 bytes
4. Now divide by 1024 x 1024 to convert to MiB = (83886081/(1 048576) = 8MiB.
Data Compression
Benefits of Compression
• Not as much storage space is needed to store the file
• It will take less time to transmit the file from one device to another
• It will be quicker to upload and download the file
• Not as much bandwidth is needed to transmit the file over the internet
Types of Compression
1. Lossy Compression
2. Lossless Compression
Lossy Compression
Lossy compression is a data compression technique that aims to reduce the file size of data,
such as images or sound, by permanently removing unnecessary or redundant information.
It achieves higher compression ratios compared to lossless compression, but it results in
some loss of data or quality. Lossy compression is commonly used in applications where
the minor loss of quality is acceptable, such as multimedia files.
Page 18 of 21
Lossy Compression in Sound
CS BY IFFAT ANSARI
In sound compression, the goal is to remove data that is imperceptible to the human ear.
This can include sounds that fall outside the audible frequency range or softer sounds that
are masked by louder sounds. Techniques like perceptual audio coding are employed to
identify and remove such imperceptible sounds. The popular MP3 format uses lossy
compression to reduce the file size of audio files by removing redundant or less audible
information. Reducing the sample rate and sample resolution also helps in reducing file size
while compromising some audio quality.
Lossy Compression in Images
In image compression, unnecessary or less distinguishable data is removed to reduce the
file size. This can involve reducing the color depth or reducing the image resolution. The
color depth refers to the number of bits used to represent each color in an image, and
reducing it decreases the number of colors available, resulting in a smaller file size. Image
resolution refers to the number of pixels in an image, and reducing it decreases the level of
detail in the image. The JPEG format is a widely used lossy compression algorithm for
bitmap images, which achieves significant file size reduction while maintaining acceptable
image quality.
Loss of Detail and Quality
Due to the removal of data in lossy compression, there is a loss of detail and quality
compared to the original file. However, the algorithms used in lossy compression aim to
remove data that is less noticeable to human perception. The amount of loss depends on
the compression settings and the specific algorithm used. In some cases, the loss may be
unnoticeable to most users, while in other cases, there may be noticeable degradation in the
compressed file, especially if excessive compression is applied.
Lossless Compression
In lossless compression, the original uncompressed file can be completely reconstructed
without any loss of data. This is important in situations where preserving every detail of the
file is crucial, such as transferring complex spreadsheets or downloading computer
applications.
One example of lossless compression is run-length encoding (RLE). RLE is a technique that
reduces the size of a string of adjacent, identical data. It works by encoding a repeating
string into two values: the first value represents the number of identical data items in the run,
and the second value represents the code of the data item (e.g., ASCII code for keyboard
characters). RLE is effective when there is a long run of repeated units or bits in the data.
It's important to note that while lossless compression techniques preserve all the original
data, they may not achieve the same level of compression as lossy compression techniques.
Lossless compression is typically used in situations where maintaining data integrity is a
priority, even if it means larger file sizes compared to lossy compression.
Page 19 of 21
Key terms
CS BY IFFAT ANSARI
Bit - the basic computing element that is either 0 or 1, and is formed from the word Binary
digit Binary number system a number system based on 2 and can only use the values 0 and
1
Hexadecimal number system - a number system based on the value 16 which uses denary
digits 0 to 9 and letters A to F Error code - an error message generated by the computer
MAC address - standing for Media Access Control, this address (given in hexadecimal) is
a that uniquely identifies a device on the internet; it takes the form: NN-NN-NN-DDDD-DD,
where NN-NN-NN is the manufacturer code and DD-DD-DD is the device code NN-NN-NN-
DD-DD-DD
IP address - Internet Protocol identified either as IPv4 or IPv6; it gives a unique address to
each device connected to a network identifying their location
HTML (Hypertext Mark-up Language) is used in the design of web pages and to write, for
example, http(s) protocols; in the context of this chapter, colors used in web pages are
assigned a hexadecimal code based on red, green and blue colors
Overflow error - the result of carrying out a calculation that produces a value that is too
large for the computer s allocated word size (8-bit, 16-bit. 32-bit, and so on)
Logical shift - an operation that shifts bits to the left or right in a register; any bits shifted
out of a register (left or right) are replaced with zeroes
Two’s complement - a method of representing negative numbers in binary; when applied
to an 8-bit system, the left-most bit (most significant bit) is given the value 128
ASCII code - a character set for all the characters on a standard keyboard and control codes
Character set - a list of characters that have been defined by computer hardware and
software. The character set is necessary so that the computer can understand human
characters
Unicode - a character set which represents all the languages of the world (the first 128
characters are the same as ASCII code) Sampling resolution - the number of bits used to
represent sound amplitude in digital sound recording (also known as bit depth) Bit depth -
the number of bits used to represent the smallest unit in a sound file
Color depth - the number of bits used to represent the colors of a pixel
Sampling rate - the number of sound samples taken per second in digital sound recording
Page 20 of 21
Pixel - derived from the term picture element’, this is the smallest element used to make up
CS BY IFFAT ANSARI
an image on a display Image resolution - the number of pixels in the X-Y direction of an
image, for example, 4096 x 3192 pixels
Pixelated (image) - this is the result of zooming into a bitmap image; on zooming out the
pixel density can be diminished to such a degree that the actual pixels themselves can be
seen
Pixel density - number of pixels per square inch
Compression - reduction of the size of a file by removing repeated or redundant pieces of
data; this can be lossy or lossless Bandwidth - the maximum rate of transfer of data across
a network, measured in kilobits per second (kbps) or megabits per second (Mbps)
decompression process for example. JPEG, mp3
Lossless (file compression) - a file compression method that allows the original file can be
fully restored during the decompression process, for example, run length encoding (RLE)
Audio compression - a method used to reduce the size of a sound file using perceptual
music shaping MP3 a lossy file compression method used for music files
MP4 a lossy file compression method used for multimedia files
JPEG - from Joint Photographic Expert Group; a form of lossy file compression used with
image files which relies on the inability of the human eye to distinguish certain colour
changes and hues
Run length encoding (RLE) - a lossless file compression technique used to reduce the size
of text and photo files in particular
Page 21 of 21