Data Representation
Data Representation
Introduction
Data representation is a key concept in computer science, which refers to the way in which data is
stored, processed, and transmitted in a computer system. At the IGCSE level, students are expected
to understand how computers represent different types of data (e.g., numbers, characters, images)
using binary, hexadecimal, ASCII, Unicode, and various encoding techniques.
1. Binary System
Binary (Base-2) is the fundamental number system used by computers. All data in a computer is
ultimately represented in binary (1s and 0s).
Conversions
• Binary to Decimal:
To convert a binary number to decimal, multiply each bit by 2 raised to the power of its
position (starting from the right with 0).
Example:
Convert 1011 to decimal.
• Decimal to Binary:
To convert a decimal number to binary, divide the number by 2 and keep track of the
remainders.
Example:
Convert 13 to binary.
13÷2=6 remainder 113 \div 2 = 6 \text{ remainder } 113÷2=6 remainder 1 6÷2=3 remainder 06 \div 2
= 3 \text{ remainder } 06÷2=3 remainder 0 3÷2=1 remainder 13 \div 2 = 1 \text{ remainder }
13÷2=1 remainder 1 1÷2=0 remainder 11 \div 2 = 0 \text{ remainder } 11÷2=0 remainder 1
2. Hexadecimal System
Hexadecimal (Base-16) is a number system used to represent large binary numbers more compactly.
It uses the digits 0-9 and the letters A-F to represent values from 0 to 15.
• Hexadecimal to Binary:
Each hexadecimal digit can be converted to 4 bits.
Example:
Convert B2 to binary:
o B = 1011 (binary)
o 2 = 0010 (binary)
ASCII is a 7-bit character encoding standard used to represent text in computers and other devices.
• Range: ASCII uses 7 bits to represent characters, giving 128 possible values (from 0 to 127).
• Characters: ASCII includes control characters (like newline, tab) and printable characters
(letters, digits, punctuation marks).
Examples:
4. Unicode
Unicode is a more comprehensive character encoding system than ASCII, designed to cover all the
world's writing systems and many special characters.
• Wide Range: Unicode supports over 137,000 characters, allowing the representation of
virtually all characters used in modern languages.
• Encoding Forms: Unicode can be implemented using different encoding schemes, such as:
• Byte: 8 bits. A byte can represent a character, such as a letter or number, in ASCII.
Also included in the new conversion are the old named measurements, only that the concept of
mathematics has been encompassed in the field of computer science, an appreciation of the kilo
being a thousand has been embraced, hence
6. Floating-Point Representation
Floating-point numbers are used to represent real numbers (decimals) in computers. They consist of
three parts:
Example: Representing the number -6.75 in IEEE 754 format involves breaking the number into
these components, converting them to binary, and following the IEEE 754 encoding scheme.
7. Data Compression
Data compression reduces the size of data to save space or transmission time. There are two types
of compression:
• Lossless Compression: The original data can be perfectly reconstructed after decompression
(e.g., ZIP, PNG).
• Lossy Compression: Some data is lost during compression, which may lead to a decrease in
quality (e.g., JPEG, MP3).
In data transmission, errors can occur due to noise or interference. Error detection and correction
methods are used to ensure data integrity.
• Parity Bits: An extra bit added to a byte to make the number of 1s either even (even parity)
or odd (odd parity). It helps in detecting single-bit errors.
• Checksums: A value calculated from a data block and sent along with the data. The receiver
recalculates the checksum and compares it to detect errors.
• Hamming Code: An error-correcting code that can detect and correct single-bit errors in
data.
• Images: Images are stored as pixels. Each pixel can be represented by a combination of
binary values (often using RGB color model).
o RGB: Each pixel is represented by three values: Red, Green, and Blue. Each color is
typically represented by 8 bits (1 byte), allowing 256 different values per color.
• Sound: Sound is represented as a waveform. The sound is sampled at regular intervals, and
each sample is stored as a binary number.
o Sampling Rate: The number of samples taken per second (measured in Hz).
o Bit Depth: The number of bits used to represent each sample (higher bit depth
means better sound quality).
• ASCII and Unicode are encoding standards for representing characters and text.
• Data compression is used to reduce the size of files, with lossless and lossy methods
available.
• Images and sound are represented using binary values based on pixels and samples,
respectively.
Key Terms
• Bit
• Byte
• Binary
• Hexadecimal
• ASCII
• Unicode
• Floating-point
• Data Compression
• Sampling
• Bit Depth
• Parity Bit