0% found this document useful (0 votes)
9 views

Number System and Data Representation Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Number System and Data Representation Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

MULTIMEDIA UNIVERSITY OF KENYA (MMU)

Course Unit Code: ICS 2112/ICS 2101


Course Unit Name: Computer Organisation
Instructor: Albert Opondo

LECTURE: Number Systems and Data Representation

INTRODUCTION TO DATA REPRESENTATION

What is Data Representation?

- Definition: Data representation refers to the methods used to store, process,


and transmit information within a computer system. All types of data
(numbers, characters, images, audio, etc.) are ultimately converted into a
format that computers can understand, which is binary (0s and 1s).
- Purpose: Computers operate using electrical signals, and the presence or
absence of voltage (usually represented by 0 and 1) is how they process
information.
Understanding how data is represented is essential for working with computers at the
hardware, software, and application levels.
- Types of Data Representation:
- Numeric Data: Numbers are converted into binary form (the system of 0s and
1s).
- Text Data: Characters are represented using encoding systems like ASCII
(American Standard Code for Information Interchange) or Unicode.
- Multimedia Data: Images, audio, and video are also converted into binary,
often compressed using specific algorithms.

What is Number System?

A number system is a way to represent and express numbers using a consistent set of
symbols and rules. It defines how numbers are represented, manipulated, and understood.
Number systems are foundational in mathematics and computer science, as they are used to
perform calculations, represent data, and express mathematical concepts. Here are some key
components and types of number systems:

Components of a Number System

1. Base: The base (or radix) of a number system determines how many unique digits or
symbols are used to represent numbers. For example:
- In the decimal system (base-10), the digits are 0-9.
- In the binary system (base-2), the digits are 0 and 1.

- In the octal system (base-8), the digits are 0-7.


- In the hexadecimal system (base-16), the digits are 0-9 and A-F.
2. Digits: The symbols used to represent numbers in a particular base. Each position in a
number has a value that is a power of the base.
3. Place Value/ Positional Value: The value of a digit is determined by its position in
the number. For example, in the decimal number 345, the '3' represents 300 (3 x 102),
the '4' represents 40 (4 x 101), and the '5' represents 5 (5 x 100).

Types of Number Systems

1. Decimal (Base-10):
- The most common number system used in everyday life.
- Uses ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
- Example: The number 256 in decimal represents 2×102+5×101+6×100
2. Binary (Base-2):
- Used primarily in computer systems and digital electronics.
- Uses two digits: 0 and 1.
- Example: The binary number 1011 represents 1×23+0×22+1×21+1×20=11 in
decimal.
3. Octal (Base-8):
- Uses eight digits: 0, 1, 2, 3, 4, 5, 6, 7.
- Example: The octal number 25 represents 2×81+5×80=21 in decimal.
4. Hexadecimal (Base-16):
- Commonly used in computing and programming.
- Uses sixteen symbols: 0-9 and A (10), B (11), C (12), D (13), E (14), F (15).
- Example: The hexadecimal number 1A represents 1×161+10×160=26 in
decimal.

Importance of Number Systems

- Computational Representation: Different number systems are used in


computing to represent data efficiently. For example, binary is used in
computer memory and processing, while hexadecimal is often used in
programming and debugging.
- Mathematical Operations: Number systems provide the framework for
performing arithmetic operations. Understanding how to convert between
systems is essential for various applications in mathematics and computer
science.
- Data Representation: Various fields, such as cryptography, graphics, and
digital communications, rely on different number systems for data encoding
and transmission.

In summary, a number system is a structured way to represent numbers using a defined set of
symbols and rules. Understanding number systems is crucial for mathematics, computer
science, and various applied fields.
Positional Value System

The positional value system (or place value system) is a method of representing numbers in
which the position of each digit in a number determines its value. This system is fundamental
in many number systems, including the decimal (base-10), binary (base-2), octal (base-8),
and hexadecimal (base-16) systems.

Key Concepts of the Positional Value System 1.

Base or Radix:

- The base of a number system indicates how many unique digits or symbols are
used to represent numbers.
- For example, in the decimal system (base-10), the digits are 0 to 9, while in
the binary system (base-2), the digits are 0 and 1.
2. Place Value:
- Each digit in a number has a place value that depends on its position
within the number. The value of each position is a power of the base.
- For instance, in the decimal number 456, the place values are:
- 4×102 (hundreds place)
- 5×101 (tens place)
- 6×100 (ones place)
- The total value is calculated by summing these values: 400+50+6=456
3. Counting System:
- The positional value system allows for compact representation of large
numbers, making it easier to perform arithmetic operations.
- For example, instead of writing the number "one thousand" as "1000," it can
be represented as 103 in the positional value system.

Examples of Positional Value Systems

1. Decimal System (Base-10):


- Uses ten digits (0-9).
- Example: The number 742 can be expressed as: 
7×102+4×101+2×100=700+40+2=742.
2. Binary System (Base-2):
- Uses two digits (0 and 1).
- Example: The binary number 1101 can be expressed as:
 1×23+1×22+0×21+1×20=8+4+0+1=13 in decimal.
3. Octal System (Base-8):
- Uses eight digits (0-7).
- Example: The octal number 27 can be expressed as: 
2×81+7×80=16+7=23 in decimal.
4. Hexadecimal System (Base-16):
- Uses sixteen symbols (0-9 and A-F).
- Example: The hexadecimal number 2F can be expressed as: 
2×161+15×160=47 in decimal.
Importance of the Positional Value System

- Simplicity: It simplifies the representation of large numbers and allows for


easier arithmetic calculations.
- Efficiency: Positional notation reduces the amount of space required to write
numbers compared to non-positional systems, such as tally marks.
- Universality: The system is used across various cultures and languages,
making it a standard way to represent numbers in mathematics, computing,
and everyday life.

In conclusion, the positional value system is a fundamental concept in mathematics that


assigns value to digits based on their position within a number. This system is essential for
understanding arithmetic operations, number representation, and various applications in
science and engineering.

NUMBER SYSTEMS AND DATA REPRESENTATION

Understanding number systems and data representation is fundamental in computer science


and programming. Different number systems are utilized to represent data in a format that
computers can process and understand. This lecture will cover the following topics:

1. Number Systems
- Decimal Number System
- Binary Number System
- Octal Number System
- Hexadecimal Number System
2. Conversions between Number Systems
- Converting Decimal to Binary, Octal, and Hexadecimal
- Converting Binary, Octal, and Hexadecimal to Decimal
- Conversions between Binary, Octal, and Hexadecimal
- Binary Arithmetic
3. Binary Data Representation
- Fixed-Point and Floating-Point Representation
- Binary Coding Schemes

NUMBER SYSTEMS

Decimal Number System (Base-10)

- The decimal system is the most commonly used number system in everyday life,
consisting of ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9.
- Each position in a decimal number represents a power of 10.
- For example, the number 253 can be expressed as: 2×102+5×101+3×100==253

Binary Number System (Base-2)

- The binary system uses only two digits: 0 and 1.


- Each position in a binary number represents a power of 2.
- For example, the binary number 1101 can be expressed as: 1×23+1×22+0×21+1×20=13

Octal Number System (Base-8)

- The octal system uses eight digits: 0, 1, 2, 3, 4, 5, 6, and 7.


- Each position represents a power of 8.
- For example, the octal number 157 can be expressed as: 1×82+5×81+7×80=111

Hexadecimal Number System (Base-16)

- The hexadecimal system uses sixteen symbols: 0-9 and A-F (where A=10, B=11,
C=12, D=13, E=14, F=15).

- Each position represents a power of 16.


- For example, the hexadecimal number 2F3 can be expressed as:
2×162+15×161+3×160=755

Conversions between Number Systems

Converting Decimal to Other Bases

1. Decimal Integer to Binary, Octal, Hexadecimal

- Decimal to Binary: Use successive division by 2. Record remainders


in reverse.
- Example: Convert 13 to binary:
- 13÷2=6 R1
- 6÷2=3 R0
- 3÷2=1 R1
- 1÷2=0 R1
- Binary: 1101
- Decimal to Octal: Use successive division by 8. Record remainders in
reverse.
- Decimal to Hexadecimal: Use successive division by 16. Record
remainders in reverse.

2. Decimal Fraction to Binary, Octal, Hexadecimal

- Decimal to Binary: Multiply the fraction by 2, take the integer part as the next binary
digit, and repeat with the fractional part.
- Decimal to Octal: Multiply the fraction by 8, take the integer part as the next octal
digit, and repeat.
- Decimal to Hexadecimal: Multiply the fraction by 16, take the integer part as the
next hexadecimal digit, and repeat.

3. Converting Decimal Integer.Fraction to Other Bases

- Convert the integer part using the methods above.


- Convert the fractional part using the methods described for fractions.

Converting from Other Bases to Decimal

- Binary to Decimal: Multiply each binary digit by its corresponding power of 2.


- Octal to Decimal: Multiply each octal digit by its corresponding power of 8.
- Hexadecimal to Decimal: Multiply each hexadecimal digit by its corresponding
power of 16.

Conversions between Non-Decimal Bases

- Binary to Octal: Group binary digits in sets of three (from right to left) and convert
each group.
- Binary to Hexadecimal: Group binary digits in sets of four and convert each group.
- Octal to Binary: Convert each octal digit directly into three binary digits.
- Hexadecimal to Binary: Convert each hexadecimal digit directly into four binary
digits.

Revision Questions:

1. Binary to Decimal

Q1: Convert the binary number 101011 to decimal.

Q2: What is the decimal equivalent of 1101.101?

Q3: Convert the binary number 11101001 to decimal.

2. Decimal to Binary

Q1: Convert the decimal number 43 to binary.

Q2: What is the binary equivalent of the decimal number 255?

Q3: Convert the decimal number 126 to binary.

3. Binary to Octal

Q1: Convert the binary number 101011 to octal.

Q2: What is the octal equivalent of 11001011?

Q3: Convert the binary number 11111111 to octal.

4. Octal to Binary

Q1: Convert the octal number 57 to binary.

Q2: What is the binary equivalent of the octal number 325?


Q3: Convert the octal number 704 to binary.

5. Binary to Hexadecimal

Q1: Convert the binary number 11101001 to hexadecimal.

Q2: What is the hexadecimal equivalent of the binary number 1010101011?

Q3: Convert the binary number 110101111 to hexadecimal.

6. Hexadecimal to Binary

Q1: Convert the hexadecimal number A7 to binary.

Q2: What is the binary equivalent of 3C5 in hexadecimal?

Q3: Convert the hexadecimal number FA3 to binary.

7. Decimal to Hexadecimal

Q1: Convert the decimal number 123 to hexadecimal.

Q2: What is the hexadecimal equivalent of 999 in decimal?

Q3: Convert the decimal number 450 to hexadecimal.

8. Hexadecimal to Decimal

Q1: Convert the hexadecimal number 7D to decimal.

Q2: What is the decimal equivalent of the hexadecimal number B2F?

Q3: Convert 1A3 from hexadecimal to decimal.

9. Octal to Decimal

Q1: Convert the octal number 134 to decimal.

Q2: What is the decimal equivalent of the octal number 725?

Q3: Convert 65 in octal to decimal.

10. Decimal to Octal

Q1: Convert the decimal number 82 to octal.

Q2: What is the octal equivalent of the decimal number 190?

Q3: Convert 233 from decimal to octal.


11. Hexadecimal to Octal

Q1: Convert the hexadecimal number 2A to octal.

Q2: What is the octal equivalent of the hexadecimal number 1F3?

Q3: Convert 7B in hexadecimal to octal.

12. Octal to Hexadecimal

Q1: Convert the octal number 175 to hexadecimal.

Q2: What is the hexadecimal equivalent of the octal number 605?

Q3: Convert 370 from octal to hexadecimal.

Binary Arithmetic

Binary arithmetic relies on a set of basic rules for addition, subtraction, multiplication, and
division, which are quite similar to those in decimal but adapted for the binary system (only 0
and 1 are used).

1. Binary Addition

Binary addition follows simple rules, with "carrying" occurring when the sum is 2 or more:

0+0=0

0+1=1

1+0=1

1 + 1 = 0 (which is 0, carry 1, note in the decimal system 1+1=2, and 2 in binary is 10)

1 + 1 + 1 = 1 (which is 1, carry 1, note in the decimal system 1+1+1=3, and 3 in binary is


11)

Example: Adding 1011 and 1101

111 (Carry)
1011
+ 1101
----------
11000

Step-by-step:

1. Add the rightmost column: 1+1=0 (write down 0, carry 1)


2. Next column: 1+1+0=0 (write down 0, carry 1)
3. Next column: 1+0+1=0 (write down 0, carry 1)
4. Leftmost column: 1+1+1=1 (write down 1, carry 1)
5. Write down the final carry: 1

Final Result: 11000 (which is 11+13=24 in decimal)

2. Binary Subtraction

Binary subtraction is similar to decimal subtraction but uses borrowing when needed:

Binary subtraction is similar to decimal subtraction but Binary subtraction uses "borrowing,"
where a "10"( 2 in decimal) in binary is borrowed if needed:

0-0=0

1-0=1

1-1=0

0 - 1 = 1 (with a borrow from the next higher bit)

Step-by-Step Binary Subtraction (1010 - 0111)

We’ll use the rules of binary subtraction with borrowing, which works similarly to decimal
subtraction but is adapted for binary.

1. Write the numbers in columns with the larger number on top:

1010

-0111

2. Start from the rightmost column:

 0 - 1: Since 0 is less than 1, we need to borrow 1 from the next column to the left. The
0 in the next column (second from the right) becomes 2 in binary, so the rightmost
column calculation becomes 10 - 1 = 1.
 Result for this column: 1

3. Move to the second column from the right:

 After borrowing, we now have 0 - 1 in the second column as well.


 Again, we borrow 1 from the next column. The calculation becomes 10 - 1 = 1. 
Result for this column: 1

4. Move to the third column from the right:


 Now we have 0 - 1 in this column, so we again borrow from the next column to the
left.
 The calculation becomes 10 - 1 = 1.
 Result for this column: 1

5. Leftmost column:

 After all borrowing, we now have 1 - 0 = 1.


 Result for this column: 1

6. Combine the results of each column:

1010

-0111

--------

0011

So,

1010 -

0111 =

0011 in

binary

(which

is equal

to 10-

7=3 in

decimal

).

3. Binary Multiplication

Binary multiplication is straightforward, as it involves shifting and adding:

0×0=0
0×1=0

1×0=0

1×1=1

Binary multiplication is similar to decimal multiplication but easier since each digit is either 0
or 1:

- Multiply each bit of the first number by each bit of the second number.
- Shift left for each position of the digit in the multiplier (similar to multiplying by
powers of ten in decimal).

Example: Multiplying 101 and 11

101
x 11
---------
101 (101 * 1)
+ 1010 (101 * 1, shifted left by 1)
---------
1111

Step-by-step:

1. Multiply the first bit of the bottom number by the top number: 101×1=101
2. Multiply the second bit of the bottom number by the top number and shift left by one
position: 101×1=101
3. Add the two results together:

101
+ 1010
--------
1111

Final Result: 1111 (which is 5*3=15 in decimal)

4. Binary Division

Binary division is similar to long division in decimal:

1. Compare the divisor with the dividend.


2. Subtract the divisor from the dividend as many times as possible until what's left is
less than the divisor.

Example: Binary Division of 101010÷110


We’ll follow a process similar to long division in decimal number

Setup: Write 101010 as the dividend and 110 as the divisor.

Align the divisor under the leftmost bits of the dividend that can be divided:

 The divisor 110 can fit into the first three bits of 101 in the dividend. So, we check if
110 fits into 101:
 110 is larger than 101, so it doesn’t fit. Place a 0 in the quotient.

Shift right by one bit and try again:

 Now we consider 1010 (the first four bits of the dividend).


 110 does fit into 1010. Place a 1 in the quotient.  Subtract 110 from 1010:

1010

- 110

-------

100

Bring down the next bit in the dividend:

 After subtraction, we have 100, and the next


bit in 101010 is 1, making it 1001.

Divide 110 into 1001:

 110 fits into 1001. Place a 1 in the quotient. 


Subtract 110 from 1001:

1001
- 110
-------
011

Bring down the last bit in the dividend:

 After subtraction, we have 011, and the last bit in


101010 is 0, making it 110.

Final Division Step:

 110 fits into 110 exactly. Place a 1 in the quotient. 


Subtract 110 from 110, which gives 000.

Quotient and Remainder


The binary quotient is 111, and the remainder is 000.

Final Answer

101010 in decimal is 42

110 in decimal is 6

So, 42 divided by 6 in binary is 111 (which is 7 in decimal).

Summary of Operations

 Binary Addition: Align and add, carrying over when sums exceed 1.
 Binary Subtraction: Borrow when necessary.
 Binary Multiplication: Multiply like decimal, shifting for each bit.
 Binary Division: Use long division methods, subtracting the divisor repeatedly.

These methods are essential for performing arithmetic operations in digital electronics and
programming, as all data in computers is ultimately represented in binary form.

Revision Questions:

Binary Addition

1. Question 1: What is the result of adding the binary numbers 1101 and 1011?
2. Question 2: Add the binary numbers 0110 and 1110 and explain any carries involved.
3. Question 3: Calculate the sum of the binary numbers 10101 and 00111.

Binary Subtraction

1. Question 1: What is the result of subtracting 1010 from 1100 in binary?


2. Question 2: Subtract the binary number 0111 from 1000. Describe any borrowing
that occurs.
3. Question 3: Find the result of 1101 minus 0101 in binary.

Binary Multiplication

1. Question 1: What is the product of the binary numbers 101 and 11?
2. Question 2: Multiply the binary numbers 110 and 101. Show your working steps.
3. Question 3: Calculate the product of 111 and 10 in binary.

Binary Division

1. Question 1: Divide the binary number 1100 by 10 and state the quotient and
remainder.

2. Question 2: What is the result of dividing 1011 by 11 in binary?


3. Question 3: Divide 10010 by 10 and provide the quotient and remainder.
Signed and Unsigned Numbers in Binary Representation

In binary systems, numbers are represented using bits (binary digits), and depending on how
these bits are interpreted, they can represent either positive only numbers (unsigned) or both
positive and negative numbers (signed).

Unsigned Numbers

- Definition: Unsigned numbers use all the bits to represent non-negative values,
meaning they only represent positive integers and zero.
- Range: For an n-bit unsigned number, the range is from 0 to (2n - 1).
- Example: An unsigned 4-bit number can represent values from 0 to 15:
- 0000 in binary = 0 in decimal
- 1111 in binary = 15 in decimal

Use Cases:

- Unsigned numbers are used when you know that negative values are not needed,
such as in addressing memory locations or representing counts of objects. Signed
Numbers

- Definition: Signed numbers allow representation of both positive and negative values.
This is typically done using a method called two's complement, which simplifies
binary arithmetic operations like addition and subtraction.
- Range: For an n-bit signed number, the range is from -2(n-1) to 2(n-1) - 1.
- Example: A signed 4-bit number can represent values from -8 to 7:
- 1000 in binary = -8 in decimal (two's complement representation of negative values)
- 0111 in binary = 7 in decimal

Two's Complement Representation:

- Two's complement is the most widely used method for representing signed integers
in binary. It simplifies the hardware required for arithmetic operations by making
negative numbers easier to work with.

Complement of Binary Numbers

In binary, a complement is used to change the sign of a number or to prepare it for certain
arithmetic operations. There are two types of complements commonly used: one's
complement and two's complement.

One's Complement

- Definition: The one's complement of a binary number is formed by inverting all the
bits (changing 0s to 1s and 1s to 0s).
- How to calculate: Simply flip all the bits of the number.
Example:

- For a 4-bit number, 1010 (10 in decimal) has a one's complement of 0101.

Limitations:

- One's complement has two representations for zero (0000 for +0 and 1111 for -0),
which can lead to complications in arithmetic.

Two's Complement

- Definition: The two's complement of a binary number is formed by inverting all the
bits (one’s complement) and then adding 1 to the least significant bit.

For an n-bit signed number in two's complement, the range is from −2(n−1) to 2(n−1)−1.

Here's why this range works:

 Negative Range: The most significant bit (MSB) in two's complement represents the
sign (0 for positive, 1 for negative). With nnn bits, the smallest possible value, when
the MSB is set to 1 and all other bits are 0, is −2(n−1).
 Positive Range: The largest value is obtained when the MSB is 0, and all other bits
are set to 1, which gives 2(n−1) −1.

Example

For an 8-bit signed integer (n=8):

 Minimum value: −2(8−1)=−27=−128


 Maximum value: 2(8−1)−1=27−1=127

Thus, an 8-bit two's complement integer ranges from −128 to 127.

- How to calculate:
1. Take the one's complement of the number.
2. Add 1 to the result.

Example:

- For a 4-bit number 1010 (which represents 10 in decimal):


1. First, find the one's complement: 0101.
2. Then, add 1 to get the two's complement: 0101 + 0001 = 0110.
- The two's complement of 1010 is 0110, which represents -6 in decimal.

Advantages:
- Two's complement simplifies arithmetic operations and is widely used in computer
systems.
- There is only one representation for zero, which eliminates the confusion seen in
one's complement.
- Subtraction can be done by simply adding the two's complement of a number (i.e., no
separate subtraction operation is needed).

Use Case:

- Two's complement is the standard method for representing signed integers in modern
computers due to its efficiency in arithmetic operations.

Example of Two's Complement Addition and Subtraction

Addition:

Let’s add two numbers using two’s complement.

Example: Add 5 and -3 using 4-bit binary numbers.

1. Represent 5 in binary (4 bits): 0101.


2. Represent -3 in two’s complement (4 bits): - 3 in binary is 0011.
- The one’s complement of 0011 is 1100.
- Adding 1 gives the two’s complement: 1101 (this is -3).
3. Add 0101 (5) and 1101 (-3): 0101+1101=10010
4. Discard the extra bit (since we’re using 4-bit representation), the result is 0010, which
is 2 in decimal.

Subtraction:

Subtraction in two's complement can be performed by adding the two's complement of the
number to be subtracted.

Example: Subtract 3 from 7.

1. 7 in binary (4 bits) is 0111.


2. 3 in binary (4 bits) is 0011. It’s two's complement is 1101.
3. Add 0111 (7) and 1101 (-3): 0111+1101=10100
4. Discard the carry, and the result is 0100, which is 4 in decimal.

Summary

 Unsigned Numbers represent only positive values (e.g., memory addresses).


 Signed Numbers use two's complement to represent both positive and negative
values efficiently.
 One's Complement is a way to represent negative numbers by flipping the bits but
has limitations (like two representations of zero).
 Two's Complement simplifies binary arithmetic by allowing addition and subtraction
with minimal complexity and is the most common method for representing signed
integers in modern computers.

Both one’s and two’s complement methods are essential for computer systems to handle
arithmetic operations efficiently, especially when dealing with negative values.

Performing arithmetic operations in binary follows similar principles to decimal arithmetic,


but it requires some adjustments due to the binary system's base-2 nature.

Sign Magnitude

The sign-magnitude method is a way to represent signed binary numbers. In this system, the
most significant bit (MSB) represents the sign of the number: a 0 indicates a positive value,
and a 1 indicates a negative value. The remaining bits represent the magnitude (absolute
value) of the number in binary.

Key Points of Sign-Magnitude Representation

1. MSB as Sign Bit:


 The leftmost bit (MSB) is the sign bit. 0 means positive, 1 means negative.
 For example, in an 8-bit system:
 00000101 represents +5.  10000101 represents -5.
2. Magnitude:
 The remaining bits represent the magnitude of the number.
 Magnitude is calculated as if the sign bit were not there (i.e., using standard
binary interpretation).
3. Range:
 For an n-bit number, the range of
representable values is:
 Negative values: from -(2(n-1)- 1) to -1 
Positive values: from +0 to
+(2(n-1)- 1)
4. Drawbacks:
 Dual Representations of Zero: In sign-magnitude representation, both
00000000 and 10000000 represent zero, creating a positive and a negative
zero.
 Complexity in Arithmetic: Addition and subtraction can be more complex
with sign-magnitude than with two's complement, as you need to account for
signs separately.

Example

For a 4-bit sign-magnitude system:

 0110 represents +6
 1110 represents -6

This approach is simple but less commonly used for computations, as other systems like
two's complement offer advantages in handling arithmetic operations directly.

Revision Questions:

Sign Magnitude

1. Question 1: What is the sign-magnitude representation of the decimal number -5


using 8 bits?
2. Question 2: If the sign-magnitude representation of a number is 10001010, what is
the decimal equivalent of this binary number? Is it positive or negative?
3. Question 3: Explain how to convert the decimal number 3 into sign-magnitude format
using 8 bits. What would the sign bit be?

1's Complement

1. Question 1: What is the 1's complement of the binary number 1010?


2. Question 2: If the 1's complement representation of a binary number is 1101, what is
the equivalent signed decimal value?
3. Question 3: Describe the process of converting the decimal number -7 into its 1's
complement representation using 8 bits.

2's Complement

1. Question 1: Calculate the 2's complement of the binary number 0110.


2. Question 2: If the 2's complement representation of a number is 11111001, what is
the signed decimal equivalent?
3. Question 3: Explain how to convert the decimal number -12 into its 2's complement
representation using 8 bits. What steps do you need to follow?

Overflow in the Binary Number System

Overflow in the binary number system occurs when a calculation produces a result that is
too large to be represented with the fixed number of bits allocated for a particular binary
value. This is a common issue in computing, particularly when adding or subtracting binary
numbers in fixed-length registers or memory locations. Overflow typically arises in both
unsigned and signed binary number representations (e.g., two's complement).

Types of Overflow:

1. Unsigned Binary Overflow:

- In an unsigned binary system, all bits represent non-negative values (0 or positive).


Overflow occurs when the result of an arithmetic operation exceeds the maximum
value that can be represented by the available bits.
- For example, if a system uses 4 bits to represent numbers, the largest unsigned
number is 11112 (15 in decimal). Adding 11112+00012 would result in 100002,
which requires 5 bits. Since the system is limited to 4 bits, the result would overflow,

and only the lower 4 bits would be stored (00002), which gives an incorrect result of
0.

2. Signed Binary Overflow (Two's Complement):

- In signed binary numbers using two's complement, the leftmost bit represents the sign
of the number (0 for positive, 1 for negative). The remaining bits represent the
magnitude of the number.
- Overflow in two's complement occurs when the result of an operation is too large (or
too small) to be represented within the given bit width, resulting in an incorrect sign. -
Example (4-bit system):

 01112 represents +7, and 00012 represents +1. Adding them gives
10002, which represents -8, not +8.

- Overflow in signed numbers can be detected by checking the carry into and carry
out of the most significant bit (MSB). If these carries are different, overflow has
occurred.

Conditions for Overflow in Signed Numbers:

 Addition: Overflow occurs if:


 Two positive numbers produce a negative result.  Two
negative numbers produce a positive result.
 Subtraction: Overflow occurs if:
 Subtracting a negative number from a positive one produces a
negative result.  Subtracting a positive number from a negative
one produces a positive result.

Detecting Overflow:

 In unsigned binary addition, overflow occurs if there is a carry out from the MSB.
 In signed addition, overflow is detected by comparing the sign of the operands and
the result. Specifically, overflow occurs if:
 The sum of two positive numbers results in a negative number.
 The sum of two negative numbers results in a positive number.

Practical Example:

Suppose we have an 8-bit unsigned number system (0 to 255) and try to add 240 and 20:

- 240 in binary: 111100002


- 20 in binary: 000101002
- Sum: 111100002+000101002=1000011002
This sum requires 9 bits, but since we only have 8 bits, the MSB (1) will be lost, resulting in
000011002 (12 in decimal), which is an incorrect result due to overflow.

Overflow is an important concept in computer systems, especially in fixed-width binary


number representations used in microprocessors, where arithmetic operations are often
performed within limited registers or memory locations.

Underflow in the Binary Number System

In the binary number system, underflow occurs when a calculation produces a result smaller
than the smallest value that the system can represent with the given precision. This is
particularly relevant in floating-point arithmetic where a result’s absolute magnitude (its size
regardless of sign) is too close to zero for the available number of bits.

1. Definition and Causes

- Underflow usually happens when dealing with floating-point numbers, especially in


systems that use normalized representations (like IEEE 754 format). In these systems,
numbers are represented with a base (binary) and an exponent. For example, if a system
can represent only numbers down to 2−127, any value smaller than that is too "tiny" and
causes underflow.
- A simple example of binary underflow might be attempting to subtract two very close
floating-point values where the result is smaller than the smallest representable number
in the system.

2. Impact of Underflow

- When underflow occurs, the system may round the result to zero (or, in some systems, to
a denormalized small value if denormalization is supported).
- This can lead to loss of precision in calculations, potentially causing inaccuracies in
scientific computing, machine learning, and other fields where precise values are critical.

3. Underflow in Fixed-Point and Integer Calculations

- While underflow is mainly associated with floating-point arithmetic, it can also occur in
integer and fixed-point representations if a number rounds down to zero or loses
accuracy due to a limited range of representation, though this is less common than in
floating-point systems.

4. Handling Underflow

- Many systems mitigate underflow by using denormalized numbers (tiny values


represented with leading zeros in the mantissa rather than normalized), which allow for a
"graceful" approach towards zero.
- Error checking and precision management can also help avoid underflow when small
numbers are critical in calculations, ensuring meaningful results.
Underflow is often compared to overflow but is distinguished by its impact on very small, rather
than large, numbers. The IEEE 754 standard handles underflow cases more predictably,
allowing systems to represent values just above zero without rounding abruptly to zero.

Example:

Imagine you are working in a computer system that represents floating-point numbers using a
limited range of exponents, such as 2−127 as the smallest non-zero value in IEEE 754 single-
precision format. In this system:

1. Setup of Example Calculation:


- Say we want to compute the product of two very small numbers:
1.0×10−40×1.0×10−40=1.0×10−80
- This result is smaller than 2−127, the smallest representable value in single-
precision floating-point.
2. Underflow Occurs:
- Since 10−80 is smaller than 2−127, the system cannot represent this result. As a
result, it rounds down to zero, causing an underflow.
3. Result:
- The system outputs zero instead of 1.0×10−80, leading to a loss of precision. This
can be problematic in fields like scientific computing where small differences
matter.
4. Explanation:
- This underflow is similar to subtracting very close numbers where the difference
is below the minimum representable precision.
- Due to underflow, the result is approximated as zero when the difference is too
small to fit within the system’s smallest representable value.

This underflow issue is common in financial calculations, scientific modelling, and simulations
where precise, very small numbers are necessary. Techniques like denormalization (which
allows representing values just above zero with leading zeros) or error handling can help
manage the effects of underflow.

UNITS OF DATA STORAGE

In computing, data is stored and measured in specific units. Understanding these units
helps in comprehending how computers process, store, and manage information. This
section explores the fundamental and higher units of data storage and how they are
managed. 1. Bits, Nibble, and Bytes: Basic Units of Storage

Bit

- Definition: A bit is the most basic unit of data storage in computing. It stands
for binary digit and can hold one of two values: 0 or 1. The word "bit" is
derived from the combination of two words: "binary" and "digit." It
represents the smallest unit of data in computing and digital communications.
American mathematician and computer scientist John Tukey introduced the
term "bit" in 1946. He used it in his work on information theory, specifically
in a paper discussing the organization of data for statistical analysis. The term
quickly became popular in computer science and information theory,
representing the basic unit of information in binary systems.
- 0 represents an "off" state.
- 1 represents an "on" state.
- Binary System: Computers use the binary system (base-2) to store and
process data, where everything is represented as combinations of 0s and 1s.
- History: The encoding of data by discrete bits was used in the punched cards
invented by Basile Bouchon and Jean-Baptiste Falcon (1732), developed by
Joseph Marie Jacquard (1804), and later adopted by Semyon Korsakov,
Charles Babbage, Herman Hollerith, and early computer manufacturers like
IBM. A variant of that idea was the perforated paper tape. In all those systems,
the medium (card or tape) conceptually carried an array of hole positions; each
position could be either punched through or not, thus carrying one bit of
information. The encoding of text by bits was also used in Morse code (1844)
and early digital communications machines such as teletypes and stock ticker
machines (1870).

Ralph Hartley suggested the use of a logarithmic measure of information in 1928.


Claude E. Shannon first used the word "bit" in his seminal 1948 paper "A
Mathematical Theory of Communication". He attributed its origin to John W. Tukey,
who had written a Bell Labs memo on 9 January 1947 in which he contracted "binary
information digit" to simply "bit".

Nibble

- Definition: A nibble consists of 4 bits.


- Since a nibble is 4 bits, it can represent 16 different values (from 0000 to
1111 in binary, which is 0 to 15 in decimal).
- Nibbles are rarely used as standalone units but are important in byte structure.

Byte

- Definition: A byte consists of 8 bits and is the fundamental unit of storage in


computing.
- One byte can represent 256 different values (28), ranging from 0 to 255.

- Characters: One byte typically stores a single character (like a letter or


number) in text encoding systems such as ASCII.
- Example: The letter 'A' in ASCII is represented by the byte 01000001 (or 65
in decimal).

Comparison: Bits, Nibble, and Bytes


Unit Size in Bits Range (in decimal) Use
Bit 1 bit 0 or 1 Fundamental unit of binary data
Nibble 4 bits 0 - 15 Part of byte storage
Byte 8 bits 0 - 255 Standard unit for representing characters

In data representation, various terms like bit, byte, nibble, and others refer to specific units
of information and the number of bits they hold. Here's a detailed explanation of each term:

1. Bit (Binary Digit)

 Definition: The smallest unit of data in computing.


 Size: 1 bit.
 Usage: Represents a single binary value, either 0 or 1. It is the fundamental building
block of all data representation.

2. Nibble

 Definition: A group of 4 bits.


 Size: 4 bits.
 Usage: Typically used to represent a single hexadecimal digit (ranging from 0 to F in
hexadecimal).

3. Byte

 Definition: A group of 8 bits.


 Size: 8 bits.
 Usage: It is the most commonly used unit in data storage. A byte can represent 256
different values (2^8), and it is often used to represent a single character (such as 'A'
or '1') in text data.

4. Character

 Definition: Typically refers to an alphanumeric symbol or control character.


 Size: 8 bits (1 byte) for most modern encodings, but can vary.
 Usage: In character encodings like ASCII, one character is represented by one byte.
In Unicode, a character may require more than one byte, depending on the specific
encoding (e.g., UTF-8, UTF-16).

5. Halfword

 Definition: A data unit that is half the size of a typical "word."

 Size: Commonly 16 bits (2 bytes).


 Usage: Used in systems where the natural data size is larger than a byte, but smaller
than a full word.

6. Word

 Definition: A unit of data that the CPU can process in a single operation.
 Size: Typically 16 bits (2 bytes) or 32 bits (4 bytes), depending on the computer
architecture.
 Usage: The term "word" can vary in size depending on the system architecture. In
older systems, it is usually 16 bits; in modern systems, it’s often 32 bits.

7. Double Word (DWord)

 Definition: A unit of data that is twice the size of a word.


 Size: 32 bits (4 bytes) in a 16-bit system or 64 bits (8 bytes) in a 32-bit system.
 Usage: Used for storing larger values than a standard word and commonly
encountered in many programming languages and hardware architectures.

8. Quad Word (QWord)

 Definition: A unit of data that is four times the size of a word.


 Size: 64 bits (8 bytes) in a 16-bit system or 128 bits (16 bytes) in a 32-bit system.
 Usage: Quad words are used in applications requiring very large integers or high
precision, such as scientific computing or cryptography.

Summary Table:
Term Number of Bits Number of Bytes
Bit 1 bit -
Nibble 4 bits 0.5 byte
Byte 8 bits 1 byte
Character 8 bits 1 byte (in ASCII)
Halfword 16 bits 2 bytes
Word 16-32 bits 2-4 bytes
DWord 32-64 bits 4-8 bytes
QWord 64-128 bits 8-16 bytes
In modern systems, particularly 64-bit architectures, "words" and "double words" may refer
to larger values. These terms are flexible and often depend on the specific system architecture
being used.

2. Kilobytes, Megabytes, Gigabytes, Terabytes: Measuring Data Size

Data storage is typically measured in larger units than bits and bytes, especially when dealing
with files, programs, and storage devices. As data size increases, so do the units used to
measure it.

Kilobytes (KB)

- Definition: A kilobyte (KB) consists of 1024 bytes (not 1000 due to the
binary system used in computing, where 1 KB = 210 bytes).
- Common Use: Kilobytes are used to measure small files, such as simple text
documents or small icons.
- Example: A plain text document with approximately 1000 characters would
be roughly 1 KB.

Megabytes (MB)

- Definition: A megabyte (MB) is equal to 1024 kilobytes (1 MB = 1024 KB =


1,048,576 bytes).
- Common Use: Megabytes measure medium-sized files, such as images, small
videos, and music files.
- Example: A standard MP3 song might be around 4-5 MB in size.

Gigabytes (GB)

- Definition: A gigabyte (GB) is equal to 1024 megabytes (1 GB = 1024 MB =


1,073,741,824 bytes).
- Common Use: Gigabytes measure larger data sizes, such as videos,
highquality images, or software programs.
- Example: A Full HD movie might take up around 1-2 GB.

Terabytes (TB)

- Definition: A terabyte (TB) is equal to 1024 gigabytes (1 TB = 1024 GB =


1,099,511,627,776 bytes).
- Common Use: Terabytes are used to measure the storage capacity of modern
hard drives, cloud storage, and server data centers.
- Example: A modern external hard drive might have a storage capacity of 1-5
TB.

Comparison of Units
Unit Size in Bytes Use
Kilobyte 1024 bytes Text files, small documents
Megabyte 1024 KB = 1,048,576 bytes Images, songs, medium-sized files
Gigabyte 1024 MB = 1,073,741,824 bytes Videos, applications, large files
Terabyte 1024 GB = 1,099,511,627,776 bytes Large storage devices, cloud storage

Binary Coding Schemes

Binary-Coded Decimal (BCD)

- Represents each digit of a decimal number with its binary equivalent. For example,
the decimal number 45 would be represented as 0100 0101 in BCD.

Extended Binary Coded Decimal Interchange Code (EBCDIC)

- An 8-bit character encoding used primarily on IBM mainframe systems. It represents


alphanumeric characters in binary.
American Standard Code for Information Interchange (ASCII)

- A 7-bit character encoding standard that represents characters and control codes using
numbers. For example, the letter "A" is represented as 65 in decimal or 01000001 in
binary.

Unicode

- A universal character encoding standard that allows for the representation of text in
most of the world's writing systems. It uses variable-length encoding, often in UTF-8
format, which can accommodate a vast array of characters from different languages.

These foundational concepts of number systems, data conversions, binary arithmetic, and
data representation techniques are essential for understanding how data is stored, processed,
and manipulated in computer systems. Mastery of these concepts provides a solid base for
more advanced studies in computer science and data analysis.

DATA TYPES AND STORAGE

In computing, data types define how data is stored, represented, and processed by
computers. Each data type has specific storage requirements and ways of representation.
Understanding data types is crucial for handling different kinds of data, such as text,
numbers, images, and multimedia content.

1. Characters and ASCII/Unicode: Representation of Text

Characters

- Definition: Characters are individual letters, numbers, punctuation marks, and


symbols that make up text.
- Characters are stored in a computer as numeric codes, allowing them to be represented
in binary form.

ASCII (American Standard Code for Information Interchange)

- ASCII is a character encoding standard that uses 7-bit binary codes to


represent characters.
- It can represent 128 characters (27), including letters (A-Z, a-z), digits (0-9),
punctuation marks, and control characters (like newline or backspace).
- Example: The letter ‘A’ is represented by 65 in decimal or 01000001 in
binary.
ASCII TABLE

History and Evolution of ASCII:

The American Standard Code for Information Interchange (ASCII) has a rich history
that began in the early 1960s. It was developed as a standardized way to represent text and
control characters in computers and communication equipment. Here's an overview of its
history and evolution:

1. Early Beginnings (1960s)

ASCII was created in 1963 by a committee led by Robert W. Bemer, a computer scientist
known as the "father of ASCII." Prior to this, various computers and communication systems
used their own proprietary character sets, which made interoperability difficult. The aim of
ASCII was to provide a standard that all computers could use for encoding characters.

 Standardization by ANSI: The American National Standards Institute (ANSI)


adopted ASCII as the standard, creating a 7-bit character encoding scheme that could
represent 128 characters, including letters, numbers, punctuation, and control codes.
 Adoption: ASCII was quickly adopted because of its simplicity and compatibility
with telecommunication systems, especially teletype machines.

2. ASCII Structure

ASCII consists of 128 characters:


 Printable Characters (33-126): Includes uppercase and lowercase letters (A-Z, a-z),
digits (0-9), and common punctuation marks like periods, commas, and brackets.
 Control Characters (0-31 and 127): These are non-printable characters used to
control devices like printers or terminals (e.g., Line Feed, Carriage Return).

3. Key Milestones in ASCII's Evolution

 1967 Update: A few minor changes were made to ASCII in 1967, adding the
lowercase letters and making the assignment of certain punctuation marks more
consistent.
 1970s-1980s: ASCII became the standard for computers and communication systems,
particularly with the rise of personal computers. Early systems like the PDP-11 and
Apple II relied heavily on ASCII.
 8-bit Expansion (Extended ASCII): While ASCII was a 7-bit code, many systems
started using an 8-bit byte in the 1980s. This allowed for Extended ASCII, where the
additional 128 codes (128-255) could represent graphical characters, accented letters,
and more symbols. However, this extension was not standardized and varied by region
and language.

4. Modern Usage and Decline

ASCII remains foundational to character encoding in computing but has largely been
supplanted by more modern encodings like Unicode, which can represent a far larger range
of characters (including international scripts and symbols) to accommodate the globalized
use of computers. Unicode incorporates ASCII as its first 128 characters for backward
compatibility, ensuring that any text encoded in ASCII will be readable in a Unicode system.

Pros of ASCII

- Simplicity: ASCII's straightforward design made it easy to implement in early


computer systems.
- Interoperability: It helped standardize communication between different systems,
enabling more widespread adoption of computing technology.
- Efficiency: ASCII's 7-bit format was efficient for the storage and transmission of text
data in early systems.

Cons of ASCII

- Limited Range: ASCII's character set is limited to 128 symbols, which is insufficient
for representing non-English languages or special symbols.
- Incompatibility with Multilingual Data: ASCII couldn't accommodate languages
with accented characters, non-Latin alphabets, or the need for broader character
representation, leading to the rise of alternatives like Unicode.

Conclusion: ASCII laid the foundation for text encoding in the digital world, but its
limitations prompted the development of more comprehensive systems like Unicode, which
can handle the demands of modern computing. Nonetheless, ASCII remains a key part of
computing history, and its simplicity still finds use in many applications today.
For more information on ASCII and its influence on modern computing, you can explore
resources such as ASCII Wikipedia.

Unicode

- Unicode is a more comprehensive character encoding system that supports


over 143,000 characters from various writing systems worldwide.
- Unicode uses different encoding formats such as UTF-8, UTF-16, and
UTF32, which can store characters using 8, 16, or 32 bits. UTF stands for
Unicode Transformation Format.
- UTF-8 is the most common encoding on the web, where:
- Basic characters like those in ASCII are stored using 1 byte.
- Complex characters, such as those from other languages (Chinese, Arabic),
require more bytes.
- Example: The character ‘A’ is 65 in both ASCII and Unicode, but the
character ‘€’ (Euro sign) is represented as 8364 in Unicode.

History and Evolution of Unicode:

Unicode is a comprehensive character encoding system developed to handle the


representation of text for virtually every language, symbol, and script in the world. Its history
and evolution stem from the limitations of earlier encoding systems, such as ASCII, that were
unable to accommodate the growing need for a universal character set. Here's an overview of
its history:

1. The Need for Unicode (Late 1980s)

By the late 1980s, the limitations of the ASCII encoding system and various extended
character sets became more apparent. ASCII, which was a 7-bit encoding, could represent
only 128 characters. Extended versions (like ISO 8859 and others) could encode more, but
they were still limited in terms of supporting multiple languages and special characters.
Different languages, such as Chinese, Japanese, and Arabic, required their own encoding
standards, creating compatibility issues when exchanging text across systems and languages.

The rise of global computing and the internet created a pressing need for a unified system that
could handle multiple scripts without ambiguity. This is where Unicode comes in.

2. The Creation of Unicode (1987-1991)

The Unicode project began in 1987 when Joe Becker from Xerox, along with other engineers
from Apple and Xerox, started to address the need for a more comprehensive encoding
system. The goal was to create a universal character set that could support multiple
languages, symbols, and characters without requiring multiple incompatible encodings.

- Unicode Consortium Formation: The Unicode Consortium was formed in 1991,


with the goal of developing and promoting the Unicode Standard. This non-profit
organization continues to oversee Unicode's development to this day.
- Unicode 1.0 Release (1991): The first version of the Unicode Standard (Unicode 1.0)
was released in October 1991, defining characters for many world languages,
including Latin, Greek, Cyrillic, Hebrew, and Arabic scripts.

3. Evolution and Growth of Unicode (1990s to Present)

Unicode's early versions started small, but the encoding quickly expanded to accommodate
more scripts and languages. Over time, Unicode went from being just a useful tool to
becoming the global standard for character representation in software and the web.

- UTF-8 Encoding (1993): UTF-8, a variable-width encoding system for Unicode, was
developed in 1993 and became widely adopted due to its backward compatibility with
ASCII. It allowed systems to handle a wide variety of characters without drastically
increasing storage requirements for common English texts.
- Unicode and the Web: The growing importance of Unicode became particularly
evident in the mid-1990s, as the World Wide Web expanded. With websites serving
users from across the globe, it was necessary to represent all languages without
corruption or encoding errors.
- Unicode Standardization: Unicode quickly became the preferred encoding for most
major software platforms, including Microsoft Windows, macOS, and Unix-based
systems like Linux. It was also adopted as the underlying character encoding for
HTML, XML, and most programming languages.
- Emojis and Unicode: In the 2010s, Unicode gained even more visibility with its role
in standardizing emojis. These small pictographic characters became widely used
across mobile platforms, and Unicode made sure they were consistently represented
across devices and operating systems. Emojis are now part of Unicode updates, with
new ones being added each year.

4. Structure of Unicode

Unicode was initially designed as a 16-bit encoding, which provided space for 65,536 unique
characters. However, as more scripts were added, it became clear that even 16 bits were not
enough, especially with the addition of thousands of rare characters, ancient scripts, and
modern symbols like emojis. Today, Unicode uses 21 bits, which allows for over a million
possible characters (though fewer than 150,000 have been assigned).

Unicode consists of several encoding forms:

 UTF-8: A variable-length encoding that uses 1 to 4 bytes per character. It is the most
commonly used form, especially on the web.
 UTF-16: A variable-length encoding using 2 or 4 bytes per character.
 UTF-32: A fixed-length encoding that uses 4 bytes per character, but is less
commonly used due to its inefficient use of space.

UTF (Unicode Transformation Format) refers to a set of encoding schemes used to


represent Unicode characters in digital text. The goal of UTF encodings is to enable
the representation of every character from all writing systems, symbols, and control
codes defined in the Unicode standard. Here’s an overview of different UTF
encodings: a. UTF-8

 Definition: UTF-8 is a variable-length character encoding that uses 1 to 4 bytes to


represent each character.

 Encoding: It’s backwards-compatible with ASCII (the first 128 characters are
identical), and can encode all possible Unicode characters by using additional bytes as
needed.
 Usage: UTF-8 is the most widely used encoding on the web today due to its efficiency
and compatibility.
 Pros:
 Efficient for representing ASCII characters (1 byte).
 Widely adopted across platforms and the web.
 Cons:
 Variable-length encoding means that more memory is used for non-ASCII
characters.

b. UTF-16

 Definition: UTF-16 is another variable-length encoding that uses either 2 or 4 bytes to


represent characters.
 Encoding: Most common characters (Basic Multilingual Plane) are represented in 2
bytes, while less common characters (like emojis) require 4 bytes.
 Usage: UTF-16 is commonly used in environments where text primarily includes
characters outside the ASCII range, such as in Windows operating systems.
 Pros:
 More efficient for languages that use a lot of non-ASCII characters (e.g.,
Chinese, Japanese).
 Directly supports a wide range of Unicode characters.
 Cons:
 Takes up more memory when handling primarily ASCII text.  Not as
compact or web-friendly as UTF-8.

c. UTF-32

 Definition: UTF-32 uses a fixed-length encoding where every character is represented


by exactly 4 bytes.
 Encoding: All Unicode characters are stored in the same length (4 bytes).
 Usage: UTF-32 is used in specific applications requiring constant-length encoding,
where performance is prioritized over memory efficiency.
 Pros:
 Fixed-length encoding simplifies text processing.
 Direct access to any character in a string (since all characters are 4 bytes).
 Cons:
 Consumes a large amount of memory, even for small text files.  Less
efficient for general use compared to UTF-8 or UTF-16.
Key Differences between UTF Encodings:

 UTF-8: Efficient for ASCII-heavy text, uses 1-4 bytes.


 UTF-16: More efficient for languages with many non-ASCII characters, uses 2 or 4
bytes.
 UTF-32: Simple but memory-heavy, always uses 4 bytes.

Each UTF encoding is designed for specific use cases depending on memory efficiency, text
complexity, and performance requirements. UTF-8 is generally preferred for web and
international applications, while UTF-16 or UTF-32 may be used in specialized systems like
certain operating systems or databases.

5. Modern-Day Adoption

Today, Unicode is the de facto standard for text encoding on most platforms, including:

- Operating Systems: Windows, macOS, Linux, and other systems fully support
Unicode.
- Programming Languages: Most modern programming languages (Python, Java, C#,
JavaScript) have built-in support for Unicode.
- Web: HTML5, CSS, XML, and JSON all use Unicode as their default character
encoding.

Pros of Unicode

- Universal Character Support: Unicode can represent almost every character in


every language, making it a truly global standard.
- Cross-Platform Compatibility: Text encoded in Unicode can be shared across
different operating systems and applications without corruption.
- Efficient with UTF-8: UTF-8 encoding is space-efficient for texts that are primarily
in ASCII but can still represent all Unicode characters when needed.
- Standardization: Unicode's standardization has simplified the development of
internationalized software and websites.

Cons of Unicode

- Storage Overhead: Some forms of Unicode (such as UTF-16 or UTF-32) use more
space than older encodings, which can be inefficient for storage or processing,
especially in memory-limited systems.
- Complexity in Processing: Handling variable-length encodings like UTF-8 can make
text processing more complex compared to fixed-width systems.
- Backward Compatibility: Transitioning older systems that used regional encodings
to Unicode can be challenging and requires reworking the codebases and data.

Conclusion: Unicode has revolutionized text encoding by providing a single, unified


standard for representing characters from nearly every language and script. From its early
days in the 1990s to its current status as a global standard for text and symbols, Unicode has
played a vital role in enabling communication and software development across linguistic
and cultural boundaries.
For more in-depth details, the official Unicode Consortium website provides extensive
documentation and updates on the Unicode Standard.

Comparison between ASCII and Unicode

Feature ASCII Unicode


Bit Length 7-bit (128 characters) Variable (UTF-8: 8-32 bits, etc.)
Languages English and limited symbols Supports nearly all world languages
Encoding Fixed length (1 byte per character) Variable length (1-4 bytes)

2. Numbers (Integer and Floating-point): How Numbers are Stored

Integers

- Definition: Integers are whole numbers (both positive and negative) without
decimal points (e.g., -10, 0, 25).
- Storage: Integers are typically stored using two's complement for both
positive and negative numbers. The number of bits used determines the range
of values.
- A 32-bit integer can store values between -2,147,483,648 to 2,147,483,647.
- A 64-bit integer can store larger numbers between -
9,223,372,036,854,775,808 and 9,223,372,036,854,775,807.
- Two's Complement:
- Negative numbers are stored by inverting the binary representation of their
absolute value and adding 1.
- Example:
 +5 in binary (8 bits): 00000101
 -5 in two’s complement (8 bits): 11111011

3. Booleans: Storing True/False Values

Booleans

- Definition: A boolean represents a truth value, which can either be True or


False.
- Storage: Boolean values are typically stored as 1 bit, where:
- 1 represents True.
- 0 represents False.

Boolean Operations

- Boolean logic is fundamental in computing, especially in decision-making


processes.
- Common boolean operations include:
- AND: Both values must be true for the result to be true.
- OR: If either value is true, the result is true.
- NOT: Inverts the truth value (True becomes False, False becomes True).

Application of Booleans

- Booleans are widely used in programming for conditional statements, loops,


and decision-making processes.
- Example: In Python, the expression 5 > 3 returns True, which is internally
stored as 1.

You might also like