0% found this document useful (0 votes)
5 views116 pages

Low level programming Lecture2

This document is a lecture outline for CS 107 at Stanford University, focusing on integer representations, bits, and bytes. It covers topics such as numerical bases, binary and hexadecimal representations, data sizes, and integer overflow in C. Additionally, it includes information about assignments, lab signups, and humorous binary anecdotes.

Uploaded by

Ashish Dhiwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views116 pages

Low level programming Lecture2

This document is a lecture outline for CS 107 at Stanford University, focusing on integer representations, bits, and bytes. It covers topics such as numerical bases, binary and hexadecimal representations, data sizes, and integer overflow in C. Additionally, it includes information about assignments, lab signups, and humorous binary anecdotes.

Uploaded by

Ashish Dhiwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

CS 107

Lecture 2: Integer
Representations and
Bits / Bytes

Computer Systems
Summer 2025
Stanford University

Computer Science Department

Reading:
Reader: Bits and Bytes
Textbook: Chapter 2.2

This document is copyright (C) Stanford Computer Science, Adam Keppler, and Olayinka Adekola licensed under Creative Commons Attribution 2.5 License. All rights reserved.
Based on slides created by Joel Ramirez, Nick Troccoli, Chris Gregg
Some Binary Humor (It is Either Funny or Not)

If you get an 11/100 on a CS test, but you claim it should be counted as a 'C', they'll probably decide you
deserve the upgrade. - https://xkcd.com/953/ 2
Assignment 0: Unix!
Assignment page: https://web.stanford.edu/class/cs107/assign0/
Assignment already released, due Friday, 6/27

Late submissions accepted till Sunday 6/29

3
Lab
Signup
https://web.stanford.edu/class/archive/cs/cs107/cs107.1258/cgi-bin/lab_preferences

Labs will begin tomorrow, please make sure to fill out the
preference form.

4
Today's Topics
• Numerical Bases
• Binary, Bits, & Bytes
• Octal & Hexadecimal Bases
• ASCII & Characters

• Integer Representations
• Unsigned Numbers
• Signed Numbers
• Two’s Complement
• Two’s Complement Overflow
• Signed vs Unsigned Number Casting in C
• Signed and Unsigned Comparisons

• Data Sizes & The sizeof Operator


• Min and Max Integer Values
• Truncating Integers
• More on Extending the Bit representation of Numbers
• Addressing and Byte Ordering
• Boolean Algebra
5
Combinations of bits can Encode Anything
represent everything

We can encode anything


we want with bits. E.g., the
ASCII character set.

6
Number Representations
• Unsigned Integers: positive and 0 integers. (e.g. 0, 1, 2, … 99999…
• Signed Integers: negative, positive and 0 integers. (e.g. …-2, -1, 0, 1,… 9999…)

• Floating Point Numbers: real numbers. (e,g. 0.1, -12.2, 1.5x1012)


Look up IEEE floating point if you’re interested ☺ !

7
Data Sizes

On the myth computers (and


most 64-bit computers today),
the int representation is
comprised of 32-bits, or four 8-
bit bytes. NOTE: C language
does not mandate sizes. To the
right is Figure 2.3 from your
textbook:

8
Data Sizes

There are guarantees on the


lower-bounds for type sizes, but
you should expect that the myth
machines will have the numbers
in the 64-bit column.

9
Data Sizes

You can be guaranteed the sizes


for int32_t (4 bytes) and
int64_t (8 bytes)

10
Data Sizes
C allows a variety of ways to
order keywords to define a type.
The following all have the same
meaning:

unsigned long
unsigned long int
long unsigned
long unsigned int

11
Transitioning To Larger Datatypes

• Early 2000s: most computers were 32-bit. This means that pointers were 4
bytes (32 bits).
• 32-bit pointers store a memory address from 0 to 232-1, equaling 232 bytes of
addressable memory. This equals 4 Gigabytes, meaning that 32-bit
computers could have at most 4GB of memory (RAM)!
• Because of this, computers transitioned to 64-bit. This means that datatypes
were enlarged; pointers in programs were now 64 bits.
• 64-bit pointers store a memory address from 0 to 264-1, equaling 264 bytes of
addressable memory. This equals 16 Exabytes, meaning that 64-bit
computers could have at most 1024*1024*1024*16 GB of memory (RAM)! 12
Addressing and Byte Ordering

On the myth machines, pointers are 64-bits long, meaning that a program can "address" up to 264 bytes of memory,
because each byte is individually addressable.

This is a lot of memory! It is 16 exabytes, or 1.84 x 1019 bytes. Older, 32-bit machines could only address 232 bytes, or 4
Gigabytes.

64-bit machines can address 4 billion times more memory than 32-bit machines...

Machines will not need to address more than 264 bytes of memory for a long, long time.

13
Overflow
• If you exceed the maximum value of your bit representation, you wrap around
or overflow back to the smallest bit representation.

0b1111 + 0b1 = 0b0000

• If you go below the minimum value of your bit representation, you wrap
around or overflow back to the largest bit representation.

0b0000 - 0b1 = 0b1111

14
Overflow in Unsigned Addition
When integer operations overflow in C, the runtime does not produce an error:
#include<stdio.h>
#include<stdlib.h>
#include<limits.h> // for UINT_MAX
$ ./unsigned_overflow
a = 4294967295
int main() {
b = 1
unsigned int a = UINT_MAX;
a + b = 0
unsigned int b = 1;
unsigned int c = a + b;
printf("a = %u\n",a); Technically, unsigned integers in C don't
printf("b = %u\n",b); overflow, they just wrap. You need to be
printf("a + b = %u\n",c); aware of the size of your numbers. Here is
} return 0;
one way to test if an addition will fail:
// for addition
#include <limits.h>
unsigned int a = <something>;
unsigned int x = <something>;
if (a > UINT_MAX - x) /* `a + x` would overflow */;
15
Unsigned Integers
For positive (unsigned) integers, there is a 1-to-1 relationship between the decimal
representation of a number and its binary representation. If you have a 4-bit
number, there are 16 possible combinations, and the unsigned numbers go from 0
to 15:
0b0000 = 0 0b0001 = 1 0b0010 = 2 0b0011 = 3
0b0100 = 4 0b0101 = 5 0b0110 = 6 0b0111 = 7
0b1000 = 8 0b1001 = 9 0b1010 = 10 0b1011 = 11
0b1100 = 12 0b1101 = 13 0b1110 = 14 0b1111 = 15

The range of an unsigned number is 0 → 2w - 1, where w is the number of bits in


our integer. For example, a 32-bit int can represent numbers from 0 to 232 - 1,
or 0 to 4,294,967,295.
16
Unsigned Integers

17
Computers use a limited number of bits for numbers
#include<stdio.h>
#include<stdlib.h>

int main() {
int a = 200;
200 * 300 * 400 * 500 = 12,000,000,000
int b = 300;
int c = 400;
int d = 500;
int answer = a * b * c * d;
printf("%d\n",answer);
return 0;
}

$ gcc -g -O0 mult-test.c -o mult-test


$ ./mult-test
-884901888
$ 18
Computers use a limited number of bits for numbers
#include<stdio.h> Recall that in base 10, you can represent: 10
#include<stdlib.h> numbers with one digit (0 - 9),
100 numbers with two digits (00 - 99),
int main() { 1000 numbers with three digits (000 - 999)
int a = 200;
I.e., with n digits, you can represent up to 10n
int b = 300;
numbers.
int c = 400;
int d = 500; In base 2, you can represent:
int answer = a * b * c * d; 2 numbers with one digit (0 - 1)
printf("%d\n",answer); 4 numbers with two digits (00 - 11)
return 0; 8 numbers with three digits (000 - 111)
}
I.e., with n digits, you can represent up to 2n
numbers
The C int type is a "32-bit" number, meaning it uses 32 digits. That
means we can represent up to 232 numbers. 19
Computers use a limited number of bits for numbers
#include<stdio.h> 232 = 4,294,967,296
#include<stdlib.h> 200 * 300 * 400 * 500 = 12,000,000,000

int main() {
int a = 200;
int b = 300;
int c = 400;
int d = 500;

int answer = a * b * c * d;
printf("%d\n",answer);
return 0; Turns out it is worse -- ints are signed,
} meaning that the largest positive number is
(232 / 2) - 1 =
$ gcc -g -O0 mult-test.c -o mult-
test 231 - 1 = 2,147,483,647
$ ./mult-test
-884901888
$ 20
Computers use a limited number of bits for numbers
#include<stdio.h>
#include<stdlib.h>

int main() { The good news: all of the following produce


int a = 200;
the same (wrong) answer:
int b = 300;
int c = 400;
int d = 500; (500 * 400) * (300 * 200)

int answer = a * b * c * d; ((500 * 400) * 300) * 200


printf("%d\n",answer); ((200 * 500) * 300) * 400
return 0;
} 400 * (200 * (300 * 500))
$ gcc -g -O0 mult-test.c -o mult-
test
$ ./mult-test
-884901888
$ 21
Let's look at a different program
#include<stdio.h>
#include<stdlib.h>

int main() {
float a = 3.14;
float b = 1e20;

printf("(3.14 + 1e20) - 1e20 = %f\n", (a + b) - b);


printf("3.14 + (1e20 - 1e20) = %f\n", a + (b - b));

return 0;
}
$ gcc -g -Og -std=gnu99 float-mult-
test.c -o float-mult-test
$ ./float-mult-test.c
(3.14 + 1e20) - 1e20 = 0.000000
3.14 + (1e20 - 1e20) = 3.140000 bigger problem! 22
$
Information Storage

23
Information Storage

In C, everything can be thought of as a block of 8 bits

24
Information Storage

In C, everything can be thought of as a block of 8 bits


called a "byte"

25
Byte Range
Because a byte is made up of 8 bits, we can represent the range of a byte as
follows:

00000000 to 11111111

This range is 0 to 255 in decimal.

But, neither binary nor decimal is particularly convenient to write out bytes
(binary is too long, and decimal isn't numerically friendly for byte
representation)

So, we use "hexadecimal," (base 16).

26
Hexadecimal
• When working with bits, oftentimes we have large numbers with 32 or 64 bits.
• Instead, we’ll represent bits in base-16 instead; this is called hexadecimal.

0110 1010 0011


0-15 0-15 0-15

27
Hexadecimal
• Hexadecimal is base-16, so we need digits for 1-15. How do we do this?

0 1 2 3 4 5 6 7 8 9 a b c d e f
10 11 12 13 14 15

28
Hexadecimal
Hexadecimal has 16 digits, so we augment our normal 0-9 digits with six
more digits: A, B, C, D, E, and F.

Figure 2.2 in the textbook shows the hex digits and their binary and decimal
values:

29
Hexadecimal
• When working with bits, oftentimes we have large numbers with 32 or 64 bits.
• Instead, we’ll represent bits in base-16 instead; this is called hexadecimal.

6 A 3
0-15 0-15 0-15

Each is a base-16 digit!


30
Hexadecimal
• We distinguish hexadecimal numbers by prefixing them with 0x, and binary
numbers with 0b. These prefixes also work in C
• E.g. 0xf5 is 0b11110101

0x f 5
1111 0101

31
Practice: Hexadecimal to Binary
What is 0x173A in binary?

Hexadecimal 1 7 3 A
Binary 0001 0111 0011 1010

32
Practice: Hexadecimal to Binary
What is 0b1111001010 in hexadecimal? (Hint: start from the right)

Binary 11 1100 1010


Hexadecimal 3 C A

33
Hexadecimal
Convert: 0b1111001010110110110011 to hexadecimal.

(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

34
Hexadecimal
Convert: 0b1111001010110110110011 to hexadecimal.

(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

35
Hexadecimal
Convert: 0b1111001010110110110011 to hexadecimal.

(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

36
Hexadecimal
Convert: 0b1111001010110110110011 to hexadecimal.

(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

37
Hexadecimal
Convert: 0b1111001010110110110011 to hexadecimal.

(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

38
Hexadecimal
Convert: 0b1111001010110110110011 to hexadecimal.

(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

39
Hexadecim
Convert: 0b1111001010110110110011 to hexadecimal.
al
(start from the right)

0b1111001010110110110011
is hexadecimal 3CADB3

40
Decimal to Hexadecimal
To convert from decimal to hexadecimal, you need to repeatedly divide
the number in question by 16, and the remainders make up the digits of
the hex number:

41
Hexidecimal
to Decimal
To convert from hexadecimal to decimal, multiply each of the hexadecimal
digits by the appropriate power of 16:

42
Hexadecimal: It’s funky but concise
• Let’s take a byte (8 bits):
Base-10: Human-readable,
165 but cannot easily interpret on/off bits

0b10100101 Base-2: Yes, computers use this,


but not human-readable

Base-16: Easy to convert to Base-2,


0xa5 More “portable” as a human-readable format
(fun fact: a half-byte is called a nibble or nybble)

43
Let the computer do it!
Honestly, hex to decimal and vice versa are easy to let the computer
handle. You can either use a search engine (Google does this
automatically), or you can use a python one-liner:

44
Let the computer do it!

You can also use Python to convert to and from binary:

(but you should memorize this as it is easy and you will use it frequently)

45
Let the computer do it!

You can also use Python to convert to and from binary:

(also might show up in an offline exam )

46
How to Represent A Signed Value
A signed integer is a negative, 0, or positive
integer.

How can we represent both negative and


positive numbers in binary?

47
Signed Integers
• A signed integer is a negative integer, 0, or a positive integer.
• Problem: How can we represent negative and positive numbers in binary?

Idea: let’s reserve the most


significant bit to store the sign.

48
Sign Magnitude Representation

0110
positive 6

1011
negative 3
49
Sign Magnitude Representation

0000
positive 0

1000
negative 0
50
Sign Magnitude Representation
1 000 = -0 0 000 = 0
1 001 = -1 0 001 = 1
1 010 = -2 0 010 = 2
1 011 = -3 0 011 = 3
1 100 = -4 0 100 = 4
1 101 = -5 0 101 = 5
1 110 = -6 0 110 = 6
1 111 = -7 0 111 = 7

• We’ve only represented 15 of our 16 available numbers!

51
Sign Magnitude Representation AKA Ones
Complement
• Pro: easy to represent, and easy to convert to/from decimal.
• Con: +-0 is not intuitive
• Con: we lose a bit that could be used to store more numbers
• Con: arithmetic is tricky: we need to find the sign, then maybe subtract
(borrow and carry, etc.), then maybe change the sign. This complicates the
hardware support for something as fundamental as addition.

Can we do better?

52
Now Lets Try a Better Approach!

53
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0101
+????

0000
54
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0101
+1011

0000
55
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0011
+????

0000
56
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0011
+1101

0000
57
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0000
+????

0000
58
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0000
+0000

0000
59
A Better Idea
Decimal Positive Negative Decimal Positive Negative

0 0000 0000 8 1000 1000

1 0001 1111 9 1001 (same as -7!) NA

2 0010 1110 10 1010 (same as -6!) NA

3 0011 1101 11 1011 (same as -5!) NA

4 0100 1100 12 1100 (same as -4!) NA

5 0101 1011 13 1101 (same as -3!) NA

6 0110 1010 14 1110 (same as -2!) NA

7 0111 1001 15 1111 (same as -1!) NA


60
There Seems Like a Pattern Here…

0101 0011 0000


+1011 +1101 +0000

0000 0000 0000


• The negative number is the positive number inverted, plus one!
61
There Seems Like a Pattern Here…

A binary number plus its inverse is all 1s. Add 1 to this to carry over all 1s and get 0!

0101 1111
+1010 +0001

1111 0000
62
Another Trick
• To find the negative equivalent of a number, work right-to-left and write down
all digits through when you reach a 1. Then, invert the rest of the digits.

100100
+??????

000000
63
Another Trick
• To find the negative equivalent of a number, work right-to-left and write down
all digits through when you reach a 1. Then, invert the rest of the digits.

100100
+???100

000000
64
Another Trick
• To find the negative equivalent of a number, work right-to-left and write down
all digits through when you reach a 1. Then, invert the rest of the digits.

100100
+ 011100

000000
65
Two’s Complement

66
Two’s Complement
• In two’s complement, we represent a
positive number as itself, and its
negative equivalent as the two’s
complement of itself.
• The two’s complement of a number is
the binary digits inverted, plus 1.
• This works to convert from positive to
negative, and back from negative to
positive!

67
Two’s Complement
• Con: more difficult to represent, and
difficult to convert to/from decimal and
between positive and negative.
• Pro: only 1 representation for 0!
• Pro: all bits are used to represent as
many numbers as possible
• Pro: the most significant bit still indicates
the sign of a number.
• Pro: addition works for any combination
of positive and negative!

68
Two’s Complement
• Adding two numbers is just…adding! There is no special case needed for
negatives. E.g. what is 2 + -5?

0010 2

+1011 -5

1101 -3

69
Two’s Complement
• Subtracting two numbers is just performing the two’s complement on one of
them and then adding. E.g. 4 – 5 = -1.

0100 4 0100 4

- 0101 5 +1011 -5

1111 -1

70
How to Read Two’s Complement #s
• Multiply the most significant bit by -1 and multiply all the other bits by 1 as normal

1 1
_ _ 1
_ 0
_
23 22 21 20
= 1*-8 + 1*4 + 1*2 + 0*1 = -2

71
How to Read Two’s Complement #s
• Multiply the most significant bit by -1 and multiply all the other bits by 1 as normal

0 1
_ _ 1
_ 0
_
23 22 21 20
= 0*-8 + 1*4 + 1*2 + 0*1 = 6

72
Practice: Two’s Complement
What are the negative or positive equivalents of the numbers below?
a) -4 (1100)
b) 7 (0111)
c) 3 (0011)

73
Practice: Two’s Complement
What are the negative or positive equivalents of the numbers below?
a) -4 (1100) -> 4 (0100)
b) 7 (0111) -> (1001)
c) 3 (0011) -> (1101)

74
Some Extra Slides for Review

75
Two's Complement
In practice, a negative number in two's
complement is obtained by inverting all
the bits of its positive counterpart*, and
then adding 1, or: x = ~x + 1
Example: The number 2 is represented as normal in
binary: 0010

-2 is represented by inverting the bits, and adding 1:

0010 ☞ 1101

1101
+ 1
1110

*Inverting all the bits of a number is its "one's complement" 76


Two's Complement
To convert a negative number to a
positive number, perform the same
steps!
Example: The number -5 is represented in two's
complements as: 1011

5 is represented by inverting the bits, and adding 1:

1011 ☞ 0100

0100
+ 1
0101

Shortcut: start from the right, and write down


numbers until you get to a 1:
1
Now invert all the rest of the digits:
77
0101
Two's Complement: Neat Properties
There are a number of useful properties
associated with two's complement
numbers:

1. There is only one zero (yay!)


2. The highest order bit (left-most) is 1
for negative, 0 for positive (so it is
easy to tell if a number is negative)
3. Adding two numbers is just…adding!
Example:
2 + -5 = -3

0010 ☞ 2
+1011 ☞ -5
1101 ☞ -3 decimal (wow!)
78
Two's Complement: Neat Properties
More useful properties:

4. Subtracting two numbers is simply


performing the two's complement on
one of them and then adding.
Example:
4 - 5 = -1

0100 ☞ 4, 0101 ☞ 5
Find the two's complement of 5: 1011
add:
0100 ☞ 4
+1011 ☞ -5
1111 ☞ -1 decimal 79
Two's Complement: Neat Properties
More useful properties:

5. Multiplication of two's complement


works just by multiplying (throw away
overflow digits).

Example: -2 * -3 = 6

1110 ☞ -2
x1101 ☞ -3
1110
0000
1110
+1110
80
10110110 ☞ 6
Practice
Convert the following 4-bit numbers
from positive to negative, or from
negative to positive using two's
complement notation:

a. -4 (1100) ☞

b. 7 (0111) ☞

c. 3 (0011) ☞

d. -8 (1000) ☞

81
Practice
Convert the following 4-bit numbers
from positive to negative, or from
negative to positive using two's
complement notation:

a. -4 (1100) ☞ 0100

b. 7 (0111) ☞ 1001

c. 3 (0011) ☞ 1101

d. -8 (1000) ☞ 1000 (?! If you look at


the chart, +8 cannot be represented
in two's complement with 4 bits!) 82
Practice
Convert the following 8-bit numbers
from positive to negative, or from
negative to positive using two's
complement notation:

a. -4 (11111100) ☞ 00000100

b. 27 (00011011) ☞ 11100101

c. -127 (10000001) ☞ 01111111

d. 1 (00000001) ☞ 11111111

83
History: Two’s complement
• Two’s Complement was first proposed by John von
Neumann in First Draft of a Report on the EDVAC
(1945)
• That same year, he also invented the merge sort algorithm

• Many early computers used +7 0b0000 0111


sign-magnitude or EDSAC (1949)
-7 0b1111 1000
one’s complement 8-bit one’s complement

• The System/360, developed by IBM in 1964, was


widely popular (had 1024KB memory) and
established two’s complement as the dominant
binary representation of integers
System/360 (1964)
84
Casting Between Signed and Unsigned
Converting between two numbers in C can happen explicitly (using a
parenthesized cast), or implicitly (without a cast):

explicit implicit
1 int tx, ty; 1 int tx, ty;
2 unsigned ux, uy; 2 unsigned ux, uy;
3… 3…
4 tx = (int) ux; 4 tx = ux; // cast to signed
5 uy = (unsigned) ty; 5 uy = ty; // cast to unsigned

When casting: the underlying bits do not change, so there isn't any
conversion going on, except that the variable is treated as the type that it is.
NOTE: Converting a signed number to unsigned preserves the bits not the
number! 85
Casting Between Signed and Unsigned
When casting: the underlying bits do not change, so there isn't any
conversion going on, except that the variable is treated as the type that it is.
You cannot convert a signed number to its unsigned counterpart using a cast!
1 // test_cast.c $ ./test_cast
2 #include<stdio.h> v = -12345, uv = 4294954951
3 #include<stdlib.h>
4
5 int main() {
6 int v = -12345;
7 unsigned int uv = (unsigned int) v;
8 Signed -> Unsigned
9 printf("v = %d, uv = %u\n",v,uv); -12345 goes to 4294954951
10
11 return 0;
12 }
Not 12345

86
IMPORTANT NOTE
• Because Types are just about how we read memory, it is important
to note that casting does not impact the values or bits only the
meaning that we expect them to have

• BEWARE: Expectations are like assumptions they can be violated or


incorrect

87
Casting Between Signed and Unsigned
printf has three 32-bit integer representations:

%d : signed 32-bit int


%u : unsigned 32-bit int
%x : hex 32-bit int

As long as the value is a 32-bit type, printf will treat it according to the
formatter it is applying:
1 int x = -1; $ ./test_printf
2 unsigned u = 3000000000; // 3 billion x = 4294967295 = -1
3 u = 3000000000 = -1294967296
4 printf("x = %u = %d\n", x, x);
5 printf("u = %u = %d\n", u, u);
6

88
Signed vs Unsigned Number Wheels

89
Comparison between signed and unsigned integers
When a C expression has combinations of signed and unsigned variables, you
need to be careful!

If an operation is performed that has both a signed and an unsigned value, C


implicitly casts the signed argument to unsigned and performs the
operation assuming both numbers are non-negative. Let's take a look…

Expression Type Evaluation


0 == 0U
-1 < 0
-1 < 0U
2147483647 > -2147483647 - 1
2147483647U > -2147483647 - 1
2147483647 > (int)2147483648U
-1 > -2
(unsigned)-1 > -2
90
Comparison between signed and unsigned integers
When a C expression has combinations of signed and unsigned variables, you
need to be careful!

If an operation is performed that has both a signed and an unsigned value, C


implicitly casts the signed argument to unsigned and performs the
operation assuming both numbers are non-negative. Let's take a look…

Expression Type Evaluation


0 == 0U Unsigned 1
-1 < 0 Signed 1
-1 < 0U Unsigned 0
2147483647 > -2147483647 - 1 Signed 1
2147483647U > -2147483647 - 1 Unsigned 0
2147483647 > (int)2147483648U Signed 1
-1 > -2 Signed 1
(unsigned)-1 > -2 Unsigned 1
91
Note: In C, 0 is false and everything else is true. When C produces a boolean value, it allways chooses 1 to represent true.
Comparison between signed and unsigned integers
Let's try some more…a bit more abstractly.
int s1, s2, s3, s4;
unsigned int u1, u2, u3, u4;
What is the value of this
expression?

u1 > s3

92
Comparison between signed
and
Let's unsigned integers
try some more…a bit more abstractly.
int s1, s2, s3, s4;
unsigned int u1, u2, u3, u4;
Which many of the following
statements are true? (assume that
variables are set to values that place
them in the spots shown)

u1 > s3 : true

93
Overflow
• What is happening here? Assume 4-bit numbers.
0b1101
+ 0b0100 15 0 1
14 2
13 3
12 4

11 5
10 6
9 8 7

94
Overflow
• What is happening here? Assume 4-bit numbers.
0b1101
+ 0b0100 15 0 1
14 2
13 3
12 4

Signed Unsigned 11 5
10 6
-3 + 4 = 1 13 + 4 = 1 9 8 7

No overflow Overflow

95
Limits and Comparisons
1. What is
the… Largest unsigned? Largest signed? Smallest signed?
char

int

2. Will the following char comparisons evaluate to true or false?


i.-7 < 4 iii. (char) 130 > 4

ii.-7 < 4U iv. (char) -132 > 2

96
Limits and Comparisons
1. What is
the… Largest unsigned? Largest signed? Smallest signed?
char 28 - 1 = 255 27 – 1 = 127 -27 = -128

int 232 - 1 = 231 - 1 = -231 =


4294967296 2147483647 -2147483648

These are available as


UCHAR_MAX, INT_MIN,
INT_MAX, etc. in the
<limits.h> header.
97
Limits and Comparisons
2. Will the following char comparisons evaluate to true or false?
i. -7 < 4 true iii. (char) 130 > 4 false

ii. -7 < 4U false iv. (char) -132 > 2 true

By default, numeric constants in C are signed ints, unless they are


suffixed with u (unsigned) or L (long). 98
The sizeof Operator
long sizeof(type);

// Example
long int_size_bytes = sizeof(int); // 4
long short_size_bytes = sizeof(short); // 2
long char_size_bytes = sizeof(char); // 1

sizeof takes a variable type as a parameter and returns the size of that type, in
bytes.

99
The sizeof Operator
As we have seen, integer types are limited by the number of bits they hold. On
the 64-bit myth machines, we can use the sizeof operator to find how many
bytes each type uses:
int main() {
printf("sizeof(char): %d\n", (int) sizeof(char));
printf("sizeof(short): %d\n", (int) sizeof(short));
printf("sizeof(int): %d\n", (int) sizeof(int));
printf("sizeof(unsigned int): %d\n", (int) sizeof(unsigned int));
printf("sizeof(long): %d\n", (int) sizeof(long));
printf("sizeof(long long): %d\n", (int) sizeof(long long));
printf("sizeof(size_t): %d\n", (int) sizeof(size_t));
printf("sizeof(void *): %d\n", (int) sizeof(void *));
return 0;
}

$ ./sizeof
sizeof(char): 1 Type Width in bytes Width in bits
sizeof(short): 2 char 1 8
sizeof(int): 4
sizeof(unsigned int): 4 short 2 16
sizeof(long): 8 int 4 32
sizeof(long long): 8
long 8 64
sizeof(size_t): 8
sizeof(void *): 8 void * 8 64 100
MIN and MAX values for integers
Because we now know how bit patterns for integers works, we can figure out the
maximum and minimum values, designated by INT_MAX, UINT_MAX, INT_MIN,
(etc.), which are defined in limits.h

Width Width
Type Min in hex (name) Max in hex (name)
(bytes) (bits)

char 1 8 80 (CHAR_MIN) 7F (CHAR_MAX)


unsigned char 1 8 0 FF (UCHAR_MAX)

short 2 16 8000 (SHRT_MIN) 7FFF (SHRT_MAX)


unsigned short 2 16 0 FFFF (USHRT_MAX)

int 4 32 80000000 (INT_MIN) 7FFFFFFF (INT_MAX)


unsigned int 4 32 0 FFFFFFFF (UINT_MAX)

long 8 64 8000000000000000 (LONG_MIN) 7FFFFFFFFFFFFFFF (LONG_MAX)


unsigned long 8 64 0 FFFFFFFFFFFFFFFF (ULONG_MAX)
101
Min and Max Integer Values
• You can also find constants in the standard library that define the
max and min for each type on that machine(architecture)

• Visit <limits.h> or <cstdint.h> and look for variables like:


INT_MIN
INT_MAX
UINT_MAX
LONG_MIN
LONG_MAX
ULONG_MAX

102
Expanding Bit Representations
• Sometimes, we want to convert between two integers of different sizes (e.g.
short to int, or int to long).
• We might not be able to convert from a bigger data type to a smaller data
type, but we do want to always be able to convert from a smaller data type to
a bigger data type.
• For unsigned values, we can add leading zeros to the representation (“zero
extension”)
• For signed values, we can repeat the sign of the value for new digits (“sign
extension”
• Note: when doing <, >, <=, >= comparison between different size types, it will
promote to the larger type.
103
Expanding the bit representation of a number
For signed values, we want the number to remain the same, just with more
bits. In this case, we perform a "sign extension" by repeating the sign of the
value for the new digits. E.g.,
short s = 4;
// short is a 16-bit format, so s = 0000 0000 0000 0100b

int i = s;
// conversion to 32-bit int, so i = 0000 0000 0000 0000 0000 0000 0000 0100b
— or —

short s = -4;
// short is a 16-bit format, so s = 1111 1111 1111 1100b

int i = s;
// conversion to 32-bit int, so i = 1111 1111 1111 1111 1111 1111 1111 1100b

Converting from a smaller type to a larger type is also often called promotion
I.E. the number was promoted from short to int 104
Sign-extension Example
// show_bytes() defined on pg. 45, Bryant and O'Halloran
int main() {
short sx = -12345; // -12345
unsigned short usx = sx; // 53191
int x = sx; // -12345 $ ./sign_extension
unsigned ux = usx; // 53191 sx = -12345: c7 cf
printf("sx = %d:\t", sx);
usx = 53191: c7 cf
show_bytes((byte_pointer) &sx, sizeof(short)); x = -12345: c7 cf ff ff
printf("usx = %u:\t", usx); ux = 53191: c7 cf 00 00
show_bytes((byte_pointer) &usx, sizeof(unsigned short));
printf("x = %d:\t", x); (careful: this was
show_bytes((byte_pointer) &x, sizeof(int));
printf("ux = %u:\t", ux); printed on the little-
show_bytes((byte_pointer) &ux, sizeof(unsigned)); endian myth machines!)
return 0;
}

105
Truncating Numbers: Signed
What if we want to reduce the int x = 53191; // 53191
number of bits that a number short sx = (short) x; // -12345
int y = sx;
holds? E.g.
This is a form of overflow! We have altered the value of the number.
Be careful!

We don't have enough bits to store the int in the short for the value we have
in the int, so the strange values occur.

What is y above? We are converting a short to an int, so we sign-extend,


and we get -12345!
1100 1111 1100 0111becomes
1111 1111 1111 1111 1100 1111 1100 0111
Play around here: http://www.convertforfree.com/twos-complement-calculator/ 106
Truncating Numbers: Signed
If the number does fit into the int x = -3; // -3
smaller representation in the short sx = (short) -3; // -3
int y = sx; // -3
current form, it will convert just
fine.

x: 1111 1111 1111 1111 1111 1111 1111 1101becomes


sx: 1111 1111 1111 1101

Play around here: http://www.convertforfree.com/twos-complement-calculator/ 107


Truncating Numbers: Unsigned
We can also lose information with unsigned int x = 128000;
unsigned numbers: unsigned short sx = (short) x;
unsigned int y = sx;

Bit representation for x = 128000 (32-bit unsigned int):

0000 0000 0000 0001 1111 0100 0000 0000

Truncated unsigned short sx:

1111 0100 0000 0000

which equals 62464 decimal.

Converting back to an unsigned int, y = 62464 108


Overflow In Practice: PSY
Signed overflow wraps around to the negative numbers:

YouTube fell into this trap — their view counter was a signed, 32-bit int. They fixed it after it was
noticed, but for a while, the view count for Gangnam Style (the first video with over INT_MAX
number of views) was negative.

YouTube: “We never thought a video would be watched in numbers greater than a 32-bit integer
(=2,147,483,647 views), but that was before we met PSY. "Gangnam Style" has been viewed so many
times we had to upgrade to a 64-bit integer (9,223,372,036,854,775,808)!” [link]
109
“We saw this coming a couple months ago and updated our systems to prepare for it” [link]
Overflow in Signed Addition
In the news on January 5, 2022 (!):

https://arstechnica.com/gadgets/2022/01/google-fixes-nightmare-android-bug-
that-stopped-user-from-calling-911/

110
Overflow in Signed Addition
Signed overflow wraps around to the negative numbers.
#include<stdio.h>
#include<stdlib.h> $ ./signed_overflow
#include<limits.h> // for INT_MAX
a = 2147483647
int main() { b = 1
int a = INT_MAX;
int b = 1; a + b = -2147483648
int c = a + b;
Technically, signed integers in C produce
printf("a = %d\n",a);
printf("b = %d\n",b);
undefined behavior when they overflow. On two's
printf("a + b = %d\n",c); complement machines (virtually all machines
return 0;
these days), it does overflow predictably. You can
} test to see if your addition will be correct:
// for addition
#include <limits.h>
int a = <something>;
int x = <something>;
if ((x > 0) && (a > INT_MAX - x)) /* `a + x` would overflow */;
if ((x < 0) && (a < INT_MIN - x)) /* `a + x` would underflow */; 111
Overflow

At which points can overflow occur for


signed and unsigned int? (assume binary values 111…111 000…000
shown are all 32 bits) 111…110 000…001
111…101 000…010
A. Signed and unsigned can both overflow X 000…011
111…100
at points X and Y
B. Signed can overflow only at X, unsigned
only at Y
… …
C. Signed can overflow only at Y, unsigned
only at X
D. Signed can overflow at X and Y, Y
unsigned only at X 011…101
100…010
E. Other 100…001 011…110
100…000 011…111

112
Overflow In Practice: Gandhi
• In the game “Civilization”, each
civilization leader had an
“aggression” rating. Gandhi was
meant to be peaceful, and had a
score of 1.
• If you adopted “democracy”, all
players’ aggression reduced by 2.
Gandhi’s went from 1 to 255!
• Gandhi then became a big fan of
nuclear weapons.
https://kotaku.com/why-gandhi-is-such-an-asshole-in-civilization-1653818245

113
Overflow In Practice: Games

Super Mario Bros (NES):


Impossible Pacman Level 256 losing all extra lives if you exceed 127
114
Overflow In Practice: Timestamps
Many systems store timestamps as the number of seconds since Jan. 1, 1970 in
a signed 32-bit integer.
• Problem: the latest timestamp that can be represented this way is 3:14:07 UTC
on Jan. 13 2038!

• Casino erroneous slot machine payout ($42,949,672.76) due to overflow


• Reported vulnerability CVE-2019-3857 in libssh2 may allow a hacker to
remotely execute code
• Apple CoreGraphics overflow bug exploited via iMessage, used in known
spyware

115
Overflow in Practice
• Pacman Level 256
• Make sure to reboot Boeing Dreamliners every 248 days
• Comair/Delta airline had to cancel thousands of flights days before Christmas
– they exceeded 32,767 crew changes (limit of short)
• Many operating systems may have issues storing timestamp values beginning
on Jan 19, 2038
• Donkey Kong Kill Screen

116

You might also like