Week_02_bits_and_bytes
Week_02_bits_and_bytes
Lecture 2:
Representing text and numbers in
computers
COMP 1238
Introduction to Data
Management
Monday, Sept 9
Starting at 1:05
◎ AtKlass: HPYX
◎ AtKlass reg: E2CJ
COMP 1238 – Intro to Data
Management
4
Computer Memory
and Storage
The “bit” – when we are limited to only two
states
Self portrait of Samuel Morse
1812
7
The “bit”
◎ The tiniest piece of information in
computing and communication
◎ Can hold with one of two possible
values
○ 0/1, true/false, on/off
◎ Allegedly, “bit” is short for “Binary
Digit”
◎ What are the two “values” in
Morse code on the pic?
8
CD under a
microscop
e
9
Bytes – groups of bits
◎ 1 bit is of limited use – so we group bits into “bytes”
◎ Groups of 6, 8 and 9 bits were common in early days
◎ Now everyone uses bytes consisting of 8 bits
10
Punched Card
12
Volatile and Non-Volatile
memory
◎ Volatile = gets erased when electricity is gone
◎ Non-volatile memory is used for long term storage
○ Hard drive, SD card, Flash drive
○ CD / DVD discs, Tape, punch cards, clay tablets …
◎ Volatile memory is used as short-term working
memory, it’s usually much faster but much more
expensive
13
RAM vs Storage
◎ Long term memory - called storage or disk
○ Usually slower and optimized to read or write large
chunks of data at a time
◎ Volatile working memory - usually called RAM
○ RAM = Random Access Memory – it’s fast for
reading from any location
◎ How much memory does you phone have?
14
RAM Storage /
Disk
15
16
How many bytes is a Gigabyte?
17
Refresher on exponentiation
◎ Powers of 10: ◎ Powers of 2:
18
Units of measurement - GB vs GiB
◎ Kilo, Mega, Giga prefixes like usual
◎ 1 Megabyte = 8 MegaBits
◎ But kilo = 1000 or 1024 = 210 ?
◎ Both are used, difference may be
important in some cases
◎ Convention that nobody really
remembers:
○ MB, GB … for powers of 10
○ MiB, GiB … for powers or 2 like
1024
19
Representin
g text &
numbers
with bits
We can choose which
sequences of bits represent
what character in an arbitrary
way, that’s what Morse code
does for example
21
22
But the engineers decided to
interpret bits as numbers in
“binary” notation
◎ They had good reasons – easier to build hardware that
does math on binary numbers than Morse digits
23
Dive into binary + disclaimer
◎ This section about binary and hex might be a bit heavy
◎ Don’t worry if you don’t get it, many people don’t
◎ You’ll hear about it on other courses in more detail
24
Why binary and hex are so
confusing
10=2
numbers -
explained by
ancient
Babylonians
Number vs
Representation of a number
◎ When talking about different representations,
the language we use is confusing, like
○ “Ten in binary is two”
○ “Ten in hex is sixteen”
◎ The number of fingers on this hand does not
change if I decide to represent it in one system
or another - , V, “five”, 0b101
◎ But to communicate a number, we must
represent it somehow – the word “five” is a
representation 27
Ambiguity in words and
symbols is the main source of
confusion here
◎ For some representations we have only one meaning,
no ambiguity:
○ The word “five”, the digit 5
○ All the digits and spoken numbers up to 9 have
only one meaning
◎ But when we say “ten” do we mean:
○ The number of fingers on two hands
or
○ The digits 1 and 0 written together as “10”
28
Positional numeral systems
◎ 345 = 3*100 + 4*10 + 5
29
Positional numeral systems
◎ 345 = 3*100 + 4*10 + 5
◎ “Positional” because the meaning
of a digit depends on position
◎ The base of the system
is how many digits it uses
also called “radix”
We use 10 digits 0 to 9
so it’s a base-10 system
30
Common options for the radix
Base = radix
“Positional” because the How many Fancy Name Digits
meaning of a digit depends digits it uses
on position – e.g. hundreds
vs thousands. 2 Binary 0,1
32
Rules to follow to avoid
confusion
◎ Whenever you use words like “ten” or “twelve”
○ always use them to represent the number they
mean in everyday language
○ not the string of digits “10” and “25”
◎ When you want to say a binary number 10 – spell out
the digits “binary one zero” as if you are talking over a
radio
33
Converting a binary number
Let’s convert 0b1010 to decimal
1 0 1 0 digits
3 2 1 0 Index
23 22 21 20 2^index
8 4 2 1 2^index = ?
1 0 1 0 Digits again
8 2 digit 2^index
34
Converting a hex number
Let’s convert 0x2F to decimal
Digit “F” stands for 15
0 0 2 F digits
3 2 1 0 Index
35
Exercise time
36
Two-byte integer
The “Least Significant Bit” on the right represent
The “Most Significant Bit” on the left represents
Two-byte integer can store numbers from 0 to
37
Memory as a series of bytes
38
We’ve seen positive integer
numbers in binary.
39
Signed numbers (positive and
negative)
◎ We could use the first bit for sign, but the most
commonly uses system is where the first bit stands for
negative 2N
◎ a 4-bit signed number 1010 would be
40
Fractional numbers
◎ How would you represent fractional number like
○ 3.1415
○ 0.0000000042
○ 9.99
41
Fixed point notation
◎ For prices if we assume we can only use
whole cents
○ Price can be 3.45 but not 3.455
◎ Write all the digits, but remember that the
dot separates the last two of them
◎ This is equivalent to storing the prices in
cents in this case
42
Floating point notation
1. Write out the digits: 47988
2. Say where the point is: 4 positions from the end
3. Store those two numbers and use them to represent
the fractional number
43
How big can the numbers be?
Some early computers could only address 64KB of
memory because the address was stores as a two-
byte integer so they could only count to 65,535
45
Bits to Text
46
ASCII encoding
◎ ASCII = American Standard Code for Information
Interchange
◎ A table of characters and “control codes” to
standardize communication
49
Extended ASCII
50
Text Mode
51
1960s computer
rooms had no
screens
Then screens appeared, but before we
could control each pixel on the screen
we used “Text Mode”
◎ First screens were direct replacement for printers
◎ “Text Mode” divides the screen into a grid, usually 80
columns by 25 lines, putting a character in each cell
◎ Low-level hardware would light up the right pixels to
display each character
53
54
Old airport billboards were
also a kind of text only
displays
55
Why text mode?
Apple II was released in 1977
With 4KB of RAM and only
removable floppy drives for storage
(140KB per disk)
59
“Code pages” for other
languages
◎ People stated using the 128-255 range for a variety of
characters from other languages
○ Latin1 – the most common (often default) code page
○ Windows Code Pages CP-125x
◎ It was a mess, avoid anything other than Latin1 if possible.
Use Unicode UTF-8 for other languages, we will talk about
it later
60
encoding
for Cyrillic
letters
61
Files are sequences of bytes
◎ We decide what the bytes mean and how to interpret
them
◎ Text files are files where each byte is interpreted as a
character (or special symbol line new-line)
○ * We will talk about Unicode later – some
characters may take more than one byte
62
File
s
63
Why files?
◎ Named chunks of space on a storage device
◎ Simplest way to “organize” large amount of data
64
Each byte can be shown as a
character for any file, not only text
files
65
Binary files – files that look ugly
if displayed as text
66
Questions?
67
Links
68
DRAFTS
69
Decimal Binary Hex Babylonia
n
70
Start with counting states of N bits
◎ TBD
71
Rules that help with understanding
72
74
◎ In decimal system we use daily (also called base 10)
◎ In binary
75
Representing Numbers
Positional Number Systems
◎ TBD
○ Use some system like Babylonian to make it
distinct from the systems we are playing with
○ <<|| is always
○ The confusion of phrases like “12 hex” is “18 decimal” is probably
the biggest problem when learning number systems like hex. Don’t
use phrases like this!
○ Twelve is a number, we have twelve phalanges on the 4 fingers of
the hand, there are twelve of them no matter which system you use
to represent the number
○ Always spell out the digits
○ Prefix with 0x 0b 0d 0o (zero o), pronounce as “Octal one two” 76