01 Intro Bits Bytes
01 Intro Bits Bytes
algorithms (recapitulation),
bits, strings
Jesper Larsson
Course, teaching
• Me: Jesper Larsson, string and compression algorithms
person, teaching at MAU since 2014, research
background in string algorithms and data compression
• You?
• Programming
• Prove that:
- A speci c algorithm does what it’s supposed to
- A speci c algorithm has a certain “speed”
- There can be no algorithm for the problem “faster”
than a certain “speed”
fi
fi
fi
how many times
do you have to
turn the crank?
Charles Babbage’s
analytical engine,
“programmed” by
Ada Lovelace
Time complexity of algorithm
• T(N) = a measure for the time it takes to run the program
on an input of size N
Ex 1. ⅙ N 3 + 20 N + 16 ~ ⅙N3 is Θ(N3)
Ex 2. ⅙ N 3 + 100 N 4/3 + 56 ~ ⅙N3 = proportional to N3
Ex 3. ⅙N3 - ½N 2 + ⅓ N ~ ⅙N3 = cubic
Formally
f(N) is O(g(N)) means: ∃ constants N0 and c so that if N > N0, then |f(N)| < c·g(N)
f(N) is Ω(g(N)) means: ∃ constants N0 and c so that if N > N0, then |f(N)| > c·g(N)
f(N) is Θ(g(N)) means: f(N) is both O(g(N)) and Ω(g(N))
Time for sorting with
pairwise compares
log2
• Lower bound: ?
• Optimal algoritm: ?
a<b
tree heigh =
yes no number of
compares in the
b<c a<c
yes
no yes no
yes no yes no
• N values a1 to aN. Assume they are all di erent (a case we need to manage).
• N! = N · (N−1) · (N−2) · … · 3 · 2 · 1 di erent orderings
• Tree with compares (internal nodes), with orderings as leaves:
Stirling's approximation
στρατός (formerly known as Michelangelo_MI), At the end of the track https:// ic.kr/p/4wMSNh
fl
fl
Sound
• A sequence of amplitude values in binary
representatio
Pictures
• Bit-mapped
PDF, Postscript, …
fl
7 bit ascii
International (e.g.
Scandinavian) characters
• Replace some glyphs [ ] \ { }
• Use 8 bits: Latin-1 (ISO 8859-1
• Replace for new chars (€ etc.): Latin-9
(ISO-8859-15
• Microsoft variant: Windows-125
• Unicode multibyte: UTF-8 de-facto standard?
)
UTF-8
No of bits we Bytes in
Byte 1 Byte 2 Byte 3 Byte 4
need UTF-8 value
7 1 0xxxxxxx
2 7
1 3 6 8
4 5
0 9
Binary number
representation
decimal:
782 = 2·100
+ 8·101
+ 7·102
binary:
1100001110 = 0·20
+ 1·21
+ 1·22
+ 1·23
+ 0·24
+ 0·25
+ 0·26
+ 0·27
+ 1·28
+ 1·29
“8 Questions” for unsigned
numbers (octets or “bytes”)
00000000:
00000001:
00000010:
00000011:
00000100:
⋮
01111110: 12
01111111: 12
10000000: 12
10000001: 12
10000010: 13
⋮
11111110: 25
11111111: 255
4
Floating-point representation
0.250244140625:
sign 0: positive
exponent: 01111101 = 125, subtract 127 (exponent bias): −2
mantissa: 1·20 (implicit rst 1 bit)
+ 0·2−1
+ 0·2−2
+ 0·2−3
⋮
+ 0·2−9
+ 1·2−10
+ 0·2−11
⋮
= 1.0009765625
• or | ∨ +
• xor ^ ⊻ ⊕
• not ~ ¬ ¯
• <<
• >>, >>>
a b a&b
0 0 0
0 1 0
1 0 0
1 1 1
carry C
A
B
Each row is an operation withthree input bits and two output bits
outi = Ai ^ Bi ^ Ci
Ci+1 = Ai & Bi | Ci & (Ai ^ Bi)
Groups of bits
• Groups of 3: octal
• Groups of 4: hex[adecimal]
• 2022 * 666
= 1346652
• 111|11100110 * 10|10011010
= 10100|10001100|01011100
2022 & 0x = 230
2022 >>> 8 = 7 7 230
138 2 154
92
ff
154*7 + 138 & 0x = 192
154*7 + 138 >>> 8 = 4 7 230
2 154
ff
4 192 92
2*230 & 0x = 204
2*230 >>> 8 = 1 7 230
1 2 154
4 192 92
204
ff
2*7 + 1 = 15
7 230
2 154
4 192 92
15 204
192 + 204 & 0x = 140
192 + 204 >>> 8 = 1 7 230
1+4+15 = 20 2 154
4 192 92
15 204
20 140 92
= 666·2022
ff
Endianness
Next lecture!
fi