A Programmers Guide To Computer Science A Virtual Degree For The Self-Taught Developer Vol2 by William M Springer II
A Programmers Guide To Computer Science A Virtual Degree For The Self-Taught Developer Vol2 by William M Springer II
TO COMPUTER SCIENCE
VOLUME II
WILLIAM M SPRINGER II PHD
Illustrated by
BRIT A SPRINGER
Edited by
NICKOLAS R ALLGOOD
JAXSON MEDIA
A Programmer's Guide to Computer Science, Volume II By William M.
Springer II
All rights reserved. No portion of this book may be reproduced in any form
without permission from the publisher, except as permitted by U.S.
copyright law. Visit the author's website at http://www.whatwillia
msaid.com/books/.
ISBNs:
978-1-951204-04-4 (paperback)
978-1-951204-05-1 (hardcover)
978-1-951204-06-8 (ebook)
Cover design and images without source cited by Brit Springer at Moonlight
Designs Studio. Typesetting done by the author in LATEX. Technical editing
by Nicholas R Allgood. Copy editing by Margo Simon.
CONTENTS
Introduction
Afterword
Notes
INTRODUCTION
Overview
The various sections of this volume are fairly independent and can be
read in any order. They intentionally scratch the surface of a variety of areas.
The final section, Advanced Topics, covers the more esoteric subjects
referenced in earlier chapters.
Example
A pair of twin primes is two prime numbers that are two apart
(e.g., 5 and 7, 11 and 13, 17 and 19.). The twin prime conjecture 2
states that there are infinitely many twin primes. This seems
reasonable, as we know there are an infinite number of primes, 3 but
not obvious, since primes don't follow any known pattern.
At the time of this writing, the largest known twin primes,
2996863034895x2 1290000 plus or minus 1, are 388,342 digits long.
The great number of twin primes found strongly suggests that they
will continue forever - but isn't a proof!
Axiom
A statement that is accepted without proof; the rest of the proof depends on
the correctness of the axioms.
Conjecture
Something that is suspected to be true but has not yet been proven. When a
conjecture is proven, it becomes a theorem. 4
Corollary
A statement that immediately follows from another theorem or definition.
Often this is simply a restriction of the theorem to some special case.
Hypothesis
A conjecture that is assumed to be true, although it has not been proven.
Sometimes a hypothesis will be taken as true and used to develop
conditional proofs of other conjectures.
Lemma
A minor result that is proved as part of a proof of a larger theorem. If the
desired proof is the peak of Mt. Everest, a lemma is a stop along the climb;
proving it hopefully gets you that much closer to where you want to be. 5
Proof
A mathematical proof is a series of logical steps which show the conclusion
is guaranteed from the stated assumptions. Often the abbreviation QED 6 or
a square is used to denote the end of the proof.
Theorem
A statement that has been proven to be true using axioms and/or previously
proven theorems. When a conjecture is proven, it becomes a theorem.
CHAPTER 18: PROOF TECHNIQUES
Example
The four-color theorem a states that no planar graph requires
more than four colors. Appel and Haken proved the theorem
exhaustively by demonstrating that a minimal counter example must
contain one of 1,936 possible configurations, each of which was
checked by computer.
a
See Section 4.7 on Graph Coloring.
Example
An irrational number is a real number that cannot be written as
the ratio of two integers. Suppose we wish to prove that [?]2 is
irrational. Assume it's actually rational; then [?]2= a/b for some
smallest integers a and b. Rearranging, we get b[?]2 = a. Squaring
both sides, we get b 2 [?] 2 = a 2.
Since a 2 is twice b 2, a must be even, so we can set it equal to
double a third number: a = 2c. So now we have 2b 2 = (2c) 2 =4c 2 .
Now b 2 = 2c 2, so b is even, contradicting the assumption that a and b
were the smallest integers whose quotient is [?]2. 3 Therefore our
initial assumption was false, and [?]2 is not a rational number.
18.3.1 Example
Imagine that we have a chessboard with one square removed, For example,
such a 2 x 2 chessboard would like look this:
Example
Suppose we want to prove that if k is an irrational number, [?]k
must also be an irrational number.
The contrapositive of this is that if [?]k is a rational number, k
must also be a rational number.
Suppose [?]k is rational; then it is the ratio of two integers: [?]k =
a/b. Squaring both sides, we get k = a 2/b 2. This equation shows that k
is also the ratio of two integers and thus is rational.
This chain of reasoning proves that if [?]k is rational, then k must
also be rational, and by contrapositive, if k is irrational, then [?]k
must also be irrational.
CHAPTER 19: CERTIFICATES
Beware of bugs in the above code; I have only proved it correct, not
tried it.
- Donald Knuth 1
The difficulty is that while the algorithm may be correct, we can't be sure
that the implementation of that algorithm doesn't contain errors. We would
like to have a way to check that the program is actually returning the correct
answer.
A proof of correctness is known as a certificate; something we can easily
check to show that the answer is correct.
For example, consider a program which determines whether or not a
number is prime. If the answer is no, a certificate would be a set of integers
whose product is the number we are testing.
Our goal is that checking a certificate will be faster and simpler (have a
lower asymptotic runtime and less complexity) than the original problem, so
that we can be more certain the program verifying the certificate does not
contain errors. We say that a certificate is strong if the verification algorithm
has a better time bound than the original problem, and weak if it does not;
in practice, algorithms often have a strong rejection certificate and a weak
acceptance certificate, or vice versa. 2
Example
Checking whether a graph is bipartite takes O(n+m) time. If it is
bipartite, an acceptance certificate is a two-coloring of the graph,
which also requires O(n+m) time to verify, making it a weak
certificate. If it is not, a rejection certificate (an odd cycle) can be
verified in O(n) time and thus is a strong certificate.
PART IX - SECURITY AND
PRIVACY
CHAPTER 20: INTRO TO SECURITY
20.1 Confidentiality
Consider the following information:
20.2 Integrity
Integrity refers to the trustworthiness of the data, and encompasses two
areas: data integrity (whether the information is correct) and origin integrity
(whether we know where the information came from).
We need to ensure that the data is accurate, which means having
mechanisms in place to prevent unauthorized changes or detect when an
unauthorized change occurs. 3 We also want to know where the data comes
from, as this reflects on its trustworthiness: medical advice has a higher
degree of credibility if it comes from a doctor rather than a receptionist.
Integrity mechanisms can ensure both that we know who sent a message and
that we can prove it; we discuss this more in Chapter 22.
20.3 Availability
If an attacker cannot compromise a system, he may still be able to make it
unavailable for use. These attempts can range from simply trying to make a
service unavailable to anyone (for example, launching a denial of service
attack against a popular website) to causing delays that enable a second
attack (causing a failover from a secure server to a compromised one).
20.4 Goals
The various security mechanisms have three goals:
21.2 Terminology
Traditionally, we have a message which is being sent from Alice to Bob. A
third party, Eve, would like to eavesdrop on that message.
Alice and Bob have some information which allows them to communicate
securely. If they are using a symmetric algorithm, then the key is a shared
secret. Alice and Bob each have a copy of the key, which is used for both
encryption and decryption. Symmetric key cryptography requires that the
participants have a way to securely agree on a key before using it to transmit
messages.
This required key exchange is the primary disadvantage of symmetric
algorithms, as it requires that two parties wishing to communicate arrange
to meet in advance or find some already existing method of secure
communication. This problem can be avoided using quantum key
distribution, in which properties of quantum mechanics are used to
distribute keys securely.
Asymmetric algorithms, often known as public key algorithms, use
different keys for encryption and decryption. A message which is encrypted
with the public key can be decrypted with the private key.
In some cases, the reverse is also true: the person with the private key
can encrypt a message and anyone with the public key can decrypt it. In this
case, the private key can be used for authentication: the owner of the key can
sign a document and anyone with the public key can verify that signature. 3
The Math
Let f be a function such that f (x) = y. If f is a good trapdoor
function, then computing y from x is easy, but computing x from y is
impractical without the key.
22.2 RSA
RSA 4 is one of the most commonly used algorithms for public key
cryptography.
In the RSA algorithm, Alice chooses two large primes p and q of similar
lengths and computes n=pq. This is a one-way function because multiplying
two integers is easy, but factoring the result into its component primes is
not.
The Math
Why does this work? Fermat's little theorem states that if p is a
prime number and a is an integer, a p[?] a mod p. The [?] symbol
means congruent; the two sides are equivalent with respect to the
modulus. For example, 3 and 15 are congruent mod 12, which is why
the third hour (3am) and 15th hour (3pm) are in the same position
on a 12-hour clock.
In this case, the message m is being raised to the power of e and
then to the power of d, mod n. Raising a value to one exponent and
then to a second exponent is equivalent to raising the original value
to the product of the two exponents, so after running both the
encryption and decryption we have m ed mod n. Alice chooses e and d
such that this will leave us with m, the original message.
Passwords have been used to determine access for thousands of years. In the
internet age, however, passwords often do not provide sufficient security, for
several reasons.
Weak passwords - Given a system that allows unlimited password
tries, an attacker can simply try every possible password until arriving at the
correct one. This can be an exhaustive key search (trying literally every legal
combination) or a dictionary attack (trying everything in a list of common
passwords).
Reused passwords - Even a strong password can be discovered. If an
attacker gains access to a password that is associated with the same
username or email across multiple sites, the attacker now has access to each
of those sites.
Adding length and complexity requirements for passwords helps mitigate
the first problem, but increases the odds that the user will reuse the
password (who wants to memorize multiple long, complicated passwords?),
store it somewhere that an attacker could potentially access, 1 or risk
forgetting it. Password managers (which only require that the user
remember one master password while generating and storing a strong
password for each website) are one solution, although they then represent a
single point of attack to gain access to all of a user's accounts. 2
Other options for increasing password security include:
Forced password change - Until recently, Microsoft recommended
that companies force employees to change their passwords frequently, under
the assumption that shared passwords might eventually leak. These frequent
changes exacerbate the issues discussed above (users who are required to
change passwords frequently tend to choose passwords that are easy to
remember) and are no longer recommended. 3 Passwords that have not been
stolen do not need to be changed, and passwords that you suspect have been
stolen should be changed immediately.
Banned password lists - In addition to minimum length
requirements, a user's password can be compared to a list of common
passwords and disallowed if it is found.
Password hashing - A secure website will not allow you to retrieve a
lost password, because it never actually saves the password. Instead, it
combines the password with a salt (a secret string, unique to each user), 4
hashes the resulting value, 5 and saves the hashed value. Thus, even if the
password database is compromised, the attacker still would still need to
individually crack each password.
Multi-factor authentication - There are three factors commonly used
for authentication: something you know, something you have, and
something you are. Typical examples are a password for logging in to a
website, a smartcard for accessing a building at work, and a fingerprint for
unlocking a phone. Using a chip-and-PIN credit card requires two of these:
something you have (the card itself) and something you know (the PIN). 6
Many websites are now offering two-factor authentication, 7 which require
both your password and a second, temporary access code that shows you
have access to a phone number or device associated with the account.
PART X - HARDWARE AND
SOFTWARE
CHAPTER 24: HARDWARE
ABSTRACTIONS
Figure 24.1: An illustration of a hard drive platter. Each concentric circle [A] is
called a track; each pie slice [B] is a geometrical sector. The intersection of a track
and a geometrical sector is a track sector [C] and is the smallest section that can
be read or written. A block [D] is the smallest logical amount of disk space that
will actually be used by the operating system (for example, to store a file). Public
domain image from wikimedia commons.
Down in the weeds
All flash memory uses transistors to store data, but those
transistors may be arranged in different ways. The two main types,
NOR and NAND, are named after the logic gates they mimic. The
different arrangements of transistors lead to different physical
properties, as described below.
NOR (Not OR) flash memory provides random access, high-speed
reads, high reliability, and the ability to read or write single bytes.
NAND (Not AND) flash has smaller cells, which results in much
higher write and erase speeds, as well as lower costs. However,
NAND flash is accessed in pages or blocks rather than bytes. As
NAND flash uses an indirect interface, it is more complicated to
access, and the presence of bad blocks requires additional error-
correcting functionality that isn't necessary for NOR flash.
In practice, NOR flash is used for code storage and execution,
such as the firmware on cell phones, while NAND flash is used for
data storage, such as memory sticks and solid state drives.
24.3 Memory
The primary detail of interest to programmers discussing computer memory
(aside from why there isn't more of it) will be how it is allocated.
Information in an executing program is stored on either the (control) stack
or the heap. In either case, the item is stored in memory (with a caveat noted
below); the stack and heap are simply the data structures we use to keep
track of it.
When a program is running and calls a function, all of that function's
variables go on the stack. When the function exits, all of its variables are
removed from the stack and that memory can be reused. The stack also
holds pointers to function calls to allow execution to return to the correct
location.
Because a stack is a simple last-in, first-out data structure (as discussed
in Section 2.4), allocating and reclaiming memory is more efficient than
when using a heap. 5
In a multi-threaded application, each thread gets its own stack. As the
stack tends to hold a (relatively) small number of items and the memory in
the stack is accessed frequently, values in the stack are likely to be cached.
Additionally, research shows that a small stack cache (separate from the
main cache) leads to significant performance improvements. 6
Only primitives and references are placed on the stack; objects are always
placed in the heap. 7 Whereas memory access for the stack is strictly
regulated, any element in the heap can be accessed at any time; there is no
ordering of the various objects. The heap and the stack are both stored in
memory; often the stack grows downward from the highest-available
memory address, while the heap grows upward from the lowest-available
address.
Unlike the stack, memory on the heap is dynamically allocated and can
be released at any time. 8 The heap is less efficient to access, and memory
must be tracked rather than being automatically reclaimed, but it can hold
objects of variable size, and objects can persist across stack levels.
24.4 Cache
While RAM is much faster than disk, it's still not able to provide data to the
CPU immediately. One reason for this is the speed-of-light limit: light (and
thus, data) can only travel so far in a given amount of time. 9 In a vacuum,
light travels at approximately one foot per nanosecond. 10 This means that if
a CPU is operating at 3GHz, the absolute furthest a signal could travel in one
clock cycle is four inches, round trip. As a result, if the memory is located
more than two inches away from the CPU, data can not be retrieved within
one clock cycle, even if the memory responds instantly.
Of course, memory does not actually respond instantly. Another factor is
how long it takes for a memory module to access a particular location. 11
Even with very fast RAM, the latency means there will be a multiple-
nanosecond delay between when information is requested and when it is
available.
We handle both of these problems by locating a small amount of very fast
memory on the die itself. 12 As this memory is very close to the processor, we
minimize the speed-of-light delay, and because we use a small amount of
very fast memory, we minimize the lag as well. Only so much memory will fit
on the die, and the fast memory used is expensive, 13 so we use it to cache
frequently-used data. This on-die memory is the L1 cache; there may also be
an L2 cache that is still on the motherboard but may not be on the chip
itself. 14
24.5 Registers
While computers attempt to use cache memory to hold data for the near
future, registers are used to hold data and instructions for right now.
Registers are the memory that the processor actually works with directly,
and are naturally made up of the memory that is closest to the CPU and has
the fastest response time. In Section 25.1, we'll see an example of a machine
code instruction that specifies the register where a value should be placed.
A CPU will generally have both general-purpose registers (which store
temporary data and can be accessed by user programs) and specific-use
registers, such as the accumulator, program counter, and instruction
register. 15
CHAPTER 25: SOFTWARE
ABSTRACTIONS
Example - C#
C# defines the bitwise operators for the int, uint, long, and ulong
types. Other types are converted to int (and the bitwise operation
returns an int). For any variable num of an appropriate type, num <<
x would left-shift num by x bits. For example, if num=3 (0000 0011),
then num << 4 would be 3x2 4=48 (0011 0000). The right-shift
operator does an arithmetic shift if the operand is signed and a
logical shift if it is not.
Example - Masking
Binary operators are often used for masking. Suppose an object
has a number of boolean properties. We can combine them into one
variable by assigning each to be a power of two and ORing them
together:
PropertyOne = 2 0 = 1 << 0 = 1
PropertyTwo = 2 1 = 1 << 1 = 2
PropertyThree = 2 2 = 1 << 2 = 4
PropertyFour = 2 3 = 1 << 3 = 8
If an object has properties one, two, and four, the flag variable will
be 0001 | 0010 | 1000 = 1011.
Later on, we want to know whether the object has properties three
and four.
Property three: (0100 & 1011) = 0
Property four: (1000 & 1011) = 1
In code, we would have defined the properties as constants, and
would just write the following 5
if ((PropertyThree & flagVar) !=0)
if ((PropertyFour & flagVar) != 0)
If we want to check multiple properties at once, we can OR all of
them together into a new sumVar and AND it with flagVar. If the
result is sumVar, every boolean returned true.
26.4 Exclusive OR
The logical exclusive OR (XOR or ^) returns true if exactly one of its
operands is true; the binary version returns 1 if exactly one operand is 1.
A consequence of this is that XOR is reversible; applying the operation
twice will return the original value. For example, consider a plaintext
10100101 that we wish to encrypt using a one-time pad 6 10110111.
27.1.1 Multitasking
A multitasking system works by time-sharing, where the processing
capability is divided into time slices and those slices are assigned to the
various processes.
Modern computers generally use preemptive multitasking, where the
kernel will interrupt the currently-executing process when its time slice is up
or when a higher priority task needs the processor. As the operating system
controls the allocation of time slices, each process can be guaranteed to
eventually receive CPU time. However, any given process may lose control at
any time, and it is possible for several processes using shared resources to
deadlock, in which case no deadlocked process can finish its execution
because each one is waiting on a resource held by another process in the
group.
Example
The usual example is the dining philosophers problem, in which
five silent (non-communicating) philosophers sit around a table
eating spaghetti, with one fork between each pair of philosophers.
Each philosopher either eats or thinks; if he eats, he takes the forks
from both sides. A philosopher with only one fork will not eat, but
will wait for the second fork to become available; thus, if each
philosopher picks up the fork on his left, all will starve while waiting
for the second fork to be released.
Lock the shared data prior to accessing it with either thread and
thus ensure that it does not change while in use. 7
Make the entire unit of work (both the conditional and the body of
the if statement) an atomic operation, meaning that it cannot be
interrupted. 8
How do we ensure that the process does not access any memory
outside of the allowed block?
If we do not have a contiguous 4 GB block of RAM, how do we
create one?
If the process is suspended and its RAM is used by another process,
do we need to wait until that specific block of memory is available
again before the process can resume?
Often a process will be working with a small set of its available pages. 12 If
this entire working set is in memory, the process can execute quickly despite
having only a small number of pages loaded. However, if only part of the
working set is loaded at a time, the result will be frequent page faults as the
process continuously finds that the page it needs is not in memory
(thrashing). To avoid this, we may allow a process to run only when its
entire working set can be loaded into memory. 13
If there is insufficient memory available to hold the current working sets
of all running processes, then some processes should be swapped out to
avoid thrashing.
27.3 I/O
An important function of the operating system is providing a uniform view
of I/O devices. From the standpoint of a process, it should not matter
whether data is read from memory, from disk, or from the network, or
whether it is written to the screen, to a file, or to a printer. The operating
system provides an interface that allows the process to simply read from
standard input or write to standard output.
From an operating system perspective, I/O tends to be extremely slow
compared to processor speed. 14 For a simple system that does only one
thing at a time, it may be sufficient for the processor to request I/O and then
busy wait 15 until the appropriate device has processed the data. In most
cases, we prefer that the processor do a context switch and come back to the
current process when I/O is complete. One way to switch back is to have the
hardware interrupt the CPU whenever it is ready for the next character, but
this wastes a significant amount of time, as interrupts are expensive. A
better option is the use of a direct memory access (DMA) controller. The
CPU simply initiates the transfer, hands it off to the DMA controller, then is
interrupted only when the transfer is complete.
27.4 Security
In Chapter 20, we discussed the three elements of computer security:
confidentiality, integrity, and availability. In terms of operating systems, we
can consider it the system's job to ensure that data belonging to a user or
process is not accessed by another user or process without permission, that
it is not altered unexpectedly, and that it is available on request.
In the sections above, we discussed ensuring that a process is able to
access only the memory locations allocated to it by the operating system and
is not able to monopolize the CPU. As a general rule, we wish to ensure that
each process accesses only those resources it is permitted to use, in ways
that it is permitted to use them.
This is done by requiring that processes access resources by sending a
request to the operating system, rather than taking control of the hardware
directly. The operating system must then verify that the request is
appropriate.
This verification can be done with access control lists (ACLs), which
identify the processes that can use a resource and in what ways, or with
capability lists, which determine the access granted to a given process.
In order to ensure that processes cannot bypass the access control
measures, it is crucial to avoid the operating system itself being corrupted.
In trusted computing systems, we create a trusted computing base (TCB)
consisting of those components (both hardware and software) that are
responsible for maintaining security.
The TCB is kept as small as possible to make it possible to verify
correctness; all requests for system resources must go through the reference
monitor, which acts as a barrier between the trusted and untrusted parts of
the system.
More generally, computer security works based on the principle of least
privilege: every actor (be that a process, user, or program) should have the
amount of access required for its purpose and no more. Thus, the kernel 16 is
kept as small as possible and is isolated from everything else on the system.
Programs execute as user-level processes, which do not have permission
to access resources directly. When access is needed, the process makes a
system call, interrupting the kernel. After verifying that the process has
access to the requested resource, the kernel performs the desired access.
CHAPTER 28: DISTRIBUTED SYSTEMS
Scaling Google
When Google started in 1998, they had four computers and
several hundred gigabytes of storage space. They now run more than
two million servers in multiple data centers around the world.
28.1 The fallacies of distributed
computing
Distributed computing brings its own challenges. L. Peter Deutsch and
others 2 have created a list of false assumptions that new distributed systems
programmers often make: 3
28.2 Communication
The various processes that make up a distributed system will communicate
with each other by passing messages. Any two processes could be located on
the same machine or on two servers thousands of miles apart. The system
designers must make decisions that include:
Should we require that messages be acknowledged, and if so, how
long should we wait for the acknowledgement?
Do we want to receive each message at least once, or at most once?
How do we respond if some elements of the system become
unavailable?
How do we handle the addition or removal of servers?
Simply put, a distributed system must take into account what will
happen when messages cannot be delivered in a timely, reliable, or secure
manner, or can't be delivered at all.
Customer orders are located in an Orders table, which contains all of the
information for the order. Rather than putting customer information in the
Orders table (which would result in a lot of duplicate information, especially
if one customer places many orders), each row has a customer ID that
matches the appropriate ID from the Customers table. This is a foreign key -
the same ID might be used for many entries in the Orders table, but must
map to exactly one entry in the Customers table.
A correctly designed relational database promotes accuracy by avoiding
duplication, as in the Customer example above. We store a piece of
information only in the appropriate table and link to it as needed.
Common relational database management systems (RDBMSs) include
Microsoft SQL Server, Oracle, and MySQL. Like most other relational
database systems, they use SQL (structured query language), which is the
standard language for querying relational databases.
SQL is a declarative language: rather than defining how to do something,
you declare what you want to have happen and the computer determines
how to accomplish the task.
Example
Suppose I need to see all orders placed between March 15 and
April 1 of last year. I request that information, and then the computer
determines how to actually sort the data on the Orders table to find
what I'm looking for.
The master theorem can be used for algorithms where the runtime can
be written in the form
where T (n) is the total runtime and the other parameters are as
described above.
We apply one of three cases, depending on the size of f (n).
Math alert
When you see log ba, it means the logarithm of a to base b.
When b=2, this is the binary logarithm, abbreviated lg. For
example, log 28=3.
Other common bases are 10 (the common logarithm, log) and e
(the natural logarithm, ln).
1. If f (n) is asymptotically less 1 than n log b a, then the runtime for the
algorithm is Th(n log b a).
2. If f (n) = Th(n log b a), then the total runtime is Th(n log b a * log n).
3. If f (n) is asymptotically greater than n log b a, meaning that the total
time to solve all of the subproblems is less than the time required to
combine those solutions, then the runtime is Th(f(n)).
Here we see that the amount of work required for breaking up the problem
and then recombining the subproblems, compared to the amount of work
required to actually solve each subproblem, determines the overall runtime
of the algorithm. 2
We can draw the recursion as a tree with depth log bn, with a i nodes at
depth i. Then there are a log b n = n log b a leaves. We compare the number of
leaves to the amount of work done at each level, and the larger of the two
values determines the solution.
In the first case, the amount of work done in each step is bounded from
above by n log b a. At the extreme case, imagine that f (n) is negligible, so the
runtime is asymptotically equal to the number of leaves: n log b a.
In the third case, the amount of work done at each step is bounded from
below by n log b a, which overwhelms the number of leaves, and the total
runtime is Th(f (n)).
In the second case, neither the number of leaves nor the polynomial f (n)
dominates the other (they are asymptotically equal) and the runtime is the
amount of work done at each level times the number of levels in the tree, or
Th(n log b a * log n).
Example: Mergesort
Recall from Section 8.3.2 that mergesort works by dividing an
array into two smaller arrays, then recursively sorting those arrays.
When we divide up or recombine arrays, we don't do any special
processing, so this takes O(n) time. At each step we divide the array
in two, so there are O(lg n) steps. Multiplying, we get a total runtime
of O(n lg n).
If we apply the master theorem, then f (n) = Th(n). This is case
two above (where a and b are both 2), so the runtime of mergesort is
Th(n lg n).
CHAPTER 33: AMORTIZED RUNTIME
Suppose you have an empty array of size n, and wish to insert values into it.
The first n inserts take O(1) time each. On insert n+1, the array is full and
must be resized, which requires copying all of the values over to a new,
larger array, in time O(n).
Any given insert operation takes as much as O(n) time, but this is only
the case when the array must be resized; otherwise, inserting takes constant
time. What is the average time required for each insert operation?
If we increase the array size by a constant amount each time, then the
number of elements that eventually need to be copied over will dominate the
constant, and operations will take, on average, O(n) time. 1
However, if we double the size of the array each time, then the total time
to add n items is n + n, and each insert takes an average of 2n / n = O(1)
time.
The insight here is that often we don't actually care how long any given
operation takes; what's important is the total time required over all
operations. When each expensive operation can be paired with many cheap
operations, we can amortize the cost of the expensive operation over all of
the cheap ones. Thus, even though any given operation may take O(n) time,
the amortized cost may be much lower. In the array example above, simply
looking at the worst-case runtime for a given step would lead us to conclude
that a series of n inserts would be O(n 2), but the amortized analysis shows
that (provided we double the number of elements when resizing the array) it
will actually be only O(n).
CHAPTER 34: SPLAY TREES
Chapter 34: Splay Trees
In Section 5.1, we introduced binary search trees, which provide Th(lg n)
runtime for common operations provided that the height of the tree is kept
to O(lg n). A splay tree 1 is a self-optimizing binary search tree: it rearranges
itself so that recently accessed nodes will be moved close to the root,
allowing quicker access, while maintaining an average height of O(lg n).
Thus, frequently used items are readily available, while all items can still be
accessed in O(lg n) time in the average case.
34.1 Concepts
Whenever we access a node x of the splay tree, we perform a splay operation
(a series of tree rotations) to move it to the root. When the tree has become
unbalanced, finding a node that is lower down in the tree may take O(n)
time, but the tree is then rebalanced by the splay step. The end result is that
all basic operations are performed in O(lg n) amortized time.
Aside
A tree rotation changes the structure of a binary tree (updating
which nodes are the children of which other nodes) without altering
the order of the elements.
If a is the left child of b, and we do a tree rotation so that b is the
right child of a, we haven't changed that node b is larger than node a;
doing an inorder traversal of the tree before and after the rotation
will return the same result.
Until x is the root, we use the above factors to choose between three
possible tree rotations.
34.2 Zig
When p is the root, we rotate the edge between x and p.
Figure 34.1: After performing the Zig operation, x has become the root.
34.3 Zig-zig
When p is not the root and p and x are both right children or both left
children, we rotate the edge between p and g and then rotate the edge
between x and p.
Figure 34.2: After performing the Zig-zig operation, x has replaced its
grandparent.
34.4 Zig-zag
When neither of the above cases apply, we rotate on the edge between p and
x and then on the resulting edge between x and g.
Figure 34.3: After performing the Zig-zag operation, x has again replaced its
grandparent.
CHAPTER 35: TREAPS
Split: This operation splits the treap into two smaller treaps, one
with keys less than x and one with keys greater than x. To do this,
insert a node with key x and maximum priority. After it is rotated to
the top, simply delete it; its former children will be the two desired
treaps.
Join: To join two ordered treaps (treaps where every key in one
treap is less than every key in the second treap), create a new node
to be the parent of the roots of those treaps. Assign it a legal key
(one which is larger than the key of its left child and smaller than
the key of its right child) and minimum priority. Then rotate it to
its proper location (as a leaf) and delete it, leaving behind the
desired treap. 3
Aside
An intractable problem generally has an algorithm (brute force
search) that provides a solution, but the algorithm is too inefficient to
be practical due to the number of possibilities to be checked.
Combinatorial explosion refers to the rapid growth in the number of
possibilities as the problem size increases. Consider a game of chess.
White has 20 legal options for his first move, and so does black.
Thus, there are 400 legal board states after each player has taken his
first move; after the second set of moves, the total number of
possibilities is into six figures.
36.2 Subfields of AI
Moravec's paradox is that computers are often very good at things that
humans are bad at, but helpless at things that even a small child finds
simple. The most challenging areas in artificial intelligence are often tasks
for which humans have unconscious competence: recognizing faces,
catching a ball, understanding language. Many of these problems have now
spawned their own subfields of AI, some of which were mentioned above.
They include (but are not limited to):
Computer vision
This field attempts to teach a computer to recognize objects in images or
videos.
Planning
The computer decides what actions to take to reach a desired goal.
Robotics
Robots must be able to navigate and respond to unexpected events in the
real world.
Speech processing
This area includes both recognizing and generating speech.
36.3 Examples
AI has become common in everyday life. A few examples that many readers
will be familiar with:
Terminology
A classical computer is simply a computer that is not a quantum
computer. It relies only on classical mechanics, while a quantum
computer requires features of quantum mechanics.
37.1 Physics
Quantum physics deals with how things behave at the subatomic level.
Several concepts will be important in quantum computing:
superposition
This is the ability of a quantum system to be in more than one state at the
same time. A qubit may be in a superposition of 0 and 1; it is both 0 and 1
(with some probability of each) until it is measured.
entanglement
When two or more particles are entangled, the quantum state of each
particle becomes correlated with that of the others. Measuring the state
(spin, polarization, position, momentum) of one particle affects the others,
even if they are separated by a great distance. 1
quantum measurement
When the state of a quantum system is measured, the quantum state
collapses into a single classical state. Each qubit is now either a 0 or a 1,
rather than existing in a superposition of states.