Computer Science Field Guide - Student Version
Computer Science Field Guide - Student Version
Field Guide
2.12.2
Print Edition
Sponsors
Funding for this guide has been generously provided by these sponsors
The Computer Science Field Guide uses a Creative Commons (CC BY-
NC-SA 4.0) license.
1. Introduction
Watch the video online at https://www.youtube.com/embed/v5yeq5u2RMI?rel=0
I'm glad you asked! Put simply, computer science is about tools and techniques
for designing and building applications that are very fast, have great interfaces,
are reliable, secure, helpful --- even fun.
A lot of people confuse computer science with programming. It has been said
that "computer science is no more about computers than astronomy is about
telescopes" (Mike Fellows). Programming is the tool that computer scientists
use to bring great ideas to life, but just knowing how to give programmed
instructions to a computer isn't enough to create software that delights and
empowers people.
For example, computers can perform billions of operations every second, and
yet people often complain that they are too slow. Humans can perceive delays
of about one tenth of a second, and if your program takes longer than that to
respond it will be regarded as sluggish, jerky or frustrating. You've got well
under a second to delight the user! If you are searching millions of items of
data, or displaying millions of pixels (megapixels), you can't afford to do things
the wrong way, and you can't just tell your users that they should buy a faster
computer ... they'll probably just buy someone else's faster software instead!
Here's some advice from Fred Wilson, who has invested in many high profile
tech companies:
First and foremost, we believe that speed is more than a feature. Speed is
the most important feature. If your application is slow, people won't use it. I
see this more with mainstream users than I do with power users. I think that
power users sometimes have a bit of sympathetic eye to the challenges of
building really fast web apps, and maybe they're willing to live with it, but
when I look at my wife and kids, they're my mainstream view of the world. If
something is slow, they're just gone. ... speed is more than a feature. It's a
requirement.
A key theme in computer science is working out how to make things run fast,
especially if you want to be able to sell your software to the large market of
people using old-generation smartphones, or run it in a data centre where you
pay by the minute for computing time. You can't just tell your customers to buy
a faster device --- you need to deliver efficient software.
(This book has many interactives like this. If the calculators don't work properly,
you may need to use a more recent browser. The interactive material in this
book works in most recent browsers; Google Chrome is a particularly safe bet.)
The second calculator above is slower, and that can be frustrating. But it has a
fancier interface --- buttons expand when you point to them to highlight what
you're doing. Does this make it easier to use? Did you have problems because
the "C" and "=" keys are so close?
How interfaces work is a core part of computer science. The aesthetics ---
images and layout --- are important, but what's much more crucial is the
psychology of how people interact. For example, suppose the "OK" and
"Cancel" buttons in dialogue boxes were occasionally reversed. You would
always need to check carefully before clicking on one of them, instead of using
the instinctive moves you've made countless times before. There are some
very simple principles based on how people think and behave that you can take
advantage of to design systems that people love.
Making software that can scale up is another important theme. Imagine you've
built a web interface and have attracted thousands of customers. Everything
goes well until your site goes viral overnight, and you suddenly have millions of
customers. If the system becomes bogged down, people will become frustrated
waiting for a response, and tomorrow you will have no customers --- they’ll all
have moved on to someone else's system. But if your programs are designed
so they can scale up to work with such large amounts of data your main
problem will be dealing with offers to buy your company!
Some of these problems can be solved by buying more equipment, but that can
be an expensive and wasteful option (not just for cost, but because of the
impact on the environment, including the wasted power used to do the
processing inefficiently). With mobile computing it's even more important to
keep things lean and efficient --- heavy duty programs chew up valuable
battery life, and processing and memory must be used sparingly as these affect
the size, weight and even heat dissipation of devices.
If your system is successful and becomes really popular, pretty soon people will
be trying to hack into it to steal valuable customer data or passwords. How can
you design systems so that you know they are secure from such attacks and
your customers can trust you with their personal information or business
transactions?
All these questions and more are addressed by the field of computer science.
The purpose of this guide is to introduce you to those ideas so that you have a
better idea of whether this field is for you. It is aimed at high-school level, and
is intended to bring you to the point where you have a good overview of the
field, and are well prepared for further in-depth study to become an expert.
We've broken computer science up into a whole lot of topics that you'll often
find in curricula around the world, such as algorithms, human-computer
interaction, compression, cryptography, computer graphics, and artificial
intelligence. The reality is that all these topics interact, so be on the lookout for
the connections.
This guide isn't a list of facts for you to memorise, or to copy and paste into
projects! It is mainly a guide to things you can do --- experiences that will
engage you with the topics. In fact, we won't go through all the topics in great
detail, but will give you references to websites and books that explain things
thoroughly. The idea of this guide is to give you enough background to
understand the topics, and to do something meaningful with them.
1.3. Programming
And what about programming? You can get through this whole guide without
doing any programming, although we'll suggest exercises. Ultimately, however,
all the concepts here are reflected in programs that people write. If you want to
learn programming there are many excellent courses available. It takes time
and practice, and is well worth doing in parallel with working through the topics
in this guide. There are a number of free online systems and books that you
can use to teach yourself programming. A list of options for all ages learning to
program is available at www.code.org, where there is also a popular video of
some well-known high-fliers in computing that is good to show classes.
Each chapter begins with a section about the "big picture" --- why the topic is
useful for understanding and designing computer systems, and what can be
achieved using the main ideas in the chapter. You'll then be introduced to key
ideas and applications of the topic through examples, and wherever possible
we'll have interactive activities that enable you to work with the ideas first
hand. Sometimes these will be simplified versions of the full sized problems
that computer scientists need to deal with -- our intention is for you to actually
interact with the ideas, not just read about them. Make sure you give them a
go!
We finish each chapter by talking about the "whole story," giving hints about
parts of the topic that we omitted because we didn't want to make the chapter
too overwhelming. There will be pointers for further reading, but be warned
that some of it might be quite deep, and require advanced math or
programming skills.
If you are doing this for formal study, you'll end up having to do some sort of
assessment. The curriculum guides provide ideas for projects and activities that
could be used for this.
Production of the guide was partially funded by a generous grant from Google
Inc., and supported by the University of Canterbury. Of course, we welcome
donations to support further work on the guide.
There are also some excellent general web sites about Computer Science,
many of which we've referenced in other chapters:
• Computer Science For Fun --- a very readable collection of short articles
about practical applications of topics in computer science
• Babbage's bag is an excellent collection of technical articles on many
topics in computing.
• CS Bytes has up-to-date articles about applications of computer science.
• Thriving in our digital world has some excellent information and interactive
material on topics from computer science.
• The Virginia tech online interactive modules for teaching computer science
cover a range of relevant topics.
• CS animated has interactive activities on computer science.
• CS for All
2. Algorithms
Watch the video online at https://www.youtube.com/embed/FOwCCvHEfY0
If you have read through the Introduction chapter you may remember that the
speed of an application on a computer makes a big difference to a human using
it. If an application you create is too slow, people will get frustrated with it and
won't use it. It doesn't matter if your software is amazing, if it takes too long
they will simply give up and try something else!
Often you can get away with describing a process just using some sort of
informal instructions using natural language; for example, an informal
instruction in a non computing context might be "please get me a glass of
water". A human can understand what this means and can figure out how to
accomplish this task by thinking, but a computer would have no idea how to do
this!
Algorithms are often expressed using a loosely defined format called pseudo-
code, which matches a programming language fairly closely, but leaves out
details that could easily be added later by a programmer. Pseudocode doesn't
have strict rules about the sorts of commands you can use, but it's halfway
between an informal instruction and a specific computer program.
With the high score problem, the algorithm might be written in pseudo-code
like this:
Algorithms are more precise than informal instructions and do not require any
insight to follow; they are still not precise enough for a computer to follow in
the form they are written, but are precise enough for a human to know exactly
what you mean, so they can then work out how to implement your algorithm,
either doing it themselves, or writing a computer program to do it. The other
important thing with this level of precision is that we can often make a good
estimate of how fast it will be. For the high score problem above, if the score
table gets twice as big, the algorithm will take about twice as long. If the table
could be very big (perhaps we're tracking millions of games and serving up the
high score many times each second), that might already be enough to tell us
that we need a better algorithm to track high scores regardless of which
language it's going to be programmed in; or if the table only ever has 10 scores
in it, then we know that the program is only going to do a few dozen
operations, and is bound to be really fast even on a slow computer.
The most precise way of giving a set of instructions is in the form of a program,
which is a specific implementation of an algorithm, written in a specific
programming language, with a very specific result for any particular input. This
is the most precise of these three descriptions and computers are able to follow
and understand these.
For the example with getting a drink, we might program a robot to do that; it
would be written in some programming language that the robot's computer can
run, and would tell the robot exactly how to retrieve a glass of water and bring
it back to the person who asked for the water.
With the high-score problem, it would be written in a particular language; even
in a particular language there are lots of choices about how to write it, but
here's one particular way of working out a high score (don't worry too much
about the detail of the program if the language isn't familiar; the main point is
that you could give it to a computer that runs Python, and it would follow the
instructions exactly):
def find_high_score(scores):
if len(scores) == 0:
print("No high score, table is empty")
return -1
else:
highest_so_far = scores[0]
for score in scores[1:]:
if score > highest_so_far:
highest_so_far = score
return highest_so_far
But here's another program that implements exactly the same algorithm, this
time in the Scratch language.
Both of the above programs are the same algorithm. In this chapter we'll look in
more detail about what an algorithm is, and why they are such a fundamental
idea in computer science. Because algorithms exist even if they aren't turned in
to programs, we won't need to look at programs at all for this topic, unless you
particularly want to.
For example, one way of expressing the cost of the high score algorithm above
would be to observe that for a table of 10 values, it does about 10 sets of
operations to find the best score, whereas for a table of 20 scores, it would do
about twice as many operations. In general the number of operations for a
table of n items will be proportional to n. Not all algorithms take double the
time for double the input; some take a lot more than double, while others take
a lot less. That's worth knowing in advance because we usually need our
programs to scale up well; in the case of the high scores, if you're running a
game that suddenly becomes popular, you want to know in advance that the
high score algorithm will be fast enough if you get more scores to check.
The formal term for working out the cost of an algorithm is algorithm
analysis, and we often refer to the cost as the algorithm's complexity. The
most common complexity is the "time complexity" (a rough idea of how
long it takes to run), but often the "space complexity" is of interest - how
much memory or disk space will the algorithm use up when it's running?
The amount of time a program which performs the algorithm takes to complete
may seem like the simplest cost we could look at, but this can actually be
affected by a lot of different things, like the speed of the computer being used,
or the programming language the program has been written in. This means
that if the time the program takes to complete is used to measure the cost of
an algorithm it is important to use the same program and the same computer
(or another computer with the same speed) for testing the algorithm with
different numbers of inputs.
This algorithm relies on a correct search algorithm in the first step. If the search
algorithm incorrectly chose a random person, the algorithm for assigning
animals as pets would also be incorrect.
As you will see in this chapter with searching and sorting there are multiple
correct algorithms for the same problem. Often there are good reasons to know
multiple correct algorithms because there are tradeoffs in simplicity, algorithm
cost, and assumptions about inputs.
2.2. Searching
Searching through collections of data is something computers have to do all the
time. It happens every time you type in a search on Google, or when you type
in a file name to search for on your computer. Computers deal with such huge
amounts of data that we need fast algorithms to help us find information
quickly.
You may have noticed that the numbers on the monsters and pets in the game
were in a random order, which meant that finding the pet was basically luck!
You might have found it on your first try, or if you were less lucky you might
have had to look inside almost all the presents before you found it. This might
not seem like such a bad thing since you had enough lives to look under all the
boxes, but imagine if there had been 1,000 boxes, or worse 1,000,000! It would
have taken far too long to look through all the boxes and the pet might have
never been found.
Now this next game is slightly different. You have less lives, which makes things
a bit more challenging, but this time the numbers inside the boxes will be in
order. The monsters, or maybe the pet, with the smallest number is in the
present on the far left, and the one with the largest number is in the present on
the far right. Let's see if you can collect all the pets without running out of
lives...
Use the interactive online at http://www.csfieldguide.org.nz/en/interactives/searching-
algorithms/index.html?level=3
Now that you have played through the whole game (and hopefully found all of
the lost pets!) you may have noticed that even though you had less lives in the
second part of the game, and lots of presents to search through, you were still
able to find the pet. Why was this possible?
• Check if the first item in a list is the item you are searching for, if it is the
one you are looking for, you are done.
• If it isn't the item you are searching for move on and check the next item.
• Continue checking items until you find the one you are searching for.
If you used this algorithm you might get lucky and find what you are looking for
on your first go, but if you were really unlucky you might have to look through
everything in your list before you found the right object! For a list of 10 items
this means on average you would only have to look at 5 items to find what you
were looking for, but for a list of 10000 you would have to look through on
average 5000.
If you watched the video at the beginning of the chapter you might be
thinking that what you did in the present searching game sounds more like
Bozo Search than Linear Search, but actually Bozo Search is even sillier
than this! If you were doing a Bozo Search then after unwrapping a present
and finding a monster inside, you would wrap the present back up and try
another one at random! This means you might end up checking the same
present again and again and again and you might never find the pet, even
with a small number of presents!
If you used a Binary Search on each of the levels then you would have always
had enough lives to find the pet! Informally, the Binary Search algorithm is as
follows:
• Look at the item in the centre of the list and compare it to what you are
searching for
• If it is what you are looking for then you are done.
• If it is larger than the item you are looking for then you can ignore all the
items in the list which are larger than that item (if the list is from smallest
to largest this means you can ignore all the items to the right of the centre
item).
• If it is smaller then you can ignore all the items in the list which are
smaller than that centre item.
• Now repeat the algorithm on the remaining half of the list, checking the
middle of the list and choosing one of the halves, until you find the item
you are searching for.
Binary Search is a very powerful algorithm. If you had 1000 presents to search
through it would take you at most 10 checks for Binary search to find
something and Linear search would take at most 1000 checks, but if you
doubled the number of presents to search through how would this change the
number of checks made by Binary Search and Linear search?
Spoiler: How does doubling the number of boxes affect the number of
checks required?
The answer to the above question is that the maximum number of checks
for Linear Search would double, but the maximum number for Binary
Search would only increase by one.
It is important to remember that you can only perform a Binary Search if the
items you are searching through are sorted into order. This makes the sorting
algorithms we will look at next even more important because without sorting
algorithms we wouldn't be able to use Binary Search to quickly look through
data!
The following files will run linear and binary search in various languages;
you can use them to generate random lists of values and measure how
long they take to find a given value. Your project is to measure the amount
of time taken as the number of items (n) increases; try drawing a graph
showing this.
2.3. Sorting
Sorting is another very important area of algorithms. Computers often have to
sort large amounts of data into order based on some attribute of that data,
such as sorting a list of files by their name or size, or emails by the date they
were received, or a customer list according to people's names. Most of the time
this is done to make searching easier. For example you might have a large
amount of data and each piece of data could be someone's name and their
phone number. If you want to search for someone by name it would help to first
have the data sorted alphabetically according to everyones names, but if you
then wanted to search for a phone number it would be more useful to have the
data sorted according to people's phone numbers.
Like searching there are many different sorting algorithms, but some take much
longer than others. In this section you will be introduced to two slower
algorithms and one much better one.
2.3.1. Scales Interactive
Throughout this section you can use the sorting interactive to test out the
algorithms we talk about. When you're using it make sure you take note of the
comparisons at the bottom of the screen, each time you compare two boxes
the algorithm is making 'one comparison' so the total number of comparisons
you have to make with each algorithm is the cost of that algorithm for the 8
boxes.
Use the scales to compare the boxes (you can only compare two boxes at a
time) and then arrange them along the bottom of the screen. Arrange them so
that the lightest box is on the far left and the heaviest is on the far right. Once
you think they are in order click 'Test order'.
If the interactive does not run properly on your computer you can use a set of
physical balance scales instead; just make sure you can only tell if one box is
heavier than the other, not their exact weight (so not digital scales that show
the exact weight).
After finding the lightest box simply repeat the process again with the
remaining boxes until you find the second lightest, now place that to the side
alongside the lightest box. If you keep repeating this process you will
eventually find you have placed each box into order. Try sorting the whole
group of boxes in the scales interactive into order using this method and count
how many comparisons you have to make.
Tip: Start by moving all the boxes to the right of the screen and then once you
have found the lightest box place it to the far right (if you want to find the
heaviest first instead then move them all to the left).
If you record how many comparisons you had to make each time to find the
next lightest box you might notice a pattern (hint: finding the lightest should
take 7 comparisons, and then finding the second lightest should take 6
comparisons…). If you can see the pattern then how many comparisons do you
think it would take to then sort 9 boxes into order? What about 20? If you knew
how many comparisons it would take to sort 1000 boxes, then how many more
comparisons would it take to sort 1001 instead?
This algorithm is called Selection sort, because each time you look through the
list you are 'selecting' the next lightest box and putting it into the correct
position. If you go back to the algorithms racing interactive at the top of the
page you might now be able to watch the selection sort list and understand
what it is doing at each step.
• Find the smallest item in the list and place it to one side. This will be your
sorted list.
• Next find the smallest item in the remaining list, remove it and place it
into your sorted list beside the item you previously put to the side.
• Repeat this process until all items have been selected and moved into
their correct position in the sorted list.
You can swap the word 'smallest' for 'largest' and the algorithm will still work,
as long as you are consistent it doesn't matter if you are looking for the
smallest or the largest item each time.
Try this with the scales interactive. Start by moving all the boxes to one side of
the screen, this is your original, and unsorted, group. Now choose a box at
random and place that on the other side of the screen, this is the start of your
sorted group.
To insert another box into the sorted group, compare it to the box that is
already in the sorted group and then arrange these two boxes in the correct
order. Then to add the next box compare it to these boxes (depending on the
weight of the box you might only have to compare it to one!) and then arrange
these three boxes in the correct order. Continue inserting boxes until the sorted
list is complete. Don't forget to count how many comparisons you had to make!
This algorithm is called Insertion Sort. If you're not quite sure if you've got the
idea of the algorithm yet then have a look at this animation from Wikipedia.
• Take an item from your unsorted list and place it to the side, this will be
your sorted list.
• One by one, take each item from the unsorted list and insert it into the
correct position in the sorted list.
• Do this until all items have been sorted.
People often perform this when they physically sort items. It can also be a very
useful algorithm to use if you already have a sorted set of data and want to add
a new piece of data into the set. For example if you owned a library and
purchased a new book you wouldn't do a Selection Sort on the entire library
just to place this new book, you would simply insert the new book in its correct
place.
2.3.4. Quicksort
Insertion and Selection Sort may seem like logical ways to sort things into
order, but they both take far too many comparisons when they are used for
large amounts of data. Remember computers often have to search through
HUGE amounts of data, so even if they use a good searching algorithm like
Binary Search to look through their data, if they use a bad sorting algorithm to
first sort that data into order then finding anything will take far too long!
Now apply this process to each of the two groups of boxes (the lighter ones,
then the heavier ones). Keep on doing this until they are all sorted. The boxes
should then be in sorted order!
It might be worth trying this algorithm out a few times and counting the
number of comparisons you perform each time. This is because sometimes you
might be unlucky and happen to pick the heaviest, or the lightest box first. On
the other hand you might be very lucky and choose the middle box to compare
everything to first. Depending on this the number of comparisons you perform
will change.
• Choose an item from the list and compare every other item in the list to
this (this item is often called the pivot).
• Place all the items that are greater than it into one subgroup and all the
items that are smaller into another subgroup. Place the pivot item in
between these two subgroups.
• Choose a subgroup and repeat this process. Eventually each subgroup will
contain only one item and at this stage the items will be in sorted order.
The following files will run selection sort and quicksort in various
languages; you can use them to generate random lists of values and
measure how long they take to be sorted. Note how long these take for
various amounts of input (n), and show it in a table or graph. You should
notice that the time taken by Quicksort is quite different to that taken by
selection sort.
There are dozens of sorting algorithms that have been invented; most of the
ones that are used in practice are based on quicksort and/or mergesort. These,
and many others, can be seen in this intriguing animated video.
2.4.1. Sequencing
Sequencing is the technique of deciding the order instructions are executed to
produce the correct result. Imagine that we have the following instructions (A,
B, C) to make a loaf of bread:
1. Combine ingredients
2. If ingredients contain yeast, allow to sit at room temperature for 1
hour
3. Bake for 30 minutes
2.4.3. Iteration
Iteration allows an algorithm to repeat instructions. In its simplest form we
might specify the exact number of times. For example, here is an algorithm to
bake 2 loaves of bread:
1. Repeat 2 times:
1. Combine ingredients
2. If ingredients contain yeast, allow to sit at room temperature for 1
hour
3. Bake for 30 minutes
This algorithm clearly works but it would take at least 3 hours to complete! If
we had to make 20 loaves we would probably want to design a better
algorithm. We could measure the size of the mixing bowl, how many loaves fit
on the table to rise, and how many loaves we could bake at the same time in
the oven. Our algorithm might then look like:
1. Repeat 10 times:
1. Combine ingredients for 2 loaves
2. Split dough into 2 bread pans
3. If ingredients contain yeast, allow to sit at room temperature for 1
hour
4. Bake bread pans in the same oven for 30 minutes
Astute observers will note that this algorithm is still inefficient because the
rising table and oven are not used at the same time. Designing algorithms that
take advantage of parallelism is an important advanced topic in computer
science.
We can connect the algorithm for baking bread in the previous section to this
algorithm to create a new algorithm that makes croutons from scratch. If we
required other ingredients for our recipe, we could connect multiple algorithms
to build very complex algorithms.
Often when we have multiple algorithms that solve a problem there are
advantages of each algorithm for specific cases. Hybrid algorithms take parts of
multiple algorithms and combine them to gain the advantages of both original
algorithms. For example, Timsort is one of the fastest known sorting algorithms
in practice and it uses parts of insertion sort and merge sort. Insertion sort is
used on very small sequences to take advantage of its speed for already or
partially ordered sequences. Merge sort is used to merge these small
sequences into larger ones to take advantage of the better upper bound on
algorithm cost for large data sets.
The algorithms introduced in this chapter aren't even necessarily the best for
any situation; there are several other common ways of searching (e.g. hashing
and search trees) and sorting (e.g. mergesort), and a computer scientist needs
to know them, and be able to apply and fine tune the right one to a given
situation.
In this chapter we'll look at what happens when you write and run a program,
and how this affects the way that you distribute the program for others to use.
We start with an optional subsection on what programming is, for those who
have never programmed before and want an idea about what a program is.
Examples of very simple programs in Python are provided, and these can be
run and modified slightly. Working through this section should give you
sufficient knowledge for the rest of this chapter to make sense; we won't teach
you how to program, but you will get to go through the process that
programmers use to get a program to run. Feel free to skip this section if you
are already know a bit about programming.
print("**********************************************")
print("**********************************************")
print("** Welcome to computer programming, Student **")
print("**********************************************")
print("**********************************************")
This program is written in a language called Python, and when the program
runs, it will print the following text to the screen
************************************************
************************************************
*** Welcome to computer programming, Student ***
************************************************
************************************************
Try changing the program so that it says your name instead of Student. When
you think you have it right, try running the program again to see. Make sure
you don’t remove the double quotes or the parentheses (round brackets) in the
program by mistake. What happens if you spelt "programming" wrong? Does
the computer correct it? If you are completely stuck, ask your teacher for help
before going any further.
Hopefully you figured out how to make the program print your name. You can
also change the asterisks (*) to other symbols. What happens if you do remove
one of the double quotes or one of the parentheses? Try it!
If you change a critical symbol in the program you will probably find that the
Python interpreter gives an error message. In the online Python interpreter
linked to above, it says “ParseError: bad input on line 1”, although different
interpreters will express the error in different ways. If you have trouble fixing
the error again, just copy the program back into Python from above.
Programming languages can do much more than print out text though. The
following program is able to print out multiples of a number. Try running the
program.
The first line is a print statement, like those you saw earlier, which just tells the
system to put the message on the screen. The second line is a loop, which says
to repeat the lines after it 5 times. Each time it loops, the value of i changes.
i.e. the first time i is 0, then 1, then 2, then 3, and finally 4. It may seem weird
that it goes from 0 to 4 rather than 1 to 5, but programmers tend to like
counting from 0 as it makes some things work out a bit simpler. The third line
says to print the current value of i multiplied by 3 (because we want multiples
of 3). Note that there is not double quotes around the last print statement, as
they are only used when we want to print out a something literally as text. If we
did put them in, this program would print the text "i*3" out 5 times instead of
the value we want!
Note that the # symbol tells the computer that it should ignore the line, as it is
a comment for the programmer.
Try changing the recipients or the letter. Look carefully at all the symbols that
were used to include the recipient's name in the letter.
The first line is a print statement (which you should be very familiar with by
now!) The second line asks the user for a number of miles which is converted
from input text (called a string) to an integer, the third line uses an if statement
to check if the number entered was less than 0, so that it can print an error if it
is. Otherwise if the number was ok, the program jumps into the else section
(the error is not printed because the if was not true), calculates the number of
kilometers (there are 0.6214 kilometers in a mile), stores it into a variable
called number_of_kilometers for later reference, and then the last line prints it
out. Again, we don’t have quotes around number_of_kilometers in the last line
as we want to print the value out that is stored in the number_of_kilometers
variable. If this doesn’t make sense, don’t worry. You aren’t expected to know
how to program for this chapter, this introduction is only intended for you to
have some idea of what a program is and the things it can do.
If you are keen, you could modify this program to calculate something else,
such as pounds to kilograms or Fahrenheit to Celsius. It may be best to use an
installed Python interpreter on your computer rather than the web version, as
the web version can give very unhelpful error messages when your program
has a mistake in it (although all interpreters give terrible error messages at
least sometimes!)
Programs can do many more things, such as having a graphical user interface
(like most computer programs you will be familiar with), being able to print
graphics onto a screen, or being able to write to and read from files on the
computer in order to save information between each time you run the program.
3.1.2. Where are we going?
When you ran the programs, it might have seemed quite magical that the
computer was able to instantly give you the output. Behind the scenes
however, the computer was running your example programs through another
program in order to convert them into a form that it could make sense of and
then run.
Firstly, you might be wondering why we need languages such as Python, and
why we can’t give computers instructions in English. If we typed into the
computer “Okay computer, print me the first 5 multiples of 3”, there's no
reason that it would be able to understand. For starters, it would not know what
a “multiple” is. And it would not even know how to go about this task.
Computers cannot be told what every word means, and they cannot know how
to accomplish every possible task. Understanding human language is a very
difficult task for a computer, as you will find out in the Artificial Intelligence
chapter. Unlike humans who have an understanding of the world, and see
meaning, computers are only able to follow the precise instructions you give
them. Therefore, we need languages that are constrained and unambiguous
that the computer “understands” instructions in. These can be used to give the
computer instructions, like those in the previous section.
It isn’t this simple though, a computer cannot run instructions given directly in
these languages. At the lowest level, a computer has to use physical hardware
to run the instructions. Arithmetic such as addition, subtraction, multiplication,
and division, or simple comparisons such as less than, greater than, or equal to
are done on numbers represented in binary by putting electricity through
physical computer chips containing transistors. The output is also a number
represented in binary. Building a fast and cheap circuit to do simple arithmetic
such as this isn't that hard, but the kind of instructions that people want to give
computers (like "print the following sentence", or "repeat the following 100
times") are much harder to build circuitry for.
The electronics in computers uses circuitry that mainly just works with two
values (represented as high and low voltages) to make it reliable and fast.
This system is called binary, and is often written on paper using zeroes and
ones. There's a lot more about binary in the data representation chapter,
and it's worth having a quick look at the first section of that now if you
haven't come across binary before.
The conversion from a high level to a low level language can involve compiling,
which replaces the high level instructions with machine code instructions that
can then be run, or it can be done by interpreting, where each instruction is
converted and followed one by one, as the program is run. In reality, a lot of
languages use a mixture of these, sometimes compiling a program to an
intermediate language, then interpreting it (Java does this). The language we
looked at earlier, Python, is an interpreted language. Other languages such as
C++ are compiled. We will talk more about compiling and interpreting later.
We will start with looking at low level languages and how computers actually
carry out the instructions in them, then we will look at some other
programming languages that programmers use to give instructions to
computers, and then finally we will talk about how we convert programs that
were written by humans in a high level language into a low level language that
the computer can carry out.
The instructions are quite different to the ones you will have seen before in
high level languages. For example, the following program is written in a
machine language called MIPS; which is used on some embedded computer
systems. We will use MIPS in examples throughout this chapter.
It starts by adding 2 numbers (that have been put in registers $t0 and $t1) and
printing out the result. It then prints “Hello World!” Don’t worry, we aren’t
about to make you learn how to actually program in this language! And if you
don’t really understand the program, that’s also fine because many software
engineers wouldn’t either! (We are showing it to you to help you to appreciate
high level languages!)
.data
str: .asciiz "\nHello World!\n"
# You can change what is between the quotes if you like
.text
.globl main
main:
# Do the addition
# For this, we first need to put the values
# to add into registers ($t0 and $t1)
# You can change the 30 below to another value
li $t0, 30
# You can change the 20 below to another value
li $t1, 20
You can run this program using a MIPS emulator using this interactive:
Copy and paste the output in the “Assembler Output” box into the box in this
simulator interactive:
Use the interactive online at http://www.csfieldguide.org.nz/en/interactives/mips-simulator/
index.php
Once you have got the program working, try changing the values that are
added. The comments tell you where these numbers that can be changed are.
You should also be able to change the string (text) that is printed without too
much trouble also. As a challenge, can you make it so that it subtracts rather
than adds the numbers? Clue: instruction names are always very short.
Unfortunately you won’t be able to make it multiply or divide using this
simulator as this is not currently supported. Remember that to rerun the
program after changing it, you will have to follow both steps 1 and 2 again.
You may be wondering why you have to carry out both these steps. Because
computers work in 1’s and 0’s, the instructions need to simply be converted
into hexadecimal. Hexadecimal is a shorthand notation for binary numbers.
Don’t muddle this process with compiling or interpreting! Unlike these, it is
much simpler as in general each instruction from the source code ends up
being one line in the hexadecimal.
One thing you might have noticed while reading over the possible instructions
is that there is no loop instruction in MIPS. Using several instructions though, it
actually is possible to write a loop using this simple language. Have another
read of the paragraph that describes the various instructions in MIPS. Do you
have any ideas on how to solve this problem? It requires being quite creative!
The jumping to a line, and jumping to a line if a condition is met can be used to
make loops! A very simple program we could write that requires a loop is one
that counts down from five and then says “Go!!!!” once it gets down to one. In
Python we can easily write this program in three lines.
But in MIPS, it isn’t that straightforward. We need to put values into registers,
and we need to build the loop out of jump statements. Firstly, how can we
design the loop?
And the full MIPS program for this is as follows. You can go away and change it.
# Define the data strings
.data
go_str: .asciiz "GO!!!!!\n"
new_line: .asciiz "\n"
.text
# Where should we start?
.globl main
main:
# Put our starting value 5 into register $t0. We will update it as we
go
li $t0, 5
# Put our stopping value 0 into register $t1
li $t1, 0
# This is the end loop label that we jumped to when the loop is false
end_loop:
# These three lines print the “GO!!!!” string
li $v0, 4
la $a0, go_str
syscall
# And these 2 lines make the program exit nicely
li $v0, 0
jr $ra
Can you change the Python program so that it counts down from 10? What
about so it stops at 5? (You might have to try a couple of times, as it is
somewhat counter intuitive. Remember that when i is the stopping number, it
stops there and does not run the loop for that value!). And what about
decrementing by 2 instead of 1? And changing the string (text) that is printed
at the end?
You probably found the Python program not too difficult to modify. See if you
can make these same changes to the MIPS program.
If that was too easy for you, can you make both programs print out “GO!!!!”
twice instead of once? (you don’t have to use a loop for that). And if THAT was
too easy, what about making each program print out “GO!!!!” 10 times?
Because repeating a line in a program 10 times without a loop would be terrible
programming practice, you’d need to use a loop for this task.
More than likely, you’re rather confused at this point and unable to modify the
MIPS program with all these suggested changes. And if you do have an
additional loop in your MIPS program correctly printing “GO!!!” 10 times, then
you are well on your way to being a good programmer!
So, what was the point of all this? These low level instructions may seem
tedious and a bit silly, but the computer is able to directly run them on
hardware due to their simplicity. A programmer can write a program in this
language if they know the language, and the computer would be able to run it
directly without doing any further processing. As you have probably realised
though, it is extremely time consuming to have to program in this way. Moving
stuff in and out of registers, implementing loops using jump and branch
statements, and printing strings and integers using a three line pattern that
you’d probably never have guessed was for printing had we not told you leaves
even more opportunities for bugs in the program. Not to mention, the resulting
programs are extremely difficult to read and understand.
Because computers cannot directly run the instructions in the languages that
programmers like, high level programming languages by themselves are not
enough. The solution to this problem of different needs is to use a compiler or
interpreter that is able to convert a program in the high level programming
language that the programmer used into the machine code that the computer
is able to understand.
These days, few programmers program directly in these languages. In the early
days of computers, programs written directly in machine language tended to be
faster than those compiled from high level languages. This was because
compilers weren’t very good at minimising the number of machine language
instructions, referred to as optimizing, and people trained to write in machine
code were better at it. These days however, compilers have been made a lot
smarter, and can optimize code far better than most people can. Writing a
program directly in machine code may result in a program that is less
optimized than one that was compiled from a high level language. Don’t put in
your report that low level languages are faster!
This isn’t the full story; the MIPS machine code described here is something
called a Reduced Instruction Set Architecture (RISC). Many computers these
days use a Complex Instruction Set Architecture (CISC). This means that the
computer chips can be a little more clever and can do more in a single step.
This is well beyond the scope of this book though, and understanding the kinds
of things RISC machine code can do, and the differences between MIPS and
high level languages is fine at this level, and fine for most computer scientists
and software engineers.
Karen 12 12 14 18 17
James 9 7 1
Ben 19 17 19 13
Lisa 9 1 3 0
Amalia 20 20 19 15 18
Cameron 19 15 12 9 3
She realises she needs to know the average (assuming 5 quizzes) that each
student scored, and with many other things to do does not want to spend much
time on this task. Using python, she can very quickly generate the data she
needs in less than 10 lines of code.
Note that understanding the details of this code is irrelevant to this chapter,
particularly if you aren’t yet a programmer. Just read the comments (the things
that start with a “#”) if you don’t understand, so that you can get a vague idea
of how the problem was approached.
This will generate a file that contains each student’s name followed by the
result of adding their scores and dividing the sum by 5. You can try the code if
you have python installed on your computer (it won’t work on the online
interpreter, because it needs access to a file system). Just put the raw data into
a file called “scores.txt” in the same format it was displayed above. As long as
it is in the same directory as the source code file you make for the code, it will
work.
This problem could of course be solved in any language, but some languages
make it far simpler than others. Standard software engineering languages such
as Java, which we talk about shortly, do not offer such straight forward file
processing. Java requires the programmer to specify what to do if opening the
file fails in order to prevent the program from crashing. Python does not require
the programmer to do this, although does have the option to handle file
opening failing should the programmer wish to. Both these approaches have
advantages in different situations. For the teacher writing a quick script to
process the quiz results, it does not matter if the program crashes so it is ideal
to not waste time writing code to deal with it. For a large software system that
many people use, crashes are inconvenient and a security risk. Forcing all
programmers working on that system to handle this potential crash correctly
could prevent a lot of trouble later on, which is where Java’s approach helps.
In addition to straight forward file handling, Python did not require the code to
be put inside a class or function, and it provided some very useful built in
functions for solving the problem. For example, the function that found the sum
of the list, and the line of code that was able to convert the raw line of text into
a list of numbers (using a very commonly used pattern).
This same program written in Java would require at least twice as many lines of
code.
There are many other scripting languages in addition to Python, such as Perl,
Bash, and Ruby.
3.3.2. Scratch
Scratch is a programming language used to teach people how to program. A
drag and drop interface is used so that new programmers don’t have to worry
so much about syntax, and programs written in Scratch are centered around
controlling cartoon characters or other sprites on the screen.
And this is the output that will be displayed when the green flag is clicked:
Scratch can be used for simple calculations, creating games and animations.
However it doesn't have all the capabilities of other languages.
Other educational languages include Alice and Logo. Alice also uses drag and
drop, but in a 3D environment. Logo is a very old general purpose language
based on Lisp. It is not used much anymore, but it was famous for having a
turtle with a pen that could draw on the screen, much like Scratch. The design
of Scratch was partially influenced by Logo. These languages are not used
beyond educational purposes, as they are slow and inefficient.
3.3.3. Java
Java is a popular general purpose software engineering language. It is used to
build large software systems involving possibly hundreds or even thousands of
software engineers. Unlike Python, it forces programmers to say how certain
errors should be handled, and it forces them to state what type of data their
variables are intended to hold, e.g. int (i.e. a number with no decimal places),
or String (some text data). Python does not require types to be stated like this.
All these features help to reduce the number of bugs in the code. Additionally,
they can make it easier for other programmers to read the code, as they can
easily see what type each variable is intended to hold (figuring this out in a
python program written by somebody else can be challenging at times, making
it very difficult to modify their code without breaking it!)
This is the Java code for solving the same problem that we looked at in Python;
generating a file of averages.
import java.io.*;
import java.util.*;
While the code is longer, it ensures that the program doesn’t crash if something
goes wrong. It says to try opening and reading the file, and if an error occurs,
then it should catch that error and print out an error message to tell the user.
The alternative (such as in Python) would be to just crash the program,
preventing anything else from being able to run it. Regardless of whether or not
an error occurs, the "I am finished!" line will be printed, because the error was
safely “caught”. Python is able to do error handling like this, but it is up to the
programmer to do it. Java will not even compile the code if this wasn’t done!
This prevents programmers from forgetting or just being lazy.
There are many other general software engineering languages, such as C# and
C++. Python is sometimes used for making large software systems, although is
generally not considered an ideal language for this role.
3.3.4. JavaScript
• Interpreted in a web browser
• Similar language: Actionscript (Flash)
Note that this section will be completed in a future version of the field guide.
For now, you should refer to wikipedia page for more information.
3.3.5. C
• Low level language with the syntax of a high level language
• Used commonly for programming operating systems, and embedded
systems
• Programs written in C tend to be very fast (because it is designed in a way
that makes it easy to compile it optimally into machine code)
• Bug prone due to the low level details. Best not used in situations where it
is unnecessary
• Related languages: C++ (somewhat)
Note that this section will be completed in a future version of the field guide.
For now, you should refer to wikipedia page for more information.
3.3.6. Matlab
• Used for writing programs that involve advanced math (calculus, linear
algebra, etc.)
• Not freely available
• Related languages: Mathematica, Maple
Note that this section will be completed in a future version of the field guide.
For now, you should refer to wikipedia page for more information.
You could even make your own programming language if you wanted to!
Since the computer hardware can only run programs in a low level language
(machine code), the programming system has to make it possible for your
Python instructions to be executed using only machine language. There are two
broad ways to do this: interpreting and compiling.
The main difference is that a compiler is a program that converts your program
to machine language, which is then run on the computer. An interpreter is a
program that reads your program line by line, works out what those instructions
are, and does them immediately.
There are advantages to both approaches, and each one suits some languages
better than others. In reality, most modern languages use a mixture of
compiling and interpreting. For example, most Java programs are compiled to
an "intermediate language" called ByteCode, which is closer to machine code
than Java. The ByteCode is then executed by an interpreter.
If your program is to be distributed for widespread use, you will usually want it
to be in machine code because it will run faster, the user doesn't have to have
an interpreter for your particular language installed, and when someone
downloads the machine code, they aren't getting a copy of your original high-
level program. Languages where this happens include C#, Objective C (used for
programming iOS devices), Java, and C.
Interpreted programs have the advantage that they can be easier to program
because you can test them quickly, trace what is happening in them more
easily, and even sometimes type in single instructions to see what they do,
without having to go through the whole compilation process. For this reason
they are widely used for introductory languages (for example, Scratch and Alice
are interpreted), and also for simple programs such as scripts that perform
simple tasks, as they can be written and tested quickly (for example, languages
like PHP, Ruby and Python are used in these situations).
The languages we have discussed in this chapter are ones that you are likely to
come across in introductory programming, but there are some completely
different styles of languages that have very important applications. There is an
approach to programming called Functional programming where all operations
are formulated as mathematical functions. Common languages that use
functional techniques include Lisp, Scheme, Haskell, Clojure and F#; even some
conventional languages (such as Python) include ideas from functional
programming. A pure functional programming style eliminates a problem called
side effects, and without this problem it can be easier to make sure a program
does exactly what it is intended to do. Another important type of programming
is logic programming, where a program can be thought of as a set of rules
stating what it should do, rather than instructions on how to do it. The most
well-known logic programming language is Prolog.
Computers are becoming hundreds of times more powerful every decade, yet
there is one important component of the computer system that hasn't changed
significantly in performance since the first computers were developed in the
1940s: the human. For a computer system to work really well, it needs to be
designed by people who understand the human part of the system well.
In this chapter we'll look at what typically makes good and bad interfaces. The
idea is to make you sensitive to the main issues so that you can critique
existing interfaces, and begin to think about how you might design good
interfaces.
Try out the following interactive task, and get some friends to try it:
Did anyone get a wrong answer to the question even though they thought they
got it right? You may have noticed that the "Even" and "Odd" buttons
sometimes swap. Inconsistency is almost always a really bad thing in an
interface, as it can easily fool the user into making an error.
A lot of people might blame themselves for such errors, but basic psychology
says that this is a natural error to make, and a good system should protect
users from such errors (for example, by allowing them to be undone).
Designing good interfaces is very difficult. Just when you think you've got a
clever idea, you'll find that a whole group of people struggle to figure out how
to use it, or it backfires in some situation. Even worse, some computer
developers think that their users are dummies and that any interface problems
are the user’s fault and not the developer’s. But the real problem is that the
developer knows the system really well, whereas the user just wants to get
their job done without having to spend a lot of time learning the software – if
the software is too hard to use and they have a choice, they’ll just find
something else that’s easier. Good interfaces are worth a lot in the market!
There are many ways to evaluate and fine tune interfaces, and in this chapter
we'll look at some of these. One important principle is that one of the worst
people to evaluate an interface is the person who designed and programmed it.
They know all the details of how it works, they've probably been thinking about
it for weeks, they know the bits that you're not supposed to touch and the
options that shouldn't be selected, and of course they have a vested interest in
finding out what is right with it rather than what is wrong. It's also important
that the interface should be evaluated by someone who is going to be a typical
user; if you get a 12-year-old to evaluate a retirement planning system they
may not know what the user will be interested in; and if you get a teacher to try
out a system that students will use, they may know what the answers are and
what the correct process is.
Often interfaces are evaluated by getting typical users to try them out, and
carefully noting any problems they have. There are companies that do nothing
but these kinds of user tests – they will be given a prototype product, and pay
groups of people to try it out. A report on the product is then produced and
given to the people who are working on it. This is an expensive process, but it
makes the product a lot better, and may well give it a huge advantage over its
competitors. Having it evaluated by a separate company means that you avoid
any bias from people in your own company who want to prove (even
subconsciously) that they've done a good job designing it, rather than uncover
any niggling problems in the software that will annoy users.
Think about the kinds of considerations you would have to make for the
following user groups.
• Senior citizens
• Gamers
• Casual users
• Foreign visitors
Spoiler: Some possible answers: Don't open until you've thought about it!
• Senior citizens: use large print, have few features to learn, don't rely
so much on memory, allow for poor eyesight and less agile physically
(e.g. large buttons help), don't assume previous experience with
computers
• Gamers: use previous experience with typical game interfaces,
expecting challenges, probably running on a high end machine
• Casual users: interface needs to be very easy to learn, perhaps based
on widely used systems, needs clear guidance
• Foreign visitors: use simple language and meaningful images/icons
The interface is the only part of a program that the user sees (that's the
definition of an interface!), so if the interface doesn't work for them, then the
program doesn't work.
It's important to think through all the parts of a task when discussing an
interface, as it's the small steps in a task that make all the difference between
using the interface in a real situation, compared with a demonstration of some
features of the device.
It's very important to think about the whole context when describing a
task. As an exercise, can you provide an example of a real task, including
context, for a real person for each of the following:
Discuss your answers with a classmate or a friend. This should help you to
evaluate your own answer, and consider other possible examples.
Computer systems often make people feel dumb - in fact, there are lots of
"dummies" books available, such as "iPad for dummies" or "The Complete
Idiot's Guide to Microsoft Windows 8". These books sell millions of copies,
yet chances are the people who buy them are actually quite intelligent ---
it's just that the interfaces can make people so frustrated that they feel like
a dummy. The truth is that if an interface makes lots of people feel like an
idiot, chances are the real problem is with the interface and not the user. In
the past there has been a culture where the balance of power was with the
programmers who designed a system, and they could afford to blame the
users for any problems. However, now users have a large choice of
systems, and there are competitors ready to offer a better interface, so if a
programmer continually blame your users for problems, chances are it's
the programmer who is the complete idiot! If you hear people using
derogatory terms such as luser, PEBKAC, or ID-10T error (Idiot error), they
may be humorous, but they usually show a disregard for the importance of
getting an interface right, and are a sign that the system is badly designed.
For this project, try sending an email from both a computer and a mobile
phone. Take note of all the steps required from when you start using the
device until the email is sent.
You will probably notice quite a few differences between the two interfaces.
Keep your notes for later, as you can further analyse them once you have
read through more of this chapter.
• 87 people were killed when Air Inter Flight 148 crashed due to the
pilots entering "33" to get a 3.3 degree descent angle, but the same
interface was used to enter the descent rate, which the autopilot
interpreted as 3,300 feet per minute. This interface problem is called a
"mode error" (described later). There is more information here.
• 13 people died and many more were injured when the pilots of Varig
Flight 254 entered an incorrect heading. The flight plan had specified
a heading of 0270, which the captain interpreted and entered into the
flight computer as 270 degrees. What it actually meant was 027.0
degrees. This confusion came about due to the format of headings
and the position of the decimal point on flight plans being changed
without him knowing. Unfortunately, the co-pilot mindlessly copied the
captain's heading instead of reading it off the flight plan like he was
supposed to. The plane then cruised on autopilot for a few hours.
Unfortunately, confirmation bias got the better of the pilots who were
convinced they were near their destination, when in fact they were
hundreds of miles away. The plane ran out of fuel and crash landed in
the Amazon Jungle. Designing aircraft systems which work for humans
is a big challenge, and is a part of the wider area of human factors
research.
• A bank employee accidentally gave a customer a loan of $10 million
instead of $100,000. The customer withdrew most of the money and
fled to Asia, the bank lost millions of dollars in the process, and the
teller concerned would have had a traumatic time just because of a
typing error. The error was due to the employee typing in two extra
zeroes, apparently because some interfaces automatically put in the
decimal point (you could type 524 to enter $5.24), and others didn't.
This error can be explained in terms of a lack of consistency in the
interface, causing a mode error.
• A 43-year old woman suffered respiratory arrest after a nurse
accidentally entered 5 instead of 0.5 for a dose rate for morphine. The
interface should have made it difficult to make an error by a factor of
10. There is a paper about it, and an article about the interface
problem. Similar problems can occur in any control system where the
operator types in a value; a better interface would force the operator
to press an "up" and "down" button, so big changes take a lot of work
(this is an example of an "off by one error", where one extra digit is
missed or added, and also relates to the principle of commensurate
effort)
In all these cases the fault could be blamed on the user (the pilots, the
bank teller and the nurse) for making a mistake, but a well designed
interface that doesn't cause serious consequences from mistakes that
humans can easily make would be much better.
There are many elements that can be considered in usability, and we will
mention a few that you are likely to come across when evaluating everyday
interfaces. Bear in mind that the interfaces might not just be a computer – any
digital device such as an alarm clock, air conditioning remote control,
microwave oven, or burglar alarms can suffer from usability problems.
4.3.1. Consistency
A "golden rule" of usability is consistency. If a system keeps changing on you,
it's going to be frustrating to use. Earlier we had the example of a "Even"/"Odd"
button pair that occasionally swapped places. A positive example is the
consistent use of "control-C" and "control-V" in many different programs to copy
and paste text or images. This also helps learnability: once you have learned
copy and paste in one program, you know how to use it in many others.
Imagine if every program used different menu commands and keystrokes for
this!
A related issue is the Mode error, where the behaviour of an action depends on
what mode you are in. A simple example is having the caps lock key down
(particularly for entering a password, where you can't see the effect of the
mode). A classic example is in Excel spreadsheets, where the effect of clicking
on a cell depends on the mode: sometimes it selects the cell, and other times it
puts the name of the cell you clicked on into another cell. Modes are considered
bad practice in interface design because they can easily cause the user to
make the wrong action, and should be avoided if possible.
The following interactive lets you find out how fast "instant" is for you. As you
click on each cell, there will sometimes be a random delay before it comes up;
other cells won't have a delay. Click on each cell, and if it seems to respond
instantly, leave it as it is. However, if you perceive that there is a small delay
before the image comes up, click it again (which makes the cell green). Just
make a quick, gut-level decision the first time you click each cell - don't
overthink it. The delay may be very short, but only make the cell green if you
are fairly sure you noticed a delay.
Once you have clicked on all the cells, click on "View statistics" to see how long
the delays were compared with your perception. 100 ms (100 milliseconds) is
one tenth of a second; for most people this is where they are likely to start
perceiving a delay; anything shorter (particularly around 50 ms) is very hard to
notice. Longer delays (for example, 350 ms, which is over a third of a second)
are very easy to notice.
The point of this is that any interface element (such as a button or checkbox)
that takes more than 100 ms to respond is likely to be perceived by the user as
not working, and they may even click it again. In the case of a checkbox, this
may lead to it staying off (from the two clicks), making the user think that it's
not working. Try clicking the following checkbox just enough times to make it
show as selected.
So, as you evaluate interfaces, bear in mind that even very small delays can
make a system hard to use.
The following video is of an experiment that was done with virtual reality
goggles to simulate Internet lag in real life situations. It has English captions,
but the most interesting part is what is happening in the action.
There's some more information about "time limits" for interfaces in this article
by Jakob Nielsen.
Another place that the layout of an interface changes quickly is when a tablet
or smartphone is rotated. Some devices rearrange the icons for the new
orientation, which loses the spatial layout, while others keep them the same
(except they may not look right in the new rotation). Try a few different devices
and see which ones change the layout when rotated.
There are a number of other situations where the layout can change
suddenly for the user, and create confusion. Here are some examples:
• The layout may change if a data projector is plugged in and the screen
resolution changes (which is particularly frustrating because the user
may well be about to present to an audience, and they can't find an
icon, with the added awkwardness that lots of people are waiting).
• If you upgrade to a different size device (such as a larger monitor or
different smartphone) you may have to re-learn where everything is.
• Layouts often change with new versions of software (which is one
reason that upgrading every time a new version comes out may not
be the best plan).
• Using the same software on a different operating system may have
subtly different layout (e.g. if someone who uses the Chrome browser
all the time on Windows starts using Chrome under MacOS). This can
be particularly frustrating because the location of common controls
(close/maximise window, and even the control key on the keyboard) is
different, which frustrates the user's spatial memory.
• The Microsoft Word "ribbon" was particularly frustrating for users
when it came out for several of the reasons already mentioned -- the
position of each item was quite different to the previous versions.
• Adaptive interfaces can also be a problem; it might seem like a good
idea to gradually change a menu in a program so that the frequently
used items are near the top, or unused items are hidden, but this can
lead to a frustrating treasure hunt for the user as they can't rely on
their spatial memory to find things.
Associated with spatial memory is our muscle memory, which helps us to locate
items without having to look carefully. With some practice, you can probably
select a common button with a mouse just by moving your hand the same
distance that you always have, rather than having to look carefully. Working
with a new keyboard can mean having to re-learn the muscle memory that you
have developed, and so may slow you down a bit or cause you to press the
wrong keys.
4.3.5. Missing the button
One common human error that an interface needs to take account of is the off
by one error, where the user accidentally clicks or types on an item next to the
one they intended. For example, if the "save" menu item is next to a "delete"
menu item, that is risky because one small slip could cause the user to erase a
file instead of saving it. A similar issue occurs on keyboards; for example,
control-W might close just one window in a web browser, and control-Q might
close the entire web-browser, so choosing these two adjacent keys is a
problem. Of course, this can be fixed by either checking if the user quits, or by
having all the windows saved so that the user just needs to open the browser
again to get their work back. This can also occur in web forms, where there is a
reset button next to the submit button, and the off-by-one error causes the user
to lose all the data they just entered.
4.3.7. In summary
These are just a few ideas from HCI that will help you to be aware of the kinds
of issues that interfaces can have. In the following project you can observe
these kinds of problems firsthand by watching someone else use an interface,
noting any problems they have. It's much easier to observe someone else than
do this yourself, partly because it's hard to concentrate on the interface and
take notes at the same time, and partly because you might already know the
interface and have learned to overcome some of the less usable features.
In a think aloud protocol, you observe someone else using the interface
that you want to evaluate, and encourage them to explain what they're
thinking at each step. You'll take notes on what they say, and you can
reflect on that afterwards to evaluate the interface (it can be helpful to
record the session if possible.)
For example, if someone setting an alarm clock says "I'm pressing the up
button until I get to 7am - oh bother, it stopped at 7:09, now I have to go
right around again", that gives some insight into how the interface might
get in the way of the users completing a task efficiently.
The project will be more interesting if your helper isn't completely familiar
with the system, or if it's a system that people often find confusing or
frustrating. Your writeup could be used to make decisions about improving
the system in the future.
The task could be things like setting the time on a clock, finding a recently
dialled number on an unfamiliar phone, or choosing a TV program to
record.
To do the evaluation, you should give the device to your helper, explain the
task to them, and ask them to explain what they are thinking at each step.
Your helper may not be used to doing that, so you can prompt them as they
go with questions like:
If they get the hang of "thinking aloud", just keep quiet and take notes on
what they say.
It's very important not to criticise or intimidate the helper! If they make a
mistake, try to figure out how the interface made them do the wrong thing,
rather than blaming them. Any mistakes they make are going to be
valuable for your project! If they get everything right, it won't be very
interesting.
Once you've noted what happened, go over it, looking for explanations for
where the user had difficulty. The examples earlier in the chapter will help
you to be sensitive to how interfaces can frustrate the user, and find ways
that they could be improved.
The goal of the cognitive walkthrough is to identify if the user can see what
to do at each step, and especially to notice if there is anything that is
confusing or ambiguous (like which button to press next), and to find out if
they're confident that the right thing happened.
The task only needs to have about 3 or 4 steps (e.g. button presses), as
you'll be asking three questions at each step and taking notes on their
responses, so it could take a while. You should know how to do the task
yourself as we'll be focussing on the few steps needed to accomplish the
task; if the user goes off track, you can put them back on task rather than
observe them trying to recover from an HCI problem that shouldn't have
been there in the first place. The task might be something like recording a
10-second video on a mobile phone, deleting a text message, or setting a
microwave oven to reheat food for 45 seconds.
• do you know what to try to do at this step? Then have them look at
the interface, and ask:
• can you see how to do it? Then have them take the action they
suggested, and ask:
• are you able to tell that you did the right thing?
If their decisions go off track, you can reset the interface, and start over,
explaining what to do for the step they took wrong if necessary (but noting
that this wasn't obvious to them --- it will be a point to consider for
improving the interface.)
Once the first action has been completed, repeat this with the next action
required (it might be pressing a button or adjusting a control). Once again,
ask the three questions above in the process.
In practice the second question (can you see how to do it?) is usually split
into two: do they notice the control at all, and if so, do they realise that it's
the one that is needed? For this exercise we'll simplfy it to just one
question.
There are various sets of heuristics that people have proposed for evaluating
interfaces, but a Danish researcher called Jakob Nielsen has come up with a set
of 10 heuristics that have become very widely used, and we will describe them
in this section. If you encounter a usability problem in an interface, it is almost
certainly breaking one of these heuristics, and possibly a few of them. It's not
easy to design a system that doesn't break any of the heuristics, and
sometimes you wouldn't want to follow them strictly – that's why they are
called heuristics, and not rules.
This heuristic states that a user should be able to see what the device is doing
(the system's status), at all times. This varies from the user being able to tell if
the device is turned on or off, to a range of actions. A classic example is the
"caps lock" key, which may not clearly show if it is on, and when typing a
password the user might not know why it is being rejected; a positive example
of this is when a password entry box warns you that the caps lock key is on.
There are many tasks that users ask computers to do that require some time
including copying documents, downloading files, and loading video games. In
this situation, one of the most common ways to keep a user informed of the
task is the progress bar.
Image source
There are some other important delay periods in interface evaluation: a delay
of around 1 second is where natural dialogues start to get awkward, and
around 10 seconds puts a lot of load on the user to remember what they were
doing. Nielsen has an article about the importance of these time periods. If you
want to test these ideas, try having a conversation with someone where you
wait 3 seconds before each response; or put random 10 second delays in when
you're working on a task!
The language, colours and notation in an interface should match the user's
world, and while this seems obvious and sensible, it's often something that is
overlooked. Take for example the following two buttons – can you see what is
confusing about them?
The following interface is from a bank system for paying another person.
Suppose you get an email asking someone to pay you $1699.50 for a used car;
try entering "$1699.50" into the box.
The notation "$1699.50" is a common way to express a dollar amount, but this
system forces you to follow its own conventions (probably to make things
easier for the programmer who wrote the system).
Try find some other amounts which should be valid, but this system seems to
reject. Ideally, the system should be flexible with the inputted text to prevent
errors.
The dialogue also rejects commas in the input e.g. "1,000", even though
they are a very useful way to read dollar amounts, e.g. it's hard to
differentiate between 1000000 and 100000, yet this could make a huge
difference! It also doesn't allow you to have a space before or after the
number, yet if the number has been copied and pasted from an email, it
might look perfectly alright. A less lazy programmer would allow for these
situations; the current version is probably using a simple number
conversion system that saves having to do extra programming...
4.4.3. User control and freedom
Users often choose system functions by mistake and will need a clearly marked
"emergency exit" to leave the unwanted state without having to go through an
extended dialogue. Support undo and redo.
It is very frustrating to make a mistake and not be able to get out of it. It is
particularly bad if one small action can wipe a lot of work that can't be
recovered. The reset button on some web forms is infamous for this – it is often
next to the submit button, and you can wipe all your data with an off-by-one
error.
A common way to provide user freedom is an "undo" feature, which means that
not only can mistakes be fixed easily, but the user is encouraged to
experiment, trying out features of the interface secure in the knowledge that
they can just "undo" to get back to how things were, instead of worrying that
they'll end up in a state that they can't fix. If "redo" is also available, they can
flick back and forth, deciding which is best. (In fact, redo is really an undo for
undo!)
Here's an example of a button that doesn't provide user control; if you press it,
you'll lose this whole page and have to find your way back (we warned you!)
Sometimes the interface can force the user into doing something they don't
want to do. For example, it is quite common for operating systems or programs
to perform updates automatically that require a restart. Sometimes the
interface may not give them the opportunity to cancel or delay this, and restart
nevertheless. This is bad if it happens when the user is just about to give a
presentation.
Another common form of this problem is not being able to quit a system. A
positive example is the "home" button on smartphones, which almost always
stops the current app that is in use.
Image source
A lack of consistency is often the reason behind people not liking a new system.
It is particularly noticeable between Mac and Windows users; someone who has
only used one system can find the other very frustrating to use because so
many things are different (consider the window controls for a start, which are in
a different place and have different icons). An experienced user of one interface
will think that it is "obvious", and can't understand why the other person finds it
frustrating, which can lead to discussions of religious fervour on which interface
is best. Similar problems can occur when a radically different version of an
operating system comes out (such as Windows 8); a lot of the learning that has
been done on the previous system needs to be undone, and the lack of
consistency (i.e. losing prior learning) is frustrating.
A computer program shouldn't make it easy for people to make serious errors.
An example of error prevention found in many programs is a menu item on a
toolbar or dropdown being 'greyed out' or deactivated. It stops the user from
using a function that shouldn't be used in that situation, like trying to copy
when nothing is selected. A good program would also inform the user why an
item is not available (for example in a tooltip).
Below is a date picker; can you see what errors can be produced with it?
The date picker allows the user to choose invalid dates, such as Feb 30, or
Nov 31. The three-menu date picker is hard to get right, because each
menu item limits what can be in the others, but any can be changed. For
example, you might pick 29 Feb 2008 (a valid date), then change the year
to 2009 (not valid), then back to 2008. When 2009 was chosen the day
number would need to change to 28 to prevent errors, but if that was just
an accident and the user changes back to 2008, the number has now
changed, and might not be noticed. It's preferable to use a more
sophisticated date picker that shows a calendar, so the user can only click
on valid dates (many websites will offer this). Date picking systems usually
provide a rich example for exploring interface issues!
A related problem with dates is when a user needs to pick a start and end date
(for example, booking flights or a hotel room); the system should prevent a
date prior to the first date being selected for the second date.
Here's a menu system that offers several choices:
Any time a dialogue box comes up that says you weren't allowed to do a
certain action, it's frustrating because the system has failed to prevent an error.
Of course, it may be difficult to do that because the error can depend on so
many user choices, but it is ideal that the system doesn't offer something that
it can't do.
And here's another example, this time with a computer science slant: the
following calculator has a binary mode, where it does calculations on binary
numbers. The trouble is that in this mode you can still type in decimal digits,
which gives an error when you do the calculation. A user could easily not notice
that it's in binary mode, and the error message isn't particularly helpful!
4.4.6. Recognition rather than recall
Minimize the user's memory load by making objects, actions, and options
visible. The user should not have to remember information from one part of the
dialogue to another. Instructions for use of the system should be visible or
easily retrievable whenever appropriate.
Humans are generally very good at recognising items, but computers are good
at remembering them accurately. A good example of this is a menu system; if
you click on the "Edit" menu in a piece of software, it will remind you of all the
editing tasks available, and you can choose the appropriate one easily. If
instead you had to type in a command from memory, that would put more load
on the user. In general it's good for the computer to "remember" details, and
the user to be presented with options rather than having to remember them.
The exception is a system that is used all the time by an expert who knows all
the options; in this case entering commands directly can sometimes be more
flexible and faster than having to select from a list.
For example, when you type in a place name in an online map, the system
might start suggesting names based on what you're typing, and probably
adapted to focus on your location or past searches. The following image is from
Google maps, which suggests the name of the place you may be trying to type
(in this case, the user has only had to type 4 letters, and the system saves
them from having to recall the correct spelling of
"Taumatawhakatangihangakoauauotamateapokaiwhenuakitanatahu" because
they can then select it.) A similar feature in web browsers saves users from
having to remember the exact details of a URL that they have used in the past;
a system that required you to type in place names exactly before you could
search for them could get rather frustrating.
4.4.7. Flexibility and efficiency of use
Accelerators -- unseen by the novice user -- may often speed up the interaction
for the expert user such that the system can cater to both inexperienced and
experienced users. Allow users to tailor frequent actions.
When someone is using software every day, they will soon have common
sequences of operations they do (such as "Open the file, find the next blank
space, type in a record of what just happened"). It's good to offer ways to make
this quick to do, such as "macros", which do a sequence of actions from a
single keystroke.
An important area of research in HCI is working out how to make shortcuts easy
to learn. You don't want them to get in the way for beginners, but you don't
want frequent users to be unaware of them either. A simple way of doing this is
having keystroke equivalents in a menu (an accelerator); the menu displayed
here shows that shift-command-O will open a new project, so the user can learn
this sequence if they are using the command frequently.
The following site identified some of the "scariest" interfaces around, some
of which are great examples of not having minimalist design: OK/Cancel
scariest interface.
Cartoonist Roz Chast illustrates how scary a remote control can be with her
cartoon "How Grandma sees the remote".
It’s not hard to find error messages that don’t really tell you what’s wrong! The
most common examples are messages like "Misc error", "Error number -2431",
or "Error in one of the input values". These force the user to go on a debugging
mission to find out what went wrong, which could be anything from a
disconnected cable or unfixable compatibility issue, to a missing digit in a
number.
A variant of unhelpful error messages is one that gives two alternatives, such
as "File may not exist, or it may already be in use". A better message would
save the user having to figure out which of these is the problem.
A positive example can be found in some alarm clocks such as the following
one on an Android smartphone. For example, here the alarm time is shown at
"9:00". In a country that uses 12-hour time, a user might mistake this for 9pm,
and the alarm would go off at the wrong time.
There are many other ideas from psychology, physiology, sociology and even
anthropology that HCI experts must draw on. Things that come into play
include Mental models, about how someone believes a system works compared
with how it actually works (these are almost never the same e.g. double
clicking on an icon that only needs to be single clicked), Fitts's law, about how
long it takes to point to objects on a screen (such as clicking on a small button),
the Hick-Hyman law, about how long it takes to make a decision between
multiple choices (such as from a menu), Miller's law about the number of items
a person can think about at once, affordances, about how properties of an
object help us to perform actions on them, interaction design (IxD), about
creating digital devices that work for the people who will use the product, the
NASA TLX (Task Load Index) for rating the perceived workload that a task puts
on a user, and many more laws, observations and guidelines about designing
interfaces that take account of human behaviour and how the human body
functions.
• The cs4fn website has a lot of articles and activities on Human Computer
Interaction, such as problems around reporting interface problems,
cultural issues in interface design, and The importance of Sushi.
The idea that everything stored and transmitted in our digital world is stored
using just two values might seem somewhat fantastic, but here's an exercise
that will give you a little experience using just black and white cards to
represent numbers. In the following interactive, click on the last card (on the
right) to reveal that it has one dot on it. Now click on the previous card, which
should have two dots on it. Before clicking on the next one, how many dots do
you predict it will have? Carry on clicking on each card moving left, trying to
guess how many dots each has.
The challenge for you now is to find a way to have exactly 22 dots showing (the
answer is in the spoiler below). Now try making up other numbers of dots, such
as 11, 29 and 19. Is there any number that can't be represented? To test this,
try counting up from 0.
You may have noticed that each card shows twice as many dots as the one
to its right. This is an important pattern in data representation on
computers.
The number 22 requires the cards to be "white, black, white, white, black",
11 is "black, white, black, white, white", 29 is "white, white, white, black,
white", and 19 is "white, black, black, black, white".
You should have found that any number from 0 to 31 can be represented with 5
cards. Each of the numbers could be communicated using just two words: black
and white. For example, 22 dots is "white, black, white, white, black". Or you
could decode "black, black, white, white, white" to the number 7. This is the
basis of data representation - anything that can have two different states can
represent anything on a digital device.
When we write what is stored in a computer on paper, we normally use “0” for
one of the states, and “1” for the other state. For example, a piece of computer
memory could have the following voltages:
low low high low high high high high low high low low
We could allocate “0” to “low”, and “1” to “high” and write this sequence
down as:
0 0 1 0 1 1 1 1 0 1 0 0
While this notation is used extensively, and you may often hear the data being
referred to as being “0’s and 1’s”, it is important to remember that a computer
does not store 0’s and 1’s; it has no way of doing this. They are just using
physical mechanisms such as high and low voltage, north or south polarity, and
light or dark materials.
Every file you save, every picture you make, every download, every digital
recording, every web page is just a whole lot of bits. These binary digits are
what make digital technology digital! And the nature of these digits unlock a
powerful world of storing and sharing a wealth of information and
entertainment.
Computer scientists don't spend a lot of time reading bits themselves, but
knowing how they are stored is really important because it affects the amount
of space that data will use, the amount of time it takes to send the data to a
friend (as data that takes more space takes longer to send!) and the quality of
what is being stored. You may have come across things like "24-bit colour",
"128-bit encryption", "32-bit IPv4 addresses" or "8-bit ASCII". Understanding
what the bits are doing enables you to work out how much space will be
required to get high-quality colour, hard-to-crack secret codes, a unique ID for
every device in the world, or text that uses more characters than the usual
English alphabet.
This chapter is about some of the different methods that computers use to code
different kinds of information in patterns of these bits, and how this affects the
cost and quality of what we do on the computer, or even if something is
feasible at all.
When working through the material in this section, a good way to draw
braille on paper without having to actually make raised dots is to draw a
rectangle with 6 small circles in it, and to colour in the circles that are
raised, and not colour in the ones that aren’t raised.
Let's work out how many different patterns can be made using the 6 dots in a
Braille character. If braille used only 2 dots, there would be 4 patterns. And with
3 dots there would be 8 patterns
You may have noticed that there are twice as many patterns with 3 dots as
there are with 2 dots. It turns out that every time you add an extra dot, that
gives twice as many patterns, so with 4 dots there are 16 patterns, 5 dots has
32 patterns, and 6 dots has 64 patterns. Can you come up with an explanation
as to why this doubling of the number of patterns occurs?
Spoiler: Why does adding one more dot double the number of possible
patterns?
The reason that the number of patterns doubles with each extra dot is that
with, say, 3 dots you have 8 patterns, so with 4 dots you can use all the 3-
dot patterns with the 4th dot flat, and all of them with it raised. This gives
16 4-dot patterns. And then, you can do the same with one more dot to
bring it up to 5 dots. This process can be repeated infinitely.
So, Braille, with its 6 dots, can make 64 patterns. That's enough for all the
letters of the alphabet, and other symbols too, such as digits and punctuation.
Digital devices almost always use two values (binary) for similar reasons:
computer disks and memory can be made cheaper and smaller if they only
need to be able to distinguish between two extreme values (such as a high and
low voltage), rather than fine-grained distinctions between very subtle
differences in voltages. Using ten digits (like we do in our every day decimal
counting system) would obviously be too challenging.
Why are digital systems so hung up on only using two digits? After all, you
could do all the same things with a 10 digit system?
5.3. Numbers
In this section, we will look at how computers represent numbers. To begin with,
we'll revise how the base-10 number system that we use every day works, and
then look at binary, which is base-2. After that, we'll look at some other
charactertistics of numbers that computers must deal with, such as negative
numbers and numbers with decimal points.
In decimal, the value of each digit in a number depends on its place in the
number. For example, in $123, the 3 represents $3, whereas the 1 represents
$100. Each place value in a number is worth 10 times more than the place
value to its right, i.e. there are the “ones”, the “tens”, the “hundreds”, the
“thousands” the “ten thousands”, the “hundred thousands”, the “millions”, and
so on. Also, there are 10 different digits (0,1,2,3,4,5,6,7,8,9) that can be at
each of those place values.
If you were only able to use one digit to represent a number, then the largest
number would be 9. After that, you need a second digit, which goes to the left,
giving you the next ten numbers (10, 11, 12... 19). It's because we have 10
digits that each one is worth 10 times as much as the one to its right.
All this probably sounds really obvious, but it is worth thinking about
consciously, because binary numbers have the same properties.
Binary works in a very similar way to Decimal, even though it might not initially
seem that way. Because there are only 2 digits, this means that each digit is 2
times the value of the one immediately to the right.
The interactive below illustrates how this binary number system represents
numbers. Have a play around with it to see what patterns you can see.
What is the largest number you can make with the interactive? What is the
smallest? Is there any integer value in between the biggest and the smallest
that you can’t make? Are there any numbers with more than one
representation? Why/ why not?
You have probably noticed from the interactive that when set to 1, the leftmost
bit (the “most significant bit”) adds 32 to the total, the next adds 16, and then
the rest add 8, 4, 2, and 1 respectively. When set to 0, a bit does not add
anything to the total. So the idea is to make numbers by adding some or all of
32, 16, 8, 4, 2, and 1 together, and each of those numbers can only be included
once.
Image source
Choose a number less than 61 (perhaps your house number, your age, a
friend's age, or the day of the month you were born on), set all the binary digits
to zero, and then start with the left-most digit (32), trying out if it should be
zero or one. See if you can find a method for converting the number without
too much trial and error. Try different numbers until you find a quick way of
doing this.
Figure out the binary representation for 23 without using the interactive? What
about 4, 0, and 32? Check all your answers using the interactive to verify they
are correct.
Using your new knowledge of the binary number system, can you figure
out a way to count to higher than 10 using your 10 fingers? What is the
highest number you can represent using your 10 fingers? What if you
included your 10 toes as well (so you have 20 fingers and toes to count
with).
The interactive used exactly 6 bits. In practice, we can use as many or as few
bits as we need, just like we do with decimal. For example, with 5 bits, the
place values would be 16, 8, 4, 2 and 1, so the largest value is 11111 in binary,
or 31 in decimal. Representing 14 with 5 bits would give 01110.
The answers are (spaces are added to make the answers easier to read,
but are not required)
An important concept with binary numbers is the range of values that can be
represented using a given number of bits. When we have 8 bits the binary
numbers start to get useful --- they can represent values from 0 to 255, so it is
enough to store someone's age, the day of the month, and so on.
Groups of 8 bits are so useful that they have their own name: a byte.
Computer memory and disk space are usually divided up into bytes, and
bigger values are stored using more than one byte. For example, two bytes
(16 bits) are enough to store numbers from 0 to 65,535. Four bytes (32
bits) can store numbers up to 4,294,967,295. You can check these numbers
by working out the place values of the bits. Every bit that's added will
double the range of the number.
In practice, computers store numbers with either 16, 32, or 64 bits. This is
because these are full numbers of bytes (a byte is 8 bits), and makes it easier
for computers to know where each number starts and stops.
Luckily it's possible to use binary notation for birthday candles --- each
candle is either lit or not lit. For example, if you are 18, the binary notation
is 10010, and you need 5 candles (with only two of them lit).
It's a lot smarter to use binary notation on candles for birthdays as you get older,
as you don't need as many candles.
Writing out long binary numbers is tedious --- for example, suppose you need to
copy down the 16-bit number 0101001110010001. A widely used shortcut is to
break the number up into 4-bit groups (in this case, 0101 0011 1001 0001),
and then write down the digit that each group represents (giving 5391). There's
just one small problem: each group of 4 bits can go up to 1111, which is 15,
and the digits only go up to 9.
The solution is simple: we introduce symbols for the digits from 1010 (10) to
1111 (15), which are just the letters A to F. So, for example, the 16-bit binary
number 1011 1000 1110 0001 can be written more concisely as B8E1. The "B"
represents the binary 1011, which is the decimal number 11, and the E
represents binary 1110, which is decimal 14.
Because we now have 16 digits, this representation is base 16, and known as
hexadecimal (or hex for short). Converting between binary and hexadecimal is
very simple, and that's why hexadecimal is a very common way of writing down
large binary numbers.
Here's a full table of all the 4-bit numbers and their hexadecimal digit
equivalent:
Binary Hex
0000 0
0001 1
0010 2
0011 3
0100 4
0101 5
0110 6
0111 7
1000 8
Binary Hex
1001 9
1010 A
1011 B
1100 C
1101 D
1110 E
1111 F
For example, the largest 8-bit binary number is 11111111. This can be written
as FF in hexadecimal. Both of those representations mean 255 in our
conventional decimal system (you can check that by converting the binary
number to decimal).
Which notation you use will depend on the situation; binary numbers represent
what is actually stored, but can be confusing to read and write; hexadecimal
numbers are a good shorthand of the binary; and decimal numbers are used if
you're trying to understand the meaning of the number or doing normal math.
All three are widely used in computer science.
Some of the things that we might think of as numbers, such as the telephone
number (03) 555-1234, aren't actually stored as numbers, as they contain
important characters (like dashes and spaces) as well as the leading 0 which
would be lost if it was stored as a number (the above number would come out
as 35551234, which isn't quite right). These are stored as text, which is
discussed in the next section.
On the other hand, things that don't look like a number (such as "30 January
2014") are often stored using a value that is converted to a format that is
meaningful to the reader (try typing two dates into Excel, and then subtract
one from the other --- the result is a useful number). In the underlying
representation, a number is used. Program code is used to translate the
underlying representation into a meaningful date on the user interface.
The difference between two dates in Excel is the number of days between
them; the date itself (as in many systems) is stored as the amount of time
elapsed since a fixed date (such as 1 January 1900). You can test this by
typing a date like "1 January 1850" --- chances are that it won't be
formatted as a normal date. Likewise, a date sufficiently in the future may
behave strangely due to the limited number of bits available to store the
date.
Numbers are used to store things as diverse as dates, student marks, prices,
statistics, scientific readings, sizes and dimensions of graphics.
Any system that stores numbers needs to make a compromise between the
number of bits allocated to store the number, and the range of values that can
be stored.
In some systems (like the Java and C programming languages and databases)
it's possible to specify how accurately numbers should be stored; in others it is
fixed in advance (such as in spreadsheets).
Some are able to work with arbitrarily large numbers by increasing the space
used to store them as necessary (e.g. integers in the Python programming
language). However, it is likely that these are still working with a multiple of 32
bits (e.g. 64 bits, 96 bits, 128 bits, 160 bits, etc). Once the number is too big to
fit in 32 bits, the computer would reallocate it to have up to 64 bits.
In some programming languages there isn't a check for when a number gets
too big (overflows). For example, if you have an 8-bit number using two's
complement, then 01111111 is the largest number (127), and if you add one
without checking, it will change to 10000000, which happens to be the number
-128. This can cause serious problems if not checked for, and is behind a
variant of the Y2K problem, called the Year 2038 problem, involving a 32-bit
number overflowing for dates on Tuesday, 19 January 2038.
Image source
On tiny computers, such as those embedded inside your car, washing machine,
or a tiny sensor that is barely larger than a grain of sand, we might need to
specify more precisely how big a number needs to be. While computers prefer
to work with chunks of 32 bits, we could write a program (as an example for an
earthquake sensor) that knows the first 7 bits are the lattitude, the next 7 bits
are the longitude, the next 10 bits are the depth, and the last 8 bits are the
amount of force.
Even on standard computers, it is important to think carefully about the
number of bits you will need. For example, if you have a field in your database
that could be either "0", "1", "2", or "3" (perhaps representing the four bases
that can occur in a DNA sequence), and you used a 64 bit number for every
one, that will add up as your database grows. If you have 10,000,000 items in
your database, you will have wasted 62 bits for each one (only 2 bits is needed
to represent the 4 numbers in the example), a total of 620,000,000 bits, which
is around 74 MB. If you are doing this a lot in your database, that will really add
up -- human DNA has about 3 billion base pairs in it, so it's incredibly wasteful
to use more than 2 bits for each one.
And for applications such as Google Maps, which are storing an astronomical
amount of data, wasting space is not an option at all!
It is really useful to know roughly how many bits you will need to represent
a certain value. Have a think about the following scenarios, and choose the
best number of bits out of the options given. You want to ensure that the
largest possible number will fit within the number of bits, but you also want
to ensure that you are not wasting space.
• a) 1 bit
• b) 4 bits
• c) 8 bits
• d) 32 bits
• a) 16 bits
• b) 32 bits
• c) 64 bits
• d) 128 bits
• a) 16 bits
• b) 32 bits
• c) 64 bits
• d) 128 bits
• a) 16 bits
• b) 32 bits
• c) 64 bits
• d) 128 bits
We will look at two possible approaches: Adding a simple sign bit, much like we
do for decimal, and then a more useful system called Two's Complement.
For example, if we wanted to represent the number 41 using 7 bits along with
an additional bit that is the sign bit (to give a total of 8 bits), we would
represent it by 00101001. The first bit is a 0, meaning the number is positive,
then the remaining 7 bits give 41, meaning the number is +41. If we wanted to
make -59, this would be 10111011. The first bit is a 1, meaning the number is
negative, and then the remaining 7 bits represent 59, meaning the number is
-59.
Using 8 bits as described above (one for the sign, and 7 for the actual
number), what would be the binary representations for 1, -1, -8, 34, -37,
-88, and 102?
The spaces are not necessary, but are added to make reading the binary
numbers easier
• 1 is 0000 0001
• -1 is 1000 0001
• -8 is 1000 1000
• 34 is 0010 0010
• -37 is 1010 0101
• -88 is 1101 1000
• 102 is 0110 0110
Going the other way is just as easy. If we have the binary number 10010111,
we know it is negative because the first digit is a 1. The number part is the next
7 bits 0010111, which is 23. This means the number is -23.
• 00010011
• 10000110
• 10100011
• 01111111
• 11111111
• 00010011 is 19
• 10000110 is -6
• 10100011 is -35
• 01111111 is 127
• 11111111 is -127
But what about 10000000? That converts to -0. And 00000000 is +0. Since -0
and +0 are both just 0, it is very strange to have two different representations
for the same number.
This is one of the reasons that we don't use a simple sign bit in practice.
Instead, computers usually use a more sophisticated representation for
negative binary numbers called Two's Complement.
Representing positive numbers is the same as the method you have already
learnt. Using 8 bits, the leftmost bit is a zero and the other 7 bits are the usual
binary representation of the number; for example, 1 would be 00000001, and
65 would be 00110010.
1. Convert the number to binary (don't use a sign bit, and pretend it is a
positive number).
2. Invert all the digits (i.e. change 0's to 1's and 1's to 0's).
3. Add 1 to the result (Adding 1 is easy in binary; you could do it by
converting to decimal first, but think carefully about what happens when a
binary number is incremented by 1 by trying a few; there are more hints in
the panel below).
The rule for adding one to a binary number is pretty simple, so we'll let you
figure it out for yourself. First, if a binary number ends with a 0 (e.g.
1101010), how would the number change if you replace the last 0 with a 1?
Now, if it ends with 01, how much would it increase if you change the 01 to
10? What about ending with 011? 011111?
The method for adding is so simple that it's easy to build computer
hardware to do it very quickly.
1. 19
2. -19
3. 107
4. -107
5. -92
In order to reverse the process, we need to know whether the number we are
looking at is positive or negative. For positive numbers, we can simply convert
the binary number back to decimal. But for negative numbers, we first need to
convert it back to a normal binary number.
So, if the number starts with a 1, use the following process to convert the
number back to a negative decimal number.
1. 00001100
2. 10001100
3. 10111111
1. 12
2. 10001100 -> (-1) 10001011 -> (inverted) 01110100 -> (to decimal)
116 -> (negative sign added) -116
3. 10111111 -> (-1) 10111110 -> (inverted) 01000001 -> (to decimal)
65 -> (negative sign added) -65
While it might initially seem that there is no bit allocated as the sign bit, the
left-most bit behaves like one. With 8 bits, you can still only make 256 possible
patterns of 0's and 1's. If you attempted to use 8 bits to represent positive
numbers up to 255, and negative numbers down to -255, you would quickly
realise that some numbers were mapped onto the same pattern of bits.
Obviously, this will make it impossible to know what number is actually being
represented!
0 to −9,223,372,036,854,775,808 to
64 bit
18,446,744,073,709,551,615 9,223,372,036,854,775,807
You've probably learnt about column addition. For example, the following
column addition would be used to do 128 + 255.
1 (carries)
128
+255
----
383
When you go to add 5 + 8, the result is higher than 9, so you put the 3 in the
one's column, and carry the 1 to the 10's column. Binary addition works in
exactly the same way.
111 (carries)
11001110
+00001111
---------
11011101
Remember that the digits can be only 1 or 0. So you will need to carry a 1 to
the next column if the total you get for a column is (decimal) 2 or 3.
With negative numbers using sign bits like we did before, this does not work. If
you wanted to add +11 (01011) and -7 (10111), you would expect to get an
answer of +4 (00100).
11111 (carries)
01011
+10111
100010
Which is -2.
One way we could solve the problem is to use column subtraction instead. But
this would require giving the computer a hardware circuit which could do this.
Luckily this is unnecessary, because addition with negative numbers works
automatically using Two's Complement!
For the above addition (+11 + -7), we can start by converting the numbers to
their 5-bit Two's Complement form. Because 01011 (+11) is a positive
number, it does not need to be changed. But for the negative number, 00111
(-7) (sign bit from before removed as we don't use it for Two's Complement),
we need to invert the digits and then add 1, giving 11001.
01011
11001
100100
Any extra bits to the left (beyond what we are using, in this case 5 bits) have
been truncated. This leaves 00100, which is 4, like we were expecting.
We can also use this for subtraction. If we are subtracting a positive number
from a positive number, we would need to convert the number we are
subtracting to a negative number. Then we should add the two numbers. This is
the same as for decimal numbers, for example 5 - 2 = 3 is the same as 5 + (-2)
= 3.
For larger numbers (such as subtracting the two 3-digit numbers 255 -
128), the complement is the number that adds up to the next power of 10
i.e. 1000-128 = 872. Check that adding 872 to 255 produces (almost) the
same result as subtracting 128.
Working out complements in binary is way easier because there are only
two digits to work with, but working them out in decimal may help you to
understand what is going on.
Two's Complement is widely used, because it only has one representation for
zero, and it allows positive numbers and negative numbers to be treated in the
same way, and addition and subtraction to be treated as one operation.
There are other systems such as "One's Complement" and "Excess-k", but
Two's Complement is by far the most widely used in practice.
5.4. Text
There are several different ways in which computers use bits to store text. In
this section, we will look at some common ones and then look at the pros and
cons of each representation.
5.4.1. ASCII
We saw earlier that 64 unique patterns can be made using 6 dots in Braille. A
dot corresponds to a bit, because both dots and bits have 2 different possible
values.
Each pattern in ASCII is usually stored in 8 bits, with one wasted bit, rather than
7 bits. However, the left-most bit in each 8-bit pattern is a 0, meaning there are
still only 128 possible patterns. Where possible, we prefer to deal with full bytes
(8 bits) on a computer, this is why ASCII has an extra wasted bit.
Here is a table that shows the patterns of bits that ASCII uses for each of the
characters.
For example, the letter c (lower-case) in the table has the pattern “01100011”
(the 0 at the front is just extra padding to make it up to 8 bits). The letter o has
the pattern “01101111”. You could write a word out using this code, and if you
give it to someone else, they should be able to decode it exactly.
The name "ASCII" stands for "American Standard Code for Information
Interchange", which was a particular way of assigning bit patterns to the
characters on a keyboard. The ASCII system even includes "characters" for
ringing a bell (useful for getting attention on old telegraph systems),
deleting the previous character (kind of an early "undo"), and "end of
transmission" (to let the receiver know that the message was finished).
These days those characters are rarely used, but the codes for them still
exist (they are the missing patterns in the table above). Nowadays ASCII
has been supplanted by a code called "UTF-8", which happens to be the
same as ASCII if the extra left-hand bit is a 0, but opens up a huge range of
characters if the left-hand bit is a 1.
Note that the text "358" is treated as 3 characters in ASCII, which may be
confusing, as the text "358" is different to the number 358! You may have
encountered this distinction in a spreadsheet e.g. if a cell starts with an
inverted comma in Excel, it is treated as text rather than a number. One
place this comes up is with phone numbers; if you type 027555555 into a
spreadsheet as a number, it will come up as 27555555, but as text the 0
can be displayed. In fact, phone numbers aren't really just numbers
because a leading zero can be important, as they can contain other
characters -- for example, +64 3 555 1234 extn. 1234.
English text can easily be represented using ASCII, but what about languages
such as Chinese where there are thousands of different characters?
Unsurprisingly, the 128 patterns aren’t nearly enough to represent such
languages. Because of this, ASCII is not so useful in practice, and is no longer
used widely. In the next sections, we will look at Unicode and its
representations. These solve the problem of being unable to represent non-
English characters.
There are several other codes that were popular before ASCII, including the
Baudot code and EBCDIC. A widely used variant of the Baudot code was the
"Murray code", named after New Zealand born inventor Donald Murray.
One of Murray's significant improvements was to introduce the idea of
"control characters", such as the carriage return (new line). The "control"
key still exists on modern keyboards.
The most widely used Unicode encoding schemes are called UTF-8, UTF-16, and
UTF-32; you may have seen these names in email headers or describing a text
file. Some of the Unicode encoding schemes are fixed length, and some are
variable length. Fixed length means that each character is represented
using the same number of bits. Variable length means that some characters
are represented with fewer bits than others. It's better to be variable length,
as this will ensure that the most commonly used characters are represented
with fewer bits than the uncommonly used characters. Of course, what might
be the most commonly used character in English is not necessarily the most
commonly used character in Japanese. You may be wondering why we need so
many encoding schemes for Unicode. It turns out that some are better for
English language text, and some are better for Asian language text.
The remainder of the text representation section will look at some of these
Unicode encoding schemes so that you understand how to use them, and why
some of them are better than others in certain situations.
5.4.3. UTF-32
UTF-32 is a fixed length Unicode encoding scheme. The representation for
each character is simply its number converted to a 32 bit binary number.
Leading zeroes are used if there are not enough bits (just like how you can
represent 254 as a 4 digit decimal number -- 0254). 32 bits is a nice round
number on a computer, often referred to as a word (which is a bit confusing,
since we can use UTF-32 characters to represent English words!)
The following interactive will allow you to convert a Unicode character to its
UTF-32 representation. The Unicode character's number is also displayed. The
bits are simply the binary number form of the character number.
ASCII actually took the same approach. Each ASCII character has a number
between 0 and 255, and the representation for the character the number
converted to an 8 bit binary number. ASCII is also a fixed length encoding
scheme -- every character in ASCII is represented using 8 bits.
In practice, UTF-32 is rarely used -- you can see that it's pretty wasteful of
space. UTF-8 and UTF-16 are both variable length encoding schemes, and very
widely used. We will look at them next.
1. What is the largest number that can be represented with 32 bits? (In
both decimal and binary).
2. The largest number in Unicode that has a character assigned to it is
not actually the largest possible 32 bit number -- it is 00000000
00010000 11111111 11111111. What is this number in decimal?
3. Most numbers that can be made using 32 bits do not have a Unicode
character attached to them -- there is a lot of wasted space. There are
good reasons for this, but if you had a shorter number that could
represent any character, what is the minimum number of bits you
would need, given that there are currently around 120,000 Unicode
characters?
3. You can represent all current characters with 17 bits. The largest
number you can represent with 16 bits is 65,536, which is not enough.
If we go up to 17 bits, that gives 131,072, which is larger than
120,000. Therefore, we need 17 bits.
5.4.4. UTF-8
UTF-8 is a variable length encoding scheme for Unicode. Characters with a
lower Unicode number require fewer bits for their representation than those
with a higher Unicode number. UTF-8 representations contain either 8, 16, 24,
or 32 bits. Remembering that a byte is 8 bits, these are 1, 2, 3, and 4 bytes.
01001000
The following interactive will allow you to convert a Unicode character to its
UTF-8 representation. The Unicode character's number is also displayed.
3. Count how many bits are in the binary number, and choose the correct
pattern to use, based on how many bits there were. Step 4 will explain
how to use the pattern.
4. Replace the x's in the pattern with the bits of the binary number you
converted in 2. If there are more x's than bits, replace extra left-most x's
with 0's.
For example, if you wanted to find out the representation for 貓 (cat in
Chinese), the steps you would take would be as follows.
5.4.5. UTF-16
Just like UTF-8, UTF-16 is a variable length encoding scheme for Unicode.
Because it is far more complex than UTF-8, we won't explain how it works here.
However, the following interactive will allow you to represent text with UTF-16.
Try putting some text that is in English and some text that is in Japanese into it.
Compare the representations to what you get with UTF-8.
The following table summarises what we have said so far about each
representation.
Variable or Bits per
Representation Real world Usage
Fixed Character
No longer widely
ASCII Fixed Length 8 bits
used
8, 16, 24, or 32
UTF-8 Variable Length Very widely used
bits
In order to compare and evaluate them, we need to decide what it means for a
representation to be "good". Two useful criteria are:
We know that UTF-8, UTF-16, and UTF-32 can represent all characters, but
ASCII can only represent English. Therefore, ASCII fails the first criterion. But for
the second criteria, it isn't so simple.
The following interactive will allow you to find out the length of pieces of text
using UTF-8, UTF-16, or UTF-32. Find some samples of English text and Asian
text (forums or a translation site are a good place to look), and see how long
your various samples are when encoded with each of the three representations.
Copy paste or type text into the box.
As a general rule, UTF-8 is better for English text, and UTF-16 is better for Asian
text. UTF-32 always requires 32 bits for each character, so is unpopular in
practice.
If you only wanted to represent the 26 letters of the alphabet, and weren’t
worried about upper-case or lower-case, you could get away with using just 5
bits, which allows for up to 32 different patterns.
You might have exchanged notes which used 1 for "a", 2 for "b", 3 for "c", all
the way up to 26 for "z". We can convert those numbers into 5 digit binary
numbers. In fact, you will also get the same 5 bits for each letter by looking at
the last 5 bits for it in the ASCII table (and it doesn't matter whether you look at
the upper case or the lower case letter).
Represent the word "water" with bits using this system. Check the below panel
once you think you have it.
Spoiler
w: 10111
a: 00001
t: 10111
e: 10100
r: 10010
For printing, printers commonly use three slightly different primary colours:
cyan, magenta, and yellow (CMY). All the colours on a printed document were
made by mixing these primary colours.
Both these kinds of mixing are called "subtractive mixing", because they start
with a white canvas or paper, and "subtract" colour from it. The interactive
below allows you to experiment with CMY incase you are not familiar with it, or
if you just like mixing colours.
Computer screens and related devices also rely on mixing three colours, except
they need a different set of primary colours because they are additive, starting
with a black screen and adding colour to it. For additive colour on computers,
the colours red, green and blue (RGB) are used. Each pixel on a screen is
typically made up of three tiny "lights"; one red, one green, and one blue. By
increasing and decreasing the amount of light coming out of each of these
three, all the different colours can be made. The following interactive allows
you to play around with RGB.
Use the interactive online at http://www.csfieldguide.org.nz/en/interactives/rgb-mixer/
index.html
See what colours you can make with the RGB interactive. Can you make black,
white, shades of grey, yellow, orange, and purple?
Having all the sliders at the extremes will produce black and white, and if
they are all the same value but in between, it will be grey (i.e. between
black and white).
Yellow is not what you might expect - it's made from red and green, with no
blue.
There's a very good reason that we mix three primary colours to specify
the colour of a pixel. The human eye has millions of light sensors in it, and
the ones that detect colour are called "cones". There are three different
kinds of cones, which detect red, blue, and green light respectively. Colours
are perceived by the amount of red, blue, and green light in them.
Computer screen pixels take advantage of this by releasing the amounts of
red, blue, and green light that will be perceived as the desired colour by
your eyes. So when you see "purple", it's really the red and blue cones in
your eyes being stimulated, and your brain converts that to a perceived
colour. Scientists are still working out exactly how we perceive colour, but
the representations used on computers seem to be good enough give the
impression of looking at real images.
For more information about RGB displays, see RGB on Wikipedia; for more
information about the eye sensing the three colours, see Cone cell and
trichromacy on Wikipedia.
The word pixel is short for "picture element". On computer screens and
printers an image is almost always displayed using a grid of pixels, each
one set to the required colour. A pixel is typically a fraction of a millimeter
across, and images can be made up of millions of pixels (one megapixel is
a million pixels), so you can't usually see the individual pixels. Photographs
commonly have several megapixels in them.
It's not unusual for computer screens to have millions of pixels on them,
and the computer needs to represent a colour for each one of those pixels.
With 256 possible values for each of the three primary colours (don't forget to
count 0!), that gives 256 x 256 x 256 = 16,777,216 possible colours -- more
than the human eye can detect!
Think back to the binary numbers section. What is special about the
number 255, which is the maximum colour value?
We'll cover the answer later in this section if you are still not sure!
The following interactive allows you to zoom in on an image to see the pixels
that are used to represent it. Each pixel is a solid colour square, and the
computer needs to store the colour for each pixel. If you zoom in far enough,
the interactive will show you the red-green-blue values for each pixel. You can
pick a pixel and put the values on the slider above - it should come out as the
same colour as the pixel.
5.5.3.1. How many bits will we need for each colour in the
image?
With 256 different possible values for the amount of each primary colour, this
means 8 bits would be needed to represent the number.
The smallest number that can be represented using 8 bits is 00000000 -- which
is 0. And the largest number that can be represented using 8 bits is 11111111
-- which is 255.
Because there are three primary colours, each of which will need 8 bits to
represent each of its 256 different possible values, we need 24 bits in total to
represent a colour.
So, how many colours are there in total with 24 bits? We know that there is 256
possible values each colour can take, so the easiest way of calculating it is:
Because 24 bits are required, this representation is called 24 bit colour. 24 bit
colour is sometimes referred to in settings as "True Color" (because it is more
accurate than the human eye can see). On Apple systems, it is called "Millions
of colours".
For example, suppose you have the colour that has red = 145, green = 50, and
blue = 123 that you would like to represent with bits. If you put these values
into the interactive, you will get the colour below.
Start by converting each of the three numbers into binary, using 8 bits for each.
• red = 10010001,
• green = 00110010,
• blue = 01111011.
Putting these values together gives 100100010011001001111011, which is the
bit representation for the colour above.
There are no spaces between the three numbers, as this is a pattern of bits
rather than actually being three binary numbers, and computers don’t have
any such concept of a space between bit patterns anyway --- everything must
be a 0 or a 1. You could write it with spaces to make it easier to read, and to
represent the idea that they are likely to be stored in 3 8-bit bytes, but inside
the computer memory there is just a sequence of high and low voltages, so
even writing 0 and 1 is an arbitrary notation.
Also, all leading and trailing 0’s on each part are kept --- without them, it would
be representing a shorter number. If there were 256 different possible values
for each primary colour, then the final representation must be 24 bits long.
"Black and white" images usually have more than two colours in them;
typically 256 shades of grey, represented with 8 bits.
The computer won’t ever convert the number into decimal, as it works with the
binary directly --- most of the process that takes the bits and makes the right
pixels appear is typically done by a graphics card or a printer. We just started
with decimal, because it is easier for humans to understand. The main point
about knowing this representation is to understand the trade-off that is being
made between the accuracy of colour (which should ideally be beyond human
perception) and the amount of storage (bits) needed (which should be as little
as possible).
When writing HTML code, you often need to specify colours for text,
backgrounds, and so on. One way of doing this is to specify the colour
name, for example “red”, “blue”, “purple”, or “gold”. For some purposes,
this is okay.
However, the use of names limits the number of colours you can represent
and the shade might not be exactly the one you wanted. A better way is to
specify the 24 bit colour directly. Because 24 binary digits are hard to read,
colours in HTML use hexadecimal codes as a quick way to write the 24
bits, for example #00FF9E. The hash sign means that it should be
interpreted as a hexadecimal representation, and since each hexadecimal
digit corresponds to 4 bits, the 6 digits represent 24 bits of colour
information.
This "hex triplet" format is used in HTML pages to specify colours for things
like the background of the page, the text, and the colour of links. It is also
used in CSS, SVG, and other applications.
This can be broken up into groups of 4 bits: 1001 0001 0011 0010 0111
1011.
And now, each of these groups of 4 bits will need to be represented with a
hexadecimal digit.
• 1001 -> 5
• 0001 -> 1
• 0011 -> 3
• 0010 -> 2
• 0111 -> 7
• 1011 -> B
Understanding how these hexadecimal colour codes are derived also allows
you to change them slightly without having to refer back the colour table,
when the colour isn’t exactly the one you want. Remember that in the 24
bit color code, the first 8 bits specify the amount of red (so this is the first 2
digits of the hexadecimal code), the next 8 bits specify the amount of
green (the next 2 digits of the hexadecimal code), and the last 8 bits
specify the amount of blue (the last 2 digits of the hexadecimal code). To
increase the amount of any one of these colours, you can change the
appropriate hexadecimal letters.
For example, #000000 has zero for red, green and blue, so setting a higher
value to the middle two digits (such as #004300) will add some green to
the colour.
You can use this HTML page to experiment with hexadecimal colours. Just
enter a colour in the space below:
It should be possible to get a perfect match using 24 bit colour. But what about
8 bits?
The above system used 3 bits to specify the amount of red (8 possible values),
3 bits to specify the amount of green (again 8 possible values), and 2 bits to
specify the amount of blue (4 possible values). This gives a total of 8 bits
(hence the name), which can be used to make 256 different bit patterns, and
thus can represent 256 different colours.
You may be wondering why blue is represented with fewer bits than red and
green. This is because the human eye is the least sensitive to blue, and
therefore it is the least important colour in the representation. The
representation uses 8 bits rather than 9 bits because it's easiest for computers
to work with full bytes.
Using this scheme to represent all the pixels of an image takes one third of the
number of bits required for 24-bit colour, but it is not as good at showing
smooth changes of colours or subtle shades, because there are only 256
possible colors for each pixel. This is one of the big tradeoffs in data
representation: do you allocate less space (fewer bits), or do you want higher
quality?
You probably noticed that 8-bit colour looks particularly bad for faces, where we
are used to seeing subtle skin tones. Even the 16-bit colour is noticably worse
for faces.
In other cases, the 16-bit images are almost as good as 24-bit images unless
you look really carefully. They also use two-thirds (16/24) of the space that they
would with 24-bit colour. For images that will need to be downloaded on 3G
devices where internet is expensive, this is worth thinking about carefully.
One other interesting thing to think about is whether or not we’d want
more than 24 bit colour. It turns out that the human eye can only
differentiate around 10 million colours, so the ~ 16 million provided by 24
bit colour is already beyond what our eyes can distinguish. However, if the
image were to be processed by some software that enhances the contrast,
it may turn out that 24-bit colour isn't sufficient. Choosing the
representation isn't simple!
8 bit colour is not used much anymore, although it can still be helpful in
situations such as accessing a computer desktop remotely on a slow internet
connection, as the image of the desktop can instead be sent using 8 bit colour
instead of 24 bit colour. Even though this may cause the desktop to appear a
bit strange, it doesn’t stop you from getting whatever it was you needed to get
done, done. Seeing your desktop in 24 bit colour would not be very helpful if
you couldn't get your work done!
In some countries, mobile internet data is very expensive. Every megabyte that
is saved will be a cost saving. There are also some situations where colour
doesn’t matter at all, for example diagrams, and black and white printed
images.
These make much more clever compromises to reduce the space that an image
takes, without making it look so bad, including choosing a better palette of
colours to use rather than just using the simple representation discussed
above. However, compression methods require a lot more processing, and
images need to be decoded to the representations discussed in this chapter
before they can be displayed.
The ideas in this present chapter more commonly come up when designing
systems (such as graphics interfaces) and working with high-quality images
(such as RAW photographs), and typically the goal is to choose the best
representation possible without wasting too much space.
Have a look at the Compression Chapter to find out more!
Before reading this section, you should have an understanding of low level
languages (see the section on Machine Code in the Programming
Languages chapter).
In the above machine code program li and add are considered to be operations
to "load an integer" and "add two integers" respectively. $t0, $t1, and $a0 are
register operands and represent a place to store values inside of the machine.
10 and 20 are literal operands and allow instructions to represent the exact
integer values 10 and 20. If we were using a 32-bit operating system we might
encode the above instructions with each instruction broken into 4 8-bit pieces
as follows:
Our operation will always be determined by the bits in the first 8-bits of the 32-
bit instruction. In this example machine code, 00001000 means li and
00001010 means add. For the li operation, the bits in Op1 are interpreted to be
a storage place, allowing 00000000 to represent $t0. Similarly the bits in Op1
for the add instruction represent $a0. Can you figure out what the bits in Op3
for each instruction represent?
Using bits to represent both the program instructions and data forms such as
text, numbers, and images allows entire computer programs to be represented
in the same binary format. This allows programs to be stored on disks, in
memory, and transferred over the internet as easily as data.
This site has more complex activities with binary numbers, including fractions,
multiplication and division.
The three main reasons that we use more complex representations of binary
data are:
• Compression: this reduces the amount of space the data needs (for
example, coding an audio file using MP3 compression can reduce the size
of an audio file to well under 10% of its original size).
• Encryption: this changes the representation of data so that you need to
have a "key" to unlock the message (for example, whenever your browser
uses "https" instead of "http" to communicate with a website, encryption
is being used to make sure that anyone eavesdropping on the connection
can't make any sense of the information).
• Error Control: this adds extra information to your data so that if there are
minor failures in the storage device or transmission, it is possible to detect
that the data has been corrupted, and even reconstruct the information
(for example, bar codes on products have an extra digit added to them so
that if the bar code is scanned incorrectly in a checkout, it makes a
warning sound instead of charging you for the wrong product).
Often all three of these are applied to the same data; for example, if you take a
photo on a smartphone it is usually compressed using JPEG, stored in the
phone's memory with error correction, and uploaded to the web through a
wireless connection using an encryption protocol to prevent other people
nearby getting a copy of the photo.
Without these forms of coding, digital devices would be very slow, have limited
capacity, be unreliable, and be unable to keep your information private.
The idea of encoding data to make the representation more compact, robust or
secure is centuries old, but the solid theory needed to support codes in the
information age was developed in the 1940s --- not surprisingly considering
that technology played such an important role in World War II, where efficiency,
reliability and secrecy were all very important. One of the most celebrated
researchers in this area was Claude Shannon, who developed the field of
"information theory", which is all about how data can be represented effectively
(Shannon was also a juggler, unicyclist, and inventor of fanciful machines).
Curiosity: Entropy
You can explore the idea of entropy further using an Unplugged activity
called Twenty Guesses, and an online game for guessing sentences.
Common forms of compression that are currently in use include JPEG (used for
photos), MP3 (used for audio), MPEG (used for videos including DVDs), and ZIP
(for many kinds of data). For example, the JPEG method reduces photos to a
tenth or smaller of their original size, which means that a camera can store 10
times as many photos, and images on the web can be downloaded 10 times
faster.
So what's the catch? Well, there can be an issue with the quality of the data –
for example, a highly compressed JPEG image doesn't look as sharp as an
image that hasn't been compressed. Also, it takes processing time to compress
and decompress the data. In most cases, the tradeoff is worth it, but not
always.
In this chapter we'll look at how compression might be done, what the benefits
are, and the costs associated with using compressed data that need to be
considered when deciding whether or not to compress data.
We'll start with a simple example – Run Length Encoding – which gives some
insight into the benefits and the issues around compression.
7.2. Run Length Encoding
Watch the video online at https://www.youtube.com/embed/uaV2RuAJTjQ?rel=0
Run length encoding (RLE) is a technique that isn't so widely used these days,
but it's a great way to get a feel for some of the issues around using
compression.
One very simple way a computer can store this image in binary is by using a
format where '0' means white and '1' means black (this is a "bit map", because
we've mapped the pixels onto the values of bits). Using this method, the above
image would be represented in the following way:
011000010000110
100000111000001
000001111100000
000011111110000
000111111111000
001111101111100
011111000111110
111110000011111
011111000111110
001111101111100
000111111111000
000011111110000
000001111100000
100000111000001
011000010000110
P1
15 15
0 1 1 0 0 0 0 1 0 0 0 0 1 1 0
1 0 0 0 0 0 1 1 1 0 0 0 0 0 1
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 0 0 0 0
0 0 0 1 1 1 1 1 1 1 1 1 0 0 0
0 0 1 1 1 1 1 0 1 1 1 1 1 0 0
0 1 1 1 1 1 0 0 0 1 1 1 1 1 0
1 1 1 1 1 0 0 0 0 0 1 1 1 1 1
0 1 1 1 1 1 0 0 0 1 1 1 1 1 0
0 0 1 1 1 1 1 0 1 1 1 1 1 0 0
0 0 0 1 1 1 1 1 1 1 1 1 0 0 0
0 0 0 0 1 1 1 1 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0
1 0 0 0 0 0 1 1 1 0 0 0 0 0 1
0 1 1 0 0 0 0 1 0 0 0 0 1 1 0
The first two lines are the header. The first line specifies the format of the
file (P1 means that the file contains ASCII zeroes and ones). The second
line specifies the width and then the height of the image in pixels. This
allows the computer to know the size and dimensions of the image, even if
the newline characters separating the rows in the file were missing. The
rest of the data is the image, just like above. If you wanted to, you could
copy and paste this representation (including the header) into a text file,
and save it with the file extension .pbm. If you have a program on your
computer able to open PBM files, you could then view the image with it.
You could even write a program to output these files, and then display
them as images.
Because the digits are represented using ASCII in this format, it isn't very
efficient, but it is useful if you want to read what's inside the file. There are
variations of this format that pack the pixels into bits instead of characters,
and variations that can be used for grey scale and colour images. More
information about this format is available on Wikipedia.
The key question in compression is whether or not we can represent the same
image using fewer bits, but still be able to reconstruct the original image.
It turns out we can. There are many ways of going about it, but in this section
we are focussing on a method called run length encoding.
Imagine that you had to read the bits above out to someone who was copying
them down... after a while you might say things like "five zeroes" instead of
"zero zero zero zero zero". Is the basic idea behind run length encoding (RLE),
which is used to save space for storing digital images. In run length encoding,
we replace each row with numbers that say how many consecutive pixels are
the same colour, always starting with the number of white pixels . For example,
the first row in the image above contains one white, two black, four white, one
black, four white, two black, and one white pixel.
011000010000110
1, 2, 4, 1, 4, 2, 1
For the second row, because we need to say what the number of white pixels is
before we say the number of black, we need to explicitly say there are zero at
the start of the row.
100000111000001
0, 1, 5, 3, 5, 1
You might ask why we need to say the number of white pixels first, which in this
case was zero. The reason is that if we didn't have a clear rule about which to
start with, the computer would have no way of knowing which colour was which
when it displays the image represented in this form!
The third row contains five whites, five blacks, five whites.
000001111100000
5, 5, 5
That means we get the following representation for the first three rows.
1, 2, 4, 1, 4, 2, 1
0, 1, 5, 3, 5, 1
5, 5, 5
You can work out what the other rows would be following this same system.
4, 7, 4
3, 9, 3
2, 5, 1, 5, 2
1, 5, 3, 5, 1
0, 5, 5, 5
1, 5, 3, 5, 1
2, 5, 1, 5, 2
3, 9, 3
4, 7, 4
5, 5, 5
0, 1, 5, 3, 5, 1
1, 2, 4, 1, 4, 2, 1
4, 11, 3
4, 9, 2, 1, 2
4, 9, 2, 1, 2
4, 11, 3
4, 9, 5
4, 9, 5
5, 7, 6
0, 17, 1
1, 15, 2
What is the image of? How many pixels were there in the original image? How
many numbers were used to represent those pixels?
In the original representation, 225 digits (ones and zeroes) were required to
represent the image. Count up the number of commas and digits (but not
spaces or newlines, ignore those) in the new representation. This is the number
of characters required to represent the image with the new representation (to
ensure you are on the right track, the first 3 rows that were given to you
contain 29 characters).
Assuming you got the new image representation correct, and counted correctly,
you should have found there are 121 characters in the new image (double
check if your number differs). This means that the new representation only
requires around 54% as many characters to represent (calculated using
121/225). This is a significant reduction in the amount of space required to
store the image --- it's about half the size. The new representation is a
compressed form of the old one.
In practice this method (with some extra tricks) can be used to compress
images to about 15% of their original size. In real systems, the original
image only uses one bit for every pixel to store the black and white values
(not one character, which we used for our calculations). However, the run
length numbers are also stored much more efficiently, again using bit
patterns that take very little space to represent the numbers. The bit
patterns used are usually based on a technique called Huffman coding, but
that is beyond what we want to get into here.
7.2.3. Where is Run Length Encoding used in
practice?
The main place that black and white scanned images are used now is on fax
machines, which use this approach to compression. One reason that it works so
well with scanned pages the number of consecutive white pixels is huge. In
fact, there will be entire scanned lines that are nothing but white pixels. A
typical fax page is 200 pixels across or more, so replacing 200 bits with one
number is a big saving. The number itself can take a few bits to represent, and
in some places on the scanned page only a few consecutive pixels are replaced
with a number, but overall the saving is significant. In fact, fax machines would
take 7 times longer to send pages if they didn't use compression.
Now that you know how run length encoding works, you can come up with
and compress your own black and white image, as well as uncompress an
image that somebody else has given you.
Start by making your own picture with ones and zeroes. (Make sure it is
rectangular – all the rows should have the same length.) You can either
draw this on paper or prepare it on a computer (using a fixed width font,
otherwise it can become really frustrating and confusing!) In order to make
it easier, you could start by working out what you want your image to be
on grid paper (such as that from a math exercise book) by shading in
squares to represent the black ones, and leaving them blank to represent
the white ones. Once you have done that, you could then write out the
zeroes and ones for the image.
Work out the compressed representation of your image using run length
coding, i.e. the run lengths separated by commas form that was explained
above.
Now give a copy of the compressed representation (the run length codes,
not the original uncompressed representation) to a friend or classmate,
along with an explanation of how it is compressed. Ask them to try and
draw the image on some grid paper. Once they are done, check their
conversion against your original.
Imagining that you and your friend are both computers, by doing this you
have shown that images using these systems of representations can be
compressed on one computer, and decompressed on another, as long as
you have standards that you've agreed on (e.g. that every line begins with
a white pixel). It is very important for compression algorithms to follow
standards so that a file compressed on one computer can be
decompressed on another; for example, songs often follow the "mp3"
standard so that when they are downloaded they can be played on a
variety of devices.
But working out the optimal code for each symbol is harder than it might seem
- in fact, no-one could work out an algorithm to compute the best code until a
student called David Huffman did it in 1951, and his achievement was
impressive enough that he was allowed to pass his course without sitting the
final exam.
But let's start with a very simple textual example. This example language uses
only 4 different characters, and yet is incredibly important to us: it's the
language used to represent DNA, which is made up of sequences of four
characters A, C, G and T. For example, the 4.6 million characters representing
an E.coli DNA sequence happens to start with:
agcttttcattct
a: 00
c: 01
g: 10
t: 11
The 13 characters above would be written using 26 bits as follows - notice that
we don't need gaps between the codes for each bits.
00100111111111010011110111
But we can do better than this. In the short sample text above the letter "t" is
more common than the other letters ("t" occurs 7 times, "c" 3 times, "a" twice,
and "g" just once). If we give a shorter code to "t" then 54% of the time (7 out
of 13 characters) we'd be using less space. For example, we could use the
codes:
a: 010
c: 00
g: 011
t: 1
0100110011110001011001
This new code can still be decoded even though the lengths are different. For
example, try to decode the following bits using the code we were just using.
The main thing is to start at the first bit on the left, and match up the codes
from left to right:
111001
The sequence of bits 111001 decodes to "tttct". Starting at the left, the
first bit is a 1, which only starts a "t". There are two more of these, and
then we encounter a 0. This could start any of the other three characters,
but because it is followed by another 0, it can only represent "c". This
leaves a 1 at the end, which is a "t".
But is the code above the best possible code for these characters? (As it
happens, this one is optimal for this case.) And how can we be sure the codes
can be decoded? For example, if we just reduced the length for "t" like this: a:
00 c: 01 g: 10 t: 1 try decoding the message "11001".
For example, the code we used above (and repeated here) corresponds to the
tree shown below.
a: 010
c: 00
g: 011
t: 1
To decode something using this structure (e.g. the code
0100110011110001011001 above), start at the top, and choose a branch
based each successive bit in the coded file. The first bit is a 0, so we follow the
left branch, then the 1 branch, then the 0 branch, which leads us to the letter
"a". After each letter is decoded, we start again at the top. The next few bits
are 011..., and following these labels from the start takes us to "g", and so on.
The tree makes it very easy to decode any input, and there's never any
confusion about which branch to follow, and therefore which letter to decode
each time.
The shape of the tree will depend on how common each symbol is. In the
example above, "t" is very common, so it is near the start of the tree, whereas
"a" and "g" are three branches along the tree (each branch corresponds to a
bit).
Huffman's algorithm for building the tree would work like this.
First, we count how often each character occurs (or we can work out its
probability):
a: 2 times
c: 3 times
g: 1 time
t: 7 times
We build the tree from the bottom by finding the two characters that have the
smallest counts ("a" and "g" in this example). These are made to be a branch at
the bottom of the tree, and at the top of the branch we write the sum of their
two values (2+1, which is 3). The branches are labelled with a 0 and 1
respectively (it doesn't matter which way around you do it).
We then forget about the counts for the two characters just combined, but we
use the combined total to repeat the same step: the counts to choose from are
3 (for the combined total), 3 (for "c"), and 7 (for "t"), so we combine the two
smallest values (3 and 3) to make a new branch:
This leaves just two counts to consider (6 and 7), so these are combined to
form the final tree:
You can then read off the codes for each character by following the 0 and 1
labels from top to bottom, or you could use the tree directly for coding.
If you look at other textbooks about Huffman coding, you might find English
text used as an example, where letters like "e" and "t" get shorter codes while
"z" and "q" get longer ones. As long as the codes are calculated using
Huffman's method of combining the two smallest values, you'll end up with the
optimal code.
Huffman trees aren't built manually - in fact, a Huffman trees are built every
time you take a photo as a JPG, or zip a file, or record a video. You can generate
your own Huffman Trees using the interactive below. Try some different texts,
such as one with only two different characters; one where all the characters are
equally likely; and one where one character is way more likely than the others.
In practice Huffman's code isn't usually applied to letters, but to things like the
lengths of run length codes (some lengths will be more common than others),
or the match length of a point for a Ziv-Lempel code (again, some lengths will
be more common than others), or the parameters in a JPEG or MP3 file. By
using a Huffman code instead of a simple binary code, these methods get just a
little more compression for the data.
As an experiment, try calculating a Huffman code for the four letters a, b, c and
d, for each of the following: "abcddcbaaabbccddcbdaabcd" (every letter is
equally likely), and "abaacbaabbbbaabbaacdadcd" ("b" is much more common).
The first one will use two bits for each character; since there are 24
characters in total, it will use 48 bits in total to represent all of the
characters.
In contrast, the second tree uses just 1 bit for the character "a", 2 bits for
"b", and 3 bits for both "c" and "d". Since "a" occurs 10 times, "b" 8 times
and "c" and "d" both occur 2 times, that's a total of 10x1 + 8x2 + 3x3 +
3x3 = 44 bits. That's an average of 1.83 bits for each character, compared
with 2 bits for each character if you used a simple code or were assuming
that they are all equally likely.
This shows how it is taking advantage of one character being more likely
than another. With more text a Huffman code can usually get even better
compression than this.
The examples above used letters of the alphabet, but notice that we
referred to them as "symbols". That's because the value being coded could
be all sorts of things: it might be the colour of a pixel, a sample value from
a sound file, or even a reading such as a the status of a thermostat.
As an extreme example, here's a Huffman tree for a dice roll. You'd expect
all 6 values to be equally likely, but because of the nature of the tree, some
values get shorter codes than others. You can work out the average
number of bits used to record each dice roll, since 2/6 of the time it will be
2 bits, and 4/6 of the time it will be 3 bits. The average is 2/6 x 2 + 4/6 x 3,
which is 2.67 bits per roll.
Another thing to note from this is that there are some arbitary choices in
how the tree was made (e.g. the 4 value might have been given 2 bits and
the 6 value might have been given 3 bits), but the average number of bits
will be the same.
7.4. Lossy vs Lossless compression
As the compressed representation of the image can be converted back to the
original representation, and both the original representation and the
compressed representation would give the same image when read by a
computer, this compression algorithm is called lossless, i.e. none of the data
was lost from compressing the image, and as a result the compression could be
undone exactly.
Not all compression algorithms are lossless though. In some types of files, in
particular photos, sound, and videos, we are willing to sacrifice a little bit of the
quality (i.e. lose a little of the data representing the image) if it allows us to
make the file size a lot smaller. For downloading very large files such as movies,
this can be essential to ensure the file size is not so big that it is infeasible to
download! These compression methods are called lossy. If some of the data is
lost, it is impossible to convert the file back to exactly the original form when
lossy compression was used, but the person viewing the movie or listening to
the music may not mind the lower quality if the files are smaller. Later in this
chapter, we will investigate the effects some lossy compression algorithms
have on images and sound.
Interestingly, it turns out that any lossless compression algorithm will have
cases where the compressed version of the file is larger than the uncompressed
version! Computer scientists have even proven this to be the case, meaning it
is impossible for anybody to ever come up with a lossless compression
algorithm that makes all possible files smaller. In most cases this isn’t an issue
though, as a good lossless compression algorithm will tend to give the best
compression on common patterns of data, and the worst compression on ones
that are highly unlikely to occur.
What is the image with the best compression (i.e. an image that has a size
that is a very small percentage of the original) that you can come up with?
This is the best case performance for this compression algorithm.
What about the worst compression? Can you find an image that actually
has a larger compressed representation? (Don’t forget the commas in the
version we used!) This is the worst case performance for this compression
algorithm.
The best case above is when the image is entirely white (only one number
is used per line). The worst case is when every pixel is alternating black
and white, so there's one number for every pixel. In fact, in this case the
size of the compressed file is likely to be a little larger than the original one
because the numbers are likely to take more than one bit to store. Real
systems don't represent the data exactly as we've discussed here, but the
issues are the same.
In the worst case (with alternating black and white pixels) the run length
encoding method will result in a file that's larger than the original! As noted
above, every lossless compression method that makes at least one file
smaller must also have some files that it makes larger --- it's not
mathematically possible to have a method that always makes files smaller
unless the method is lossy. As a trivial example, suppose someone claims
to have a compression method that will convert any 3-bit file into a 2-bit
file. How many different 3-bit files are there? (There are 8.) How many
different 2-bit files are there? (There are 4.) Can you see the problem?
We've got 8 possible files that we might want to compress, but only 4 ways
to represent them. So some of them will have identical representations,
and can't be decoded exactly.
Over the years there have been several frauds based on claims of a
lossless compression method that will compress every file that it is given.
This can only be true if the method is lossy (loses information); all lossless
methods must expand some files. It would be nice if all files could be
compressed without loss; you could compress a huge file, then apply
compression to the compressed file, and make it smaller again, repeating
this until it was only one byte --- or one bit! Unfortunately, this isn't
possible.
In the data representation section we looked at how the size of an image file
can be reduced by using fewer bits to describe the colour of each pixel.
However, image compression methods such as JPEG take advantage of patterns
in the image to reduce the space needed to represent it, without impacting the
image unnecessarily.
The following three images show the difference between reducing bit depth and
using a specialised image compression system. The left hand image is the
original, which was 24 bits per pixel. The middle image has been compressed to
one third of the original size using JPEG; while it is a "lossy" version of the
original, the difference is unlikely to be perceptible. The right hand one has had
the number of colours reduced to 256, so there are 8 bits per pixel instead of
24, which means it is also stored in a third of the original size. Even though it
has lost just as many bits, the information removed has had much more impact
on how it looks. This is the advantage of JPEG: it removes information in the
image that doesn't have so much impact on the perceived quality. Furthermore,
with JPEG, you can choose the tradeoff between quality and file size.
Reducing the number of bits (the colour depth) is sufficiently crude that we
don't really regard it as a compression method, but just a low quality
representation. Image compression methods like JPEG, GIF and PNG are
designed to take advantage of the patterns in an image to get a good reduction
in file size without losing more quality than necessary.
For example, the following image shows a zoomed in view of the pixels that are
part of the detail around an eye from the above (high quality) image.
Notice that the colours in adjacent pixels are often very similar, even in this
part of the picture that has a lot of detail. For example, the pixels shown in the
red box below just change gradually from very dark to very light.
Run-length encoding wouldn't work in this situation. You could use a variation
that specifies a pixel's colour, and then says how many of the following pixels
are the same colour, but although most adjacent pixels are nearly the same,
the chances of them being identical are very low, and there would be almost no
runs of identical colours.
But there is a way to take advantage of the gradually changing colours. For the
pixels in the red box above, you could generate an approximate version of
those colours by specifying just the first and last one, and getting the computer
to calculate the ones in between assuming that the colour changes gradually
between them. Instead of storing 5 pixel values, only 2 are needed, yet
someone viewing it probably might not notice any difference. This would be
lossy because you can't reproduce the original exactly, but it would be good
enough for a lot of purposes, and save a lot of space.
The process of guessing the colours of pixels between two that are known
is an example of interpolation. A linear interpolation assumes that the
values increase at a constant rate between the two given values; for
example, for the five pixels above, suppose the first pixel has a blue colour
value of 124, and the last one has a blue value of 136, then a linear
interpolation would guess that the blue values for the ones in between are
127, 130 and 133, and this would save storing them. In practice, a more
complex approach is used to guess what the pixels are, but linear
interpolation gives the idea of what's going on.
The JPEG system, which is widely used for photos, uses a more sophisticated
version of this idea. Instead of taking a 5 by 1 run of pixels as we did above, it
works with 8 by 8 blocks of pixels. And instead of estimating the values with a
linear function, it uses combinations of cosine waves.
A cosine wave form is from the trig function that is often used for
calculating the sides of a triangle. If you plot the cosine value from 0 to 180
degrees, you get a smooth curve going from 1 to -1. Variations of this plot
can be used to approximate the value of pixels, going from one colour to
another. If you add in a higher frequency cosine wave, you can produce
interesting shapes. In theory, any pattern of pixels can be created by
adding together different cosine waves!
The following graph shows the values of and for ranging from
0 to 180 degrees.
JPEGs (and MP3) are based on the idea that you can add together lots of
sine or cosine waves to create any waveform that you want. Converting a
waveform for a block of pixels or sample of music into a sum of simple
waves can be done using a technique called a Fourier transform, and is a
widely used idea in signal processing.
You can experiment with adding sine waves together to generate other
shapes using the spreadsheet provided. In this spreadsheet, the yellow
region on the first sheet allows you to choose which sine waves to add. Try
setting the 4 sine waves to frequencies that are 3, 9, 15, and 21 times the
fundamental frequency respectively (the "fundamental" is the lowest
frequency.) Now set the "amplitude" (equivalent to volume level) of the
four to 0.5, 0.25, 0.125 and 0.0625 respectively (each is half of the
previous one). This should produce the following four sine waves:
When the above four waves are added together, they interfere with each
other, and produce a shape that has sharper transitions:
In fact, if you were to continue the pattern with more than four sine waves,
this shape would become a "square wave", which is one that suddenly goes
to the maximum value, and then suddenly to the minimum. The one shown
above is bumpy because we've only used 4 sine waves to describe it.
This is exactly what is going on in JPEG if you compress a black and white
image. The "colour" of pixels as you go across the image will either be 0
(black) or full intensity (white), but JPEG will approximate it with a small
number of cosine waves (which have basically the same properties as sine
waves.) This gives the "overshoot" that you see in the image above; in a
JPEG image, this comes out as bright and dark patches surrounding the
sudden change of colour, like here:
The cosine waves used for JPEG images are based on a "Discrete Cosine
Transform". The "Discrete" means that the waveform is digital – it is the
opposite of continuous, where any value can occur. In a JPEG wave, there
are only 8 x 8 values (for the block being coded), and each of those values
can have a limited range of numbers (binary integers), rather than any
value at all.
There is no optimal quantisation table for every image, but many camera or
image processing companies have worked to develop very good quantisation
tables. As a result, many are kept secret. Some companies have also developed
software to analyse images and select the most appropriate quantisation table
for the particular image. For example, for an image with text in it, high
frequency detail is important, so the quantisation table should have lower
values in the bottom right so more detail is kept. Of course, this will also result
in the image size remaining relatively large. Lossy compression is all about
compromise!
The figure below shows an image before and after it has had quantisation
applied to it.
Before Quantisation:
After Quantisation:
Notice how the images look very similar, even though the second one has
many zero coefficients. The differences we can see will be barely visible when
the image is viewed at its original size.
We still have 64 numbers even with the many zeros, so how do we save space
when storing the zeros? You will notice that the zeros are bunched towards the
bottom right. This means if we list the coefficients in a zig-zag, starting from
the top left corner, we will end up with many zeros in a row. Instead of writing
20 zeros we can store the fact that there are 20 zeros using a method of run-
length encoding very similar to the one discussed earlier in this chapter.
And finally, the numbers that we are left with are converted to bits using
Huffman coding, so that more common values take less space and vice versa.
All those things happen every time you take a photo and save it as a JPEG file,
and it happens to every 8 by 8 block of pixels. When you display the image, the
software needs to reverse the process, adding all the basis functions together
for each block - and there will be hundereds of thousands of blocks for each
image.
For this reason, JPEG is used for photos and natural images, but other
techniques (such as GIF and PNG, which we will look at in another section) work
better for artificial images like this one.
The following interactive allows you to explore this idea. The empty boxes have
been replaced with a reference to the text occurring earlier. You can click on a
box to see where the reference is, and you can type the referenced characters
in to decode the text. What happens if a reference is pointing to another
reference? As long as you decode them from first to last, the information will be
available before you need it.
You can also enter your own text by clicking on the "Text" tab. You could paste
in some text of your own to see how many characters can be replaced with
references.
The references are actually two numbers: the first says how many characters to
count back to where the previous phrase starts, and the second says how long
the referenced phrase is. Each reference typically takes about the space of one
or two characters, so the system makes a saving as long as two characters are
replaced. The options in the interactive above allow you to require the replaced
length to be at least two, to avoid replacing a single character with a reference.
Of course, all characters count, not just letters of the alphabet, so the system
can also refer back to the white spaces between words. In fact, some of the
most common sequences are things like a full stop followed by a space.
This approach also works very well for black and white images, since
sequences like "10 white pixels" are likely to have occurred before. Here are
some of the bits from the example earlier in this chapter; you can paste them
into the interactive above to see how many pointers are needed to represent it.
011000010000110
100000111000001
000001111100000
000011111110000
000111111111000
001111101111100
011111000111110
111110000011111
In fact, this is essentially what happens with GIF and PNG images; the pixel
values are compressed using the Ziv-Lempel algorithm, which works well if you
have lots of consecutive pixels the same colour. But it works very poorly with
photographs, where pixel patterns are very unlikely to be repeated.
Curiosity: ZL or LZ compression?
The name "mp3" isn't very self explanatory because the "mp" stands for
"moving picture", and the 3 is from version 1, but mp3 files are used for
music!
The full name of the standard that it comes from is MPEG, and the missing
"EG" stands for "experts group", which was a consortium of companies and
researchers that got together to agree on a standard so that people could
easily play the same videos on different brands of equipment (so, for
example, you could play the same DVD on any brand of DVD player). The
very first version of their standards (called MPEG-1) had three methods of
storing the sound track (layer 1, 2 and 3). One of those methods (MPEG-1
layer 3) became very popular for compressing music, and was abbreviated
to MP3.
The MPEG-1 standard isn't used much for video now (for example, DVDs
and TV mainly use MPEG-2), but it remains very important for audio coding.
The next MPEG version is MPEG-4 (MPEG-3 was redundant before it became
a standard). MPEG-4 offers higher quality video, and is commonly used for
digital video files, streaming media, Blu-Ray discs and some broadcast TV.
The AAC audio compression method, used by Apple among others, is also
from the MPEG-4 standard. On computers, MPEG-4 Part 14 is commonly
used for video, and it's often abbreviated as "MP4."
So there you have it: MP3 stands for "MPEG-1 layer 3", and MP4 stands for
"MPEG-4 part 14".
Most other audio compression methods use a similar approach to the MP3
method, although some offer better quality for the same amount of storage (or
less storage for the same quality). We won't go into exactly how this works, but
the general idea is to break the sound down into bands of different frequencies,
and then represent each of those bands by adding together the values of a
simple formula (the sum of cosine waves, to be precise).
There is some more detail about how MP3 coding works on the cs4fn site, and
also in an article on the I Programmer site.
Other audio compression systems that you might come across include AAC,
ALAC, Ogg Vorbis, and WMA. Each of these has various advantages over others,
and some are more compatible or open than others.
The main questions with compressed audio are how small the file can be made,
and how good the quality is of the human ear. (There is also the question of
how long it takes to encode the file, which might affect how useful the system
is.) The tradeoff between quality and size of audio files can depend on the
situation you're in: if you are jogging and listening to music then the quality
may not matter so much, but it's good to reduce the space available to store it.
On the other hand, someone listening to a recording at home on a good sound
system might not mind about having a large device to store the music, as long
as the quality is high.
Compress each of your recordings using a variety of methods, making sure that
each compressed file is created from a high quality original. Make a table
showing how long it took to process each recording, the size of the compressed
file, and some evaluation of the quality of the sound compared with the
original. Discuss the tradeoffs involved – do you need much bigger files to store
good quality sound? Is there a limit to how small you can make a file and still
have it sounding ok? Do some methods work better for speech than others?
Does a 2 minute recording of silence take more space than a 1 minute
recording of silence? Does a 1 minute recording of music use more space than
a minute of silence?
Questions like "what is the most compression that can be achieved" are
addressed by the field of information theory. There is an activity on information
theory on the CS Unplugged site, and there is a fun activity that illustrates
information theory. Based on this theory, it seems that English text can't be
compressed to less than about 12% of its original size at the very best. Images,
sound and video can get much better compression because they can use lossy
compression, and don't have to reproduce the original data exactly.
A big issue with encryption systems is people who want to break into them and
decrypt messages without the key (which is some secret value or setting that
can be used to unlock an encrypted file). Some systems that were used many
years ago were discovered to be insecure because of attacks, so could no
longer be used. It is possible that somebody will find an effective way of
breaking into the widespread systems we use these days, which would cause a
lot of problems.
Like all technologies, encryption can be used for good and bad purposes. A
human rights organisation might use encryption to secretly send photographs
of human rights abuse to the media, while drug traffickers might use it to avoid
having their plans read by investigators. Understanding how encryption works
and what is possible can help to make informed decisions around things like
freedom of speech, human rights, tracking criminal activity, personal privacy,
identity theft, online banking and payments, and the safety of systems that
might be taken over if they were "hacked into".
There are various words that can be used to refer to trying to get the
plaintext from a ciphertext, including decipher, decrypt, crack, and
cryptanalysis. Often the process of trying to break cryptography is referred
to as an "attack". The term "hack" is also sometimes used, but it has other
connotations, and is only used informally.
Once you have figured out what the text says, make a table with the letters of
the alphabet in order and then write the letter they are represented with in the
cipher text. You should notice an interesting pattern.
Given how easily broken this cipher is, you probably don't want your bank
details encrypted with it. In practice, far stronger ciphers are used, although for
now we are going to look a little bit further at Caesar Cipher, because it is a
great introduction to the many ideas in encryption.
For this example, we say the key is 10 because keys in Caesar Cipher are a
number between 1 and 25 (think carefully about why we wouldn't want a key of
26!), which specify how far the alphabet should be rotated. If instead we used a
key of 8, the conversion table would be as follows.
In a Caesar Cipher, the key represents how many places the alphabet
should be rotated. In the examples above, we used keys of "8" and "10".
More generally though, a key is simply a value that is required to do the
math for the encryption and decryption. While Caesar Cipher only has 25
possible keys, real encryption systems have an incomprehensibly large
number of possible keys, and preferably use keys which contains hundreds
or even thousands of binary digits. Having a huge number of different
possible keys is important, because it would take a computer less than a
second to try all 25 Caesar Cipher keys.
Because we know that the key is 6, we can subtract 6 places off each character
in the ciphertext. For example, the letter 6 places before "Z" is "T", 6 places
before "N" is "H", and 6 places before "K" is "E". From this, we know that the
first word must be "THE". Going through the entire ciphertext in this way, we
can eventually get the plaintext of:
The interactive above can do this process for you. Just put the ciphertext into
the box on the right, enter the key, and tell it to decrypt. You should ensure you
understand how to encrypt messages yourself though!
Challenge 1
Challenge 2
We would start by working that the letter that is 7 places ahead of "H" is "O", 7
places ahead of "O" is "V", and 7 places ahead of "W" is "D". This means that
the first word of the plaintext encrypts to "OVD" in the ciphertext. Going
through the the entire plaintext in this way, we can eventually get the
ciphertext of:
Challenge 1
Challenge 2
So far, we have considered one way of cracking Caesar cipher: using patterns
in the text. By looking for patterns such as one letter words, other short words,
double letter patterns, apostrophe positions, and knowing rules such as all
words must contain at least one of a, e, i, o, u, or y (excluding some acronyms
and words written in txt language of course), cracking Caesar Cipher by looking
for patterns is easy. Any good cryptosystem should not be able to be analysed
in this way, i.e. it should be semantically secure
There are many other ways of cracking Caeser cipher which we will look at in
this section. Understanding various common attacks on ciphers is important
when looking at sophisticated cryptosystems which are used in practice.
The following interactive will help you analyze a piece of text by counting up
the letter frequencies. You can paste in some text to see which are the most
common (and least common) characters.
The following text has been coded using a Caesar cipher. To try to make sense
of it, paste it into the statistical analyser above.
As the message says, long messages contain a lot of statistical clues. Very
short messages (e.g. only a few words) are unlikely to have obvious statistical
trends. Very long messages (e.g. entire books) will almost always have "E" as
the most common letter. Wikipedia has a list of letter frequencies, which you
might find useful.
Put the ciphertext into the above frequency analyser, guess what the key is
(using the method explained above), and then try using that key with the
ciphertext in the interactive above. Try to guess the key with as few
guesses as you can!
Challenge 1
WTGT XH PCDIWTG BTHHPVT IWPI NDJ HWDJAS WPKT CD IGDJQAT QGTPZXCV
LXIW ATIITG UGTFJTCRN PCPANHXH
Challenge 2
OCDN ODHZ OCZ HZNNVBZ XJIOVDIN GJON JA OCZ GZOOZM O, RCDXC DN OCZ
NZXJIY HJNO XJHHJI GZOOZM DI OCZ VGKCVWZO
Challenge 3
Although in almost all English texts the letter E is the most common letter,
it isn't always. For example, the 1939 novel Gadsby by Ernest Vincent
Wright doesn't contain a single letter E (this is called a lipogram).
Furthermore, the text you're attacking may not be English. During World
War 1 and 2, the US military had many Native American Code talkers
translate messages into their own language, which provided a strong layer
of security at the time.
A slightly stronger cipher than the Caesar cipher is the Vigenere cipher,
which is created by using multiple Caesar ciphers, where there is a key
phrase (e.g. "acb"), and each letter in the key gives the offset (in the
example this would be 1, 3, 2). These offsets are repeated to give the
offset for encoding each character in the plaintext.
Attacking the Vigenere cipher by trying every possible key is hard because
there are a lot more possible keys than for the Caesar cipher, but a
statistical attack can work quite quickly. The Vigenere cipher is known as a
polyalphabetic substitution cipher, since it is uses multiple substitution
rules.
Even if you did not know the key used a simple rotation (not all substitution
ciphers are), you have learnt that A->H, B->I, M->T, X->E, and K->R. This goes
a long way towards deciphering the message. Filling in the letters you know,
you would get:
By using the other tricks above, there are a very limited number of possibilities
for the remaining letters. Have a go at figuring it out.
For this reason, it is essential for any good cryptosystem to not be breakable,
even if the attacker has pieces of plaintext along with their corresponding
ciphertext to work with. For this, the cryptosystem should give different
ciphertext each time the same plaintext message is encrypted. It may initially
sound impossible to achieve this, although there are several clever techniques
used by real cryptosystems.
This increases the number of possible keys, and thus reduces the risk of a
brute force attack. A can be substituted for any of the 26 letters in the
alphabet, B can then be substituted for any of the 25 remaining letters (26
minus the letter already substituted for A), C can then be substituted for
any of the 24 remaining letters…
EIJUDJQJYEKI
These days encryption keys are normally numbers that are 128 bits or longer.
You could calculate how long it would take to try out every possible 128 bit
number if a computer could test a million every second (including testing if
each decoded text contains English words). It will eventually crack the
message, but after the amount of time it would take, it's unlikely to be useful
anymore – and the user of the key has probably changed it!
In fact, if we analyse it, a 128 bit key at 1,000,000 per second would take
10,790,283,070,000,000,000,000,000 years to test. Of course, it might find
something in the first year, but the chances of that are ridiculously low, and it
would be more realistic to hope to win the top prize in Lotto three times
consecutively (and you'd probably get more money). On average, it will take
around half that amount, i.e. a bit more than
5,000,000,000,000,000,000,000,000 years. Even if you get a really fast
computer that can check one trillion keys a second (rather unrealistic in
practice), it would still take around 5,000,000,000,000 years. Even if you could
get one million of those computers (even more unrealistic in practice), it would
still take 5,000,000 years.
And even if you did have the hardware that was considered above, then people
would start using bigger keys. Every bit added to the key will double the
number of years required to guess it. Just adding an extra 15 or 20 bits to the
key in the above example will safely push the time required back to well
beyond the expected life span of the Earth and Sun! This is how real
cryptosystems protect themselves from brute force attacks. Cryptography
relies a lot on low probabilities of success.
The calculator below can handle really big numbers. You can double check our
calculations above if you want! Also, work out what would happen if the key
size was double (i.e. 256 bits), or if a 1024 or 2048 bit key (common these
days) was used.
Brute force attacks try out every possible key, and the number of possible
keys grows exponentially as the key gets longer. As we saw above, no
modern computer system could try out all possible 128 bit key values in a
useful amount of time, and even if it were possible, adding just one more
bit would double how long it would take.
This guide has a whole chapter about tractability, where you can explore
these issues further.
The main terminology you should be familiar with now is that a plaintext is
encrypted by a cipher to create a ciphertext using an encryption key.
Someone without the encryption key who wants to attack the cipher could
try various approaches, including a brute force attack (trying out all
possible keys), a frequency analysis attack (looking for statistical patterns),
and a known plaintext attack (matching some known text with the cipher to
work out the key).
If you were given an example of a simple cipher being used, you should be
able to talk about it using the proper terminology.
There are good solutions to these problems that are regularly used --- in fact,
you probably use them online already, possibly without even knowing! We'll
begin by looking at systems that allow people to decode secret messages
without even having to be sent the key!
Remember that Alice and Bob might be in different countries, and can only
communicate through the internet. This also rules out Alice simply passing Bob
the key in person.
So assuming that somone can observe all the bits being transmitted and
received from your computer is pretty reasonable.
Distributing keys physically is very expensive, and up to the 1970s large sums
of money were spent physically sending keys internationally. Systems like this
are call symmetric encryption, because Alice and Bob both need an identical
copy of the key. The breakthrough was the realisation that you could make a
system that used different keys for encoding and decoding. We will look further
at this in the next section.
Additionally, there's a video illustrating how public key systems work using
a padlock analogy.
Watch the video online at https://www.youtube.com/watch?v=a72fHRr6MRU
It's like giving out padlocks to all your friends, so anyone can lock a box and
send it to you, but if you have the only (private) key, then you are the only
person who can open the boxes. Once your friend locks a box, even they can't
unlock it. It's really easy to distribute the padlocks. Public keys are the same –
you can make them completely public – often people put them on their website
or attach them to all emails they send. That's quite different to having to hire a
security firm to deliver them to your colleagues.
Public key encryption is very heavily used for online commerce (such as
internet banking and credit card payment) because your computer can set up a
connection with the business or bank automatically using a public key system
without you having to get together in advance to set up a key. Public key
systems are generally slower than symmetric systems, so the public key
system is often used to then send a new key for a symmetric system just once
per session, and the symmetric key can be used from then on with a faster
symmetric encryption system.
A very popular public key system is RSA. For this section on public key systems,
we will use RSA as an example.
To ensure you understand, try encrypting a short message with your public
key. In the next section, there is an interactive that you can then use to
decrypt the message with your private key.
Despite even your enemies knowing your public key (as you publicly
announced it), they cannot use it to decrypt your messages which were
encrypted using the public key. You are the only one who can decrypt
messages, as that requires the private key which hopefully you are the only
one who has access to.
Note that this interactive’s implementation of RSA is just for demonstrating the
concepts here and is not quite the same as the implementations used in live
encryption systems.
If you were asked to multiply the following two big prime numbers, you
might find it a bit tiring to do by hand (although it is definitely achievable),
and you could get an answer in a fraction of a second using a computer.
9739493281774982987432737457439209893878938489723948984873298423989898398696987090
3498372473234549852367394893403202898485093868948989658677273900243088492048950834
If on the other hand you were asked which two prime numbers were
multiplied to get the following big number, you’d have a lot more trouble!
(If you do find the answer, let us know! We’d be very interested to hear
about it!)
3944604857329435839271430640488525351249090163937027434471421629606310815805347209
So why is it that despite these two problems being similar, one of them is
“easy” and the other one is “hard”? Well, it comes down to the algorithms
we have to solve each of the problems.
You have probably done long multiplication in school by making one line for
each digit in the second number and then adding all the rows together. We
can analyse the speed of this algorithm, much like we did in the algorithms
chapter for sorting and searching. Assuming that each of the two numbers
has the same number of digits, which we will call n (“Number of digits”),
we need to write n rows. For each of those n rows, we will need to do
around n multiplications. That gives us little multiplications. We need
to add the n rows together at the end as well, but that doesn’t take long so
lets ignore that part. We have determined that the number of small
multiplications needed to multiply two big numbers is approximately the
square of the number of digits. So for two numbers with 1000 digits, that’s
1,000,000 little multiplication operations. A computer can do that in less
than a second! If you know about Big-O notation, this is an algorithm,
where n is the number of digits. Note that some slightly better algorithms
have been designed, but this estimate is good enough for our purposes.
For the second problem, we’d need an algorithm that could find the two
numbers that were multiplied together. You might initially say, why can’t
we just reverse the multiplication? The reverse of multiplication is division,
so can’t we just divide to get the two numbers? It’s a good idea, but it
won’t work. For division we need to know the big number, and one of the
small numbers we want to divide into it, and that will give us the other
small number. But in this case, we only know the big number. So it isn’t a
straightforward long division problem at all! It turns out that there is no
known fast algorithm to solve the problem. One way is to just try dividing
by every number that is less than the number (well, we only need to go up
to the square root, but that doesn’t help much!) There are still billions of
billions of billions of numbers we need to check. Even a computer that
could check 1 billion possibilities a second isn’t going to help us much with
this! If you know about Big-O notation, this is an algorithm, where n
is the number of digits -- even small numbers of digits are just too much to
deal with! There are slightly better solutions, but none of them shave off
enough time to actually be useful for problems of the size of the one
above!
Curiosity: Encrypting with the private key instead of the public key ---
Digital Signatures!
In order to encrypt a message, the public key is used. In order to decrypt it,
the corresponding private key must be used. But what would happen if the
message was encrypted using the private key? Could you then decrypt it
with the public key?
Initially this might sound like a pointless thing to do --- why would you
encrypt a message that can be decrypted using a key that everybody in
the world can access!?! It turns out that indeed, encrypting a message with
the private key and then decrypting it with the public key works, and it has
a very useful application.
The only person who is able to encrypt the message using the private key
is the person who owns the private key. The public key will only decrypt the
message if the private key that was used to encrypt it actually is the public
key’s corresponding private key. If the message can’t be decrypted, then it
could not have been encrypted with that private key. This allows the sender
to prove that the message actually is from them, and is known as a digital
signature.
You could check that someone is the authentic private key holder by giving
them a phrase to encrypt with their private key. You then decrypt it with
the public key to check that they were able to encrypt the phrase you gave
them.
This has the same function as a physical signature, but is more reliable
because it is essentially impossible to forge. Some email systems use this
so that you can be sure an email came from the person who claims to be
sending it.
• When a user logs in, it must be possible to check that they have entered
the correct password.
• Even if the database is leaked, and the attacker has huge amounts of
computing power...
• The database should not give away obvious information, such as
password lengths, users who chose the same passwords, letter
frequencies of the passwords, or patterns in the passwords.
• At the very least, users should have several days/ weeks to be able to
change their password before the attacker cracks it. Ideally, it should
not be possible for them to ever recover the passwords.
• There should be no way of recovering a forgotten password. If the user
forgets their password, it must be reset. Even system administrators
should not have access to a user's password.
Most login systems have a limit to how many times you can guess a password.
This protects all but the poorest passwords from being guessed through a well
designed login form. Suspicious login detection by checking IP address and
country of origin is also becoming more common. However, none of these
application enforced protections are of any use once the attacker has a copy of
the database and can throw as much computational power at it as they want,
without the restrictions the application enforces. This is often referred to
"offline" attacking, because the attacker can attack the database in their own
time.
In this section, we will look at a few widely used algorithms for secure password
storage. We will then look at a couple of case studies where large databases
were leaked. Secure password storage comes down to using clever encryption
algorithms and techniques, and ensuring users choose effective passwords.
Learning about password storage might also help you to understand the
importance of choosing good passwords and not using the same password
across multiple important sites.
• Each time a specific password is hashed, it should give the same hash.
• Given a specific hash, it should be impossible to efficiently compute what
the original password was.
In the Caesar Cipher section, we talked briefly about brute force attacks.
Brute force attack in that context meant trying every possible key until the
correct one was found.
The following interactive allows you to hash words, such as passwords (but
please don't put your real password into it, as you should never enter your
password on random sites). If you were to enter a well chosen password (e.g. a
random string of numbers and letters), and it was of sufficient length, you could
safely put the hash on a public website, and nobody would be able to
determine what your actual password was.
For example, the following database table shows four users of a fictional
system, and the hashes of their passwords. You could determine their
passwords by putting various possibilities through SHA-256 and checking
whether or not the output is equivalent to any of the passwords in the
database.
It might initially sound like we have the perfect system. But unfortunately,
there is still a big problem. You can find rainbow tables online, which are
precomputed lists of common passwords with what value they hash to. It isn't
too difficult to generate rainbow tables containing all passwords up to a certain
size in fact (this is one reason why using long passwords is strongly
recommended!) This problem can be avoided by choosing a password that isn't
a common word or combination of words.
Hashing is a good start, but we need to further improve our system so that if
two users choose the same password, their hash is not the same, while still
ensuring that it is possible to check whether or not a user has entered the
correct password. The next idea, salting, addresses this issue.
When we said that if the hashed password matches the one in the
database, then the user has to have entered the correct password, we were
not telling the full truth. Mathematically, we know that there have to be
passwords which would hash to the same value. This is because the length
of the output hash has a maximum length, whereas the password length
(or other data being hashed) could be much larger. Therefore, there are
more possible inputs than outputs, so some inputs must have the same
output. When two different inputs have the same output hash, we call it a
collision.
Currently, nobody knows of two unique passwords which hash to the same
value with SHA-256. There is no known mathematical way of finding
collisons, other than hashing many values and then trying to find a pair
which has the same hash. The probability of finding one in this way is
believed to be in the order of 1 in a trillion trillion trillion trillion trillion. With
current computing power, nobody can come even close to this without it
taking longer than the life of the sun and possibly the universe.
Some old algorithms, such as MD5 and SHA-1 were discovered to not be as
immune to finding collisions as was initially thought. It is possible that
there are ways of finding collisions more efficiently than by luck. Therefore,
their use is now discouraged for applications where collisions could cause
problems.
For password storage, collisions aren't really an issue anyway. Chances are,
the password the user selected is somewhat predictable (e.g. a word out of
a dictionary, with slight modifications), and an attacker is far more likely to
guess the original password than one that happens to hash to the same
value as it.
But hashing is used for more than just password storage. It is also used for
digital signatures, which must be unique. For those applications, it is
important to ensure collisions cannot be found.
So now when a user registers, a long random salt value is generated, added to
the end of their password, and the combined password and salt is hashed. The
plaintext salt is stored next to the hash.
A common brute force attack is a dictionary attack. This is where the attacker
writes a simple program that goes through a long list of dictionary words, other
common passwords, and all combinations of characters under a certain length.
For each entry in the list, the program adds the salt to the entry and then
hashes to see if it matches the hash they are trying to determine the password
for. Good hardware can check millions, or even billions, of entries a second.
Many user passwords can be recovered in less than a second using a dictionary
attack.
Unfortunately for end users, many companies keep database leaks very quiet
as it is a huge embarrassment that could cripple the company. Sometimes the
company doesn't know its database was leaked, or has suspicions that it was
but for PR reasons they choose to deny it. In the best case, they might require
you to pick a new password, giving a vague excuse. For this reason, it is
important to use different passwords on every site to ensure that the attacker
does not break into accounts you own on other sites. There are quite possibly
passwords of yours that you think nobody knows, but somewhere in the world
an attacker has recovered it from a database they broke into.
While in theory, encrypting the salts sounds like a good way to add further
security, it isn't as great in practice. We couldn't use a one way hash function
(as we need the salt to check the password), so instead would have to use one
of the encryption methods we looked at earlier which use a secret key to
unlock. This secret key would have to be accessible by the program that checks
password (else it can't get the salts it needs to check passwords!), and we have
to assume the attacker could get ahold of that as well. The best security
against offline brute force attacks is good user passwords.
This is why websites have a minimum password length, and often require a mix
of lowercase, uppercase, symbols, and numbers. There are 96 standard
characters you can use in a password. 26 upper case letters, 26 lower case
letters, 10 digits, and 34 symbols. If the password you choose is completely
random (e.g. no words or patterns), then each character you add makes your
password 96 times more difficult to guess. Between 8 and 16 characters long
can provide a very high level of security, as long as the password is truly
random. Ideally, this is the kind of passwords you should be using (and ensure
you are using a different password for each site!).
Unfortunately though, these requirements don't work well for getting users to
pick good passwords. Attackers know the tricks users use to make passwords
that meet the restrictions, but can be remembered. For example, P@$$w0rd
contains 8 characters (a commonly used minimum), and contains a mix of
different types of characters. But attackers know that users like to replace S's
with $'s, mix o and 0, replace i with !, etc. In fact, they can just add these tricks
into their list they use for dictionary attacks! For websites that require
passwords to have at least one digit, the result is even worse. Many users pick
a standard English word and then add a single digit to the end of it. This again
is easy work for a dictionary attack to crack!
As this xkcd comic points out, most password advice doesn't make a lot of
sense.
Image source
You might not know what some of the words mean. In easy terms, what it is
saying is that there are significantly fewer modifications of common dictionary
words than there is of a random selection of four of the 2000 most common
dictionary words. Note that the estimates are based on trying to guess through
a login system. With a leaked database, the attacker can test billions of
passwords a second rather than just a few thousand.
There are many aspects to computer security beyond encryption. For example,
access control (such as password systems and security on smart cards) is
crucial to keeping a system secure. Another major problem is writing secure
software that doesn't leave ways for a user to get access to information that
they shouldn't (such as typing a database command into a website query and
have the system accidentally run it, or overflowing the buffer with a long input,
which could accidentally replace parts of the program). Also, systems need to
be protected from "denial of service" (DOS) attacks, where they get so
overloaded with requests (e.g. to view a web site) that the server can't cope,
and legitimate users get very slow response from the system, or it might even
fail completely.
For other kinds of attacks relating to computer security, see the Wikipedia entry
on Hackers).
There's a dark cloud hanging over the security of all current encryption
methods: Quantum computing. Quantum computing is in its infancy, but if this
approach to computing is successful, it has the potential to run very fast
algorithms for attacking our most secure encryption systems (for example, it
could be used to factorise numbers very quickly). In fact, the quantum
algorithms have already been invented, but we don't know if quantum
computers can be built to run them. Such computers aren't likely to appear
overnight, and if they do become possible, they will also open the possibility for
new encryption algorithms. This is yet another mystery in computer science
where we don't know what the future holds, and where there could be major
changes in the future. But we'll need very capable computer scientists around
at the time to deal with these sorts of changes!
On the positive side, quantum information transfer protocols exist and are used
in practice (using specialised equipment to generate quantum bits); these
provide what is in theory a perfect encryption system, and don't depend on an
attacker being unable to solve a particular computational problem. Because of
the need for specialised equipment, they are only used in high security
environments such as banking.
Curiosity: Steganography
Two fun uses of steganography that you can try to decode yourself are a
film about ciphers that contains hidden ciphers (called "The Thomas Beale
Cipher"), and an activity that has five-bit text codes hidden in music.
The encryption methods used these days rely on fairly advanced maths; for this
reason books about encryption tend to either be beyond high school level, or
else are about codes that aren't actually used in practice.
There are lots of intriguing stories around encryption, including its use in
wartime and for spying e.g.
• How I Discovered World War II's Greatest Spy and Other Stories of
Intelligence and Code (David Kahn)
• Information hiding
• Cryptographic protocols
War in the fifth domain looks at how encryption and security are key to our
defence against a new kind of war.
The parity magic trick (in the video above) enables the magician to detect
which card out of dozens has been flipped over while they weren't looking. The
magic in the trick is actually computer science, using the same kind of
technique that computers use to detect and correct errors in data. We will talk
about how it works in the next section.
The same thing is happening to data stored on computers --- while you (or the
computer) is looking away, some of it might accidentally change because of a
minor fault. When the computer reads the data, you don't want it to just use
the incorrect values. At the least you want it to detect that something has gone
wrong, and ideally it should do what the magician did, and put it right.
This chapter is about guarding against errors in data in its many different forms
--- data stored on a harddrive, on a CD, on a floppy disk, on a solid state drive
(such as that inside a cellphone, camera, or mp3 player), data currently in RAM
(particularly on servers where the data correctness is critical), data going
between the RAM and hard drive or between an external hard drive and the
internal hard drive, data currently being processed in the processor or data
going over a wired or wireless network such as from your computer to a server
on the other side of the world. It even includes data such as the barcodes
printed on products or the number on your credit card.
If we don't detect that data has been changed by some physical problem (such
as small scratch on a CD, or a failing circuit in a flash drive), the information will
just be used with incorrect values. A very poorly written banking system could
potentially result in your bank balance being changed if just one of the bits in a
number was changed by a cosmic ray affecting a value in the computer's
memory! If the barcode on the packet of chips you buy from the shop is
scanned incorrectly, you might be charged for shampoo instead. If you transfer
a music file from your laptop to your mp3 player and a few of the bits were
transferred incorrectly, the mp3 player might play annoying glitches in the
music. Error control codes guard against all these things, so that (most of the
time) things just work without you having to worry about such errors.
There are several ways that data can be changed accidentally. Networks that
have a lot of "noise" on them (caused by poor quality wiring, electrical
interference, or interference from other networks in the case of wireless). The
bits on disks are very very small, and imperfections in the surface can
eventually cause some of the storage to fail. The surfaces on compact disks
and DVDs are exposed, and can easily be damaged by storage (e.g. in heat or
humidity) and handling (e.g. scratches or dust). Errors can also occur when
numbers are typed in, such as entering a bank account number to make a
payment into, or the number of a container that is being loaded onto a ship. A
barcode on a product might be slightly scratched or have a black mark on it, or
perhaps the package is bent or is unable to be read properly due to the scanner
being waved too fast over it. Bits getting changed on permanent storage (such
as hard drives, optical disks, and solid state drives) is sometimes referred to as
data rot, and the wikipedia page on bit rot has a list of more ways that these
errors can occur.
Nobody wants a computer that is unreliable and won’t do what it's supposed to
do because of bits being changed! So, how can we deal with these problems?
Error control coding is concerned with detecting when these errors occur, and if
practical and possible, correcting the data to what it is supposed to be.
Some error control schemes have error correction built into them, such as the
parity method that was briefly introduced at the beginning of this section. You
might not understand yet how the parity trick worked, but after the card was
flipped, the magician detected which card was flipped, and was able to correct
it.
Other error control schemes, such as those that deal with sending data from a
server overseas to your computer, send the data in very small pieces called
packets (the network protocols chapter talks about this) and each packet has
error detection information added to it.
Error detection is also used on barcode numbers on products you buy, as well
as the unique ISBN (International Standard Book Number) that all books have,
and even the 16 digit number on a credit card. If any of these numbers are
typed or scanned incorrectly, there's a good chance that the error will be
detected, and the user can be asked to re-enter the data.
By the end of this chapter, you should understand the basic idea of error
control coding, the reasons that we require it, the differences between
algorithms that can detect errors and those that can both detect and correct
errors, and some of the ways that error control coding is used, in particular
parity (focussing on the parity magic trick) and the check digits used to ensure
book numbers, barcode numbers, and credit card numbers are entered
correctly.
A magician asks an observer to lay out a square grid of two-sided cards, and
the magician then says they are going to make it a bit harder, and add an extra
row and column to the square. The magician then faces the other way while the
observer flips over one card. The magician turns back around again, and tells
the observer which card was flipped!
The question now is, how did the magician know which card had been flipped
without seeing the card being flipped, or memorising the layout?! The short
answer is error control coding. Let's look more closely at that…
Once you think you have this correct, you should tell the computer to flip a
card. An animation will appear for a few seconds, and then the cards will
reappear with one card flipped (all the rest will be the same as before). Your
task is to identify the flipped card. You should be able to do this without having
memorised the layout. Remember the pattern you made with the extra cards
you added? That's the key to figuring it out. Once you think you have identified
the card, click it to see whether or not you were right. The interactive will guide
you through these instructions. If you are completely stuck identifying the
flipped card, a hint follows the interactive, although you should try and figure it
out for yourself first. Make sure you add the extra cards correctly; the computer
won’t tell you if you get them wrong, and you probably won’t be able to identify
the flipped card if the extra cards aren't chosen correctly.
Remember how you made it so that each column had an even number of black
cards? When a card is flipped, this results in the row and column that the card
was in having an odd number of black cards. So all you need to do is to identify
the row and column that have an odd number of black and white cards, and the
card that is at the intersection of them must be the one that was flipped!
The extra cards you added are called parity bits. Parity simply means whether a
number is even or odd (the word comes from the same root as "pair"). By
adding the extra cards in a way that ensured an even number of black cards in
each row and column, you made it so that the rows and columns had what is
called even parity.
When a card was flipped, this simulated an error being made in your data (such
as a piece of dust landing on a bit stored on a CD, or a cosmic ray changing a
bit stored on a hard disk, or electrical interference changing a bit being sent
over a network cable). Because you knew that each row and column was
supposed to have an even number of black and white cards in it, you could tell
that there was an error from the fact that there was a column and row that had
an odd number of black cards in it. This means that the algorithm is able to
detect errors, i.e. it has error detection. The specific card that had been
flipped was at the intersection of the row and column that had an odd number
of black cards and white cards in them, and because you were able to identify
exactly which card was flipped, you were able to correct the error, i.e the
algorithm has error correction.
If you had not added the parity bits, you would have had no way of even
knowing an error had occurred, unless you had memorised the entire layout of
cards! And what if more than one bit had been flipped? We'll consider this later.
Now that you have learnt how the parity trick works, you might like to try it
with a physical set of cards like the busker in the video, or you could use
any objects with two distinct sides, such as coins or cups. You could use
playing cards, but the markings can be distracting, and cards with two
colours are easiest (you can make them by cutting up sheets of card with
the two colours on, or single coloured card with a scribble or sticker on one
side).
You can find details and lots of ideas relating to the trick here, or follow
these instructions:
It would take some practice to be able to add the extra cards, and identify
the flipped card without the observer noticing that you are thinking hard
about it. With practice you should be able to do it while having a casual
conversation. Once you master it, you've got a great trick for parties, or
even for busking.
To make it more showy, you can pretend that you are mind reading the
person, waving your hands over the cards. A particular impressive variation
is to have an assistant come in to the room after the card has been flipped;
even though they haven't seen any of the setup, they will still be able to
detect the error.
It would be ideal to have some physical parity cards at this point that you can
layout in front of you and play around with to explore the questions raised.
An error control coding algorithm can often detect errors more easily than it
can correct them. Errors involving multiple bits can sometimes even go
undetected. What if the computer (or your friend if you were being a magician
with actual parity cards) had been sneaky and turned over two cards instead of
one? You could start by getting a friend or classmate to actually do this. Repeat
it a few times. Are you always able to correct the errors, or do you get it wrong?
Remember that to detect errors using this algorithm, you know that if one or
more rows and/or columns has an odd number of blacks and whites in it, that
there must be at least one error. In order to correct errors you have to be able
to pinpoint the specific card(s) that were flipped.
Are you always able to detect when an error has occurred if 2 cards have been
flipped? Why? Are you ever able to correct the error? What about with 3 cards?
It turns out that you can always detect an error when 2 cards have been flipped
(i.e. a 2-bit error), but the system can't correct more than a 1-bit error. When
two cards are flipped, there will be at least two choices for flipping two cards to
make the parity correct, and you won't know which is the correct one. With a 3-
bit error (3 cards flipped), it will always be possible to detect that there is an
error (an odd number of black bits in at least one row or column), but again,
correction isn't possible. With 4 cards being flipped, it's possible (but not likely)
that an error can go undetected.
There is actually a way to flip 4 cards where the error is then undetected
meaning that the algorithm will be unable to detect the error. Can you find a
way of doing this?
With more parity cards, we can detect and possibly correct more errors. Let's
explore a very simple system with minimal parity cards. We can have a 7x7
grid of data with just one parity card. That parity card makes it so that there is
an even number of black cards in the entire layout (including the parity card).
How can you use this to detect errors? Are you ever able to correct errors in
this system? In what situations do errors go undetected (think when you have
multiple errors, i.e. more than one card flipped).
With only one extra card for parity checking, a single bit error can be detected
(the total number of black cards will become odd), but a 2-bit error won't be
detected because the number of black cards will be even again. A 3-bit error
will be detected, but in general the system isn't very reliable.
So going back to the actual parity trick that has the 7 by 7 grid, and 15 parity
cards to make it 8 by 8, it is interesting to note that only 1 extra card was
needed to detect that an error had occurred, but an extra 15 cards were
needed to be able to correct the error. In terms of the cost of an algorithm, it
costs a lot more space to be able to correct errors than it does to be able to
simply detect them!
What happens when you use grids of different sizes? The grid doesn’t have to
have an even number of black cards and an even number of white cards, it just
happens that whenever you have an even number sized grid with the parity
bits added (e.g. the 8x8 we have mostly used in this section) and you have an
even number of black cards, you will also have to have an even number of
whites, which makes it a bit easier to keep track of.
Try a 6x6 grid with parity cards to make it 7x7. The parity cards simply need to
make each row and column have an even number of black cards (in this case
there will always be an odd number of white cards in each row and column).
The error detection is then looking for rows and columns that have an odd
number of black cards in them (but an even number of white cards).
Interestingly, the grid doesn’t even have to be a square! You could use 4x7 and
it would work!
There's also no limit on the size. You could create a 10x10 grid (100 cards), and
still be able to detect which card has been flipped over. Larger grids make for
an even more impressive magic trick.
9.3. Check digits on barcodes and
other numbers
You probably wouldn’t be very happy if you bought a book online by entering
the ISBN (International Standard Book Number), and the wrong book was sent
to you, or if a few days after you ordered it, you got an email saying that the
credit card number you entered was not yours, but was instead one that was
one digit different and another credit card holder had complained about a false
charge. Or if you went to the shop to buy a can of drink and the scanner read it
as being a more expensive product. Sometimes, the scanner won’t even read
the barcode at all, and the checkout operator has to manually enter the
number into the computer --- but if they don't enter it exactly as it is on the
barcode you could end up being charged for the wrong product. These are all
examples of situations that error control coding can help prevent.
Barcode numbers, credit card numbers, bank account numbers, ISBNs, national
health and social security numbers, shipping labels (serial shipping container
codes, or SSCC) and tax numbers all have error control coding in them to help
reduce the chance of errors. The last digit in each of these numbers is a check
digit, which is obtained doing a special calculation on all the other digits in the
number. If for example you enter your credit card number into a web form to
buy something, it will calculate what the 16th digit should be, using the first 15
digits and the special calculation (there are 16 digits in a credit card number). If
the 16th digit that it expected is not the one you entered, it can tell that there
was an error made when the number was entered and will notify you that the
credit card number is not valid.
In this section we will be initially looking at one of the most commonly used
barcode number formats used on most products you buy from supermarkets
and other shops. We will then be having a look at credit card numbers. You
don’t have to understand why the calculations work so well (this is advanced
math, and isn’t important for understanding the overall ideas), and while it is
good for you to know what the calculation is, it is not essential. So if math is
challenging and worrying for you, don’t panic too much because what we are
looking at in this section isn’t near as difficult as it might initially appear!
The last digit of these numbers is calculated from the first 12. This is very
closely related to the parity bit that we looked at above, where the last bit of a
row is "calculated" from the preceding ones. With a GTIN-13 code, we want to
be able to detect if one of the digits might have been entered incorrectly.
The following interactive checks GTIN-13 barcodes. Enter the first 12 digits of a
barcode number into the interactive, and it will tell you that the last digit
should be! You could start by using the barcode number “9 300675 036009”.
What happens if you make a mistake when you type in the 12 digits (try
changing one digit)? Does that enable you to detect that a mistake was made?
Have a look for another product that has a barcode on it, such as a food item
from your lunch, or a stationery item. Note that some barcodes are a little
different --- make sure the barcodes that you are using have 13 digits (although
you might like to go and find out how the check digit works on some of the
other ones). Can the interactive is always determine whether or not you typed
the barcode correctly?
One of the following product numbers has one incorrect digit. Can you tell
which of the products has had its number typed incorrectly?
• 9 400550 619775
• 9 400559 001014
• 9 300617 013199
If you were scanning the above barcodes in a supermarket, the incorrect one
will need to be rescanned, and the system can tell that it's a wrong number
without even having to look it up. Typically that would be caused by the bar
code itself being damaged (e.g. some ice on a frozen product making it read
incorrectly). If an error is detected, the scanner will usually make a warning
sound to alert the operator.
You could try swapping barcode numbers with a classmate, but before giving
them the number toss a coin, and if it's heads, change one digit of the barcode
before you give it to them. Can they determine that they've been given an
erroneous barcode?
If one of the digits is incorrect, this calculation for the check digit will produce a
different value to the checksum, and signals an error. So single digit errors will
always be detected, but what if two digits change --- will that always detect the
error?
What if the error is in the checksum itself but not in the other digits - will that
be detected?
The last two will be picked up from the expected length of the number; for
example,a GTIN-13 has 13 digits, so if 12 or 14 were entered, the computer
immediately knows this is not right. The first two depend on the check digit in
order to be detected. Interestingly, all one digit errors will be detected by
common checksum systems, and most transpositions will be detected (can you
find examples of transpositions that aren’t detected, using the interactive
above?)
There are also some less common errors that people make
Experiment further with the interactive. What errors are picked up? What errors
can you find that are not? Are the really common errors nearly always picked
up? Can you find any situations that they are not? Try to find examples of errors
that are detected and errors that are not for as many of the different types of
errors as you can.
• Multiply every second digit (starting with the second digit) by 3, and every
other digit by 1 (so they stay the same).
• Add up all the multiplied numbers to obtain the sum.
• The check digit is whatever number would have to be added to the sum in
order to bring it up to a multiple of 10 (i.e. the last digit of the sum should
be 0). Or more formally, take the last digit of the sum and if it is 0, the
check digit is 0. Otherwise, subtract the last digit from 10 to obtain the
check digit.
The following interactive can be used to do the calculations for you. To make
sure you understand the process, you need to do some of the steps yourself;
this interactive can be used for a wide range of check digit systems. To check a
GTIN-13 number, enter the first 12 digits where it says "Enter the number
here". The multipliers for GTIN-13 can be entered as "131313131313" (each
alternating digit multiplied by 3). This will give you the products of each of the
12 digits being multiplied by the corresponding number. You should calculate
the total of the products (the numbers in the boxes) yourself and type it in.
Then get the remainder when divided by 10. To work out the checksum, you
should calculate the digit needed to make this number up to 10 (for example, if
the remainder is 8, the check digit is 2). If the remainder is 0, then the check
digit is also 0.
Try this with some other bar codes. Now observe what happens to the
calculation when a digit is changed, or two are swapped.
The algorithm to check whether or not a barcode number was correctly entered
is very similar. You could just calculate the check digit and compare it with the
13th digit, but a simpler way is to enter all 13 digits.
Multiply every second digit (starting with the second digit) by 3, and every
other digit by 1. This includes the 13th digit, which is multiplied by 1. Add up all
the multiplied numbers to obtain the total. If the last digit of the sum is a 0, the
number was entered correctly. (That's the same as the remainder when divided
by 10 being 0).
For 13-digit barcodes, a quick way to add up a checksum that can be done
in your head (with some practice) is to separate the numbers to be
multiplied by 3, add them up, and then multiply by 3. For the example
above (9300675032247) the two groups are 9+0+6+5+3+2+7 = 32 and 3
+0+7+0+2+4= 16. So we add 32 + 16x3, which gives the total of 80
including the check digit.
To make it even easier, for each of the additions you only need to note the
last digit, as the other digits will never affect the final result. For example,
the first addition above begins with 9+0+6, so you can say that it adds up
to 5 (rather than 15) and still get the same result. The next digit (5) takes
the sum to 0, and so on. This also means that you can group digits that add
to 10 (like 1 and 9, or 5 and 5), and ignore them. For example, in the
second group, 3+0+7 at the start adds up to 0, and the only sum that
counts is 2+4, giving 6 as the total.
Finally, even the multiplication will be ok if you just take the last digit. In
the example above, that means we end up working out 6x3 giving 8 (not
18); the original was 16x3 giving 48, but it's only the final 8 digit that
matters.
All these shortcuts can make it very easy to track the sum in your head.
Let's look at some smaller examples with 5 digits (4 normal digits and a
check digit), as the same ideas will apply to the 13 digit numbers.
The first thing we should observe is that only the ones column (last digit) of
each number added have any impact on the check digit. 8+27+5+12=52,
and 8+7+5+2=22 (only looking at the last digit of each number we are
adding). Both these end in a 2, and therefore need 8 to bring them up to
the nearest multiple of 10. You might be able to see why this is if you
consider that the “2” and “1” were cut from the tens column, they are
equal to 10+20=30, a multiple of 10. Subtracting them only affects the
tens column and beyond. This is always the case, and therefore we can
simplify the problem by only adding the ones column of each number to
the sum. (This can also be used as a shortcut to calculate the checksum in
your head).
Next, let's look at why changing one digit in the number to another digit
will always be detected with this algorithm. Each digit will contribute a
number between 0 and 9 to the sum (remember we only care about the
ones column now). As long as changing the digit will result in it contributing
a different amount to the sum, it becomes impossible for it to still sum to a
multiple of 10. Remember that each digit is either multiplied by 1 or 3
before its ones column is added to the sum.
• 1 -> 3
• 2 -> 6
• 3 -> 9
• 4 -> 2 (from 12)
• 5 -> 5 (from 15)
• 6 -> 8 (from 18)
• 7 -> 1 (from 21)
• 8 -> 4 (from 24)
• 9 -> 7 (from 27)
• 0 -> 0
If you look at the right hand column, you should see that no number is
repeated twice. This means that no digit contributes the same amount to
the sum when it is multiplied by 3!
Seeing why the algorithm is able to protect against most swap errors is
much more challenging.
If two digits are next to one another, one of them must be multiplied by 1,
and the other by 3. If they are swapped, then this is reversed. For example,
if the number 89548 from earlier is changed to 85948, then (5x3)
+(9x1)=24 is being added to the total instead of (9x3)+(5x1)=32. Because
24 and 32 have different values in their ones columns, the amount
contributed to the total is different, and therefore the error will be
detected.
But are there any cases where the totals will have the same values in their
ones columns? Another way of looking at the problem is to take a pair of
rows from the table above, for example:
• 8 -> 4
• 2 -> 6
Remember that the first column is how much will be contributed to the
total for digits being multiplied by 1, and the second column is for those
being multiplied by 3. Because adjacent digits are each multiplied by a
different amount (one by 3 and the other by 1), the numbers diagonal to
each other in the chosen pair will be added.
If for example the first 2 digits in a number are “28”, then we will add 2
+4=6 to the sum. If they are then reversed, we will add 8+6=14, which is
equivalent to 4 as again, the “10” part does not affect the sum. 8+6 and 2
+4 are the diagonals of the pair!
• 8 -> 4
• 2 -> 6
So the question now is, can you see any pairs where the diagonals would
add up to the same value? There is one!
When two numbers are side by side, one is multiplied by 3 and the other by
1. So the amount contributed to the total is the sum of the number’s row in
the above table. For example, 2 has the row “2->6”. This means that 2
+6=8 will be contributed to the sum as a result of these two digits.
If any rows add up to the same number, this could be a problem. Where the
sum was over 10, the tens column has been removed.
Some of the rows add up to the same number! Because both 4 and 9 add
up to 6, the error will not be detected if “44” changes to “99” in a number!
Rows that do not add will be detected. From the example above, if 22
changes to 88, this will be detected because 22’s total is 8, and 88’s total
is 2.
Another error that people sometimes make is the jump transposition error.
This is where two digits that have one digit in between them are swapped,
for example, 812 to 218.
A pair of numbers that are two apart like this will always be multiplied by
the same amount as each other, either 1 or 3. This means that the change
in position of the numbers does not affect what they are multiplied by, and
therefore what they contribute to the sum. So this kind of error will never
be detected.
Although the numbers from this interactive are random, their check digits
are calculated using the appropriate method, so you can use them as
examples for your project. Actually, not all of them will be correct, so one of
your challenges is to figure out which are ok!
Your project is to choose a checksum other than the 13-digit barcodes, and
research how it is calculated (they all use slightly different multipliers). You
should demonstrate how they work (using the following interactive if you
want), and find out which of the numbers generated are incorrect.
ISBN-10 is particularly effective, and you could also look into why that is.
In this chapter we'll explore a range of these intelligent systems. Inevitably this
will mean dealing with ethical and philosophical issues too --- do we really want
machines to take over some of our jobs? Can we trust them? Might it all go too
far one day? What do we really mean by a computer being intelligent? While we
won't address these questions directly in this chapter, gaining some technical
knowledge about AI will enable you to make more informed decisions about the
deeper issues.
You may come across chatterbots online for serious uses (such as giving help
on a booking website). Sometimes it's hard to tell if you're getting an
automated response. We'll start off by looking at some very simple chatterbots
that are only designed as an experiment rather than to offer serious advice.
Eliza is a system that was intended to get people thinking about AI, and
you should not use it for your own therapeutic usage. You should never
enter personal information into a computer that you wouldn’t want
anybody else reading, as you can never be certain that the web site isn’t
going to store the information or pass it on to someone. So you don’t want
to be telling Eliza the kinds of things a person would say in a therapy
session --- just make stuff up when talking to Eliza! For the same reason, do
not tell any chatterbot other personal information such as your full name,
date of birth, or address.
Go to the link just below and have a conversation with Eliza (keeping in mind
that she is supposed to be a Rogerian Psychotherapist, so will respond like
one).
Do you think Eliza is as smart as a human? Would you believe she was a human
if somebody told you she was?
You will probably agree that while Eliza sounds somewhat intelligent at times,
she gives very vague and general replies, forgets earlier parts of the
conversation, and says some things that sound very strange! As an example,
here is a conversation with Eliza:
Now go back and have another conversation with Eliza. There are many other
examples of Eliza saying stuff that makes no sense such as the above dialogue.
How many can you find? In addition, how does Eliza respond when you do the
following things?
Try having a conversation with another chatterbot --- Alice. Note that she knows
you as "judge". You might work out why she calls you this once you have read
the sections on the Turing Test below!
Does Alice sound more intelligent than Eliza? Does she sound as intelligent as a
human, or can you trick her into saying things that make no sense? Try using
the same tricks you used to fool Eliza and see if they fool Alice as well (look
back at the list of bullet points in the Eliza section just above).
Like Eliza, Alice also uses hard coded rules to know what to say, although she
has a lot more of these rules and a more sophisticated algorithm for choosing
which rules should be used to determine what to say. If you want to know more,
just ask Alice how she works, and she’ll tell you!
To use the rules, the chatterbot analyses your last sentence (and possibly
earlier sentences also) by matching the sentence against the patterns in its
rules. If the sentence matches the rule pattern, it "activates" the rule. The rule
will specify a reply. Sometimes the reply is just a template for a sentence, and
the chatterbot analyses the text you typed to extract words that can be
inserted into the template. That’s why the replies sometimes use part of what
you typed.
The quality of these rules determines how intelligent the chatterbot will sound.
While Alice is undoubtedly more intelligent than Eliza, Alice still has her
limitations. Sometimes she even admits to those limitations!
Judge: What will happen in the year 2000
Alice: I cannot predict the future.
Judge: That is the past
Alice: Yes it is one of the past.
Judge: Then why did you say you can't predict the future?
Alice: It was a preprogrammed response completely determined by your
statements.
I suppose I was a bit mean for tricking her and saying "what will". Her rules
assume a human wouldn’t mislead her like that.
There are other chatterbots that are able to learn their rules from the humans
they have conversations with. By looking at how a human responds to various
dialogues, the chatterbot attempts to learn how it should respond in various
situations. The idea is that if it responds in similar ways to what a human does,
then perhaps it will sound like a human. Most of these chatterbots aim to have
very general conversations, i.e. they aren’t restrained to one domain such as
Eliza the therapist is.
Please note that the following exercise involves interacting with one of
these chatterbots. Because the chatterbot has learnt from humans, it will
quite possibly have been taught to say things that you may find highly
offensive. While we have tried to choose chatterbots that mostly say things
that aren’t going to offend, it is impossible to guarantee this, so use your
discretion with them; you can skip this section and still cover the main
concepts of this chapter. Because Eliza and Alice don’t learn from humans,
they won’t say offensive things unless you do first!
And again, don’t tell the chatterbots your personal details (such as your full
name, date of birth, address, or any other information you wouldn’t be
happy sharing with everybody). Just make stuff up where necessary. A
chatterbot that learns from people quite possibly will pass on what you say
to other people in an attempt to sound intelligent to them!
These warnings will make more sense once you’ve learnt how these
chatterbots work.
Unlike Eliza and Alice, whose rules of what to say were determined by
programmers, Cleverbot learns rules based on what people say. For example,
when Cleverbot says "hi" to a person, it keeps track of all the different
responses that people make to that, such as "hi", "hello!", "hey ya", "sup!". A
rule is made that says that if somebody says hi to you, then the things that
people have commonly said in response to Cleverbot saying hi are appropriate
things to say in response to "hi". In turn, when Cleverbot says something like
"sup!" or "hello!", it will look at how humans respond to that in order to learn
appropriate response for those. And then it will learn responses for those
responses. This allows Cleverbot to built up an increasingly large database.
Check out the short film "Do You Love Me" (~3 mins), the result when Chris
R Wilson collaborated with Cleverbot to write a movie script.
If you have a device that runs Apple iOS (for example an iPhone), have a look
at the Siri chatterbot in the device’s help system. Siri is an example of a
chatterbot that has the job of helping a human, unlike most chatterbots which
simply have the purpose of web entertainment. It also has voice recognition, so
you can talk to it rather than just typing to it.
A very famous computer scientist, Alan Turing, answered this question back in
1950, before the first chatterbots even existed! Alan Turing had an
extraordinary vision of the future, and knew that coming up with computers
that were intelligent would become a big thing, and that we would need a way
to know when we have succeeded in creating a truly intelligent computer.
This test proposed by Turing eventually became very famous and got the name
"The Turing Test". One of the motivations for writing chatterbots is to try and
make one that passes the Turing Test. Unfortunately, making a chatterbot that
successfully passes the Turing Test hasn't yet been achieved, and whether or
not it is even possible is still an open question in computer science, along with
many other questions in artificial intelligence that you will encounter later in
this chapter.
Other forms of the Turing Test exist as well. Action games sometimes have
computer controlled characters that fight your own character, in place of a
second human controlled character. A variation of the Turing Test can be used
to determine whether or not the computer controlled player seems to have
human intelligence by getting an interrogator to play against both the
computer character and the human character, and to see whether or not they
can tell them apart.
This section will involve you actually carrying out the Turing Test. Read this
entire section carefully (and the previous section if you haven’t done so
already) before you start, and make sure you understand it all before
starting.
Carrying out the Turing Test is carrying out an experiment, just like carrying
out experiments in chemistry classes. And just like the chemistry
experiments, carrying out the Turing Test requires being careful to follow
instructions correctly, and controlling factors that could potentially affect
the results but aren’t part of what is being tested. You should keep this in
mind while you are carrying out this project.
Choose a chatterbot from the list on Wikipedia (see the above chatterbots
section), or possibly use Alice or Cleverbot (Eliza isn’t recommended for
this). You will be taking the role of the interrogator, but will need another
person to act as the "human". For this it is recommended you choose a
person in your class who you don’t know very well. Do not choose your
best friend in the class, because you will know their responses to questions
too well, so will be able to identify them from the chatterbot based on their
personality rather than the quality of the chatterbot.
In addition to the chatterbot and your classmate to act as the human, you
will need access to a room with a computer with internet (this could just be
the computer classroom), another room outside it (a hallway would be
fine), pieces of paper, 2 pens, and a coin or a dice.
The chatterbot should be loaded on the computer to be used, and your
classmate should be in the same room with the computer. You should be
outside that room. As the interrogator, you will first have a conversation
with either your classmate or the computer, and then a conversation with
the other one. You should not know which order you will speak to them; to
determine which you speak to first your classmate should use the dice or
the coin to randomly decide (and shouldn’t tell you).
A problem is that it will take longer for the conversation between you and
the chatterbot than between you and the classmate, because of the need
for your classmate to type what you say to the chatterbot. You wouldn’t
want to make it obvious which was the computer and which was the human
due to that factor! To deal with this, you could intentionally delay for a
while before each reply to that they all take exactly one minute.
You can ask whatever you like, although keep in mind that you should be
assuming you don’t know your classmate already, so don’t refer to
common knowledge about previous things that happened such as in class
or in the weekend in what you ask your classmate or the chatterbot. This
would be unfair to the chatterbot since it can’t possibly know about those
things. Remember you’re evaluating the chatterbot on its ability to display
human intelligence, not on what it doesn’t know about.
Once both conversations are complete, you as the interrogator has to say
which was your classmate, and which was the chatterbot. Your classmate
should tell you whether or not you were correct.
These are some questions you can consider after you have finished
carrying out the Turing Test:
• How were you able to tell which was the chatterbot and which was
your classmate?
• Were there any questions you asked that were "unfair" --- that
depended on knowledge your classmate might have but no-one
(computer or person) from another place could possibly have?
• Which gave it away more: the content of the answers, or the way in
which the content was expressed?
10.3. The whole story!
In this chapter so far, we have only talked about one application of AI. AI
contains many more exciting applications, such as computers that are able to
play board games against humans, computers that are able to learn, and
computers that are able to control robots that are autonomously exploring an
environment too dangerous for humans to enter.
In this chapter we're going to look at problems where it's easy to tell the
computer what to do --- by writing a program --- but the computer can’t do
what we want because it takes far too long: millions of centuries, perhaps. Not
much good buying a faster computer either: if it were a hundred times faster it
would still take millions of years; even one a million times faster would take
hundreds of years. That's what you call a hard problem --- one where it takes
far longer than the lifetime of the fastest computer imaginable to come up with
a solution!
The area of tractability explores problems and algorithms that can take an
impossible amount of computation to solve except perhaps for very small
examples of the problem.
We'll define what we mean by tractable later on, but put very crudely, a
tractable problem is one which we can write programs for that finish in a
reasonable amount of time, and an intractable problem is one that will
generally end up taking way too long.
Knowing when a problem you are trying to solve is one of these hard problems
is very important. Otherwise it is easy to waste huge amounts of time trying to
invent a clever program to solve it, and never getting anywhere. A computer
scientist needs to be able to recognise a problem as an intractable problem, so
that they can use other approaches. A very common approach is to give up on
getting a perfect answer, and instead just aim for an approximately correct
answer. There are a variety of techniques for getting good approximate
answers to hard problems; a way of getting an answer that isn't guaranteed to
give the exact correct answer is sometimes referred to as a heuristic.
The following interactive has a program that solves the problem for however
many cities you want to select by trying out all possible routes, and recording
the best so far. You can get a feel for what an intractable problem looks like by
seeing how long the interactive takes to solve the problem for different size
maps. Try generating a map with about 5 cities, and press "Start" to solve the
problem.
Now try it for 10 cities (twice as many). Does it take twice as long? How about
twice as many again (20 cities)? What about 50 cities? Can you guess how long
it would take? You're starting to get a feel for what it means for a problem to be
intractable.
Of course, for some situations, intractable problems are a good thing. In
particular, most security and cryptography algorithms are based on intractable
problems; the codes could be broken, but it would take billions of years and so
would be futile. In fact, if anyone ever finds a fast algorithm for solving such
problems, a lot of computer security systems would stop being secure
overnight! So one of the jobs of computer scientists is to be confident that such
solutions don't exist!
In this chapter we will look at the TSP and other problems for which no
tractable solutions are known, problems that would take computers millions of
centuries to solve. And we will encounter what is surely the greatest mystery in
computer science today: that no-one knows whether there's a more efficient
way of solving these problems! It may be just that no-one has come up with a
good way yet, or it may be that there is no good way. We don't know which.
Image source
But let's start with a familiar problem that we can actually solve.
Having a rough idea of the complexity of a problem helps you to estimate how
long it's likely to take. For example, if you write a program and run it with a
simple input, but it doesn't finish after 10 minutes, should you quit, or is it
about to finish? It's better if you can estimate the number of steps it needs to
make, and then extrapolate from the time it takes other programs to find
related solutions.
We won't use precise notation for asymptotic complexity (which says which
parts of speed calculations you can safely ignore), but we will make rough
estimates of the number of operations that an algorithm will go through.
There's no need to get too hung up on precision since computer scientists
are comfortable with a simple characterisation that gives a ballpark
indication of speed.
For example, consider using selection sort to put a list of n values into
increasing order. (This is explained in the chapter on algorithms). Suppose
someone tells you that it takes 30 seconds to sort a thousand items. Does
that sounds like a good algorithm? For a start, you'd probably want to know
what sort of computer it was running on - if it's a supercomputer then
that's not so good; if it's a tiny low-power device like a smartphone then
maybe it's ok.
Also, a single data point doesn't tell you how well the system will work with
larger problems. If the selection sort algorithm above was given 10
thousand items to sort, it would probably take about 50 minutes (3000
seconds) --- that's 100 times as long to process 10 times as much input.
These data points for a particular computer are useful for getting an idea of
the performance (that is, complexity) of the algorithm, but they don't give
a clear picture. It turns out that we can work out exactly how many steps
the selection sort algorithm will take for n items: it will require about
operations, or in expanded form, operations. This formula applies
regardless of the kind of computer its running on, and while it doesn't tell
us the time that will be taken, it can help us to work out if it's going to be
reasonable.
From the above formula we can see why it gets bad for large values of n :
the number of steps taken increases with the square of the size of the
input. Putting in a value of 1 thousand for n tells us that it will use
1,000,000/2 - 1,000/2 steps, which is 499,500 steps.
Notice that the second part (1000/2) makes little difference to the
calculation. If we just use the part of the formula, the estimate will be out
by 0.1%, and quite frankly, the user won't notice if it takes 20 seconds or
19.98 seconds. That's the point of asymptotic complexity --- we only need
to focus on the most significant part of the formula, which contains .
If you've studied algorithms, you will have learnt that some sorting algorithms,
such as mergesort and quicksort, are inherently faster than other algorithms,
such as insertion sort, selection sort, or bubble sort. It’s obviously better to use
the faster ones. The first two have a complexity of time (that is, the
number of steps that they take is roughly proportional to ), whereas the
last three have complexity of . Generally the consequence of using the wrong
sorting algorithm will be that a user has to wait many minutes (or perhaps
hours) rather than a few seconds or minutes.
For example, if you are sorting the numbers 45, 21 and 84, then every possible
order they can be put in (that is, all permutations) would be listed as:
45, 21, 84
45, 84, 21
21, 45, 84
21, 84, 45
84, 21, 45
84, 45, 21
Going through the above list, the only line that is in order is 21, 45, 84, so
that's the solution. It's a very inefficient approach, but it will help to illustrate
what we mean by tractability.
In order to understand how this works, and the implications, choose four
different words (in the example below we have used colours) and list all the
possible orderings of the four words. Each word should appear exactly once in
each ordering. You can either do this yourself, or use an online permutation
generator such as JavaScriptPermutations or Text Mechanic.
For example if you’d picked red, blue, green, and yellow, the first few orderings
could be:
Once your list of permutations is complete, search down the list for the one
that has the words sorted in alphabetical order. The process you have just
completed is using permutation sort to sort the words.
Now add another word. How many possible orderings will there be with 5
words? What about with only 2 and 3 words --- how many orderings are there
for those? If you gave up on writing out all the orderings with 5 words, can you
now figure out how many there might be? Can you find a pattern? How many
do you think there might be for 10 words? (You don’t have to write them all out!
).
If you didn’t find the pattern for the number of orderings, think about using
factorials. For 3 words, there are (“3 factorial”) orderings. For 5 words, there
are orderings. Check the jargon buster below if you don’t know what a
“factorial” is, or if you have forgotten!
Factorials are very easy to calculate; just multiply together all the integers
from the number down to 1. For example, to calculate you would simply
multiply: 5 x 4 x 3 x 2 x 1 = 120. For you would simply multiply 8 x 7 x 6
x 5 x 4 x 3 x 2 x 1 = 40,320.
For factorials of larger numbers, most desktop calculators won't work so well;
for example, 100! has 158 digits. You can use the calculator below to work with
huge numbers (especially when using factorials and exponents).
Try calculating 100! using this calculator --- that's the number of different
routes that a travelling salesman might take to visit 100 places (not counting
the starting place). With this calculator you can copy and paste the result back
into the input if you want to do further calculations on the number. If you are
doing these calculations for a report, you should also copy each step of the
calculation into your report to show how you got the result.
There are other big number calculators available online; for example, the Big
Integer Calculator. Other big calculators are available online, or you could look
for one to download for a desktop machine or smartphone.
By now, you should be very aware of the point that is being made. Permutation
sort is so inefficient that sorting 100 numbers with it takes so long that it is
essentially impossible. Trying to use permutation sort with a non trivial number
of values simply won’t work. While selection sort is a lot slower than quick sort
or merge sort, it wouldn’t be impossible for Facebook to use selection sort to
sort their list of 1 billion users. It would take a lot longer than quick sort would,
but it would be doable. Permutation sort on the other hand would be impossible
to use!
But the problem of sorting items into order is not intractable - even though the
Permutation sort algorithm is intractable, there are lots of other efficient and
not-so-efficient algorithms that you could use to solve a sorting problem in a
reasonable amount of time: quick sort, merge sort, selection sort, even bubble
sort! However, there are some problems in which the ONLY known algorithm is
one of these intractable ones. Problems in this category are known as
intractable problems.
The Towers of Hanoi is one problem where we know for sure that it will take
exponential time. There are many intractable problems where this isn't the
case --- we don't have tractable solutions for them, but we don't know for
sure if they don't exist. Plus this isn't a real problem --- it's just a game
(although there is a backup system based on it). But it is a nice example of
an exponential time algorithm, where adding one disk will double the
number of steps required to produce a solution.
11.3. Tractability
There's a very simple rule that computer scientists use to decide if an algorithm
is tractable or not, based on the complexity (estimated number of steps) of the
algorithm. Essentially, if the algorithm takes an exponential amount of time or
worse for an input of size n, it is labelled as intractable. This simple rule is a bit
crude, but it's widely used and provides useful guidance. (Note that a factorial
amount of time, n!, is intractable because it's bigger than an exponential
function.)
To see what this means, let's consider how long various algorithms might take
to run. The following interactive will do the calculations for you to estimate how
long an algorithm might take to run. You can choose if the running time is
exponential (that is, , which is the time required for the Towers of Hanoi
problem with n disks), or factorial (that is, , which is the time required for
checking all possible routes a travelling salesman would make to visit n places
other than the starting point). You can use the interactive below to calculate
the time.
For example, try choosing the factorial time for the TSP, and put in 20 for the
value of n (i.e. this is to check all possible travelling salesman visits to 20
places). Press the return or tab key to update the calculation. The calculator will
show a large number of seconds that the program will take to run; you can
change the units to years to see how long this would be.
So far the calculation assumes that the computer would only do 1 operation per
second; try changing to a million (1,000,000) operations per second, which is
more realistic, and see how long that would take.
The interactive above estimates the amount of time taken for various
algorithms to run given n values to be processed. Let's assume that we have a
very fast computer, faster than any that exist. Try putting in the assumption
that the computer can do a million million (1,000,000,000,000) steps per
second. Is that achievable? But what if you add just two more locations to the
problem (i.e. n=22 instead of n=20)?
Now, consider an algorithm that has a complexity of (there are lots that take
roughly this number of steps, including selection sort which was mentioned
earlier). Type in a value of 1,000,000 for n to see how long it might take to sort
a million items on a single processor (keep the number of steps per second at
1,000,000,000,000, but set the number of processors to just 1) --- it should
show that it will only take about 1 second on our hypothetical very fast
machine. Now put in 10 million for n --- although it's sorting a list 10 times as
big, it takes more than 10 times as long, and will now take a matter of minutes
rather than seconds. At what value of n does the amount of time become out of
the question --- that is, how large would the problem need to be for it to take
years to finish? Is anyone ever likely to be sorting this many values --- for
example, what if for some reason you were sorting the name of every person in
the world, or every base in the human genome?
What about an algorithm with complexity of ? What's the largest size input
that it can process in a reasonable amount of time?
Now try the same when the number of steps is , but start with a value of 10
for n , then try 30, 40 , 50 and so on. You'll probably find that for an input of
about 70 items it will take an unreasonable amount of time. Is it much worse
for 80 items?
Now try increasing the number of operations per second to 10 times as many.
Does this help to solve bigger problems?
Trying out these figures you will likely have encountered the barrier between
"tractable" and "intractable" problems. Algorithms that take , or even
time to solve a problem (such as sorting a list) aren't amazing, but at least with
a fast enough computer and for the size of inputs we might reasonably
encounter, we have a chance of running them within a human lifetime, and
these are regarded as tractable . However, for algorithms that take , or
more steps, the amount of time taken can end up as billions of years even for
fairly small problems, and using computers that are thousand times faster still
doesn't help to solve much bigger problems. Such problems are regarded as
intractable . Mathematically, the boundary between tractable and intractable is
between a polynomial number of steps (polynomials are formulas made up of
, , and so on), and an exponential number of steps ( , , , and so
on).
The two formulas and look very similar, but they are really massively
different, and can mean a difference between a few seconds and many
millennia for the program to finish. The whole point of this chapter is to develop
an awareness that there are many problems that we have tractable algorithms
for, but there are also many that we haven't found any tractable algorithms for.
It's very important to know about these, since it will be futile to try to write
programs that are intractable, unless you are only going to be processing very
small problems.
Essentially any algorithm that tries out all combinations of the input will
inevitably be intractable because the number of combinations is likely to be
exponential or factorial. Thus an important point is that it's usually not going to
work to design a system that just tries out all possible solutions to see which is
the best.
What about Moore's law, which says that computing power is increasing
exponentially? Perhaps that means that if we wait a while, computers will be
able to solve problems that are currently intractable? Unfortunately this
argument is wrong; intractable problems are also exponential, and so the rate
of improvement due to Moore's law means that it will only allow for slightly
larger intractable problems to be solved. For example, if computing speed is
doubling every 18 months (an optimistic view of Moore's law), and we have an
intractable problem that takes operations to solve (many take longer than
this), then in 18 months we will be able to solve a problem that's just one item
bigger. For example, if you can solve an exponential time problem for 50 items
(50 countries on a map to colour, 50 cities for a salesman to tour, or 50 rings
on a Towers of Hanoi problem) in 24 hours, then in 18 months you can expect
to buy a computer that could solve it for 51 items at best! And in 20 years
you're likely to be able to get a computer that could solve for 55 items in one
day. You're going to have to be more than patient if you want Moore's law to
help out here --- you have to be prepared to wait for decades for a small
improvement!
Researchers have spent a lot of time trying to find efficient solutions to the
Travelling Salesman Problem, yet have been unable to find a tractable
algorithm for solving it. As you learnt in the previous section, intractable
algorithms are very slow, to the point of being impossible to use. As the only
solutions to TSP are intractable, TSP is known as an intractable problem.
Current algorithms for finding the optimal TSP solution aren't a lot better than
simply trying out all possible paths through the map (as in the interactive at
the start of this chapter). The number of possible paths gets out of hand; it's an
intractable approach. In the project below you'll be estimating how long it
would take.
While TSP was originally identified as being the problem that sales people face
when driving to several different locations and wanting to visit them in the
order that leads to the shortest route (less petrol usage), the same problem
applies to many other situations as well. Courier and delivery companies have
variants of this problem --- often with extra constraints such as limits on how
long a driver can work for, or allowing for left hand turns being faster than
right-hand ones (in NZ at least!)
Since these problems are important for real companies, it is not reasonable to
simply give up and say there is no solution. Instead, when confronted with an
intractable problem, computer scientists look for algorithms that produce
approximate solutions --- solutions that are not perfectly correct or optimal, but
are hopefully close enough to be useful. By relaxing the requirement that the
solution has to be perfectly correct, it is often possible to come up with
tractable algorithms that will find good enough solutions in a reasonable time.
This kind of algorithm is called a heuristic - it uses rules of thumb to suggest
good choices and build up a solution made of pretty good choices.
There are software companies that work on trying to make better and better
approximate algorithms for guiding vehicles by GPS for delivery routes.
Companies that write better algorithms can charge a lot of money if their
routes are faster, because of all the fuel and time savings that can be made.
An interesting thing with intractability is that you can have two very similar
problems, with one being intractable and the other being tractable. For
example, finding the shortest route between two points (like a GPS device
usually does) is a tractable problem, yet finding the shortest route around
multiple points (the TSP) isn't. By the way, finding the longest path between
two points (without going along any route twice) is also intractable, even
though finding the shortest path is tractable!
This project is based around a scenario where there is a cray fisher who has
around 18 craypots that have been laid out in open water. Each day the
fisher uses a boat to go between the craypots and check each one for
crayfish.
The cray fisher has started wondering what the shortest route to take to
check all the craypots would be, and has asked you for your help. Because
every few weeks the craypots need to be moved around, the fisher would
prefer a general way of solving the problem, rather than a solution to a
single layout of craypots. Therefore, your investigations must consider
more than one possible layout of craypots, and the layouts investigated
should have the craypots placed randomly i.e. not in lines, patterns, or
geometric shapes.
When asked to generate a random map of craypots, get a pile of coins (or
counters) with however many craypots you need, and scatter them onto an
A4 piece of paper. If any land on top of each other, place them beside one
another so that they are touching but not overlapping. One by one, remove
the coins, making a dot on the paper in the centre of where each coin was.
Number each of the dots. Each dot represents one craypot that the cray
fisher has to check. You should label the top left corner or the paper as
being the boat dock, where the cray fisher stores the boat.
Using your intuition, find the shortest path between the craypots.
Now on this new map, try to use your intuition to find the shortest path
between the craypots. Don’t spend more than 5 minutes on this task; you
don’t need to include the solution in your report. Why was this task very
challenging? Can you be sure you have an optimal solution?
Unless your locations were laid out in a circle or oval, you probably found it
very challenging to find the shortest route. A computer would find it even
harder, as you could at least take advantage of your visual search and
intuition to make the task easier. A computer could only consider two
locations at a time, whereas you can look at more than two. But even for
you, the problem would have been challenging! Even if you measured the
distance between each location and put lines between them and drew it on
the map so that you didn’t have to judge distances between locations in
your head, it’d still be very challenging for you to figure out!
How many possible routes are there for the larger example you have
generated? How is this related to permutation sort, and factorials? How
long would it take to calculate the shortest route in your map, assuming
the computer can check 1 billion (1,000,000,000) possible routes per
second? (i.e. it can check one route per nanosecond) What can you
conclude about the cost of this algorithm? Would this be a good way for the
cray fisher to decide which path to take?
Make sure you show all your mathematical working in your answers to the
above questions!
You should be able to tell that this problem is equivalent to the TSP, and
therefore it is intractable. How can you tell? What is the equivalent to a
town in this scenario? What is the equivalent to a road?
Since we know that this craypot problem is an example of the TSP, and that
there is no known tractable algorithm for the TSP, we know there is no
tractable algorithm for the craypot problem either. Although there are
slightly better algorithms than the one we used above, they are still
intractable and with enough craypots, it would be impossible to work out a
new route before the cray fisher has to move the pots again!
There are several ways of approaching this. Some are better than others in
general, and some are better than others with certain layouts. One of the
more obvious approximate algorithms, is to start from the boat dock in the
top left corner of your map and to go to the nearest craypot. From there,
you should go to the nearest craypot from that craypot, and repeatedly go
to the nearest craypot that hasn’t yet been checked. This approach is
known as a greedy heuristic algorithm as it always makes the decision that
looks the best at the current time, rather than making a not so good
decision now to try and get a bigger pay off later. You will understand why
this doesn’t necessarily lead to the optimal solution after completing the
following exercises.
On a copy of each of your 2 maps you generated, draw lines between the
craypots to show the route you would find following the greedy algorithm
(you should have made more than one copy of each of the maps!)
For your map with the smaller number of craypots (7 or 8), compare your
optimal solution and your approximate solution. Are they the same? Or
different? If they are the same, would they be the same in all cases? Show
a map where they would be different (you can choose where to place the
craypots yourself, just use as many craypots as you need to illustrate the
point).
For your larger map, show why you don’t have an optimal solution. The
best way of doing this is to show a route that is similar to, but shorter than
the approximate solution. The shorter solution you find doesn’t have to be
the optimal solution, it just has to be shorter than the one identified by the
approximate algorithm (Talk to your teacher if you can’t find a shorter route
and they will advise on whether or not you should generate a new map).
You will need to show a map that has a greedy route and a shorter route
marked on it. Explain the technique you used to show there was a shorter
solution. Remember that it doesn’t matter how much shorter the new
solution you identify is, just as long as it is at least slightly shorter than the
approximate solution --- you are just showing that the approximate solution
couldn’t possibly be the optimal solution by showing that there is a shorter
solution than the approximate solution.
Why would it be important to the cray fisher to find a short route between
the craypots, as opposed to just visiting them in a random order? Discuss
other problems that are equivalent to TSP that real world companies
encounter every day. Why is it important to these companies to find good
solutions to TSP? Estimate how much money might a courier company be
wasting over a year if their delivery routes were 10% worse than the
optimal. How many different locations/towns/etc might their TSP solutions
have to be able to handle?
Find a craypot layout that results in the greedy algorithm finding what
seem to be a really inefficient route. Why is it inefficient? Don’t worry about
trying to find an actual worst case, just find a case that seems to be quite
bad. What is a general pattern that seems to make this greedy algorithm
inefficient?
There is no known tractable solution to this problem, which means that we only
currently have exponential-time algorithms that work out the minimum number
of bins needed. However, there are a number of heuristics that can very quickly
give a non-optimal solution. One of these is the first fit algorithm. This
algorithm uses the following steps:
1. Select an item.
2. Place it in the first bin that has space available for it. If there are no bins
with space, add a new bin.
Try out these algorithms or your own ideas using this interactive:
Use the interactive online at http://www.csfieldguide.org.nz/en/interactives/bin-packing/
index.html
• http://www.cs4fn.org/algorithms/tiles.php
• http://www.cs4fn.org/algorithms/uncomputable.php
• http://www.cs4fn.org/algorithms/haltingproblem.php
It's good to know about these issues, to avoid getting stuck writing impossible
programs. It's also a fascinating area of research with opportunities to make a
discovery that could change the world of computing, as well as contribute to
our understanding on what can and can't be computed.
Movie and gaming companies can't always just use existing software to make
the next great thing --- they need computer scientists to come up with better
graphics techniques to make something that's never been seen before. The
creative possibilities are endless!
In this chapter we'll look at some of the basic techniques that are used to
create computer graphics. These will give you an idea of the techniques that
are used in graphics programming, although it's just the beginning of what's
possible.
For this chapter we are using a system called WebGL which can render 3D
graphics in your browser. If your browser is up to date everything should be
fine. If you have issues, or if the performance is poor, there is information here
about how to get it going.
Let's start with some simple but common calculations that are needed for in
graphics programming. The following interactive shows a cube with symbols on
each face. You can move it around using what's called a transform, which
simply adjusts where it is placed in space. Try typing in 3D coordinates into this
interactive to find each code.
There are several transformations that are used in computer graphics, but the
most common ones are translation (moving the object), rotation (spinning it)
and scaling (changing its size). They come up often in graphics because they
are applied not only to objects, but to things like the positions of the camera
and lighting sources.
In this section you can apply transformations to various images. We'll start by
making the changes manually, one point at a time, but we'll move up to a quick
shortcut method that uses a matrix to do the work for you. We'll start by
looking at how these work in two dimensions - it's a bit easier to think about
than three dimensions.
The following interactive shows an arrow, and on the left you can see a list of
the points that correspond to its 7 corners (usually referred to as cartesian
coordinates). The arrow is on a grid, where the centre point is the "zero" point.
Points are specified using two numbers, x and y, usually written as (x,y). The x
value is how far the point is to the right of the centre and the y value is how far
above the centre it is. For example, the first point in the list is the tip at (0,4),
which means it's 0 units to the right of the centre (i.e. at the centre), and 4
units above it. Which point does the last pair (3,1) correspond to? What does it
mean if a coordinate has a negative x value?
The transform you did in the above interactive is called a translation --- it
translates the arrow around the grid. This kind of transform is used in graphics
to specify where an object should be placed in a scene, but it has many other
uses, such as making an animated object move along a path, or specifying the
position of the imaginary camera (viewpoint).
In the following interactive, try to get the blue arrow to match up with the red
one. It will require a mixture of scaling and translation.
Next, see what happens if you swap the x and y value for each coordinate.
Typing all these coordinates by hand is inefficent. Luckily there's a much better
way of achieving all this. Read on!
where the top left value just means multiply all the x values by 2, and the
bottom right value means multiply all the y values by 2.
At this stage you may want to have the interactive open in a separate window
so that you can read the text below and work on the interactive at the same
time.
This gives us a new position of (6,2) for the rightmost point, which matches the
previous interactive after applying the scaling matrix! This same matrix
multiplication is applied to each of the seven points on the arrow.
For the rightmost point (starting at (3,1)), the matrix muliplication for scaling by
a factor of 3 is:
For the rightmost point (starting at (3,1)), the matrix muliplication for scaling by
a factor of 0.2 is:
By now you might be starting to see a recurring pattern in our matrix
multiplication for scaling. To scale by a factor of s, we can apply the general
rule:
Pick 2 or 3 more points on the arrow (include some with negative x and y
values) and try to do the matrix multiplication for scaling each factor above
(2, 3 and 0.2). You'll know if you got the correct answer because it should
match the scaled arrow in the interactive!
A simple way of looking at the matrix is that the top row determines the
transformed x value, simply by saying how much of the original x value and y
value contribute to the new x value. So in the matrix:
the top row just means that the new x value is 2 lots of the original x, and none
of the original y, which is why all the x values double. The second row
determines the y value: in the above example, it means that the new y value
uses none of the original x, but 4 times the original y value. If you try this
matrix, you should find that the location of all the x points is doubled, and the
location of all the y points is multiplied by 4.
Where it gets interesting is when you use a little of each value; try the following
matrix:
Now the x value of each coordinate is a mixture of 0.7 of the original x, and 0.7
of the original y. This is called a rotation.
In general, to rotate an image by a given angle you need to use the sine
(abbreviated sin) and cosine (abbreviated cos) functions from trigonometry. You
can use the interactive below to calculate values for the sin and cos functions.
To rotate the image anticlockwise by degrees, you'll need the following values
in the matrix, which rely on trig functions:
The order in which translation and scaling happen makes a difference. Try the
following challenge!
In the above interactive, you'll have noticed that scaling is affected by how far
the object is from the centre. If you want to scale around a fixed point in the
object (so it expands where it is), then an easy way is to translate it back to the
centre (also called the origin), scale it, and then translate it back to where it
was. The following interactive allows you to move the arrow, then scale it, and
move it back.
The same problem comes up with rotation. The following interactive allows you
to use a translation first to make the scaling more predicable.
Now that you've had a bit of practice with translation, scaling and rotation, try
out these two challenges that combine all three:
These combined transformations are common, and they might seem like a lot
of work because each matrix has to be applied to every point in an object. Our
arrows only had 7 points, but complex images can have thousands or even
millions of points in them. Fortunately we can combine all the matrix operations
in advance to give just one operation to apply to each point.
It's a bit complicated, but this calculation is only done once to work out the
combined transformation, and it gives you a single matrix that will provide two
transforms in one operation.
As a simple example, consider what happens when you scale by 2 and then
rotate by 45 degrees. The two matrices to multiply work out like this:
You can put the matrix we just calculated into the following interactive to see if
it does indeed scale by 2 and rotate 45 degrees. Also try making up your own
combination of transforms to see if they give the result you expect.
The project below gives the chance to explore combining matrices, and has an
interactive that will calculate the multiplied matrices for you.
12.2.3. 3D transforms
So far we've just done the transforms in two dimensions. To do this in 3D, we
need a z coordinate as well, which is the depth of the object into the screen. A
matrix for operating on 3D points is 3 by 3. For example, the 3D matrix for
doubling the size of an object is as follows; it multiplies each of the x, y and z
values of a point by 2.
The above image mesh has 3644 points in it, and your matrix was applied to
each one of them to work out the new image.
The next interactive allows you to do translation (using a vector). Use it to get
used to translating in the three dimensions (don't worry about using matrices
this time.)
Try applying that to the image above. This is rotating around the z-axis (a line
going into the screen); that is, it's just moving the image around in the 2D
plane. It's really the same as the rotation we used previously, as the last line (0,
0, 1) just keeps the z point the same.
Try the following matrix, which rotates around the x-axis (notice that the x
value always stays the same because of the 1,0,0 in the first line):
In the above examples, when you have several matrices being applied to every
point in the image, a lot of time can be saved by converting the series of
matrices and transforms to just one formula that does all of the transforms in
one go. The following interactive can do those calculations for you.
For example, in the following interactive, type in the matrix for doubling the
size of an object (put the number 2 instead of 1 on the main diagonal values),
then add another matrix that triples the size of the image (3 on the main
diagonal). The interactive shows a matrix on the right that combines the two ---
does it look right?
Project: 3D transforms
For this project, you will demonstrate what you've learned in the section
above by explaining a 3D transformation of a few objects. You should take
screenshots of each step to illustrate the process for your report.
Because you can't save your work in the interactives, keep notes and
screen shots as you go along. These will be useful for your report, and also
can be used if you need to start over again.
Introduce your project with a examples of 3D images, and how they are
used (perhaps from movies or scenes that other people have created).
Describe any innovations in the particular image (e.g. computer generated
movies usually push the boundaries of what was previously possible, so
discuss what boundaries were moved by a particular movie, and who wrote
the programs to achieve the new effects). One way to confirm that a movie
is innovative in this area is if it has won an award for the graphics software.
Give simple examples of translation, scaling and rotation using your scene.
You should include multiple transforms applied to one object, and show
how they can be used to position an object.
Note that these project can be very time consuming because these are
powerful systems, and there is quite a bit of detail to get right even for a
simple operation.
In 3D graphics shapes are often stored using lines and curves that mark out the
edges of tiny flat surfaces (usually triangles), each of which is so small that you
can't see them unless you zoom right in.
The lines and circles that specify an object are usually given using numbers (for
example, a line between a given starting and finishing position or a circle with a
given centre and radius). From this a graphics program must calculate which
pixels on the screen should be coloured in to represent the line or circle, or it
may just need to work out where the line is without drawing it.
For example, here's a grid of pixels with 5 lines shown magnified. The vertical
line would have been specified as going from pixel (2,9) to (2,16) --- that is,
starting 2 across and 9 up, and finishing 2 across and 16 up. Of course, this is
only a small part of a screen, as normally they are more like 1000 by 1000
pixels or more; even a smartphone can be hundreds of pixels high and wide.
These are things that are easy to do with pencil and paper using a ruler and
compass, but on a computer the calculations need to be done for every pixel,
and if you use the wrong method then it will take too long and the image will be
displayed slowly or a live animation will appear jerky. In this section we will look
into some very simple but clever algorithms that enable a computer to do
these calculations very quickly.
On the following grid, try to draw these straight lines by filling in pixels in the
grid:
Drawing a horizontal, vertical or diagonal line like the ones above is easy; it's
the ones at different angles that require some calculation.
Without using a ruler, can you draw a straight line from A to B on the following
grid by colouring in pixels?
Once you have finished drawing your line, try checking it with a ruler. Place the
ruler so that it goes from the centre of A to the centre of B. Does it cross all of
the pixels that you have coloured?
For example, choosing and means that the line would go through
the points (0,3), (1,5), (2,7), (3,9) and so on. This line goes up 2 pixels for every
one across , and crosses the y axis 3 pixels up ( ).
You should experiment with drawing graphs for various values of and (for
example, start with , and try these three lines: , and
) by putting in the values. What angle are these lines at?
The formula can be used to work out which pixels should be coloured in
for a line that goes between and . What are and for
the points A and B on the grid below?
See if you can work out the and values for a line from A to B, or you can
calculate them using the following formulas:
Now draw the same line as in the previous section (between A and B) using the
formula to calculate for each value of from to (you will
need to round to the nearest integer to work out which pixel to colour in). If
the formulas have been applied correctly, the value should range from to
.
Once you have completed the line, check it with a ruler. How does it compare
to your first attempt?
Now consider the number of calculations that are needed to work out each
point. It won't seem like many, but remember that a computer might be
calculating hundreds of points on thousands of lines in a complicated image.
Although this formula works fine, it's too slow to generate the complex graphics
needed for good animations and games. In the next section we will explore a
method that greatly speeds this up.
To draw the line, fill the starting pixel, and then for every position along the x
axis:
• if is less than 0, draw the new pixel on the same line as the last pixel,
and add to .
• if was 0 or greater, draw the new pixel one line higher than the last
pixel, and add to .
• repeat this decision until we reach the end of the line.
Without using a ruler, use Bresenham's Line Algorithm to draw a straight line
from A to B:
Once you have completed the line, check it with a ruler. How does it compare
to the previous attempts?
Now use Bresenham's algorithm to draw the line. Check that it gives the same
points as you would have chosen using a ruler, or using the formula .
How many arithmetic calculations (multiplications and additions) were needed
for Bresenham's algorithm? How many would have been needed if you used the
formula? Which is faster (bear in mind that adding is a lot faster
than multiplying for most computers).
12.3.5. Circles
As well as straight lines, another common shape that computers often need to
draw are circles. An algorithm similar to Bresenham's line drawing algorithm,
called the Midpoint Circle Algorithm, has been developed for drawing a circle
efficiently.
A circle is defined by a centre point, and a radius. Points on a circle are all the
radius distance from the centre of the circle.
Follow the rules to draw a circle on the grid, using ( , ) as the centre of the
circle, and the radius. Notice that it will only draw the start of the circle and
then it stops because is greater than !
When becomes greater than , one eighth (an octant) of the circle is drawn.
The remainder of the circle can be drawn by reflecting the octant that you
already have (you can think of this as repeating the pattern of steps you just
did in reverse). You should reflect pixels along the X and Y axis, so that the line
of reflection crosses the middle of the centre pixel of the circle. Half of the
circle is now drawn, the left and the right half. To add the remainder of the
circle, another line of reflection must be used. Can you work out which line of
reflection is needed to complete the circle?
A quadrant is a quarter of an area; the four quadrants that cover the whole
area are marked off by a vertical and horizontal line that cross. An octant is
one eighth of an area, and the 8 octants are marked off by 4 lines that
intersect at one point (vertical, horizontal, and two diagonal lines).
To complete the circle, you need to reflect along the diagonal. The line of
reflection should have a slope of 1 or -1, and should cross through the middle of
the centre pixel of the circle.
By the way, this kind of algorithm can be adapted to draw ellipses, but it has to
draw a whole quadrant because you don't have octant symmetry in an ellipse.
Computer scientists have found fast algorithms for drawing other shapes too,
which means that the image appears quickly, and graphics can display quickly
on relatively slow hardware - for example, a smartphone needs to do these
calculations all the time to display images, and reducing the amount of
calculations can extend its battery life, as well as make it appear faster.
As usual, things aren't quite as simple as shown here. For example, consider a
horizontal line that goes from (0,0) to (10,0), which has 11 pixels. Now compare
it with a 45 degree line that goes from (0,0) to (10,10). It still has 11 pixels, but
the line is longer (about 41% longer to be precise). This means that the line
would appear thinner or fainter on a screen, and extra work needs to be done
(mainly anti-aliasing) to make the line look ok. We've only just begun to explore
how techniques in graphics are needed to quickly render high quality images.
You can estimate how long each operation takes on your computer by
running a program that does thousands of each operation, and timing how
long it takes for each. From this you can estimate the total time taken by
each of the two methods. A good measurement for these is how many lines
(of your chosen length) your computer could calculate per second.
If you're evaluating how fast circle drawing is, you can compare the
number of addition and multiplication operations with those required by the
Pythagoras formula that is a basis for the simple equation of a circle (for
this calculation, the line from the centre of the circle to a particular pixel on
the edge is the hypotenuse of a triangle, and the other two sides are a
horizontal line from the centre, and a vertical line up to the point that we're
wanting to locate. You'll need to calculate the y value for each x value; the
length of the hypotenuse is always equal to the radius).
• lighting (so that virtual lights can be added to the scene, which then
creates shading and depth)
• texture mapping (to simulate realistic materials like grass, skin, wood,
water, and so on),
• anti-aliasing (to avoid jaggie edges and strange artifacts when images are
rendered digitally)
• projection (working out how to map the 3D objects in a scene onto a 2D
screen that the viewer is using),
• hidden object removal (working out which parts of an image can't be seen
by the viewer),
• photo-realistic rendering (making the image as realistic as possible), as
well as deliberately un-realistic simulations, such as "painterly rendering"
(making an image look like it was done with brush strokes), and
• efficiently simulating real-world phenomena like fire, waves, human
movement, and so on.
Matrix operations are used for many things other than computer graphics,
including computer vision, engineering simulations, and solving complex
equations. Although GPUs were developed for computer graphics, they are
often used as processors in their own right because they are so fast at such
calculations.
The idea of homogeneous coordinates was developed 100 years before the
first working computer existed, and it's almost 200 years later that
Möbius's work is being used on millions of computers to render fast
graphics. An animation of a Möbius strip therefore uses two of his ideas,
bringing things full circle, so to speak.
With increases in computer power, the decrease in the size of computers and
progressively more advanced algorithms, computer vision has a growing range
of applications. While it is commonly used in fields like healthcare, security and
manufacturing, we are finding more and more uses for them in our everyday
life, too.
Having a small portable device that can "see" and translate characters makes a
big difference for travellers. Note that the translation given is only for the
second part of the phrase (the last two characters). The first part says “please
don’t”, so it could be misleading if you think it’s translating the whole phrase!
Recognising of Chinese characters may not work every time perfectly, though.
Here is a warning sign:
My phone has been able to translate the “careful” and “steep” characters, but
it hasn’t recognised the last character in the line. Why do you think that might
be?
Giving users more information through computer vision is only one part of the
story. Capturing information from the real world allows computers to assist us in
other ways too. In some places, computer vision is already being used to help
car drivers to avoid collisions on the road, warning them when other cars are
too close or there are other hazards on the road ahead. Combining computer
vision with map software, people have now built cars that can drive to a
destination without needing a human driver to steer them. A wheelchair
guidance system can take advantage of vision to avoid bumping into doors,
making it much easier to operate for someone with limited mobility.
Human eyes have a very sensitive area in the centre of their field of vision
called the fovea. Objects that we are looking at directly are in sharp detail,
while our peripheral vision is quite poor. We have separate sets of cone cells in
the retina for sensing red, green and blue (RGB) light, but we also have special
rod cells that are sensitive to light levels, allowing us to perceive a wide
dynamic range of bright and dark colours. The retina has a blind spot (a place
where all the nerves bundle together to send signals to the brain through the
optic nerve), but most of the time we don’t notice it because we have two eyes
with overlapping fields of view, and we can move them around very quickly.
Digital cameras have uniform sensitivity to light across their whole field of
vision. Light intensity and colour are picked up by RGB sensor elements on a
silicon chip, but they aren’t as good at capturing a wide range of light levels as
our eyes are. Typically, a modern digital camera can automatically tune its
exposure to either bright or dark scenes, but it might lose some detail (e.g.
when it is tuned for dark exposure, any bright objects might just look like white
blobs).
It is important to understand that neither a human eye nor a digital camera ---
even a very expensive one --- can perfectly capture all of the information in the
scene in front of it. Electronic engineers and computer scientists are constantly
doing research to improve the quality of the images they capture, and the
speed at which they can record and process them.
13.3. Noise
One challenge when using digital cameras is something called noise. That’s
when individual pixels in the image appear brighter or darker than they should
be, due to interference in the electronic circuits inside the camera. It’s more of
a problem when light levels are dark, and the camera tries to boost the
exposure of the image so that you can see more. You can see this if you take a
digital photo in low light, and the camera uses a high ASA/ISO setting to
capture as much light as possible. Because the sensor has been made very
sensitive to light, it is also more sensitive to random interference, and gives
photos a "grainy" effect.
Noise mainly appears as random changes to pixels. For example, the following
image has "salt and pepper" noise.
Having noise in an image can make it harder to recognise what's in the image,
so an important step in computer vision is reducing the effect of noise in an
image. There are well-understood techniques for this, but they have to be
careful that they don’t discard useful information in the process. In each case,
the technique has to make an educated guess about the image to predict which
of the pixels that it sees are supposed to be there, and which aren’t.
Since a camera image captures the levels of red, green and blue light
separately for each pixel, a computer vision system can save a lot of
processing time in some operations by combining all three channels into a
single “grayscale” image, which just represents light intensities for each pixel.
This helps to reduce the level of noise in the image. Can you tell why, and
about how much less noise there might be? (As an experiment, you could take
a photo in low light --- can you see small patches on it caused by noise? Now
use photo editing software to change it to black and white --- does that reduce
the effect of the noise?)
Rather than just considering the red, green and blue values of each pixel
individually, most noise-reduction techniques look at other pixels in a region, to
predict what the value in the middle of that neighbourhood ought to be.
A mean filter assumes that pixels nearby will be similar to each other, and
takes the average (i.e. the mean) of all pixels within a square around the centre
pixel. The wider the square, the more pixels there are to choose from, so a very
wide mean filter tends to cause a lot of blurring, especially around areas of fine
detail and edges where bright and dark pixels are next to each other.
A median filter takes a different approach. It collects all the same values that
the mean filter does, but then sorts them and takes the middle (i.e. the median
) value. This helps with the edges that the mean filter had problems with, as it
will choose either a bright or a dark value (whichever is most common), but
won’t give you a value between the two. In a region where pixels are mostly the
same value, a single bright or dark pixel will be ignored. However, numerically
sorting all of the neighbouring pixels can be quite time-consuming!
A Gaussian blur is another common technique, which assumes that the closest
pixels are going to be the most similar, and pixels that are farther away will be
less similar. It works a lot like the mean filter above, but is statistically weighted
according to a normal distribution.
13.3.1. Activity: noise reduction filters
Open the noise reduction filtering interactive below and experiment with the
settings.
For this activity, investigate the different kinds of noise reduction filter and their
settings (grid size, type of blur) and determine:
• how well they cope with different levels of noise (you can set this in the
interactive or upload your own noisy photos).
• how much time it takes to do the necessary processing (the interactive
shows the number of pixels per second that it can process)
• how they affect the quality of the underlying image (a variety of images +
camera)
You can take screenshots of the image to show the effects in your writeup. You
can discuss the tradeoffs that need to be made to reduce noise.
13.4. Thresholding
Another technique that is sometimes useful in computer vision is thresholding.
This is something which is relatively simple to implement, but can be quite
useful.
If you have a greyscale image, and you want to detect regions that are darker
or lighter, it could be useful to simply transform any pixel above a certain level
of lightness to pure white, and any other pixel to pure black. In the Data
Representation chapter we talked about how pixels are all represented as sets
of numbers between 0 and 255 which represent red, green and blue values. We
can change a pixel to greyscale by simply taking the average of all three of
these values and setting all three values to be this average.
Once we have done this, we can apply a threshold to the image to decide
whether certain pixels are in certain regions. The following interactive allows
you to do this to an image you upload. Try using different thresholds on
different images and observing the results.
The next interactive lets you do the same thing, but on the original colour
image. You can set up more complicated statements as your threshold, such as
"Red > 127 AND Green < 127 OR Blue > 63". Does applying these complex
thresholds gives you more flexibility in finding specific regions of colour?
Thresholding on its own isn't a very powerful tool, but it can be very useful
when combined with other techniques as we shall see later.
There is some information about How facial recognition works that you can read
up as background, and some more information at i-programmer.info.
There are some relevant articles on the cs4fn website that also provide some
general material on computer vision.
Try using face recognition on this website to see how well the Haar face
recognition system can track a face in the image. What prevents it from
tracking a face? Is it affected if you cover one eye or wear a hat? How much
can the image change before it isn't recognised as a face? Is it possible to get it
to incorrectly recognise something that isn't a face?
For example, here's a photo where you might want to recognise individual
objects:
And here's a version that has been processed by an edge detection algorithm:
Notice that the grain on the table above has affected the quality; some pre-
processing to filter that would have helped!
There are a few commonly used convolutional kernels that people have come
up with for finding edges. After you've had a go at coming up with some of your
own, have a look at the Prewitt operator, the Roberts cross and Sobel operator
on wikipedia. Try these out in the interactive. What results do you get from
each of these?
There are a number of good edge detections out there, but one of the more
famous ones is the Canny edge detection algorithm. This is a widely used
algorithm in computer vision, developed in 1986 by John F. Canny. You can read
more about Canny edge detection on Wikipedia.
You could extend the techniques used in the above interactive by adding a few
more processing stages. If you were to apply a gaussian filter to the image first,
then do some work to favour edges that were connected to other edges, then
you would be on your way to implementing the Canny edge detector.
• Can the detector find all edges in the image? If there are some missing,
why might this be?
• Are there any false edge detections? Why did they system think that they
were edges?
• Does the lighting on the scene affect the quality of edge detection?
• Does the system find the boundary between two colours? How similar can
the colours be and still have the edge detected?
• How fast can the system process the input? Does this change with more
complex kernels?
• How well does the system deal with an image with text on it?
x = (a+b)*(c+d)
x = (a+b)*c+d)
When you try to compile or run the program, the computer will tell you that
there's an error. If it's really helpful, it might even suggest where the error is,
but it won't run the program until you fix it.
This might seem annoying, but in fact by enforcing precision and attention to
detail it helps pinpoint mistakes before they become bugs in the program that
go undetected until someone using it complains that it's not working correctly.
Whenever you get errors like this, you're dealing with a formal language.
Formal languages specify strict rules such as "all parentheses must be
balanced", "all commands in the program must be keywords selected from a
small set", or "the date must contain three numbers separated by dashes".
Formal languages aren't just used for programming languages --- they're used
anywhere the format of some input is tightly specified, such as typing an email
address into a web form.
In all these cases, the commands that you have typed (whether in Python,
Scratch, Snap!, C, Pascal, Basic, C#, HTML, or XML) are being read by a
computer program. (That's right... Python is a program that reads in Python
programs.) In fact, the compiler for a programming language is often written in
its own language. Most C compilers are written in C --- which begs the question,
who wrote the first C compiler (and what if it had bugs)?! Computer Scientists
have discovered good ways to write programs that process other programs,
and a key ingredient is that you have to specify what is allowed in a program
very precisely. That's where formal languages come in.
Many of the concepts we'll look at in this chapter are used in a variety of other
situations: checking input to a web page; analysing user interfaces; searching
text, particularly with “wild cards” that can match any sequence of characters;
creating logic circuits; specifying communication protocols; and designing
embedded systems. Some advanced concepts in formal languages even used
to explore limits of what can be computed.
Once you're familiar with the idea of formal languages, you'll possess a
powerful tool for cutting complex systems down to size using an easily
specified format.
Image source
That's a pretty simple search (though the results may have surprised you!). But
now we introduce the wildcard code, which in this case is "." --- this is a widely
used convention in formal languages. This matches any character at all. So now
you can do a search like
tim.b
and you will get any words that have both "tim" and "b" with a single character
--- any character --- in between. Are there any words that match "tim..b"?
"tim...b"? You can specify any number of occurrences of a symbol by putting a
'*' after it (again a widely used convention), so:
tim.*b
will match any words where "tim" is followed by "b", separated by any number
of characters --- including none.
x.*y.*z
• Can you find words that contain your name, or your initials?
• What about words containing the letters from your name in the correct
order?
• Are there any words that contain all the vowels in order (a, e, i, o, u)?
The code you've used above is a part of a formal language called a "regular
expression". Computer programs that accept typed input use regular
expressions for checking items like dates, credit card numbers and product
codes. They’re used extensively by programming language compilers and
interpreters to make sense of the text that a programmer types in. We'll look at
them in more detail in the section on regular expressions.
Next we examine a simple system for reading input called a finite state
automaton, which --- as we'll find out later --- is closely related to regular
expressions. Later we'll explore the idea of grammars, another kind of formal
language that can deal with more complicated forms of input.
Here's a map of a commuter train system for the town of Trainsylvania. The
trouble is, it doesn't show where the trains go --- all you know is that there are
two trains from each station, the A-train and the B-train. The inhabitants of
Trainsylvania don't seem to mind this --- it's quite fun choosing trains at each
station, and after a while you usually find yourself arriving where you intended.
You can travel around Trainsylvania yourself using the following interactive.
You're starting at the City Mall station, and you need to find your way to
Suburbopolis. At each station you can choose either the A-train or the B-train ---
press the button to find out where it will take you. But, like the residents of
Trainsylvania, you'll probably want to start drawing a map of the railway,
because later you might be asked to find your way somewhere else. If you want
a template to draw on, you can print one out from here.
Did you find a sequence of trains to get from City Mall to Suburbopolis? You can
test it by typing the sequence of trains in the following interactive. For
example, if you took the A-train, then the B-train, then an A-train, type in ABA.
Can you find a sequence that takes you from City Mall to Suburbopolis? Can
you find another sequence, perhaps a longer one? Suppose you wanted to take
a really long route ... can you find a sequence of 12 hops that would get you
there? 20 hops?
Here's another map. It's for a different city, and the stations only have
numbers, not names (but you can name them if you want).
Suppose you're starting at station 1, and need to get to station 3 (it has a
double circle to show that's where you're headed.)
The map that we use here, with circles and arrows, is actually a powerful idea
from computer science called a Finite State Automaton, or FSA for short. Being
comfortable with such structures is a useful skill for computer scientists.
The name finite state automaton (FSA) might seem strange, but each word
is quite simple. "Finite" just means that there is a limited number of states
(such as train stations) in the map. The "state" is just as another name for
the train stations we were using. "Automaton" is an old word meaning a
machine that acts on its own, following simple rules (such as the cuckoo in
a cuckoo clock).
An FSA isn't all that useful for train maps, but the notation is used for many
other purposes, from checking input to computer programs to controlling the
behaviour of an interface. You may have come across it when you dial a
telephone number and get a message saying "Press 1 for this … Press 2 for that
… Press 3 to talk to a human operator." Your key presses are inputs to a finite
state automaton at the other end of the phone line. The dialogue can be quite
simple, or very complex. Sometimes you are taken round in circles because
there is a peculiar loop in the finite-state automaton. If this occurs, it is an error
in the design of the system — and it can be extremely frustrating for the caller!
Another example is the remote control for an air conditioning unit. It might
have half a dozen main buttons, and pressing them changes the mode of
operation (e.g. heating, cooling, automatic). To get to the mode you want you
have to press just the right sequence, and if you press one too many buttons,
it's like getting to the train station you wanted but accidentally hopping on one
more train. It might be a long journey back, and you may end up exploring all
sorts of modes to get there! If there's a manual for the controller, it may well
contain a diagram that looks like a Finite State Automaton. If there isn't a
manual, you may find yourself wanting to draw a map, just as for the trains
above, so that you can understand it better.
The map that we used above uses a standard notation. Here's a smaller one:
Notice that this map has routes that go straight back to where they started! For
example, if you start at 1 and take route "b", you immediately end up back at
1. This might seem pointless, but it can be quite useful. Each of the "train
stations" is called a state, which is a general term that just represents where
you are after some sequence of inputs or decisions. What it actually means
depends on what the FSA is being used for. States could represent a mode of
operation (like fast, medium, or slow when selecting a washing machine spin
cycle), or the state of a lock or alarm (on, off, exit mode), or many other things.
We’ll see more examples soon.
One of the states has a double circle. By convention, this marks a "final” or
"accepting” state, and if we end up there we've achieved some goal. There's
also a "start” state --- that's the one with an arrow coming from nowhere.
Usually the idea is to find a sequence of inputs that gets you from the start
state to a final state. In the example above, the shortest input to get to state 2
is "a", but you can also get there with "aa", or "aba", or "baaaaa". People say
that these inputs are “accepted” because they get you from the start state to
the final state --- it doesn’t have to be the shortest route.
What state would you end up in if the input was the letter "a" repeated 100
times?
Of course, not all inputs get you to state 2. For example, "aab" or even just "b"
aren't accepted by this simple system. Can you characterise which inputs are
accepted?
Here's an interactive that follows the rules of the FSA above. You can use it to
test different inputs.
Use the interactive online at http://www.csfieldguide.org.nz/en/interactives/fsa-box/
index.html?config=example-1
Here's another FSA, which looks similar to the last one but behaves quite
differently. You can test it in the interactive below.
Work out which of the following inputs it accepts. Remember to start in state 1
each time!
• "aaa"
• "abb"
• "aaaa"
• "bababab"
• "babababa"
• the letter "a" repeated 100 times
• the letter "a" repeated 1001 times
• the letter "b" a million times, then an "a", then another million of the letter
"b"
To keep things precise, we'll define four further technical terms. One is the
alphabet, which is just a list of all possible inputs that might happen. In the last
couple of examples the alphabet has consisted of the two letters "a" and "b",
but for an FSA that is processing text typed into a computer, the alphabet will
have to include every letter on the keyboard.
The connections between states are called transitions, since they are about
changing state. The sequence of characters that we input into the FSA is often
called a string, (it's just a string of letters), and the set of all strings that can be
accepted by a particular FSA is called its language. For the FSA in the last
example, its language includes the strings "a", "aaa", "bab", "ababab", and lots
more, because these are accepted by it. However, it does not include the
strings "bb" or “aa”.
The language of many FSAs is big. In fact, the ones we've just looked at are
infinite. You could go on all day listing patterns that they accept. There's no
limit to the length of the strings they can accept.
That's good, because many real-life FSA's have to deal with "infinite" input. For
example, the diagram below shows the FSA for the spin speed on a washing
machine, where each press of the spin button changes the setting.
It would be frustrating if you could only change the spin setting 50 times, and
then it stopped accepting input ever again. If you want, you could switch from
fast to slow spin by pressing the spin button 3002 times. Or 2 times would do.
Or 2 million times (try it if you're not convinced).
Sometimes you'll see an FSA referred to as a Finite State Machine, or FSM, and
there are other closely related systems with similar names. We'll mention some
later in the chapter.
Now there's something we have to get out of the way before going further. If
we're talking about which strings of inputs will get you into a particular state,
and the system starts in that state, then the empty string --- that is, a string
without any letters at all --- is one of the solutions! For example, here's a simple
finite state automaton with just one input (button a) that represents a strange
kind of light switch. The reset button isn't part of the FSA; it’s just a way of
letting you return to the starting state. See if you can figure out which patterns
of input will turn the light on:
Have you worked out which sequences of button presses turn on the light? Now
think about the shortest sequence from a reset that can turn it on.
Since it's already on when it has been reset, the shortest sequence is zero
button presses. It's hard to write that down (although you could use “”), so we
have a symbol especially for it, which is the Greek letter epsilon: . You'll come
across quite often with formal languages.
It can be a bit confusing. For example, the language (that is, the list of all
accepted inputs) of the FSA above includes "aaa", "aaaaaa", and . If you try
telling someone that "nothing" will make the light come on that could be
confusing --- it might mean that you could never turn the light on --- so it's
handy being able to say that the empty string (or ) will turn the light on. There
are different kinds of "nothing", and we need to be precise about which one we
mean!
Here’s the FSA for the strange light switch. You can tell that is part of the
language because the start state is also a final state (in fact, it's the only final
state). Actually, the switch isn't all that strange --- data projectors often require
two presses of the power button, to avoid accidentally turning them off.
And by the way, the language of the three-state FSA above is infinitely large
because it is the set of all strings that contain the letter "a" in multiples of 3,
which is { , aaa, aaaaaa, aaaaaaaaa, ...}. That's pretty impressive for such a
small machine.
While we're looking at extremes, here's another FSA to consider. It uses "a" and
"b" as its alphabet.
As soon as you get the third character you end up in state 4, which is called a
trap state because you can't get out. If this was the map for the commuter train
system we had at the start of this section it would cause problems, because
eventually everyone would end up in the trap state, and you'd have serious
overcrowding. But it can be useful in other situations --- especially if there's an
error in the input, so no matter what else comes up, you don't want to go
ahead.
For the example above, the language of the FSA is any mixture of "a"s and
"b"s, but only two characters at most. Don't forget that the empty string is also
accepted. It's a very small language; the only strings in it are: { , a, b, aa, ab,
ba, bb}.
It's fairly clear what it will accept: strings like "ab", "abab", "abababababab",
and, of course . But there are some missing transitions: if you are in state 1
and get a "b" there's nowhere to go. If an input cannot be accepted, it will be
rejected, as in this case. We could have put in a trap state to make this clear:
But things can get out of hand. What if there are more letters in the alphabet?
We'd need something like this:
So, instead, we just say that any unspecified transition causes the input to be
rejected (that is, it behaves as though it goes into a trap state). In other words,
it's fine to use the simple version above, with just two transitions.
Now that we've got the terminology sorted out, let’s explore some applications
of this simple but powerful "machine" called the Finite State Automaton.
With such gadgets, FSAs can be used by designers to plan what will happen for
every input in every situation, but they can also be used to analyse the
interface of a device. If the FSA that describes a device is really complicated,
it's a warning that the interface is likely to be hard to understand. For example,
here's an FSA for a microwave oven. It reveals that, for example, you can't get
from power2 to power1 without going through timer1. Restrictions like this will
be very frustrating for a user. For example, if they try to set power1 it won't
work until they've set timer1 first. Once you know this sequence it's easy, but
the designer should think about whether it's necessary to force the user into
that sort of sequence. These sorts of issues become clear when you look at the
FSA. But we're straying into the area of Human-Computer Interaction! This isn't
surprising because most areas of computer science end up relating to each
other --- but this isn't a major application of FSAs, so let's get back to more
common uses.
As we shall see in the next section, one of the most valuable uses of the FSA in
computer science is for checking input to computers, whether it's a value typed
into a dialogue box, a program given to a compiler, or some search text to be
found in a large document. There are also data compression methods that use
FSAs to capture patterns in the data being compressed, and variants of FSA are
used to simulate large computer systems to see how best to configure it before
spending money on actually building it.
What's the biggest FSA in the world, one that lots of people use every day?
It's the World-Wide Web. Each web page is like a state, and the links on
that page are the transitions between them. Back in the year 2000 the web
had a billion pages. In 2008 Google Inc. declared they had found a trillion
different web page addresses. That’s a lot. A book with a billion pages
would be 50 km thick. With a trillion pages, its thickness would exceed the
circumference of the earth.
A good starting point is to think of the shortest string that is needed for a
particular description. For example, suppose you need an FSA that accepts all
strings that contain an even number of the letter "b". The shortest such string
is , which means that the starting state must also be a final state, so you can
start by drawing this:
If instead you had to design an FSA where the shortest accepted string is "aba",
you would need a sequence of 4 states like this:
Then you need to think what happens next. For example, if we are accepting
strings with an even number of "b"s, a single "b" would have to take you from
the start state to a non-accepting state:
But another "b" would make an even number, so that's acceptable. And for any
more input the result would be the same even if all the text to that point hadn't
happened, so you can return to the start state:
Usually you can find a "meaning" for a state. In this example, being in state 1
means that so far you've seen an even number of "b"s, and state 2 means that
the number so far has been odd.
Now we need to think about missing transitions from each state. So far there's
nothing for an "a" out of state 1. Thinking about state 1, an "a" doesn't affect
the number of "b"s seen, and so we should remain in state 1:
Get some practice doing this yourself! Here are instructions for two different
programs that allow you to enter and test FSAs.
14.3.2.1. Exorciser
This section shows how to use some educational software called "Exorciser".
(The next section introduces an alternative called JFLAP which is a bit harder to
use.) Exorciser has facilities for doing advanced exercises in formal languages;
but here we'll use just the simplest ones.
When you run it, select "Constructing Finite Automata" (the first menu item);
click the "Beginners" link when you want a new exercise. The challenge in each
FSA exercise is the part after the | in the braces (i.e., curly brackets). For
example, in the diagram below you are being asked to draw an FSA that
accepts an input string w if "w has length at least 3". You should draw and test
your answer, although initially you may find it helpful to just click on "Solve
exercise" to get a solution, and then follow strings around the solution to see
how it works. That’s what we did to make the diagram below.
To draw an FSA in the Exorciser system, right-click anywhere on the empty
space and you'll get a menu of options for adding and deleting states, choosing
the alphabet, and so on. To make a transition, drag from the outside circle of
one state to another (or out and back to the state for a loop). You can right-click
on states and transitions to change them. The notation "a|b" means that a
transition will be taken on "a" or "b" (it's equivalent to two parallel transitions).
If your FSA doesn't solve their challenge, you'll get a hint in the form of a string
that your FSA deals with incorrectly, so you can gradually fix it until it works. If
you're stuck, click “Solve exercise”. You can also track input as you type it:
right-click to choose that option. See the SwissEduc website for more
instructions.
The section after next gives some examples to try. If you're doing this for a
report, keep copies of the automata and tests that you do. Right-click on the
image for a "Save As" option, or else take screenshots of the images.
14.3.2.2. JFLAP
Another widely used system for experimenting with FSAs is a program called
JFLAP (download it from http://jflap.org). You can use it as an alternative for
Exorciser if necesary. You'll need to follow instructions carefully as it has many
more features than you'll need, and it can be hard to get back to where you
started.
Here's how to build an FSA using JFLAP. As an example, we'll use the following
FSA:
If you need to change something, you can delete things with the delete tool
(the skull icon). Alternatively, select the arrow tool and double-click on a
transition label to edit it, or right-click on a state. You can drag states around
using the arrow tool.
To watch your FSA process some input, use the "Input" menu (at the top),
choose "Step with closure", type in a short string such as "abaa", and click
"OK". Then at the bottom of the window you can trace the string one character
at a time by pressing "Step", which highlights the current state as it steps
through the string. If you step right through the string and end up in a final
(accepting) state, the panel will come up green. To return to the Editor window,
go to the “File” menu and select “Dismiss Tab”.
You can run multiple tests in one go. From the "Input" menu choose "Multiple
Run", and type your tests into the table, or load them from a text file.
You can even do tests with the empty string by leaving a blank line in the table,
which you can do by pressing the "Enter Lambda" button.
There are some FSA examples in the next section. If you're doing this for a
report, keep copies of the automata and tests that you do (JFLAP's "File" menu
has a "Save Image As..." option for taking snapshots of your work; alternatively
you can save an FSA that you've created in a file to open later).
• strings that start with the letter "a" (e.g. "aa", "abaaa", and "abbbb").
• strings that end with the letter "a" (e.g. "aa", "abaaa", and "bbbba").
• strings that have an even number of the letter "a" (e.g. "aa", "abaaa",
"bbbb"; and don’t forget the empty string ).
• strings that have an odd number of the letter "a" (e.g. "a", "baaa",
"bbbab", but not ).
• strings where the number of "a"s in the input is a multiple of three (e.g.
"aabaaaa", "bababab").
• strings where every time an "a" appears in the input, it is followed by a "b"
(e.g. "abb", "bbababbbabab", "bab").
• strings that end with "ab"
• strings that start with "ab" and end with "ba", and only have "b" in the
middle (e.g. "abba", "abbbbba")
For the FSA(s) that you construct, check that they accept valid input, but also
make sure they reject invalid input.
Here are some more sequences of characters that you can construct FSAs to
detect. The input alphabet is more than just "a" and "b", but you don't need to
put in a transition for every possible character in every state, because an FSA
can automatically reject an input if it uses a character that you haven't given a
transition for. Try doing two or three of these:
• the names for international standard paper sizes (A1 to A10, B1 to B10,
and so on)
• a valid three-letter month name (Jan, Feb, Mar, etc.)
• a valid month number (1, 2, ... 12)
• a valid weekday name (Monday, Tuesday, ...)
A classic example of an FSA is an old-school vending machine that only takes a
few kinds of coins. Suppose you have a machine that only takes 5 and 10 cent
pieces, and you need to insert 30 cents to get it to work. The alphabet of the
machine is the 5 and 10 cent coin, which we call F and T for short. For example,
TTT would be putting in 3 ten cent coins, which would be accepted. TFFT would
also be accepted, but TFFF wouldn't. Can you design an FSA that accepts the
input when 30 cents or more is put into the machine? You can make up your
own version for different denominations of coins and required total.
If you've worked with binary numbers, see if you can figure out what this FSA
does. Try each binary number as input: 0, 1, 10, 11, 100, 101, 110, etc.
Can you work out what it means if the FSA finishes in state q1? State q2?
There are lots of systems around that use FSAs. You could choose a system,
explain how it can be represented with an FSA, and show examples of
sequences of input that it deals with. Examples are:
• Board games. Simple board games are often just an FSA, where the
next move is determined by some input (e.g. a number given by
rolling dice), and the final state means that you have completed the
game --- so the first person to the final state wins. Most games are too
complex to draw a full FSA for, but a simple game like snakes and
ladders could be used as an example. What are some sequences of
dice throws that will get you to the end of the game? What are some
sequences that don't?!
• Simple devices with a few buttons often have states that you can
identify. For example, a remote control for a car alarm might have two
buttons, and what happens to the car depends on the order in which
you press them and the current state of the car (whether it is alarmed
or not). For devices that automatically turn on or off after a period of
time, you may have to include an input such as "waited for 30
seconds". Other devices to consider are digital watches (with states
like "showing time", "showing date", "showing stopwatch", "stopwatch
is running"), the power and eject buttons on a CD player, channel
selection on a TV remote (just the numbers), setting a clock, storing
presets on a car radio, and burglar alarm control panels.
We've already had a taste of regular expressions in the getting started section.
They are just a simple way to search for things in the input, or to specify what
kind of input will be accepted as legitimate. For example, many web scripting
programs use them to check input for patterns like dates, email addresses and
URLs. They've become so popular that they're now built into most programming
languages.
You might already have a suspicion that regular expressions are related to finite
state automata. And you'd be right, because it turns out that every regular
expression has a Finite State Automaton that can check for matches, and every
Finite State Automaton can be converted to a regular expression that shows
exactly what it does (and doesn’t) match. Regular expressions are usually
easier for humans to read. For machines, a computer program can convert any
regular expression to an FSA, and then the computer can follow very simple
rules to check the input.
The simplest kind of matching is just entering some text to match. Open the
interactive below and type the text "cat" into the box labeled "Regular
expression":
If you've only typed the 3 characters "cat", then it should find 6 matches.
Now try typing a dot (full stop or period) as the fourth character: "cat.". In a
regular expression, "." can match any single character. Try adding more dots
before and after "cat". How about "cat.s" or "cat..n"?
What do you get if you search for " ... " (three dots with a space before and
after)?
Now try searching for "ic.". The "." matches any letter, so if you really wanted a
full stop, you need to write it like this "ic\." --- use this search to find "ic" at the
end of a sentence.
Another special symbol is "\d", which matches any digit. Try matching 2, 3 or 4
digits in a row (for example, two digits in a row is "\d\d").
To choose from a small set of characters, try "[ua]ff". Either of the characters in
the square brackets will match. Try writing a regular expression that will match
"fat", "sat" and "mat", but not "cat".
Another useful shortcut is being able to match repeated letters. There are four
common rules:
If you want to choose between options, the vertical bar is useful. Try the
following, and work out what they match. You can type extra text into the test
string area if you want to experiment:
was|that|hat
was|t?hat
th(at|e) cat
[Tt]h(at|e) [fc]at
(ff)+
f(ff)+
Notice the use of brackets to group parts of the regular expression. It's useful if
you want the "+" or "*" to apply to more than one character.
Click here for another challenge: you should try to write a short regular
expression to match the first two words, but not the last three:
Of course, regular expressions are mainly used for more serious purposes. Click
on the following interactive to get some new text to search:
Use the interactive online at http://www.csfieldguide.org.nz/en/interactives/regular-
expression-search/index.html?text=Contact%20me%20at%20spam%40mymail.com%20or%
20on%20555-1234%250AFFE962%250ADetails%3A%20fred%40cheapmail.org.nz%20%
2803%29%20987-6543%250ALooking%20forward%20to%2021%20Oct%202015%250AGood
%20old%205%20Nov%201955%250ABack%20in%202%20Sep%201885%20is%20the%
20earliest%20date%250AABC123%250ALet%27s%20buy%20another%202%20Mac%
209012%20systems%20%40%20%242000%20each.
The following regular expression will find common New Zealand number plates
in the sample text, but can you find a shorter version using the {n} notation?
[A-Z][A-Z][A-Z]\d\d\d
How about an expression to find the dates in the text? Here's one option, but
it's not perfect:
\d [A-Z][a-z][a-z] \d\d\d\d
What about phone numbers? You'll need to think about what variations of
phone numbers are common! How about finding email addresses?
Image source
The particular form of regular expression that we've been using is for the Ruby
programming language (a popular language for web site development),
although it's very similar to regular expressions used in other languages
including Java, JavaScript, PHP, Python, and Microsoft's .NET Framework. Even
some spreadsheets have regular expression matching facilities.
But regular expressions have their limits --- for example, you won't be able to
create one that can match palindromes (words and phrases that are the same
backwards as forwards, such as "kayak", "rotator" and "hannah"), and you can't
use one to detect strings that consist of n repeats of the letter "a" followed by n
repeats of the letter "b". For those sort of patterns you need a more powerful
system called a grammar (see the section on Grammars). But nevertheless,
regular expressions are very useful for a lot of common pattern matching
requirements.
14.4.1. Regular expressions and FSAs
There's a direct relationship between regular expressions and FSAs. For
example, consider the following regex, which matches strings that begin with
an even number of the letter "a" and end with an even number of the letter "b":
(aa)+(bb)+
Now look at how the following FSA works on these strings --- you could try
"aabb", "aaaabb", "aaaaaabbbb", and also see what happens for strings like
"aaabb", "aa", "aabbb", and so on.
You may have noticed that q2 is a "trap state". We can achieve the same effect
with the following FSA, where all the transitions to the trap state have been
removed --- the FSA can reject the input as soon as a non-existent transition is
needed.
Like an FSA, each regular expression represents a language, which is just the
set of all strings that match the regular expression. In the example above, the
shortest string in the language is "aabb", then there's "aaaabb" and "aabbbb",
and of course an infinite number more. There's also an infinite number of
strings that aren't in this language, like "a", "aaa", "aaaaaa" and so on.
In the above example, the FSA is a really easy way to check for the regular
expression --- you can write a very fast and small program to implement it (in
fact, it's a good exercise: you typically have an array or list with an entry for
each state, and each entry tells you which state to go to next on each
character, plus whether or not it's a final state. At each step the program just
looks up which state to go to next.)
Here are some ideas for regular expressions for you to try to create. You
can check them using the Regular Expression Searcher as we did earlier,
but you'll need to make up your own text to check your regular expression.
When testing your expressions, make sure that they not only accept correct
strings, but reject ones that don't match, even if there's just one character
missing.
You may find it easier to have one test match string per line in "Your test
string". You can force your regular expression to match a whole line by
putting "^" (start of line) before the regular expression, and "$" (end of
line) after it. For example, "^a+$" only matches lines that have nothing but
"a"s on them.
For this project you will make up a regular expression, convert it to an FSA,
and demonstrate how some strings are processed.
There's one trick you'll need to know: the software we're using doesn't
have all the notations we've been using above, which are common in
programming languages, but not used so much in pure formal language
theory. In fact, the only ones available are:
Having only these three notations isn't too much of a problem, as you can
get all the other notations using them. For example, "a+" is the same as
"aa*", and "\d" is "0|1|2|3|4|5|67|8|9". It's a bit more tedious, but we'll
mainly use exercises that only use a few characters.
Converting with Exorciser
Use this section if you're using Exorciser; we recommend Exorciser for this
project, but if you're using JFLAP then skip to Converting with JFLAP
below.
Exorciser is very simple. In fact, unless you change the default settings, it
can only convert regular expressions using two characters: "a" and "b". But
even that's enough (in fact, in theory any input can be represented with
two characters --- that's what binary numbers are about!)
On the plus side, Exorciser has the empty string symbol available --- if you
type "e" it will be converted to . So, for example, "(a| )" means an
optional "a" in the input.
As a warmup, try:
aabb
then click on "solve exercise" (this is a shortcut --- the software is intended
for students to create their own FSA, but that's beyond what we're doing in
this chapter).
To test your FSA, right-click on the background and choose "Track input".
Now try some more complex regular expressions, such as the following. For
each one, type it in, click on "solve exercise", and then track some sample
inputs to see how it accepts and rejects different strings.
aa*b
a(bb)*
(bba*)*
(b*a)*a
Your project report should show the regular expressions, explain what kind
of strings they match, show the corresponding FSAs, show the sequence of
states that some sample test strings would go through, and you should
explain how the components of the FSA correspond the parts of the regular
expression using examples.
If you're using JFLAP for your project, you can have almost any character as
input. The main exceptions are "*", "+" (confusingly, the "+" is used
instead of "|" for alternatives), and "!" (which is the empty string --- in the
preferences you can choose if it is shown as or ).
The JFLAP software can work with all sorts of formal languages, so you'll
need to ignore a lot of the options that it offers! This section will guide you
through exactly what to do.
There are some details about the format that JFLAP uses for regular
expressions in the following tutorial --- just read the "Definition" and
"Creating a regular expression" sections.
http://www.jflap.org/tutorial/regular/index.html
ab*b
Multiple runs are good for showing lots of tests on your regular expression:
Now you should come up with your own regular expressions that test out
interesting patterns, and generate FSA's for them. In JFLAP you can create
FSAs for some of regular expressions we used earlier, such as (simple)
dates, email addresses or URLs.
Your project report should show the regular expressions, explain what kind
of strings they match, show the corresponding FSAs, show the sequence of
states that some sample test strings would go through, and you should
explain how the components of the FSA correspond to the parts of the
regular expression using examples.
Here are some more ideas that you could use to investigate regular
expressions:
• Advanced: The free tools lex and flex are able to take specifications
for regular expressions and create programs that parse input
according to the rules. They are commonly used as a front end to a
compiler, and the input is a program that is being compiled. You could
investigate these tools and demonstrate a simple implementation.
In this section [when it is finished!] we'll look at the kind of grammars that are
widely used in computer science. They are very powerful because they allow a
complicated system (like a compiler or a format like HTML) to be specified in a
very concise way, and there are programs that will automatically take the
grammar and build the system for you. The grammars for conventional
programming languages are a bit too unwieldy to use as initial examples (they
usually take a few pages to write out), so we're going to work with some small
examples here, including parts of the grammars for programming languages.
(Note that these will make more sense when the previous introduction to
grammars has been completed!)
• Use examples to show the parse tree (or tree) for a correct and
incorrect program fragment, or to show a sequence of grammar
productions to construct a correct program fragment
• Explore the grammar for balanced parentheses S -> SS, S -> (S), S ->
()
The JFLAP program also has a feature for rendering "L-systems" (https://
en.wikipedia.org/wiki/L-system), which are another way to use grammars to
create structured images. You'll need to read about how they work in the
JFLAP tutorial (http://www.jflap.org/tutorial/index.html), and there's a more
detailed tutorial at http://www.cs.duke.edu/csed/pltl/exercises/lessons/20/L-
system.zip. There are some sample files here to get you inspired: (the ones
starting "ex10..." http://www.cs.duke.edu/csed/jflap/jflapbook/files/ ) and
here's an example of the kind of image that can be produced:
There's also an online system for generating images with L-systems: http://
www.kevs3d.co.uk/dev/lsystems/
• The following is the BNF grammar for the ABC music format
• http://abc.sourceforge.net/
• https://meta.wikimedia.org/wiki/Music_markup
• http://www.emergentmusics.org/theory/15-implementation
• analyse a simple piece of music in terms of a formal grammar.
Technically the kind of finite state automata (FSA) that we used in the Finite
state automata section is a kind known as a Deterministic Finite Automata
(DFA), because the decision about which transition to take is unambiguous at
each step. Sometimes it's referred to as a Finite State Acceptor because it
accepts and rejects input depending on whether it gets to the final state. There
are all sorts of variants that we didn't mention, including the Mealy and Moore
machines (which produce an output for each each transition taken or state
reached), the nested state machine (where each state can be an FSA itself), the
non-deterministic finite automata (which can have the same label on more than
one transition out of a state), and the lambda-NFA (which can include
transitions on the empty string, ). Believe it or not, all these variations are
essentially equivalent, and you can convert from one to the other. They are
used in a wide range of practical situations to design systems for processing
input.
However, there are also more complex models of computation such as the
push-down automaton (PDA) which is able to follow the rules of context-free
grammars, and the most general model of computation which is called a Turing
machine. These models are increasingly complicated and abstract, and
structures like the Turing machine aren't used as physical devices (except for
fun), but instead as a tool for reasoning about the limits on what can be
computed. In fact, in principle every digital computer is a kind of limited Turing
machine, so whatever limits we find for a Turing machine gives us limits for
everyday computation.
The Turing machine is named after Alan Turing, who worked on these concepts
in the early 20th century (that's the same person from whom we got the Turing
test in AI, which is something quite different --- Turing's work comes up in many
areas of computer science!) If you want to investigate the idea of a Turing
machine and you like chocolate, there's an activity on the cs4fn site that gives
examples of how it works. The Kara programming environment also has a
demonstration of Turing machines
This chapter looked at two main kinds of formal language: the regular
expression (RE) and the context-free grammar (CFG). These typify the kinds of
languages that are widely used in compilers and file processing systems.
Regular expressions are good for finding simple patterns in a file, like
identifiers, keywords and numbers in a program, or tags in an HTML file, or
dates and URLs in a web form. Context-free grammars are good when you have
nested structures, for example, when an expression is made up of other
expressions, or when an "if" statement includes a block of statements, which in
turn could be "if" statements, ad infinitum. There are more powerful forms of
grammars that exist, the most common being context-sensitive grammars and
unrestricted grammars, which allow you to have more than one non-terminal on
the left hand side of a production; for example, you could have
xAy aBb,
which is more flexible but a lot harder to work with. The relationships between
the main kinds of grammars was described by the linguist Noam Chomsky, and
is often called the Chomsky Hierarchy after him.
There are many tools available that will read in the specification for a language
and produce another program to parse the language; some common ones are
called "Lex" and "Flex" (both perform lexical analysis of regular expressions),
"Yacc" ("yet another compiler compiler") and "Bison" (an improved version of
Yacc). These systems make it relatively easy to make up your own
programming language and construct a compiler for it, although they do
demand quite a range of skills to get the whole thing working!
So we've barely got started on what can be done with formal languages, but
the intention of this chapter is to give you a taste of the kind of structures that
computer scientists work with, and the powerful tools that have been created
to make it possible to work with infinitely complex systems using small
descriptions.
14.7. Further reading
Some of the material in this chapter was inspired by http://www.ccs3.lanl.gov/
mega-math/workbk/machine/malearn.html
14.7.1. Books
Textbooks on formal languages will have considerably more advanced material
and more mathematical rigour than could be expected at High School level, but
for students who really want to read more, a popular book is "Introduction to
the Theory of Computation" by Michael Sipser.
Regular expressions and their relationship with FSAs is explained well in the
book "Algorithms" by Robert Sedgewick.
Activities on the internet vary a lot too (email, Skype, video streaming, music,
gaming, browsing, chatting), and so do the protocols used to achieve these.
These collections of protocols form the topic of Networking Communication
Protocols and this chapter will introduce you to some of them, what problems
they solve, and what you can do to experience these protocols first hand. Let’s
start by talking about the one you’re using if you're viewing this page on the
web.
You: “Can I have a can of soda please?” Shop Keeper: “Sure, here’s a can of
soda”
HTTP uses a request/response pattern for solving the problem of reliable
communication between client and server. The “ask for” is known as a request
and the reply is known as a response. Both requests and responses can also
have other data or resources sent along with it.
This is happening all the time when you're browsing the web; every web page
you look at is delivered using the HyperText Transfer Protocol. Going back to the
shop analogy, consider the same example, this time with more resources
shown in asterisk (*) characters.
You: “Can I have a can of soda please?” *You hand the shop keeper $2*
Shop Keeper: “Sure, here’s a can of soda” *Also hands you a receipt and
your change*
There are nine types of requests that HTTP supports, and these are outlined
below.
A GET request returns some text that describes the thing you’re asking for. Like
above, you ask for a can of soda, you get a can of soda.
A HEAD request returns what you’d get if you did a GET request. It’s like this:
You: “Can I have a can of soda please?” Shop Keeper: “Sure, here’s the can
of soda you’d get” *Holds up a can of soda*
What’s neat about HTTP is that it allows you to also modify the contents of the
server. Say you’re now also a representative for the soda company, and you’d
like to re-stock some shelves.
A POST request allows you to send information in the other direction. This
request allows you to replace a resource on the server with one you supply.
These use what is called a Uniform Resource Identifier or URI. A URI is a unique
code or number for a resource. Confused? Let’s go back to the shop:
Sales Rep: “I’d like to replace this dented can of soda with barcode number
123-111-221 with this one, that isn’t dented” Shop Keeper: “Sure, that has
now been replaced”
Sales Rep: “We are no longer selling ‘Lemonade with Extra Vegetables’, no
one likes it! Please remove them!” Shop Keeper: “Okay, they are gone”.
Some other request types (HTTP methods) exist too, but they are less used;
these are TRACE, OPTIONS, CONNECT and PATCH. You can find out more about
these on your own if you're interested.
In HTTP, the first line of the response is called the status line and has a numeric
status code such as 404 and a text-based reason phrase such as “Not Found”.
The most common is 200 and this means successful or “OK”. HTTP status codes
are primarily divided into five groups for better explanation of requests and
responses between client and server are named by purpose and a number:
Informational 1XX, Successful 2XX, Redirection 3XX, Client Error 4XX and
Server Error 5XX. There are many status codes for representing different cases
for error or success. There’s even a nice 418: Teapot error on Google: https://
www.google.com/teapot
So what’s actually happening? Well, let’s find out. Open a new tab in your
browser and open the homepage of the CS Field Guide here. If you’re in a
Chrome or Safari browser, press Ctrl + Shift + I in Windows or Command +
Option + I on a Mac to bring up the web inspector. Select the Network tab.
Refresh the page. What you’re seeing now is a list of of HTTP requests your
browser is making to the server to load the page you're currently viewing. Near
the top you’ll see a request to index.html. Click that and you’ll see details of
the Headers, Preview, Response, Cookies and Timing. Ignore those last two for
now.
Let’s look at the Request Headers now, click ‘view source’ to see the original
request.
HTTP/1.1 200 OK
Date: Sun, 11 May 2014 03:52:56 GMT
Server: Apache/2.2.15 (Red Hat)
Accept-Ranges: bytes
Content-Length: 3947
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Vary: Accept-Encoding, User-Agent
Content-Encoding: gzip
Go ahead and try this same process on a few other pages too. For example, try
these sites:
Tim Berners-Lee was credited for creating HTTP in 1989. You can read more
about him here.
The neat thing about IRC is that users can use commands to interact with the
server, client or other users. For example /DIE will tell the server to shutdown
(although it will only work if you are the administrator!) /ADMIN will tell you who
the administrator is.
Whilst IRC may be new to you, the concept of a group conversation online or a
chat room may not be. There really isn’t any difference. Groups exist in the
forms of channels. A server hosts many channels, and you can choose which
one to join.
To get started with IRC, first you should get a client. A client is a program that
let's you connect. Ask your teacher about which one to use. For this chapter,
we’ll use the freenode web client. Check with your teacher about which channel
to join, as they may have set one up for you.
Try a few things while you’re in there. Look at this list of commands and try to
use some of them. What response do you get? Does this make sense?
Try a one on one conversation with a friend. If they use commands, do you see
them? How about the other way around?
15.4.1. TCP
TCP (The Transmission Control Protocol) is one of the most important protocols
on the internet. It breaks large messages up into packets. What is a packet? A
packet is a segment of data that when combined with other packets, make up a
total message (something like a HTTP request, an email, an IRC message or a
file like a picture or song being downloaded). For the rest of the section, we’ll
look at how these are used to load an image from a website.
So computer A looks the file and takes it, breaks it into packets. It then sends
the packets over the internet and computer B reassembles them and gives
them back to you as the image, which is demonstrated in this video.
So why don’t the packets all just go from computer A to computer B just fine?
Ha! That’d be nice. Unfortunately it’s not that simple. Through various means,
there are some problems that can affect packets. These problems are:
• Packet loss
• Packet delay (packets arrive out of order)
• Packet corruption (the packet gets changed on the way)
So, if we didn’t try fix these, the image wouldn’t load, bits would be missing,
corrupted or computer B might not even recognise what it is!
So, TCP is a protocol that solves these issues. To introduce you to TCP, play the
game below, called Packet attack. In the game, you are the problems (loss,
delay, corruption) and as you move through the levels, pay attention to how
the computer tries to combat them. Good luck trying to stop the messages
getting through correctly!
Packet Attack is a direct analogy for TCP and it is intended to be an interactive
simulation of it. The Packet Creatures are TCP segments, which move between
two computers. The yellow/gray zone is the unreliable channel, susceptible to
unreliability. That unreliability is the user playing the game. Remember from
the key problems of this topic on the transport level, we have delays,
corruption and lost packets, these are the attacks; delay, corrupt, kill. Solutions
come in the form of TCP mechanisms, they are slowly added level by level. Like
in TCP, the game supports packet ordering, checksums (shields), Acks and
Nacks (return creatures) and timeouts .
You can also create your own levels in Packet Attack. We’ve put a level
builder in the projects section below so that you can experiment with
different reliabilities or combination of defenses.
Adjust the trues, falses and numbers to set different abilities. Raising the
numbers will effectively equate to a less reliable communication channel.
Adding in more abilities (by setting shields etc to true) will make for a
harder to level to beat.
Let’s talk about what you saw in that game. What did the levels do to solve the
issues of packet loss, delay (reordering) and corruption? TCP has several
mechanisms for dealing with packet troubles.
Next is Ordering. Since a computer can’t look at data and order it like we can
(like when we do a jigsaw puzzle or play Scrabble™) they need a way to
“stitch” the packets back together. As we saw in Packet Attack, if you delayed a
message that didn’t have ordering, the message may look like “HELOLWOLRD”.
So, TCP puts a number on each packet (called a sequence number) which
signifies its order. With this, it can put them back together again. It’s a bit like
when you print out a few pages from a printer and you see “ Page 2 of 11” on
the bottom. Now, if packets do become out of order, TCP will wait for all of the
packets to arrive and then put the message together.
But does a computer send it again if it doesn’t hear back? Yes. It’s called a
timeout and it’s the final line of defense in TCP. If a computer doesn’t get an
ACK or a NACK back, after a certain time it will just resend the packet. It’s a bit
like when you’re tuning out in class, and the teacher keeps repeating your
name until you answer. Maybe that’s been you… woops. Sometimes, an ACK
might get lost, so the packet is resent after a timeout, but that’s OK, as TCP can
recognise duplicates and ignore them.
So that’s TCP. A protocol that puts accurate data transmission before efficiency
and speed in a network. It uses timeouts, checksums, acks and nacks and
many packets to deliver a message reliably. However, what if we don’t need all
the packets? Can we get the overall picture faster? Read on…
15.4.2. UDP
UDP (User Datagram Protocol) is a protocol for sending packets that does not
guarantee delivery. UDP doesn’t guarantee against lost packets, duplicate
packets or out of order packets. It just gets the bulk of the data there when it
can. Checksums are used for data integrity though, so they have some
protection. It’s still a protocol because it has a formal packet structure. The
packets still include destination and origin as well as the size of the packet.
So do we even use such an unreliable protocol? Yes, but not for anything too
important. Files, messages, emails, web pages and other text based items use
TCP, but things like streaming music, video, VOIP and so on use UDP. Maybe
you’ve had a call on Skype that has been poor quality. Maybe the video flickers
or the sound drops for a split second. That’s packets being lost. However, you
of course get the overall picture and the conversation you’re having is
successful.
No. This sounds absurd. As a web developer, I don’t want to worry about
anything other than making my music player easy to use and fast. I don’t want
to worry about UDP and I don’t want to worry about ethernet or cables. It’s
already done, I can assume it’s taken care of. And it is.
Internet protocols exist in layers. We have four such layers in the computer
science internet model. The top two levels are discussed above in detail, the
bottom two we won’t focus on.The first layer is the Application Layer, followed
by the Transport, Internet and Link layers.
At each layer, data is made up of the previous layers’ whole unit of data, and
then headers are added and passed down. At the bottom layer, the Link layer, a
footer is added also. Below is an example of what a UDP packet looks like when
it’s packaged up for transport.
Footers and Headers are basically packet meta-data. Information about the
information. Like a letterhead or a footnote, they’re not part of the content,
but they are on the page. Headers and Footers exist on packets to store
data. Headers come before the data and footers afterwards.
You can think of these protocols as a game of pass the parcel. When a message
is sent in HTTP, it is wrapped in a TCP header, which is then wrapped in an IPv6
header, which is then wrapped in a Ethernet header and footer and sent over
ethernet. At the other end, it’s unwrapped again from an ethernet frame, back
to a IP packet, a TCP datagram, to a HTTP request.
The name packet is a generic term for a unit of data. In the application
layer units of data are called data or requests, in the transport layer,
datagram or segments, in the Network/IP layer, packet and in the physical
layer, a frame. Each level has its own name for a unit of data (segment,
packet, frame, request etc), however the more generic “packet” is often
used instead, regardless of layer.
This system is neat because each layer can assume that the layers above and
below have guaranteed something about the information, and each layer (and
protocol in use at that layer) has a stand-alone role. So if you’re making a
website you just have to program website code, and not worry about code to
make the site work over wifi as well as ethernet. A similar system is in the
postal system… You don’t put the courier’s truck number on the front of the
envelope! That’s taken care of by the post company, which then uses a system
to sort the mail and assign it to drivers, and then drivers to trucks, and then
drivers to routes… none of which you need to worry about when you send or
receive a letter or use a courier.
The OSI internet model is different from the TCP/IP model of the internet
that Computer Scientists use to approach protocol design. OSI is
considered and probably mentioned in the networking standards but the
guide will use the computer science approach because it is simpler,
however the main idea of layers of abstraction is more important to get
across. You can read more about the differences here.
As you can see, a packet is divided into four main parts, addresses (source,
destination), numbers (sequence number, ACK number if it’s an
acknowledgement), flags (urgent, checksum) in the header, then the actual
data. At each level, a segment becomes the data for the next data unit, and
that again gets its own header.
TCP and UDP packets have a number with how big they are. This number
means that the packet can actually be as big as you like. Can you think of any
advantages of having small packets? How about big ones? Think about the ratio
of data to information (such as those in the header and footer).
15.6.1. Videos
There and back again: a packet's tale
In 1996, The ARIANE 5 rocket of the European Space Agency was launched for
its first test flight: Countdown, ignition, flame and smoke, soaring rocket... then
BANG! Lots of little pieces scattered through the South American rainforest.
Investigators had to piece together what happened and finally tracked down
this tiny, irrelevant bug. A piece of software on board the rocket which was not
even needed had reported an error and started a self-destruct sequence.
Thankfully, no one was on board but the failure still caused about US$370m
damage.
In extreme cases, software bugs can endanger lives. This happened in the
1980s, for example, when a radiation therapy machine caused the deaths of 3
patients by giving 100 times the intended dose of radiation. And in 1979, a US
army computer almost started a nuclear war, when it misinterpreted a
simulation of the Soviet Union launching a missile as the real thing! (If you are
interested in other software failures, CS4FN lists the most spectacular ones!)
Our society today is so reliant on software that we can’t even imagine life
without it anymore. In many ways, software has made our lives easier: we write
emails, chat with friends on Facebook, play computer games and search for
information on Google. Heaps of software is hidden behind the scenes too so
we don’t even know we’re using it, for example in cars, traffic lights, TVs,
washing machines, Japanese toilets, and hearing aids. We've become so used
to having software, we expect it to work at all times!
So why doesn’t it? Why do we get bugs in the first place? As it turns out, writing
software is incredibly difficult. Software isn’t a physical product, so we can’t just
look at it to see if it’s correct. On top of that, most of the software you use
every day is huge and extremely complex. Windows Vista is rumoured to have
around 50 million lines of code; MacOSX has 86 million. If we printed Vista out
on paper, we would get a 88m high stack! That’s as high as a 22 storey
building or the Statue of Liberty in New York! If you wanted to read through
Vista and try to understand how it works, you can expect to get through about
120 lines per hour, so it would take you 417,000 hours or 47 ½ years! (And
that’s just to read through it, not write it.)
Software engineering is all about how we can create software despite this
enormous size and complexity and hopefully get a working product in the end.
It was first introduced as a topic of computer science in the 1960s during the
so-called “software crisis”, when people realised that the capability of hardware
was increasing at incredible speeds while our ability to develop software is
staying pretty much the same.
As the name software engineering suggests, we are taking ideas and processes
from other engineering disciplines (such as building bridges or computer
hardware) and applying them to software. Having a structured process in place
for developing software turns out to be hugely important because it allows us
to manage the size and complexity of software. As a result of advances in
software engineering, there are many success stories of large and complex
software products that work well and contain few bugs. Think, for example, of
Google who have huge projects (Google search, Gmail, …) and thousands of
engineers working on them but somehow still manage to create software that
does what it should.
Since the 1960s, software engineering has become a very important part of
computer science, so much so that today programmers are rarely called
programmers, but software engineers. That’s because making software is much
more than just programming. There are a huge number of jobs for software
engineers, and demand for skilled workers continues to grow. The great thing
about being a software engineer is that you get to work in large teams to
produce products that will impact the lives of millions of people! Although you
might think that software engineers would have to be very smart and a bit
geeky, communication and teamwork skills are actually more important;
software engineers have to be able to work in teams and communicate with
their teammates. The ability to work well with humans is at least as important
as the ability to work with computers.
As software becomes larger, the teams working on it have grown, and good
communication skills have become even more important than in the past.
Furthermore, computer systems are only useful if they make things better for
humans, so developers need to be good at understanding the users they are
developing software for. In fact, as computers become smaller and cheaper
(following Moore's law), we've gone from having shared computers that
humans have to queue up to use, to having multiple digital devices for each
person, and it's the devices that have to wait until the human is ready. In a
digital system, the human is the most important part!
Believe it or not, Moore’s law didn’t just last for 10 years but is still true
nearly 50 years later (although a slowdown is predicted in the next couple
of years). This means that computers today are over 100 million times
faster than in 1965! (In 2015 it was 50 years since 1965, which means that
Moore's law predicts that processing power has doubled about 25 times;
is 16,777,216 so if computers could run one instruction per second in 1965,
they can now run 33,554,432.) It also means that if you buy a computer
today, you might regret it in two years time when new computers will be
twice as fast. Moore’s law also relates to other improvements in digital
devices, such as processing power in cellphones and the number of pixels
in digital cameras.
The exact numbers above will depend on exactly what you're describing,
but the main point is that the processing power is increasing exponentially
— exponential growth doesn't mean just getting a lot faster, but getting
unbelievably faster; nothing in human history has ever grown this quickly!
To illustrate this in reverse, the time taken to open an app on a smartphone
might be half a second today, but a 1965 smartphone would have taken
over a year to open the same app (and the phone would probably have
been the size of a football field). It's no wonder that smartphones weren't
popular in the 1960s.
Although software engineering has come a long way in the last decades, writing
software is still difficult today. As a user, you only see the programs that were
completed, not those that failed. In 2009, just under a third of all software
projects succeeded, while almost a quarter failed outright or were cancelled
before the software could be delivered. The remaining projects were either
delivered late, were over budget or lacked functionality. A famous recent
project failure was the software for the baggage handling system at the new
airport in Denver. The system turned out to be more complex than engineers
had expected; in the end, the entire airport was ready but had to wait for 16
months before it could be opened because the software for the baggage
system was not working. Apparently, the airport lost $1 million every day
during these 16 months!
While successful projects are desirable, there is a lot that can be learnt
from failures! Here are some sites that provide further material on this if
you are interested.
Sometimes we are making software for ourselves; in that case, we can just
decide what the software should do. (But be careful: even if you think you know
what you want the software to do when you start developing it, you will
probably find that by the end of the project you will have a very different view
of what it should do. The problem is that before you have the software, you
can’t really predict how you will use it when it’s finished. For example, the
people making smart phones and software for smart phones probably didn’t
anticipate how many people would want to use their smart phones as torches!)
In many cases, we build software for other people. You might make a website
for your aunt’s clothing shop or write software to help your friends with their
maths homework. A software company might create software for a local council
or a GP’s practice. Google and Microsoft make software used by millions of
people around the world. Either way, whether you’re writing a program for your
friends or for millions of people, you first have to find out from your customers
what they actually need the software to do.
We call anyone who has an interest in the software a stakeholder. These are the
people that you need to talk to during the analysis part of your project to find
out what they need the software to do.
Imagine that you are making a phone app that allows students to preorder food
from the school cafeteria. They can use the app to request the food in the
morning and then just go and pick up the food at lunch time. The idea is that
this should help streamline the serving of food and reduce queues in the
cafeteria. Obvious stakeholders for your project are the students (who will be
using the phone app) and cafeteria staff (who will be receiving requests
through the app). Less obvious (and indirect) stakeholders include parents (“I
have to buy my child a smartphone so they can use this app?”), school admin
(“No phones should be used during school time!”) and school IT support who
will have to deal with all the students who can’t figure out how to work the app
or connect to the network. Different stakeholders might have very different
ideas about what the app should do.
To find out what our stakeholders want the software to do, we usually interview
them. We ask them questions to find functional and non-functional
requirements for the software. Functional requirements are things the software
needs to do. For example, your phone app needs to allow students to choose
the food they want to order. It should then send the order to the cafeteria,
along with the student’s name so that they can be easily identified when
picking up the food.
Non-functional requirements, on the other hand, don’t tell us what the software
needs to do but how it needs to do it. How efficient does it need to be? How
reliable? What sort of computer (or phone) does it need to run on? How easy to
use should it be?
So we first figure out who our stakeholders are and then we go to interview
them to find the requirements for the software. That doesn’t sound too hard,
right? Unfortunately, it’s the communication with the customer that often turns
out to be most difficult.
The first problem is that customers and software engineers often don’t speak
the same language. Of course, we don’t mean to say that they don’t both
speak English, but software engineers tend to use technical language, while
customers use language specific to their work. For example, doctors might use
a lot of scary medical terms that you don’t understand.
Imagine that a customer asks you to develop a scoring system for the
(fictional) sport of Whacky-Flob. The customer tells you “It’s really simple. You
just need to record the foo-whacks, but not the bar-whacks, unless the Flob is
circulating”. After this description, you’re probably pretty confused because you
don’t know anything about the sport of Whacky-Flob and don’t know the
specific terms used. (What on earth are foo-whacks???) To get started, you
should attend a few games of Whacky-Flob and observe how the game and the
scoring works. This way, you’ll be able to have a much better conversation with
the customer since you have some knowledge about the problem domain.
(Incidentally, this is one of the cool things about being a software engineer: you
get exposure to all kinds of different, exciting problem domains. One project
might be tracking grizzly bears, the next one might be identifying cyber
terrorists or making a car drive itself.)
You should also never assume that a customer is familiar with technical terms
that you might think everyone should know, such as JPEG, database or maybe
even operating system. Something like “The metaclass subclass hierarchy was
constrained to be parallel to the subclass hierarchy of the classes which are
their instances” might make some sense to a software engineer, but a
customer will just look at you very confused! One of the authors once took part
in a customer interview where the stakeholder was asked if they want to use
the system through a browser. Unfortunately, the customer had no idea what a
browser was. Sometimes, customers may not want to admit that they have no
idea what you’re talking about and just say “Yes” to whatever you suggest.
Remember, it’s up to you to make sure you and your customer understand
each other and that you get useful responses from your customer during the
interview!
Even if you manage to communicate with a customer, you might find that they
don’t really know what they want the software to do or can’t express it. They
might say they want “software to improve their business” or to “make their
work more efficient” but that’s not very specific. (There’s a great cartoon of
Dilbert which illustrates this point!) When you show them the software you
have built, they can usually tell you if that’s what they wanted or what they like
and don’t like about it. For that reason, it’s a good idea to build little prototypes
while you’re developing your system and keep showing them to customers to
get feedback from them.
You’ll often find that customers have a specific process that they follow already
and want the software to fit in with that. We were once involved in a project
being done by university students for a library. Their staff used to write down
information about borrowed items three times on a paper form, cut up the form
and send the pieces to different places as records. When the students
interviewed them, they asked for a screen in the program where they could
enter the information three times as well (even though in a computer system
there really isn’t much point in that)!
Customers are usually experts in their field and are therefore likely to leave out
information that they think is obvious, but may not be obvious to you. Other
times, they do not really understand what can and cannot be done with
computers and may not mention something because they do not realise that it
is possible to do with a computer. Again, it’s up to you to get this information
from them and make sure that they tell you what you need to know.
Image source
Curiosity: Easy for computers and hard for humans vs hard for computers
and easy for humans
The rollover text of the above image (you will need to actually view it on
xkcd's website) is worth reading too. Image recognition is a problem that
initially seemed straightforward, probably because humans find it easy.
Interestingly, there are many problems that computers find easy, but
humans find challenging, such as multiplying large numbers. Conversely,
there are many other problems that computers find hard, yet humans find
easy, such as recognizing that the thing in a photo is, for example, a cat.
If you have multiple stakeholders, you can get conflicting viewpoints. For
example, when you talk to the cafeteria people about your food-ordering app,
they may suggest that every student should only be able to order food up to a
value of $10. In this way, they can avoid prank orders. When you talk to a
teacher, they agree with this suggestion because they are worried about
bullying. They don’t want one student to get pressured into ordering food for
lots of other students. But the students tell you that they want to be able to
order food for their friends. In their view, $10 isn’t even enough for one
student.
What do you do about these conflicting points of view? Situations like this can
be difficult to handle, depending on the situation, the stakeholders and the
software you are making. In this case, you need the support from the cafeteria
and the teachers for your software to work, but maybe you could negotiate a
slightly higher order limit of $20 to try to keep all your stakeholders happy.
Finally, even if you get everything right during the analysis stage of your
project, talk to all the stakeholders and find all the requirements for the
software, requirements can change while you’re making the software. Big
software projects can take years to complete. Imagine how much changes in
the technology world in a year! While you’re working on the project, new
hardware (phones, computers, tablets, …) could come out or a competitor
might release software very similar to what you’re making. Your software itself
might change the situation: once the software is delivered, the customer will
try working with it and may realise it isn’t what they really wanted. So you
should never take the requirements for your software to be set in stone. Ideally,
you should keep talking to customers regularly throughout the project and
always be ready for changes in requirements!
For this project, you need to find someone for whom you could develop
software. This could be someone from your family or a friend. They might,
for example, need software to manage information about their business’
customers or their squash club might want software to schedule squash
tournaments or help with the timetabling of practices. (For this project, you
won’t actually be making the software, just looking at the requirements; if
the project is small enough for you to program on your own, it's probably
not big enough to be a good example for software engineering!)
Once you’ve found a project, start by identifying and describing the
stakeholders for your project. (This project will work best if you have at
least two different stakeholders.) Try to find all the stakeholders,
remembering that some of them might only have an indirect interest in
your software. For example, if you are making a database to store
customer information, the customers whose information is being stored
have some interest in your software even though they never use it directly;
for example, they will want the software to be secure so that their data
cannot be stolen. Write up a description about each stakeholder, giving as
much background detail as possible. Who are they? What interest do they
have in the software? How much technical knowledge do they have? …
Interview one of the stakeholders to find out what they want the software
to do. Write up the requirements for your software, giving some detail
about each requirement. Try to distinguish between functional and non-
functional requirements. Make sure you find out from your stakeholder
which things are most important to them. This way you can give each
requirement a priority (for example high, medium, low), so that if you
would actually build the software you could start with the most important
features.
For the other stakeholders, try to imagine what their requirements would
be. In particular, try to figure out how the requirements would differ from
the other stakeholders. It’s possible that two stakeholders have the same
requirements but in that case maybe they have different priorities? See if
you can list any potential disagreements or conflicts between your
stakeholders? If so, how would you go about resolving them?
Software design is all about managing this complexity and making sure that the
software we create has a good structure. Before we start writing any code, we
design the structure of our software in the design phase of the project. When
you talk about software design, many people will think that you’re talking about
designing what the software will look like. Here, we’re actually going to look at
designing the internal structure of software.
So how can we design software in a way that it doesn’t end up hugely complex
and impossible to understand? Here, we give you an introduction to two
important approaches: subdivision and abstraction. Those are pretty scary
words, but as you’ll see soon, the concepts behind them are surprisingly
simple.
You can probably already guess what subdivision means: We break the software
into many smaller parts that can be built independently. Each smaller part may
again be broken into even smaller parts and so on. As we saw in the
introduction, a lot of software is so large and complex that a single person
cannot understand it all; we can deal much more easily with smaller parts.
Large software is developed by large teams so different people can work on
different parts and develop them in parallel, independently of each other. For
example, for your cafeteria project, you might work on developing the database
that records what food the cafeteria sells and how much each item costs, while
your friend works on the actual phone app that students will use to order food.
Once we have developed all the different parts, all we need to do is make them
communicate with each other. If the different parts have been designed well,
this is relatively easy. Each part has a so-called interface which other parts can
use to communicate with it. For example, your part of the cafeteria project
should provide a way for another part to find out what food is offered and how
much each item costs. This way, your friend who is working on the phone app
for students can simply send a request to your part and get this information.
Your friend shouldn’t need to know exactly how your part of the system works;
they should just be able to send off a request and trust that the answer they
get from your part is correct. This way, each person working on the project only
needs to understand how their own part of the software works.
Let’s talk about the second concept, abstraction. Have you ever thought about
why you can drive a car without knowing how its engine works? Or how you can
use a computer without knowing much about hardware? Maybe you know what
a processor and a hard drive is but could you build your own computer? Could
your parents? We don’t need to know exactly how computers or cars work
internally to be able to use them thanks to abstraction!
If we look more closely at a computer, we can see that it actually has a number
of layers of abstraction. Right at the bottom, we have the hardware, including
the processor, RAM, hard disk and various complicated looking circuit boards,
cables and plugs.
When you boot your computer, you start running the operating system. The
operating system is in charge of communicating with the hardware, usually
through special driver software. Once you’ve started your computer, you can
run programs, for example your browser. The browser actually doesn’t
communicate with the hardware directly but always goes through the operating
system.
Finally, you’re the top layer of the system. You use the program but you will
(hopefully) never have to interact with the more complicated parts of the
operating system such as driver software, let alone the hardware. In this way,
you can use the computer without ever having to worry about these things.
The computer can be broken down into multiple layers, starting with the user, then the
programs, then the operating system, then finally the hardware.
We call a system like this a layered system. You can have any number of layers
you want but each layer can only communicate with the one directly below it.
The operating system can directly access the hardware but a program running
on the computer can't. You can use programs but hopefully will never have to
access the hardware or the more complex parts of the operating system such
as drivers. This again reduces the complexity of the system because each layer
only needs to know about the layer directly below it, not any others.
Each layer in the system needs to provide an interface so that the layer above
it can communicate with it. For example, a processor provides a set of
instructions to the operating system; the operating system provides commands
to programs to create or delete files on the hard drive; a program provides
buttons and commands so that you can interact with it.
One layer knows nothing about the internal workings of the layer below; it only
needs to know how to use the layer’s interface. In this way, the complexity of
lower layers is completely hidden, or abstracted. Each layer represents a higher
level of abstraction.
We can have the same “layered” approach inside a single program. For
example, websites are often designed as so-called three-tier systems with three
layers: a database layer, a logic layer and a presentation layer. The database
layer usually consists of a database with the data that the website needs. For
example, Facebook has a huge database where it keeps information about its
users. For each user, it stores information about who their friends are, what
they have posted on their wall, what photos they have added, and so on. The
logic layer processes the data that it gets from the database. Facebook’s logic
layer, for example, will decide which posts to show on your “Home” feed, which
people to suggest as new friends, etc. Finally, the presentation layer gets
information from the logic layer which it displays. Usually, the presentation
layer doesn’t do much processing on the information it gets but simply creates
the HTML pages that you see.
Facebook can be broken down into a three tier system, comprising of the presentations
layer, then the logic layer, then finally the data layer.
Once the program was finished, they demonstrated it to some pilots. One
of the pilots decided to fly the helicopter close to a herd of kangaroos to
see what would happen. The kangaroos scattered to take cover when the
helicopter approached (so far so good) but then, to the pilot’s extreme
surprise, pulled out their guns and missile launchers and fired at the
helicopter. It seemed the programmer had forgotten to remove that part of
the code from the original simulator.
Think back to the requirements you found in the analysis project described
above. In this project, we will look at how to design the software.
Start by thinking about how the software you are trying to build can be
broken up into smaller parts. Maybe there is a database or a user interface
or a website? For example, imagine you are writing software to control a
robot. The robot needs to use its sensors to follow a black line on the
ground until it reach a target. The software for your robot should have a
part that interacts with the sensors to get information about what they
“see”. It should then pass this information to another part, which analyses
the data and decides where to move next. Finally, you should have a part
of the software which interacts with the robot’s wheels to make it move in
a given direction.
Try to break down your software into as many parts as possible (remember,
small components are much easier to build!) but don’t go too far - each
part should perform a sensible task and be relatively independent from the
rest of the system.
For each part that you have identified, write a brief description about what
it does. Then think about how the parts would interact. For each part, ask
yourself which other parts it needs to communicate with directly. Maybe a
diagram could help visualise this?
16.4. Testing: Did we Build the
Right Thing and Does it Work?
We’ve decided what our software should do (analysis) and designed its internal
structure (design), and the system has been programmed according to the
design. Now, of course, we have to test it to make sure it works correctly.
When we test software, we try lots of different inputs and see what outputs or
behaviour the software produces. If the output is incorrect, we have found a
bug.
The problem with testing is that it can only show the presence of errors, not
their absence! If you get an incorrect output from the program, you know that
you have found a bug. But if you get a correct output, can you really conclude
that the program is correct? Not really. The software might work in this
particular case but you cannot assume that it will work in other cases. No
matter how thoroughly you test a program, you can never really be 100% sure
that it’s correct. In theory, you would have to test every possible input to your
system, but that’s not usually possible. Imagine testing Google for everything
that people could search for! But even if we can’t test everything, we can try as
many different test cases as possible and hopefully at least decrease the
probability of bugs.
As with design, we can’t possibly deal with the entire software at once, so we
again just look at smaller pieces, testing one of them at a time. We call this
approach unit testing. A unit test is usually done by a separate program which
runs the tests on the program that you're writing. That way you can run the
tests as often as you like --- perhaps once a day, or even every time there is a
change to the program.
It's not unusual to write a unit test program before you write the actual
program. It might seem like wasted work to have to write two programs instead
of one, but being able to have your system tested carefully any time you make
a change greatly improves the reliability of your final product, and can save a
lot of time trying to find bugs in the overall system, since you have some
assurance that each unit is working correctly.
Once all the separate pieces have been tested thoroughly, we can test the
whole system to check if all the different parts work together correctly. This is
called integration testing. Some testing can be automated while other testing
needs to be done manually by the software engineer.
If I give you a part of the software to test, how would you start? Which test
inputs would you use? How many different test cases would you need? When
would you feel reasonably sure that it all works correctly?
There are two basic approaches you can take, which we call black-box testing
and white-box testing. With black-box testing, you simply treat the program as
a black box and pretend you don’t know how it’s structured and how it works
internally. You give it test inputs, get outputs and see if the program acts as you
expected.
But how do you select useful test inputs? There are usually so many different
ones to choose from. For example, imagine you are asked to test a program
that takes a whole number and outputs its successor, the next larger number
(e.g. give it 3 and you get 4, give it -10 and you get -9, etc). You can’t try the
program for all numbers so which ones do you try?
You observe that many numbers are similar and if the program works for one of
them it’s probably safe to assume it works for other similar numbers. For
example, if the program works as you expect when you give it the number 3,
it’s probably a waste of time to also try 4, 5, 6 and so on; they are just so
similar to 3.
This is the concept of equivalence classes. Some inputs are so similar, you
should only pick one or two and if the software works correctly for them you
assume that it works for all other similar inputs. In the case of our successor
program above, there are two big equivalence classes, positive numbers and
negative numbers. You might also argue that zero is its own equivalence class,
since it is neither positive nor negative.
For testing, we pick a couple of inputs from each equivalence class. The inputs
at the boundary of equivalence classes are usually particularly interesting.
Here, we should definitely test -1 (this should output 0), 0 (this should output 1)
and 1 (this should output 2). We should also try another negative and positive
number not from the boundary, such as -48 and 57. Finally, it can be interesting
to try some very large numbers, so maybe we’ll take -2,338,678 and
10,462,873. We have only tested 7 different inputs, but these inputs will
probably cover most of the interesting behaviour of our software and should
reveal most bugs.
Of course, you might also want to try some invalid inputs, for example “hello”
(a word) or “1,234” (a number with a comma in it) or “1.234” (a number with a
decimal point). Often, test cases like these can get programs to behave in a
very strange way or maybe even crash because the programmer hasn’t
considered that the program might be given invalid inputs. Remember that
especially human users can give you all sorts of weird inputs, for example if
they misunderstand how the program should be used. In case of an invalid
input, you probably want the program to tell the user that the input is invalid;
you definitely don’t want it to crash!
Black-box testing is easy to do but not always enough because sometimes
finding the different equivalence classes can be difficult if you don’t know the
internal structure of the program. When we do white-box testing, we look at the
code we are testing and come up with test cases that will execute as many
different lines of code as possible. If we execute each line at least once, we
should be able to discover a lot of bugs. We call this approach code coverage
and aim for 100% coverage, so that each line of code is run at least once. In
reality, even 100% code coverage won’t necessarily find all bugs though,
because one line of code might work differently depending on inputs and
values of variables in a program. Still, it’s a pretty good start.
Unit testing is very useful for finding bugs. It helps us find out if the program
works as we intended. Another important question during testing is if the
software does what the customer wanted (Did we build the right thing?).
Acceptance testing means showing your program to your stakeholders and
getting feedback about what they like or don’t like. Any mistakes that we made
in the analysis stage of the project will probably show up during acceptance
testing. If we misunderstood the customer during the interview, our unit tests
might pass (i.e. the software does what we thought it should) but we may still
have an unhappy customer.
For this project, choose a small program such as a Windows desktop app or
an Apple dashboard widget. Pick something that you find particularly
interesting or useful (such as a timer, dictionary or calculator). Start by
reading the description of the program to find out what it does before you
try it out.
Next, think about a stakeholder for this software. Who would use it and
why? Briefly write down some background information about the
stakeholder (as in the analysis project) and their main requirements. Note
which requirements would be most important to them and why.
Now, you can go ahead and install the program and play around with it. Try
to imagine that you are the stakeholder that you described above. Put
yourself in this person’s shoes. How would they feel about this program?
Does it meet your requirements? What important features are missing? Try
to see if you can find any particular problems or bugs in the program. (Tip:
sometimes giving programs unexpected input, for example a word when
they were expecting a number, can cause some interesting behaviour.)
Write up a brief acceptance test report about what you found. Try to link
back to the requirements that you wrote down earlier, noting which have
been met (or maybe partially met) and which haven’t. Do you think that
overall the stakeholder would be happy with the software? Do you think
that they would be likely to use it? Which features would you tell the
software developers to implement next?
The obvious answer would be to start with analysis to figure out what we want
to build, then design the structure of the software, implement everything and
finally test the software. This is the simplest software process, called the
waterfall process.
The waterfall process is borrowed from other kinds of engineering. If we want to
build a bridge, we go through the same phases of analysis, design,
implementation and testing: we decide what sort of bridge we need (How long
should it be? How wide? How much load should it be able to support?), design
the bridge, build it and finally test it before we open it to the public. It’s been
done that way for many decades and works very well, for bridges at least.
We call this process the waterfall process because once you “jump” from one
phase of the project to the next, you can’t go back up to the previous one. In
reality, a little bit of backtracking is allowed to fix problems from previous
project phases but such backtracking is usually the exception. If during the
testing phase of the project you suddenly find a problem with the requirements
you certainly won’t be allowed to go back and rewrite the requirements.
An advantage of the waterfall process is that it’s very simple and easy to follow.
At any point in the project, it’s very clear what stage of the project you are at.
This also helps with planning: if you’re in the testing stage you know you’re
quite far into the project and should finish soon. For these reasons, the
waterfall process is very popular with managers who like to feel in control of
where the project is and where it’s heading.
Your manager and customer will probably frequently ask you how much
longer the project is going to take and when you will finally have the
finished program. Unfortunately, it’s really difficult to know how much
longer a project is going to take. According to Hofstadter’s law, “It always
takes longer than you expect, even when you take into account
Hofstadter's Law.” Learning to make good estimates is an important part of
software engineering.
Because it’s just so nice and simple, the waterfall process is still in many
software engineering textbooks and is widely used in industry. The only
problem with this is that the waterfall process just does not work for most
software projects.
So why does the waterfall process not work for software when it clearly works
very well for other engineering products like bridges (after all, most bridges
seem to hold up pretty well...)? First of all, we need to remember that software
is very different from bridges. It is far more complex. Understanding the plans
for a single bridge and how it works might be possible for one person but the
same is not true for software. We cannot easily look at software as a whole
(other than the code) to see its structure. It is not physical and thus does not
follow the laws of physics. Since software is so different from other engineering
products, there really is no reason why the same process should necessarily
work for both.
To understand why the waterfall process doesn’t work, think back to our section
about analysis and remember how hard it is to find the right requirements for
software. Even if you manage to communicate with the customers and resolve
conflicts between the stakeholders, the requirements could still change while
you’re developing the software. Therefore, it is very unlikely that you will get
the complete and correct requirements for the software at the start of your
project.
If you make mistakes during the analysis phase, most of them are usually found
in the testing stage of the project, particularly when you show the customer
your software during acceptance testing. At this point, the waterfall process
doesn’t allow you to go back and fix the problems you find. Similarly, you can’t
change the requirements halfway through the process. Once the analysis phase
of the project is finished, the waterfall process “freezes” the requirements. In
the end of your project, you will end up with software that hopefully fulfills
those requirements, but it is unlikely that those will be the correct
requirements.
You end up having to tell the customer that they got what they asked for, not
what they needed. If they've hired you, they'll be annoyed; if it's software that
you're selling (such as a smartphone app), people just won't bother buying it.
You can also get things wrong at other points in the project. For example, you
might realise while you’re writing the code that the design you came up with
doesn’t really work. But the waterfall process tells you that you have to stick
with it anyway and make it work somehow.
So if the waterfall process doesn’t work, what can we do instead? Most modern
software development processes are based on the concept of iteration. We do a
bit of analysis, followed by some design, some programming and some testing.
(We call this one iteration.) This gives us a rather rough prototype of what the
system will look like. We can play around with the prototype, show it to
customers and see what works and what doesn’t. Then, we do the whole thing
again. We refine our requirements and do some more design, programming and
testing to make our prototype better (another iteration). Over time, the
prototype grows into the final system, getting closer and closer to what we
want. Methodologies based on this idea are often referred to as agile --- they
can easily adapt as changes become apparent.
The advantage with this approach is that if you make a mistake, you will find it
soon (probably when you show the prototype to the customer the next time)
and have the opportunity to fix it. The same is true if requirements change
suddenly; you are flexible and can respond to changes quickly. You also get a
lot of feedback from the customers as they slowly figures out what they need.
There are a number of different software processes that use iteration (we call
them iterative processes); a famous one is the spiral model. Although the
details of the different processes vary, they all use the same iteration structure
and tend to work very well for software.
Apart from the question of what we do at what point of the project, another
interesting question addressed by software processes is how much time we
should spend on the different project phases. You might think that the biggest
part of a software project is programming, but in a typical project, programming
usually takes up only about 20% of the total time! 40% is spent on analysis and
design and another 40% on testing. This shows that software engineering is so
much more than programming.
Once you’ve finished developing your program and given it to the customer,
the main part of the software project is over. Still, it’s important that you don’t
just stop working on it. The next part of the project, which can often go on for
years, is called maintenance. During this phase you fix bugs, provide customer
support and maybe add new features that customers need.
Imagine that your project is running late and your customer is getting
impatient. Your first instinct might be to ask some of your friends if they
can help out so that you have more people working on the project. Brooks’
law, however, suggests that that is exactly the wrong thing to do!
Brooks’ law states that “adding manpower to a late software project makes
it later.” This might seem counterintuitive at first because you would
assume that more people would get more work done. However, the
overhead of getting new people started on the project (getting them to
understand what you are trying to build, your design, the existing code,
and so on) and of managing and coordinating the larger development team
actually makes things slower rather than faster in the short term.
The waterfall process is simple and commonly used but doesn’t really work
in practice. In this activity, you’ll get to see why. First, you will create a
design which you then pass on to another group. They have to implement
your design exactly and are not allowed to make any changes, even if it
doesn’t work!
You need a deck of cards and at least 6 people. Start by dividing up into
groups of about 3-4 people. You need to have at least 2 groups. Each group
should grab two chairs and put them about 30cm apart. The challenge is to
build a bridge between the two chairs using only the deck of cards!
Before you get to build an actual bridge, you need to think about how you
are going to make a bridge out of cards. Discuss with you team members
how you think this could work and write up a short description of your idea.
Include a diagram to make your description understandable for others.
Now exchange your design with another group. Use the deck of cards to try
to build your bridge to the exact specification of the other group. You may
not alter their design in any way (you are following the waterfall process
here!). As frustrating as this can be (especially if you know how to fix the
design), if it doesn’t work, it doesn’t work!
If you managed to build the bridge, congratulations to you and the group
that managed to write up such a good specification! If you didn’t, you now
have a chance to talk to the other group and give them feedback about the
design. Tell them about what problems you had and what worked or didn’t
work. The other group will tell you about the problems they had with your
design!
Now, take your design back and improve it, using what you just learnt
about building bridges out of cards and what the other group told you. You
can experiment with cards as you go, and keep changing the design as you
learn about what works and what doesn't (this is an agile approach, which
we are going to be looking at further shortly). Keep iterating (developing
ideas) until you get something that works.
Which of these two approaches worked best --- designing everything first,
or doing it in the agile way?
In this activity, you will develop a language for navigating around your
school. Imagine that you need to describe to your friend how to get to a
particular classroom. This language will help you give a precise description
that your friend can easily follow.
First, figure out what your language has to do (i.e. find the requirements).
Will your language be for the entire school or only a small part? How exact
will the descriptions be? How long will the descriptions be? How easy will
they be to follow for someone who does / doesn’t know your language?
How easy will it be to learn? …
Finally, test the language using another student. Don’t tell them where
they’re going, just give them instructions and see if they follow them
correctly. Try out different cases until you are sure that your language
works and that you have all the commands that you need. If you find any
problems, go back and fix them and try again!
Note down how much time each of the different phases of the project take
you. When you have finished, discuss how much time you spent on each
phase and compare with other students. Which phase was the hardest?
Which took the longest? Do you think you had more time for some of the
phases? What problems did you encounter? What would you do differently
next time around?
Divide up into pairs, with one creator and one builder in each pair. Each
person needs a set of at least 10 coloured building blocks (e.g. lego
blocks). Make sure that each pair has a matching set of blocks or this
activity won’t work!
The two people in each pair should not be able to see each other but need
to be able to hear each other to communicate. Put up a screen between
the people in each pair or make them face in opposite directions. Now, the
creator builds something with their blocks. The more creative you are the
more interesting this activity will be!
When the creator has finished building, it’s the builder's turn. His or her
aim is to build an exact replica of the creator's structure (but obviously
without knowing what it looks like). The creator should describe exactly
what they need to do with the blocks. For example, the creator could say
“Put the small red block on the big blue block” or “Stand two long blue
blocks up vertically with a one block spacing between them, and then
balance a red block on top of them”. But the creator should not describe
the building as a whole (“Make a doorframe.”).
When the builder thinks they are done, compare what you built! How
precise was your communication? Which parts were difficult to describe for
the creator / unclear for the builder? Switch roles so that you get to
experience both sides!
Agile processes include lots of interesting principles that are quite different
from standard software development. We look at the most interesting ones
here. If you want to find out more, have a look at Agile Academy on Youtube
which has lots of videos about interesting agile practices! There’s also another
video here that explains the differences between agile software development
and the waterfall process.
16.6.1. Pair-programming
Programming is done in pairs with one person coding while the other person
watches and looks for bugs and special cases that the other might have
missed. It’s simply about catching small errors before they become bugs. After
all, 4 eyes see more than 2.
You might think that pair-programming is not very efficient and that it would be
more productive to have programmers working separately; that way, they can
write more code more quickly, right? Pair-programming is about reducing
errors. Testing, finding and fixing bugs is hard; trying not to create them in the
first place is easier. As a result, pair-programming has actually been shown to
be more efficient than everyone programming by themselves!
16.6.2. YAGNI
YAGNI stands for “You ain’t gonna need it” and tells developers to keep things
simple and only design and implement the things that you know you are really
going to need. It can be tempting to think that in the future you might need
feature x and so you may as well already create it now. But remember that
requirements are likely to change so chances are that you won’t need it after
all.
Image source
16.6.4. Refactoring
There are many different ways to design and program a system. YAGNI tells you
to start by doing the simplest thing that’s possible. As the project develops, you
might have to change the original, simple design. This is called refactoring.
Refactoring only works on software because it is “soft” and flexible. The same
concept does not really work for physical engineering products. Imagine that
when building a bridge, for example, you started off by doing the simplest
possible thing (putting a plank over the river) and then continually refactored
the bridge to get the final product.
Before you write a piece of code, you should write a test for the code that you
are about to write. This forces you to think about exactly what you’re trying to
do and what special cases there are. Of course, if you try to run the test, it will
fail (since the functionality it is testing does not yet exist). When you have a
failing test, you can then write code to make the test pass.
16.6.8. Courage
“Courage” might seem like an odd concept in the context of software
development. In agile processes, things change all the time and therefore
programmers need to have the courage to make changes to the code as
needed, fix the problems that need to be fixed, correct the design where
needed, throw away code that doesn’t work, and so on. This might not seem
like a big deal, but it can actually be quite scary to change code, particularly if
the code is complicated or has been written by a different person. Unit tests
really help by giving you courage: you’ll feel more confident to change the code
if you have tests that you can run to check your work later.
This project will provide insight into a real software engineering process,
but you'll need to find a software engineer who is prepared to be
interviewed about their work. It will be ideal if the person works in a
medium to large size company, and they need to be part of a software
engineering team (i.e. not a lone programmer).
The project revolves around interviewing the person about the process they
went through for some software development they did recently. They may
be reluctant to talk about company processes, in which case it may help to
assure them that you will keep their information confidential (your project
should only be viewed by you and those involved in supervising and
marking it; you should state its confidential nature clearly at the start so
that it doesn't get published later).
You need to do substantial preparation for the interview. Find out about the
kind of software that the company makes. Read up about software
engineering (in this chapter) so that you know the main terminology and
techniques.
Now prepare a list of questions for the interviewee. These should find out
what kind of software development processes they use, what aspects your
interviewee works on, and what the good and bad points are of the
process, asking for examples to illustrate this.
You should take extensive notes during the interview (and record it if the
person doesn't mind).
You then need to write up what you have learned, describing the process,
discussing the techniques used, illustrating it with examples, and
evaluating how well the process works.
Demonstrate understanding of
searching and sorting algorithms
The key with this standard is to be able to explain how an algorithm works
(usually best done through personalised examples), and evaluate algorithms in
terms of how the time they take increases as the size of the input increases.
For merit and excellence, students need to compare two contrasting algorithms
for both searching and sorting. (Students sometimes mix up algorithms for
searching with those for sorting; it doesn't make sense to compare the speed of
a searching algorithm with the speed of a sorting algorithm as they are
achieving different things, and it's important to be aware of what the problem is
that each algorithm solves.)
The material that you need to understand these ideas is in the chapter on
Algorithms. Start with the "Big picture" introduction to the chapter, which
explains the idea of the "cost" of an algorithm (and what an algorithm is!)
The two searching algorithms that you should learn about in order to compare
them are Sequential Search (sometimes referred to as Linear Search) and
Binary Search, these are covered in the searching algorithms section. These are
strongly contrasting algorithms and will work well for a comparison, and they
are based on a simple data structure (the array or list). Other algorithms (yet to
be added to the Field Guide) that keen students might want to explore are
based on Hash Tables (these are usually faster than Binary Search and are
commonly used in practice, but have more "moving parts" to understand and
explain) and Search Trees (which again aren't covered in the Field Guide at this
stage, but are widely used in practice). However, Sequential Search and Binary
Search provide an excellent contrasting introduction to the issues surrounding
algorithms for searching, and nicely match the expectations of this standard.
Students will also need to explore one or two sorting algorithms. There are
three main sorting algorithms described within the sorting algorithms section:
Selection Sort, Insertion Sort and Quicksort. Selection and Insertion Sort are
both slow algorithms that are easier to understand and implement, but have
limited use in practice because they are unnecessarily slow for typical data;
Quicksort is one of the faster methods known, and is a good contrast to the
previous two. For the achievement standard only two algorithms are needed;
Selection and Quicksort provide a good contrast, otherwise Insertion and
Quicksort could be used. It would not be a good idea to compare Insertion sort
with Selection sort; the differences between these are more subtle and would
require more careful evaluation. Another common sorting algorithm that is
widely used in practice because it is very fast is Mergesort; this isn't currently
covered in the Field Guide, but keen students could investigate it instead of
Quicksort.
Assessment resources for this standard can be found on the TKI website here.
The documents here mention the CS Field Guide where it supports that
assessment.
AS91886 (1.10)
Demonstrate understanding of
human computer interaction
Human computer interaction is about evaluating and designing interfaces on
digital devices. It involves a considerable element of psychology, as it is about
how humans perceive and interface. A common misunderstanding is that this
standard might be about building an interface; it is much easier to do this
assessment critically reviewing an existing interface, as the worst person to
evaluate an interface is the person who designed it. The challenge for many
students is to step back from the device and think about what challenges and
confusion it might cause for the user, which requires a good level of sympathy
for a "typical" user!
The first section of the chapter introduces the big ideas around HCI, and the
second section sensitises students to the issues around users and tasks. Some
general issues and approaches are raised in the interface usability section, and
then the idea of usablity heuristics is introduced. These heuristics are the key
to addressing the standard, but the preliminary understanding is needed to
avoid just treating them in isolation.
A trap with this standard is to describe the interface based on its specifications
(screen size, camera resolution, battery life, or even claims of being "user
friendly" in the advertising). That is why the "Think aloud protocol" and
"Cognitive walkthrough" are important, as they help the evaluator to see an
interface through a users' eyes. The heuristics provide a solid internationally
used framework to make well-founded observations about the interface.
The scope of the interface that is evaluated should be kept narrow. A simple
app like a stopwatch or alarm clock, or other "widget", will usually have more
than enough to evaluate, as it isn't so much about the number of buttons on
display, but all the states that the interface can get into, and what common
sequences of actions are for users. Avoid large systems such as an entire word
processor (just the font choice dialogue in a word processor would be enough)
or an operating system (such as Windows vs MacOS.)
Where the standard calls for Heuristics to be used, you should use the 10 listed
in usability heuristics section, which were originally published by Jakob Nielsen.
(Other sets of heuristics exist, but these ones are widely used and fairly
straight-forward to apply.)
AS91887 (1.11)
Demonstrate understanding of
compression coding for a chosen
media type
Data compression is widely used to reduce the size of files. This standard
focuses on the technical elements of how the algorithms work to achieve
compression, and is an extension of earlier progress outcomes in
Computational Thinking that look at how data is represented on computers.
Simple (uncompressed) representations may be familiar to students who have
studied DT previously; compression uses more complex binary representations
to reduce the space. While the quality of a compressed image or sound is
relevant, a common mistake is for students to focus more on this than the
actual algorithm that is applied to the data to change its representation and
make it smaller.
The general idea of changing the coding of data from a simple representation is
introduced in the short chapter on "coding". Compression is covered in its own
chapter.
The achieved level of the standard covers "lossless" methods. A simple lossless
method is run-length coding. This is suitable for the "achieved" level of the
standard, but the merit criteria require an evaluation, which is challenging
without a detailed understanding of how it is implemented in practice. The
excellence criteria include "real-world" applications, and although run-length
coding is used on fax machines, these are becoming uncommon, so might not
be accepted as a "real world" application. Run length coding is also used as
part of JPEG compression, but the way it is used is part of a complex
combination of codes, and explaining this well would involve understanding the
Discrete Cosine Transform, quantisation, and the Huffman code.
The next section on JPEG covers the common lossy method for compressing
photographs. JPEG is good to use as an example of a lossy method, and human
perception can be demonstrated by saving images at various levels of
compression from an image/photo editing program that allows you to choose
the quality of the file being saved - students can compare the quality of the
image with the amount of space it takes to store, and explain this in terms of
how the general principle that JPEG uses. Good files to use to evaluate JPEG
compression include images containing human faces, an image that is all one
colour, an image that is very random (such as a photo of a very detailed
surface), and a black-and-white image. Results should be displayed in a well
organised table showing any trends found in the experiment, and images
should be shown to demonstrate the quality. When showing images, think
about how the examiner will see them, and consider zooming in on issues
rather than relying on it coming out in a (badly) printed report. Explaining the
method in detail requires an understanding of the DCT, quantisation and
Huffman codes. These can look rather mathematical to some students, but the
maths used is the cosine function (which are a basic trig concept) and rounding
(for quantisation), so it isn't beyond what some year 11 students will have
studied. The interactives in the chapter provide the ability to do personalised
explanations, since your own photo can be loaded, and the DCT parameters for
it can be calculated online, and explained.
The section on Huffman coding explains a method that is used as part of nearly
every compression method. It has been used in the past as a lossless
compression method in its own right, but it isn't used on its own in practice
because LZ coding works better. It could be used for the achieved requirement
of the standard ("showing how a lossless compression method works"), and in
principle it enables "real-world" compression methods, but showing how it fits
into them will require some care (e.g. LZ coding uses Huffman coding to
represent the pointers, and JPEG uses Huffman coding to represent the run-
length coded quantised values). As with everything in these standards,
students should make sure they understand the broad implications of what
they are writing about, and not just latch on to one aspect of an idea, as this
can leave markers wondering if they really understand the topic!
1.44 Assessment Guide
This document provides a brief introduction to teachers on the Computer
Science Field Guide assessment guides for NCEA Achievement standard
AS91074 (1.44).
Topics
1.44 has bullet points for the following three topics in computer science:
• Algorithms
• Programming Languages
• Human Computer Interaction
Each of these topics has a chapter in the Computer Science Field Guide.
The advice below applies to the whole standard; advice specific to each topic is
available through the main NCEA index.
For students who are weak at math, searching algorithms is probably the better
choice. Sorting algorithms requires either being good at understanding trends
from data in a table or understanding how to read trends from a graph in order
to achieve merit or excellence, whereas the cost of searching algorithms can
easily be seen by students carrying out the algorithms themselves.
Sorting algorithms provide a slightly richer range of possibilities, including more
ways to demonstrate how they work in a student's report, and intriguing new
approaches to a common and easily described task that may not have been
obvious.
Guidance is given for achieved, merit, and excellence for both sorting
algorithms and searching algorithms.
Order of Topics
The three topics can be completed in any order, although the first bullet point
in each level (comparing algorithms, programs, and informal instructions) is
probably best left until both algorithms and programming languages have been
completed, since those topics can provide examples to illustrate the points in
the first bullet points.
Covering Human Computer Interaction first may make the Algorithms topic
more relevant to students. In many cases, a not so good algorithm will take a
second to run, whereas a better algorithm will take less than a tenth of a
second. This is very significant in terms of a good user interface, so covering
HCI first will make students more aware of issues like this.
In their report, it is important that the three topics are kept separate. Order
does not matter, but the student should have three or four main headings (it is
up to them whether or not they put the two parts of algorithms together), and
keep all the material under the relevant headings. This will make it far easier
for the marker to find the evidence they are looking for.
If the teacher provides too many headings or leading questions for students to
structure their work, this can reduce the opportunity for the report to reflect a
personal understanding.
Report Length
It is important to note that the page limit given by NZQA is a limit - not a
target. The markers prefer reports that are short and to the point. The
requirements of the standard can easily be met within the limit.
The page limit for 1.44 is now 10 pages to cover the three topics. A possible
breakdown that leaves one additional page is:
• Algorithms: 4 pages
• Programming Languages: 2 pages
• Human Computer Interaction: 3 pages
The assessment guides for the specific topics provide further guidance on how
to stay within these limits. Students should be mindful of the recommended
limits while they are working on their reports, in order to avoid having to delete
work they put a lot of effort into.
If using examples, don't use ones taken from the Field Guide or other sources -
students should make up their own. For sorting and searching, they should
actually carry out the balance scales activity using either the field guide
interactives or physical balance scales. For HCI, students can choose an
interface to evaluate themselves.
General Advice
In 2012 we did a study that looked over 151 student submissions for 1.44 in
2011. This was the first year 1.44 was offered, although the lessons learnt are
still relevant, particularly for teachers teaching the standard for the first time. A
WIPSCE paper was written presenting our findings of how well students
approached the standard and our recommendations for avoiding pitfalls. Our
key findings are reflected in the teacher guides, although reading the entire
paper would be worthwhile.
The paper was Bell, T., Newton, H., Andreae, P., & Robins, A. (2012). The
introduction of Computer Science to NZ High Schools --- an analysis of student
work. In M. Knobelsdorf & R. Romeike (Eds.), The 7th Workshop in Primary and
Secondary Computing Education (WiPSCE 2012). Hamburg, Germany. The
paper is available here.
Algorithms (1.44) -
Searching Algorithms
This is a guide for students attempting the Algorithms topic of digital
technologies achievement standard 1.44 (AS91074). If you follow this guide,
then you do not need to follow the sorting algorithms one.
In order to fully cover the standard, you will also need to have done projects
covering the topics of Programming Languages and Human Computer
Interaction in the standard, and included these in your report.
Overview
The topic of Algorithms has the following bullet points in achievement standard
1.44, which this guide covers. This guide separates them into two categories.
Merit: “explaining how algorithms are distinct from related concepts such as
programs and informal instructions”
As with all externally assessed Digital Technology reports, you should base your
explanations around personalised examples.
Project
This project involves understanding linear search and binary search.
Ensure you have tried both of the box searching interactives which are in the
the part of the field guide which you read. For one of them you had to use
linear search, and for the other you had to use binary search.
Take a screenshot of a completed search using the binary search interactive (if
you get lucky and find the target within 2 clicks, keep restarting until it takes at
least 3, so that you something sufficient to show in your report). Show on your
screenshot which boxes you opened, and put how many boxes you opened. The
number of boxes you opened is the cost of the algorithm for this particular
problem. Include your screenshot in your report.
Describe (in your own words with 1 - 3 sentences) the overall process you
carried out to search through the boxes. Try and make your explanation
general, e.g. if you gave the instructions to somebody who needs to know how
to search 100 boxes, or 500 boxes, the instructions would be meaningful.
You also need to show the kinds of steps that can be in an algorithm, such as
iterative, conditional, and sequential. If you don’t know what these terms
mean, go have another look at the field guide. Get a Scratch program (or
another language if you are fairly confident with understanding the language)
that implements binary search. Take a screenshot of it, or a large part of it (you
want to ensure that the screenshot takes up no more than half a page in the
report, but is still readable) and open it in a drawing program such as paint.
Add arrows and notes showing a part of the algorithm that is sequential, part
that is conditional, and part that is iterative.
Merit/ Excellence
It should be obvious from your initial investigation that binary search is far
better than linear search! Although you still might say, why not just use a faster
computer? To explore this possibility, you are now going to analyse what
happens with a huge amount of data. More specifically, you are going to
answer the following question: "How do linear search and binary search
compare when the amount of data to search is doubled?"
Start by picking a really large number (e.g. in the billions, or even bigger - this
is the amount of data that large online companies such as Google or Facebook
have to search). Imagine you have this number of boxes that you have to
search. Also, imagine that you then have two times that number of boxes, four
times that number of boxes, eight times that number of boxes, and sixteen
times that number of boxes.
Now, using those 5 different amounts of boxes, you are going to determine how
many boxes would have to be looked at on average to find a target for linear
search and binary search. You will then also calculate the amount of time you
could expect it to take, using the average number of boxes to be looked at and
an estimate of how many boxes a really fast computer could check per second.
As you do the various calculations, you should add them into a table, such as
the one below. This will be a part of your report.
Average for Average for Expected Time Expected Time
Values
Linear Binary for Linear for Binary
for n
Search Search Search Search
Chosen
??? ??? ??? ???
number
Chosen
number x ??? ??? ??? ???
2
Chosen
number x ??? ??? ??? ???
4
Chosen
number x ??? ??? ??? ???
8
Chosen
number x ??? ??? ??? ???
16
Rather than actually carrying out the searching (the interactive is not big
enough!), you are going to calculate the expected averages. Computer
scientists call this analysing an algorithm, and often it is better to work out how
long an algorithm can be expected to take before waiting years for it to run and
wondering if it will ever complete. Remember that you can use the big number
calculator and the time calculator in the field guide to help you with the math.
If you are really keen, you could make a spreadsheet to do the calculations and
graph trends.
Hint for estimating linear search: Remember that in the worst case, you would
have to look at every box (if the target turned out to be the last one), and on
average you'll have to check half of them. Therefore, to calculate the average
number of boxes that linear search would have to look at, just halve the total
number of boxes.
Hint for estimating binary search: Remember that with each box you look at,
you are able to throw away half (give or take 1) of the boxes. Therefore, To
calculate the average number of boxes that binary search would have to look
at, repeatedly divide the number by 2 until it gets down to 1. However many
times you divide by 2 is the average cost for binary search. Don’t worry if your
answer isn’t perfect; it’s okay to be within 3 or so of the correct answer. This
means that if while halving your number it never gets down to exactly 1 (e.g. it
gets down to 1.43 and then 0.715), your answer will be near enough. As long as
you have halved your number repeatedly until it gets down to a number that is
less than 1, your answer will be accurate.
Now that you have calculated the average number of boxes for each algorithm,
you can calculate how long it would take on a high end computer for each
algorithm with each problem size. Assume that the computer can look at 1
billion boxes per second. Don’t worry about being too accurate (e.g. just round
to the nearest millisecond (1/1000 of a second), second, minute, hour, day,
month, or year). Some of the values will be a tiny fraction of a millisecond. For
those, just write something like "Less than 1 millisecond". You can get the
number by dividing the expected number of boxes to check by 1 billion.
You should notice some obvious trends in your table. Explain these trends, and
in particular explain how the amount of time each algorithm takes changes as
the problem size doubles. Does it have a significant impact on the amount of
time the algorithm will take to run? Remember the original question you were
asked to investigate: "How do linear search and binary search compare when
the amount of data to search is doubled?".
Using your findings to guide you, discuss one of the following scenarios:
• Imagine that you are a data analyst with the task of searching these
boxes, and in order to do your work you need to search for many pieces of
data each day. What would happen if you were trying to use linear search
instead of binary search?
• Imagine that you have a web server that has to search a large amount of
data and then return a response to a user in a web browser (for example
searching for a person on Facebook). A general rule of computer systems
is that if they take longer than 1/10 of a second (100 milliseconds) to
return an answer, the delay will be noticeable to a human. How do binary
search and linear search compare when it comes to ensuring there is not a
noticable delay?
Writing the part of your report for the other
algorithms bullet points
Achieved/ Merit/ Excellence
We recommend doing this part after you have done programming languages.
All three levels (A/M/E) are cleanly subsumed by the E requirement, so you
should try to do that i.e. “comparing and contrasting the concepts of
algorithms, programs, and informal instructions”. You should refer to examples
you used in your report or include additional examples (e.g. a program used as
an example in the programming languages topic, or an algorithm describing
the searching process, etc). If you are confused, have another look at the field
guide. You should only need to write a few sentences to address this
requirement.
• ½ page: Screenshot of you carrying out binary search with the interactive.
(Achieved)
• ¼ page: Explanation of your binary search screenshot. (Achieved)
• ¼ page: General instructions for carrying out binary search. (Achieved)
• ½ page: Your example of the iterative, conditional, and sequential steps
that can be in an algorithm (Merit)
• 1 ½ pages: Your investigation and data collected for merit/ excellence.
Including results and discussion (Merit/ Excellence)
• ¼ page to ½ page: Explanation of the difference between algorithms,
programs, and informal instructions (Achieved/ Merit/ Excellence)
These are maximums, not targets!
For the topic of searching algorithms you probably won’t need this much space
(sorting algorithms tends to require more space).
Note that if you go over 4 pages for Algorithms, then you may have to use
fewer pages for one of the other two topics, which could be problematic. No
other material should be included for Algorithms.
Algorithms (1.44) -
Sorting Algorithms
This is a guide for students attempting the Algorithms topic of digital
technologies achievement standard 1.44 (AS91074). If you follow this guide,
then you do not need to follow the searching algorithms one.
In order to fully cover the standard, you will also need to have done projects
covering the topics of Programming Languages and Human Computer
Interaction in the standard, and included these in your report.
Overview
The topic of Algorithms has the following bullet points in achievement standard
1.44, which this guide covers. This guide separates them into two categories.
Merit: “explaining how algorithms are distinct from related concepts such as
programs and informal instructions”
As with all externally assessed Digital Technology reports, you should base your
explanations around personalised examples.
Note that 2.2 is not necessary for this project, as 2.2 focuses on searching
algorithms, whereas this project focuses on sorting algorithms.
Project
This project involves understanding how selection sort works and the types of
steps that can be in it and other algorithms, and then comparing the cost of
selection sort to the cost of quicksort.
Carry out selection sort on a small amount of data. You can do this either using
the balance scale interactive in the field guide (recommended), a physical set
of balance scales if your school has them (normal scales that show the exact
weights are unsuitable), or as a trace you did using pencil and paper (not
recommended). Count how many comparisons you made to sort the items.
Take screenshots/ photos of you using the interactive or balance scales to do
the sorting. Three or four pictures would be ideal (i.e. one showing the initial
state of the scales and weights, one or two in the middle where you are
comparing weights, and one at the end where all the weights are sorted). Use a
drawing program to draw on each of the pictures and show which weights have
been sorted so far, and which have not. Put on the screenshots how many
comparisons have been made so far in the sorting process. Write a short
explanation of what is happening in the images. Make sure you include the total
number of comparisons that was needed to sort the items in your report.
Describe (in your own words with a few sentences) the overall process you
carried out to sort the weights or numbers. Try and make your explanation
general, e.g. if you gave the instructions to somebody who needs to know how
to sort 100 numbers, or 500 numbers, the instructions would be meaningful.
You also need to show the kinds of steps that can be an algorithm, such as
iterative, conditional, and sequential. If you don’t know what these terms
mean, go have another look at the field guide. Get a Scratch program (or
another language if you are fairly confident with understanding the language)
that implements selection sort. Take a screenshot of it, or a large part of it (you
want to ensure that the screenshot takes up no more than half a page in the
report, but is still readable) and open it in a drawing program such as paint.
Add arrows and notes showing a part of the algorithm that is sequential, part
that is conditional, and part that is iterative.
Merit
Remember that some algorithms are a lot faster than others, especially as the
size of the problem gets bigger. It isn’t necessarily the case that if you try to
sort twice as many items then it will take twice as long. As a quick warm up
investigation to give you some idea of this, try the following.
Get an implementation of selection sort (there are some linked to at the end of
the chapter in the field guide). Start by choosing a number between 10 and 20.
How many comparisons does it take to sort that many randomly generated
numbers with your chosen algorithm? Now, try sorting twice as many numbers.
How many comparisons did it take now? Does it take twice as many? Now, try
sorting 10 times as many numbers. Does it take 10 times as many
comparisons? How many more times the original problem size’s number of
comparisons does it actually take? Hopefully you are starting to see a trend
here.
If you aren’t attempting excellence, include the numbers you got from the
warm up investigation, along with an explanation of the trend you found. If you
are attempting excellence, you should do the warm up investigation as it will
help you (and will only take a few minutes), but you don’t need to write about
it.
Excellence
The best way of visualising the data you have just collected is to make a graph
(e.g. using Excel). Your graph should have 2 lines; one for quicksort and one for
selection sort, showing how the number of comparisons increases as the size of
the problem goes up. Make sure you label the graph well. A simple way of
making the graph is to use a scatter plot and put in lines connecting the dots
(make sure the data for the graph is increasing order with the smallest problem
sizes first and largest last so that the line gets drawn properly). Ask your
teacher for guidance if you are having difficulty with excel.
Look at your graph. Does the rate of increase for the two algorithms seem to be
quite different? Discuss what your graph shows. If you aren’t sure what to
include in the discussion of your findings, you could consider the following
questions.
• What happens to the number of comparisons when you double how many
numbers you are sorting with quicksort? What about when you sort 10
times as many numbers? How is this different to when you used selection
sort at the start?
• What is the largest problem you can solve within a few seconds using
selection sort? What about with quicksort?
• If you had a database with 1 million people in it and you needed to sort
them by age, which of the two algorithms would you choose? Why? What
would happen if you chose the other algorithm?
We recommend doing this part after you have done programming languages.
All three levels (A/M/E) are subsumed by the E requirement, so you should try
to do that i.e. “comparing and contrasting the concepts of algorithms,
programs, and informal instructions”. You should refer to examples you used in
your report or include additional examples (e.g. a program used as an example
in the programming languages topic, or an algorithm describing the sorting
process, etc). If you are confused, have another look at the field guide. You
should only need to write a few sentences to address this requirement.
Note that if you go over 4 pages for Algorithms, then you may have to use
fewer pages for one of the other two topics, which could be problematic. No
other material should be included for Algorithms.
Human Computer
Interaction (1.44)
This is a guide for students attempting Human Computer Interaction in digital
technologies achievement standard 1.44 (AS91074).
In order to fully cover the standard, you will also need to have done projects
covering the topics of Algorithms and Programming Languages, and included
these in your report.
Overview
Human Computer Interaction has the following bullet points in achievement
standard 1.44, which this guide covers.
Achieved: “describing the role of a user interface and factors that contribute
to its usability”
As with all externally assessed reports, you should base your explanations
around personalised examples, so that the marker can be confident that your
report is your own work.
Start by reading both of these introduction sections. They will give you a
general overview of what Human Computer Interaction is all about.
What’s the Big Picture?
Then read one (or both if you're keen) of these sections on usability, in order to
understand the kinds of things you will be looking for in your usability
evaluation.
Interface Usability
Usability Heuristics
Project
In this project, you will carry out a usability evaluation by observing a helper
carry out a specific task on an interface you have chosen. While you could
theoretically do the task yourself and write down where you had difficulty, it is
surprisingly challenging to notice and be objective of usability issues you are
facing yourself.
Choosing an interface
It is essential that the interface you choose is one that your helper is not
already familiar with.
Because you will need to compare related interfaces for excellence, make sure
you choose an interface for which you will also be able to find a second related
interface to compare with (e.g. two different alarm clocks or two different flight
booking systems). The second interface should also be one that your helper is
not familiar with (otherwise they may be biased towards the one they are
familiar with).
Some possible pairs of interfaces you could use are: (Although remember that
this list is far from exhaustive)
• Online booking systems for two different airlines (e.g. Air NZ vs Jetstar).
• Two different friends' cell phones.
• Two different email clients you have never used before (don’t forget about
the many webmail clients. Even signing up for webmail addresses could
prove to be challenging in some cases).
• Two different microwaves. Cheap microwaves are notorious for being
inconsistent and illogical to use. [Note that running a microwave with
nothing in it will damage it! You would be best to put something inside it
while you are experimenting with its interface. Water in a microwave safe
glass is fine]
• Two different apps/ programs/ for setting an alarm (many exist). You could
choose ones that go on a phone or on your computer, or one of each. A
physical alarm clock would be good.
• Two different drawing programs you have never used before.
Note that an interface you (or your helper) designed yourself is unsuitable
because you will know how it works in great detail.
• Setting an alarm that will ring at 4:25am tomorrow to catch an early flight
(or for a more sophisticated interface, at 7:25am on Monday, Tuesday,
Wednesday, and Friday i.e. all weekdays except Thursday, which perhaps
you have to get up at 6:30 AM to make it to a really early meeting).
• Sending a text to a friend that says “What are you doing at 3pm today?
Want to go for coffee? :-)” (Symbols are good to include in the message, as
these can be challenging to find on some interfaces).
• Changing a phone background to a photo you found online.
• Heating some food or water in a microwave for 1 minute, 20 seconds.
• Booking the cheapest flight that will arrive before 11 AM in Auckland from
Christchurch, on the next Saturday (stop once you get to the part that
asks for payment details!).
• Draw a smiley face with a drawing program. Put your name below the
smiley face.
The two projects in the HCI chapter ("Think aloud protocol" and "Cognitive
walkthrough") provide detailed procedures of how to do an evaluation using a
widely used approach. We recommend choosing one of these (depending on
the kind of interface).
Whichever approach you take, tell your helper what the task is, and give them
one of your chosen interfaces so that they can carry out the task. While they
are carrying out the task, you should be observing and keeping notes on the
steps they take, paying particular attention to any points at which they are
confused, select an incorrect option (or menu), have to use trial and error (e.g.
they know the setting they want is probably in one of three menus, but have to
check all three), something they didn't expect happens, wasted time following
a dead end, or knew what to do due to useful prompts on the interface (e.g.
meaningful icons or naming). Ideally, they will be verbalising their thought
process while attempting the task (as in the think-aloud protocol), although
keep in mind that some people find this challenging to do.
If you have more than one task for the interface, repeat the above process for
each task. Also, if you have a second interface and are aiming for Excellence,
repeat the process with the other interface and the same tasks. Remember to
keep thorough notes on the entire process. You will need them to write a report
that describes and explains what you've observed.
Achieved/ Merit
You should write your introduction before you do the usability evaluation. By
initially thinking about what you would expect to be able to do on the interface
(s) for your task(s), you will be in a better position at the end to evaluate
whether or not the interface(s) lived up to your expectations.
Now think back to sections 3.3 and/or 3.4 of the book and look over the notes
you took during the usability evaluation for your chosen interface(s). Explain
the negative characteristics of the interface(s) which caused your helper
difficulties. Also explain the positive characteristics of the interface(s) which
made it easier for your helper. Be sure to briefly describe the context of each
characteristic (e.g. what was the user trying to accomplish at the time? What
were they expecting to see happen)? If you have two interfaces, then write up
two examples for each interface. If you have one interface, then write up three
or four examples for it.
Remember that short and concise paragraphs are far better than long winded
rambling. You might have noticed ten different usability issues, but instead of
writing about all of them, it is better to pick the three or four that are most
likely to come up and are the most serious, and explain them well.
Excellence
In order to meet the excellence criteria, you need to "discuss how different
factors of a user interface contribute to its usability by comparing and
contrasting related interfaces". Therefore, you now need to discuss how the
usability of the two interfaces compares. What was different between the two
interfaces? Which interface did your helper find the easiest to use? Which did
they prefer using? Why? If you were designing an interface that could be used
for the same task, but was better than both the interfaces you investigated,
which ideas would you take from each interface? Which ideas would you stay
away from?
Keep in mind that interface design is really challenging to get completely right,
and even the best interfaces still have usability issues. In addition, there are
often trade-offs (e.g. not all features can be listed on the outermost menu). This
is why companies such as Apple, Samsung, and Google put so much money
into interface design. The implication of this for you, the one writing a report
about usability, is that there are not necessarily any "right" answers. Therefore,
you should just focus on explaining and justifying the points you make, and not
worrying about whether or not your views are what the marker "wants" to see.
The key to this topic is writing succinctly. Be careful to not ramble. You might
not be able to include everything you wanted to; this is okay. Just prioritise and
focus on the most interesting 2 or 3 issues for each interface. It is quite
possible to use 2 or fewer pages (including images) in this topic, and infact
some students have done an amazing job with just one page!
Note that if you go over 3 or 4 pages for Human Computer Interaction, then
you may have to use fewer pages for one of the other two topics, which could
be problematic.
In order to fully cover the standard, you will also need to have done projects
covering the topics of Algorithms and Human Computer Interaction, and
included these in your report.
Overview
Programming Languages has the following bullet points in achievement
standard 1.44, which this guide covers. Note that merit is split into two bullet
points.
Excellence: “comparing and contrasting high level and low level (or machine)
languages, and explaining different ways in which programs in a high level
programming language are translated into a machine language
As with all externally assessed reports, you should base your explanations
around personalised examples.
Reading from the Computer
Science Field Guide
You should read and work through the interactives and activities in the
following sections of the CS Field Guide in order to prepare yourself for the
assessed project.
4.1 - What’s the Big Picture? (and an introduction to what programming is,
intended for those of you with limited programming experience)
4.4 - How does the Computer Process your Program? (Compilers and
Interpreters)
It is very important that you actually do the activities in 4.2 (and 4.1 if you
don’t know much about programming).
Project
This project consists of three main components. The first involves making a
couple of examples. The second involves investigating the differences between
high level and low level languages using your examples, and then the third
involves investigating the different ways that high level languages can be
converted to low level languages.
Briefly explain what each of the programs does (ideally you should have run
them). e.g. does it add numbers, or does it print some output?. What output do
your programs give? You do not need to explain how it does it (i.e. no need to
explain what each statement in the program does). The purpose of this is to
show the marker that you do know what your example does, as opposed to just
copying code you know nothing about.
What is the main difference(s) you see between the high level language and
the low level language? Why would a human not want to program in the
language shown in your low level programming language example? What made
modifying the low level programs in the field guide challenging? Given that a
human probably doesn’t want to program in a low level language, why do we
need low level programming languages at all? What is their purpose?
When you wrote your high level program (or modified an existing program),
what features of the language made this easier compared to when you
attempted to modify the low level program? Why are there many different high
level programming languages?
If you have a compiler for the language your high level program example is
written in, how would you use it to allow the computer to run your program?
(Even if your language is an interpreted one, such as Python, just explain what
would happen if you had a compiler for it, as technically a compiler can be
written for any language, and there are infact compilers for Python). What is
the purpose of the compiler?
Excellence
What about an interpreter? How does the interpreter’s function differ from a
compiler in the way interpreted programs and compiled programs are run?
Which is mostly used?
Here are some ideas for comparing compilers and interpreters: One way to
consider the difference is to explain what happens if a program is transferred
from one computer to another. Does it still run on the other computer? Does
someone else need the same compiler or interpreter to run your software? Can
you type in each line of a program and have it executed as you type it, or does
the whole program have to be available before it can be run?
• ½ page: Example of low level program (If it is larger than this, you've
probably put too much or not resized it as well as you could have)
• ½ page: Example of high level program (If it is larger than this, you've
probably put too much or not resized it as well as you could have)
• ½ page: High Level and Low Level languages discussion
• ½ page: Compilers and Interpreters discussion
Note that if you go over 2 pages for Programming Languages, then you may
have to use fewer pages for one of the other two topics, which could be
problematic. No other material should be included for Programming Languages.
2.44 Assessment Guide
This document provides a brief introduction to teachers on the Computer
Science Field Guide assessment guides for NCEA Achievement standard
AS91371 (2.44).
While there has previously been a recommendation that the same device be
used for all aspects of this standard, this can limit the range of observations
that students can make (for example, some interfaces that are good to
evaluate don't make it easy to find out how they represent data or might not
use an aspect of encoding). If a student chooses to use a common theme then
that is fine, but if their choice of device or theme doesn't have the richness or
transparency to see how all aspects work, it is better to use different devices or
examples for different aspects of the standard. Also note that the examples do
not have to be related to a "device", for example it is fine to evaluate the
interface of an interactive website for Human Computer Interaction or look at
check digits on credit cards for Error Control Coding.
Topics
2.44 has bullet points for the following topics in computer science.
Each of these topics has a chapter in the Computer Science Field Guide, which
this assessment guide is based on.
There are multiple assessment guides for representing data and the encoding
topics, of which students need to do a subset. The following explanations
outlines what students should cover.
Representing Data using Bits
Students should choose two data types. To get achieved, they should give
examples for both their data types of the data type being represented using
bits. To get merit, they should show two different representations using bits for
each data type, and then compare and contrast them. This topic does not have
excellence requirements. For this reason, students going for excellence should
put more time into the discussions for encoding and human computer
interaction than representing data using bits.
The following table shows common types of data that students could choose.
For achieved, they should choose two rows in the table, and do what is in the
achieved column for their chosen rows. For merit they should satisfy the
achieved criteria, and additionally choose one data type in the merit column for
each of their chosen rows, to compare with their ones from the achieved
columns.
Note that data types and representations currently covered in the field guide
are in italics. Binary numbers is a prerequisite for colours, and are
recommended for all students. Students who struggle with binary numbers
should just aim to represent a few numbers in binary (e.g. their age, birthday,
etc) and then move onto representing text.
For example: for achieved, a student might choose the Characters/ Text and
Binary Numbers rows, and therefore show examples of ASCII and Positive
Numbers in their report. Another student who hopes to get at least merit in the
standard might pick the same two rows, and therefore will cover ASCII and
Positive Numbers, but they will also show examples of Uncode and Floating
Point Numbers (they could have picked any of the many suggestions in that
row). They will then compare Unicode and ASCII, and then Floating Point
Numbers and Positive Numbers. They should not do comparisons across data
types (e.g. they should not compare a text representation with a number
representation, as that does not make sense to do).
Most of the data types are based on binary numbers. Therefore, all students
will need to learn how to represent whole numbers in binary before writing their
report. However, they do not have to choose Binary Numbers as one of their
two topics. They can just learn to represent whole numbers in binary, and then
move on to using the numbers to representing Images/ Colours or Sound. Most
students will find those topics much more interesting to evaluate, and easier to
satisfy the merit criteria with - they can actually hear and see the varying
Sound and Image qualities that using fewer or more bits leads to, and therefore
include this in their discussions.
Generally, the students who are only aiming for achieved will be best picking
binary numbers (Positive Numbers) and text (ASCII), as these are the most
straightforward data representations.
Another issue is that hexadecimal is not a good example for students to use as
a different representation of data, as it is simply a shorthand for binary. Writing
a number as 01111010 (binary) or 7A (hexadecimal) represents exactly the
same bits stored on a computer with exactly the same meaning; the latter is
easier for humans to read and write, but both are 8-bit representations that
have the same range of values. It is a useful shorthand, but shouldn't be used
as a second representation for a type of data, or as a different type of data.
Encoding
Students need to describe each of the three encoding topics in order to get
achieved, and additionally they need to do a more in-depth project on one of
the three topics in order to get merit or excellence.
Students should choose a subset of the provided projects that cover one of the
following options. The first three options are for students aiming for merit/
excellence. The fourth option is for students just aiming to get achieved.
Up to Only
Compression Only Achieved Only Achieved
Excellence Achieved
Up to Only
Encryption Only Achieved Only Achieved
Excellence Achieved
Note that some assessment guides provide projects that cover only achieved,
and others go to excellence. Students should choose appropriate assessment
guides based on the option they have chosen. It is best to do one topic to the
excellence level and to focus on doing a really good job, as opposed to doing a
not so good job on two or three.
At the excellence level students are required to evaluate "a widely used system
for compression coding, error control coding, or encryption". The guides discuss
some widely used systems, but it is worth noting that only one system has to
be considered (e.g. JPEG is a widely used compression system, so evaluating
JPEG would be sufficient; an alternative would be checksums used in bar
codes). The evaluation would need to involve a comparison with not using the
system, so for JPEG it might be with a RAW or BMP file; for bar codes, it would
be to consider what would be different if a check digit isn't used. In some cases,
it might make sense to compare the chosen widely used system with a
mediocre alternative (that isn't widely used). One example where this would
work is comparing the RSA crypto-system (widely used) with Caeser Cipher (no
longer used in practice).
In particular, students only need to discuss one interface. They should not
discuss a second one or attempt to do comparisons, as this implies they did not
read the requirements of the standard.
Reciting well known examples of heuristics (e.g. you turn off the Windows
computer by clicking the "Start" button) is not a good idea, because it is poorly
personalised.
In their report, it is important that the material for each topic is kept together. It
does not matter what order "Representing Data using Bits", "Encoding", and
"Human Computer Interaction" are presented in the report, as long as there is
one main heading for each, with all the relevant material below it. The three
encoding topics should all be under "Encoding", and seperated into three
subheadings of "Encryption", "Compression", and "Error Control Coding". It
would also be best to have two subheadings under "Representing Data using
Bits" - one for each of the two chosen data types. Following a logical structure
such as this ensures the marker can easily find their way around the report,
and will not overlook anything. This also helps to show that the student
understands the similarities and differences between the various topics.
Report Length
It is important to note that the page limit given by NZQA is not a target. The
markers prefer reports that are short and to the point. Additionally, students
and teachers have complained that there is too much work in computer science
reports. In many cases, this seems to be students and teachers doing far more
than what was needed for the standard.
Note that the breakdown assumes the student is aiming for excellence - those
only going for achieved will need less. Also note the wide recommended range
of 7 pages to 13 pages. It is the top excellence students who will be able to
cover the standard in just 7 pages. There is no gain in trying to bulk the length
of the report. A report that satisfies the requirements for achieved in 5 pages
may be more likely to get an achieved than one that is 14 pages bulked out
with irrelevant material and copypasta. Including unnecessary material implies
a lack of understanding.
It is also worth noting that some of the best examples we have seen of
students doing the Human Computer Interaction were done in under 1 page.
These high achieving students were able to concisely cover the requirements
by writing 3 or 4 paragraphs following a pattern along the lines of: A problem I/
my observer found with the device was . . I would fix this by . Teachers should
keep this in mind, as we've seen cases where the teacher thought the student
had not written enough and therefore couldn't possibly be at the excellence
level, when in fact it is beyond the minimum requirement for excellence!
Note that this is the only guide you need to follow to satisfy the data
representation requirements up to achieved. It covers two different data types -
numbers and text.
In order to fully cover the standard, you will also need to have done projects
covering the three encoding topics up to the achieved level (error control
coding, encryption, and compression), and a project covering human computer
interaction, and include these in your report.
Overview
The topic of Data Representation has the following achieved bullet point in
achievement standard 2.44, which this guide covers.
As with all externally assessed reports, you should base your explanations
around personalised examples.
Read all of these sections, as they give the necessary introduction of the topic
What’s the Big Picture?
Project
Start this section by writing an introduction to the topic of data representation.
Describe what a "bit" is, and why computers use bits to represent data. This
introduction only needs to be a couple of sentences - you are just showing the
marker that you understand what a "bit" is, and how "bits" are used to
represent data. It must be in your own words, based on what you understood in
class (e.g. do not paraphrase a definition).
Next, you are going to show an example of how the day and month (number
form) of your birthday are represented in binary. If you don't want to include
your real birthday in your report, you can make one up, but it must not be the
same birthday that somebody else in your class is using. For each of the two
numbers, show the working you use to get the binary representation, the final
binary representation using 1's and 0's, and a description of what you did to get
the binary representation.
Finally you are going to show one other type of representation - the ASCII
representation, which is used to represent strings of text. You will do this by
showing your first name and favorite food in ASCII. Find an ASCII table (there is
one in the field guide). Now, use the table to convert both the words to ASCII.
Briefly describe how you used the ASCII table to convert these words to ASCII.
You should not include the ASCII table in your report.
Remember that you only need to do one of the three encoding topics
(compression, encryption, and error control coding) to the excellence level. If
you are either not interested in getting more than achieved, or are doing either
encryption or error control coding to the excellence level, then this is the right
guide for you. If you were wanting to do compression up to the excellence level,
then you should select the alternate excellence guide instead.
In order to fully cover the standard, you will also need to have done projects
covering the topics of encryption and error control coding to at least the
achieved level (with one of them to the excellence level if you are attempting
to get more than achieved), and projects covering the topics of representing
data using bits and human computer interaction, and include these in your
report.
Overview
Encoding has the following achieved bullet points in achievement standard 2.44
which this guide covers.
As with all externally assessed reports, you should base your explanations
around personalised examples.
Reading from the Computer
Science Field Guide
You should read and work through the interactives in the following sections of
the CS Field Guide in order to prepare yourself for the assessed project.
Read all of these sections, as they give the necessary introduction of the topics
Project
Start this section by writing an introduction to the topic of compression. Briefly
explain what compression is, what it is used for, and what kinds of problems
would exist if there was no such thing as compression. This introduction only
needs to be a few sentences - you are just showing the marker that you
understand the bigger picture of what compression is, and some of the typical
uses of it.
Count how many characters are needed to represent your image in its original
form (i.e. how many squares does it contain?). Count how many characters
were used in your Run Length Encoding representation. Don’t forget to include
the commas! (check the field guide or read the teacher note if you don't
understand why we say you must count the commas). How well did Run Length
Encoding compress your image? You might choose to use the Run Length
Encoding interactive.
Briefly describe what Run Length Encoding is, relating back to the example you
have just put in your report. Regardless of whether you use the interactive or
calculate the run length encoding by hand, describe how you arrived at your
answer for a couple of lines in the image. The marker needs to be able to know
that you understand the example, and what it is of.
Remember that you only need to do one of the three encoding topics
(compression, encryption, and error control coding) to the excellence level. If
you are either not interested in getting more than achieved, or are doing either
encryption or compression to the excellence level, then this is the right guide
for you. If you were wanting to do error control coding up to the excellence
level, then you should select the alternate excellence guide instead.
In order to fully cover the standard, you will also need to have done projects
covering the topics of encryption and compression to at least the achieved
level (with one of them to the excellence level if you are attempting to get
more than achieved), and projects covering the topics of representing data
using bits and human computer interaction, and include these in your report.
Overview
Encoding has the following bullet points in achievement standard 2.44 which
this guide covers.
As with all externally assessed reports, you should base your explanations
around personalised examples.
Reading from the Computer
Science Field Guide
You should read and work through the interactives in the following sections of
the CS Field Guide in order to prepare yourself for the assessed project.
Read all of these sections, as they give the necessary introduction of the topics
Project
Start this section by writing an introduction to the topic of error control coding.
Briefly explain what error control coding is, what it is used for, and what kinds
of problems would exist if there was no such thing as error control coding.
Briefly describe what a check digit is, and how it fits into the larger topic of
error control coding. This introduction only needs to be a few sentences - you
are just showing the marker that you understand the bigger picture of what
error control coding is, and some of the typical uses of it.
Now you are going to show an example of error control coding in action to
include in your report. Get the packaging for a food you like, ensuring it has a
barcode on it. Take a photo of the barcode and include it in your report, with a
small caption saying what the product is and that you are going to use it to
investigate check digits.
Enter the barcode number into the interactive in the field guide, and take a
screenshot of the interactive showing that the barcode number is valid. Change
one digit of the barcode number in the interactive and show that the interactive
now says it is invalid.
Briefly describe how the barcode number checker interactive was able to
determine whether or not the barcode number was valid.
Remember that you only need to do one of the three encoding topics
(compression, encryption, and error control coding) to the excellence level. If
you are either not interested in getting more than achieved, or are doing either
encryption or compression to the excellence level, then this is the right guide
for you. If you were wanting to do error control coding up to the excellence
level, then you should select the excellence guide that uses check digits
instead. Note that we do not currently have an excellence guide for parity.
In order to fully cover the standard, you will also need to have done projects
covering the topics of encryption and compression to at least the achieved
level (with one of them to the excellence level if you are attempting to get
more than achieved), and projects covering the topics of representing data
using bits and human computer interaction, and include these in your report.
Overview
Encoding has the following bullet points in achievement standard 2.44 which
this guide covers.
As with all externally assessed reports, you should base your explanations
around personalised examples.
Reading from the Computer
Science Field Guide
You should read and work through the interactives in the following sections of
the CS Field Guide in order to prepare yourself for the assessed project.
Read all of these sections, as they give the necessary introduction of the topics
Project
Start this section by writing an introduction to the topic of error control coding.
Briefly explain what error control coding is, what it is used for, and what kinds
of problems would exist if there was no such thing as error control coding.
Briefly describe what parity bits are, and how they fit into the larger topic of
error control coding. This introduction only needs to be a few sentences - you
are just showing the marker that you understand the bigger picture of what
error control coding is, and some of the typical uses of it.
You will need to choose somebody else (e.g. a classmate, or even somebody at
home) to be your helper. You take the role of the magician, and they take the
role of laying out the initial grid (before you add parity bits) and flipping a card
while you are not looking.
Carry out the parity trick with your helper, taking the following photos.
• The original 5x5 grid that your helper lays out (you might want to take this
photo at the very end if your helper hasn't seen the trick before - you don't
want to draw attention to what you are doing and ruin it for them!)
• The grid after you have added the parity bits.
• The grid after your helper has flipped a card (edit the image afterwards,
circling the flipped card).
For each of your photos, you will need to write (briefly, remember this is only at
the achieved level) what was done. Where you added the parity bits, you
should describe how you knew which way up to put the cards. For the final
photo where you identify the flipped card, you should describe how you knew
which it was.
Finish this section by briefly (one or two sentences) describing what the cards
represent, and what flipping a card represents. This should be straightforward -
you are simply linking the parity trick back to actual computers.
Remember that you only need to do one of the three encoding topics
(compression, encryption, and error control coding) to the excellence level. If
you are either not interested in getting more than achieved, or are doing either
compression or error control coding to the excellence level, then this is the
right guide for you. If you were wanting to do encryption up to the excellence
level, then you should select the RSA cipher assessment guide instead. Note
that there is no excellence level assessment guide for caesar cipher - while it is
great for the achieved criteria, it is unsuitable for merit/ excellence due to not
being used in the real world.
In order to fully cover the standard, you will also need to have done projects
covering the topics of compression and error control coding to at least the
achieved level (with one of them to the excellence level if you are attempting
to get more than achieved), and projects covering the topics of representing
data using bits and human computer interaction, and include these in your
report.
Overview
Encoding has the following achieved bullet points in achievement standard 2.44
which this guide covers.
Read all of these sections, as they give the necessary introduction of the topics
Substitution Ciphers
If you are really keen, you might like to read further into the problems with
substitution ciphers, although note that this is optional because it is not
necessary for the project in this guide.
Project
Start this section by writing an introduction to the topic of encryption. Briefly
explain what encryption is, what it is used for, and what kinds of problems
would exist if there was no such thing as encryption. This introduction only
needs to be a few sentences - you are just showing the marker that you
understand the bigger picture of what encryption is, and some of the typical
uses of it.
Now you are going to make an example of encryption being used to encrypt a
sentence. For this example, you should use Caesar Cipher. Despite it not having
being used in practice for a very long time, it is ideal for illustrating the basic
ideas of encryption. Your example should consist of the following components,
all of which should be included in your report.
1. Start by writing a short sentence (a few words will be enough, and keep it
appropriate for your report). You will be encrypting this sentence.
2. Choose a number between 1 and 25 that will be your encryption key.
3. Make a conversion table that shows how each letter in your original
sentence should be changed using your encryption key (format the
conversion table to 3 columns in your report to make it compact and easy
to read)
4. Encrypt your original sentence using the conversion table you have
generated.
Briefly describe what you have done throughout the various parts of your
example, and be sure your descriptions include the terms "plain text", "key",
and "cipher text" in the relevant places.
In order to fully cover the standard, you will also need to have done projects
covering the topics of Representing Data and Encoding, and included these in
your report.
Overview
Human Computer Interaction has the following achieved bullet points in
achievement standard 2.44, which this guide covers.
As with all externally assessed reports, you should base your explanations
around personalised examples, so that the marker can be confident that your
report is your own work.
• List all the heuristics (it's not necessary and wastes space/ bulks out the
report unnecessarily)
• List examples you were given in class or read about on the Internet (the
marker is only interested in the heuristic violations you identify and
classify).
Merit/ Excellence
Because this is an achieved only guide, you should follow the merit/
excellence guide for HCI instead if you are aiming for higher than achieved.
In order to fully cover the standard, you will also need to have done projects
covering the topics of Representing Data and Encoding, and included these in
your report.
Overview
Human Computer Interaction has the following bullet points in achievement
standard 2.44, which this guide covers.
As with all externally assessed reports, you should base your explanations
around personalised examples, so that the marker can be confident that your
report is your own work.
Interface Usability
Then read the section on usability heuristics. You will need to understand this
material well.
Usability Heuristics
Project
Achieved
Everybody has experienced times where a computer system does not do what
they expect and/ or they cannot get it to do what they want. A natural reaction
to this for many people is to blame themselves. However, in almost every case,
the real problem is with the design of the interface. In the field guide or in
class, you will have learnt a lot about Nielson's Heuristics. If you think back on
some of the times where you've had difficulty with a computer system, you'll
probably be able to identify which of Nielson's Heuristics were violated.
To start this section of your report, you should write a brief (around half a page)
introduction to Nielson's heuristics, using a couple of examples (preferably two
good ones, but no more than three) you have come across in your day to day
life to illustrate your introduction.
• List all the heuristics (it's not necessary and wastes space/ bulks out the
report unnecessarily)
• List examples you were given in class or read about on the Internet (the
marker is only interested in the heuristic violations you identify and
classify).
Merit/ Excellence
Next, you are going to carry out a usability evaluation using heuristics. If you
did 1.44, this might seem a lot like what you did then. However, you should
keep in mind the following points:
• You must use heuristics. While in 1.44 the heuristics were optional, in 2.44
they are a requirement.
• You only need to evaluate one interface.
• Excellence is not comparing multiple interfaces, but instead using your
own intuition to suggest improvements which would address the usability
problems you identify.
Once you have three or four issues written down and know which heuristic was
violated (make sure there are at least two different heuristics across all the
issues you identify), you can stop.
Writing about the usability evaluation in your
report
Put a heading that says something like "Usability evaluation" to make it really
clear to the marker that you are now doing your usability evaluation. Start the
section by writing a paragraph stating what device you choose and which tasks
you carried out. Also briefly explain how you carried out your evaluation.
Then, for two or three of the issues you identified, write a paragraph that
includes the following:
The length of the paragraphs will depend largely on how concisely you can
write. For this reason, keep in mind that more is not necessarily better.
You might like to include images of usability issues (e.g. strange layouts of
buttons, strange icons, etc), but be sure these don't take up more than 1 page
in total, are clear, and contribute directly to your explanations. Images are not
required, so don't include them if you don't need them.
Finish up with a conclusion of two or three sentences. Was the interface a good
design overall? Why did you reach this conclusion? While the conclusion is
supposed to be your opinion, keep your focus on the usability of the interface.
Stay away from external factors, such as your view towards the company as a
whole. For example, saying "It sucks because it is Internet Explorer" is not what
the marker is looking for. Instead, say "Internet Explorer was difficult to use
because ".
Keep in mind that these are maximums. Concise writing will make it easily
possible to cover the requirements in less space.
In order to fully cover the standard, as well as one more investigation on data
representation for another type of data, you will need to do projects covering
the three encoding topics up to the achieved level (error control coding,
encryption, and compression), follow one of the three coding methods to
excellence level, and a project covering human computer interaction, and
include these in your report.
Overview
The topic of Data Representation has the following bullet points in achievement
standard 2.44, which this guide covers.
This guide focusses on one of the types of data you will need to cover (you will
need to cover two).
As with all externally assessed reports, you should base your explanations
around personalised examples.
Project
Writing an Introduction to Data Representation
You only need to do this if you have not already done it (the other
guide you follow for data representation will also tell you to write this
intro)
This explanation must be in your own words, based on what you understood in
class (e.g. do not paraphrase a definition).
Choose two text samples, which you will use to explain Unicode and the
representations for it. One text sample should be in English, and the other
should be in an Asian language, for example Japanese. The text samples should
be no longer than 50 characters long. You should be able to find a suitable text
sample online, for example by visiting a Japanese forum. Check it in Google
Translate to ensure it is appropriate for your report.
If you are confused, reread the Unicode section in the CS Field Guide or ask
your teacher for help.
These should not take up more than ⅓ of a page in your report, for both of
them. If they take more space, than use smaller samples.
Showing examples of UTF-32 representation
Show how the first character in each of your samples is represented using
UTF-32. Briefly explain what UTF-32 does.
1. Look up the character in the Unicode table to get its Unicode number
2. Convert the Unicode number into binary
3. Look at the UTF-8 conversion table to find the correct pattern to use
4. Fill in the blanks in the pattern with the bits in the character's binary
representation
Include a table in your report that shows the size for each text sample, with
each representation.
Explain your findings. Which seems to be the best for English text? Which
seems to be the best for Asian text? Why is this the case?
(optional) If you are keen, you might like to read more about UTF-16 and
UTF-32 to try and figure out why you got the results that you did.
This (including the table) should not take more than ⅔ of a page in your report.
Hints for success
• Your 2.44 report should be structured in the following way. Note that the
bolded parts are recommended headings. Details have not been included
for sections covered in other guides. Look carefully at the Representing
Data section to ensure you have structured your report properly. This will
help the marker find what they need to find.
• Representing Data
• Representing Text
• Encoding
In order to fully cover the standard, as well as one more investigation on data
representation for another type of data, you will need to do projects covering
the three encoding topics up to the achieved level (error control coding,
encryption, and compression), follow one of the three coding methods to
excellence level, and a project covering human computer interaction, and
include these in your report.
Overview
The topic of Data Representation has the following bullet points in achievement
standard 2.44, which this guide covers.
This guide focuses on one of the types of data you will need to cover (you will
need to cover two).
As with all externally assessed reports, you should base your explanations
around personalised examples.
Project
Writing an Introduction to Data Representation
You only need to do this if you have not already done it (the other
guide you follow for data representation will also tell you to write this
intro)
This explanation must be in your own words, based on what you understood in
class (e.g. do not paraphrase a definition).
Representing Positive and Negative Numbers in
Binary
This is the main part of your project.
This section will be around 2 pages in your report. There are 4 parts to it.
You will need to pick three numbers, which you will you use to illustrate the
various ways of representing numbers in binary. Your chosen numbers should
not be in field guide examples.
The numbers should be within the ranges, to ensure they work with all the
representations you'll be using them for.
This section should not take up more than ¼ of a page. Keep it very brief, and
you should be able to explain it in no more than 1 - 2 sentences.
Common pitfall warning: Don't forget to remove the sign bit before
calculating the Two's Complement representation!
The field guide showed a few examples of adding and subtracting binary
numbers using a simple sign bit, and then Two's Complement. These examples
illustrated that Two's Complement is far easier to work with than a simple sign
bit. You'll now be doing your own calculations to illustrate this point.
Start by ensuring you have all the following binary representations ready to use
(some of them you will not have done yet so will need to do now, but do not
show any more conversions to binary in your report).
All the numbers should be 8 bits long. Add leading 0's if needed, but put them
after the simple sign bit if there is one.
Use this first calculation to show how two positive binary numbers can be
added.
This calculation shows whether or not subtraction can be done with two
positive numbers using simple sign bits, without making a special case for the
sign.
This calculation shows what happens when a bigger number is subtracted from
a smaller number.
Hint: You should find that calculation 1 works (it's just adding positive
numbers, so no surprise), calculation 2 fails (the sign bits mess up the
calculation), and calculations 3 and 4 work (thanks to Two's Complement).
This section should not take more than 1 page in your report.
Hints for success
• Your 2.44 report should be structured in the following way. Note that the
bolded parts are recommended headings. Details have not been included
for sections covered in other guides. Look carefully at the Representing
Data section to ensure you have structured your report properly. This will
help the marker find what they need to find.
• Representing Data
• Representing Numbers
• Encoding
You will need to do one more project in data representation in addition to this
one, because the standard requires you to cover two. This guide only covers
text. Note that there is no excellence requirements for data representation -- it
only goes up to merit. This guide is called an excellence guide though to avoid
confusion about whether or not it is suitable for students aiming for excellence.
In order to fully cover the standard, you will also need to have done one more
project on data representation, projects covering the three encoding topics up
to the achieved level (error control coding, encryption, and compression), and a
project covering human computer interaction, and included these in your
report.
Overview
The topic of Data Representation has the following bullet points in achievement
standard 2.44, which this guide covers.
This guide focusses on one of the types of data you will need to cover (you will
need to cover 2).
As with all externally assessed reports, you should base your explanations
around personalised examples.
Reading from the Computer
Science Field Guide
• Representing whole numbers with Binary - It is important that you
understand this section really well before going any further, as every other
concept is based on it.
• Images and Colours - You will need to read this entire section well, as the
project is based on it.
Project
Writing an Introduction to Data Representation
You only need to do this if you have not already done it (the other
guide you follow for data representation will also tell you to write this
intro)
This explanation must be in your own words, based on what you understood in
class (e.g. do not paraphrase a definition)
This section will be around 1 to 1½ pages in your report. There are 4 parts to it.
1. Choosing a colour that you like, explaining how it is represented with red,
green, and blue components, explaining why computers represent colours
in this way.
2. Converting your chosen colour to its 24 bit binary representation,
explaining the process clearly.
3. Showing how your colour would be represented with 8 bits, and explaining
whether or not the colour is still the same as it was with 24 bits, and why.
4. Discussing why we commonly use 24 bit colour as opposed to 8 bit colour
or 16 bit colour, but not a higher number of bits, such as 32 bits or more.
Show the 24 bit representation for your chosen colour, clearly explaining the
process you used to get to it.
Explain whether or not your colour looks different with just 8 bits. How many
different colours can be represented with 8 bits? And what about 24 bits? Why
is it impossible to represent every colour that can be represented with 24 bits,
with just 8 bits?
The field guide contains a few ideas, and you may find the interactives useful
for experimenting, although you should do some of your own reading as well.
Wikipedia is a good place to start, although remember not to paraphrase
information. Put it into your own words, and relate it back to your own
understand of representing colours.
Be sure that your final report is printed in colour, if you include images.
• Representing Data
• Representing Colours
• Encoding
In order to fully cover the standard, you will also need to have done a project in
one other 3.44 topic. The other project should be in either Complexity and
Tractability, Artificial Intelligence, Formal Languages, Network Protocols, or
Graphics and Visual Computing.
For this project, students will need to interview somebody from the
software engineering industry about what software methodologies (Agile
and Waterfall) their company uses. It would be best to pick a company that
uses Agile (which is now very common). They will also need a couple of
examples of projects they have done themselves: one where they used an
Agile approach, and one where they used a Waterfall approach. The
projects could either be ones they do this year, or ones they have done in
the past.
The teacher should arrange for the class to visit a local software company
where there is a software engineer they can interview, or for a software
engineer to visit the class. The students should prepare “interview”
questions before the visit, to ask the software engineer in order to get the
information they need to complete their report.
To get an example of an Agile project the student did: some of the level 3
digital technology standards, such as 3.46 or 3.43, give students an
excellent opportunity to put agile processes into practice. If they were to
approach these standards using a Waterfall approach (by trying to design
their entire project and then implementing their entire project, and then
testing their entire project), they would be risking wasting a lot of time
unnecessarily or completely failing at them when something takes longer
than expected or doesn’t go as planned. Instead, it is far better to work in
small increments, designing and then implementing small pieces of
functionality so that they always have something that works that they can
show to their teacher. It is possible they are already using Agile processes
in this way without realising it.
By including both a real world project a software engineer told them about
and some of their own projects, students will be able to both consider the
real world issues, and have a deeper personalised understanding of the
issues than if they were only told about them by the software engineer.
This approach to the standard seems to have worked well for students in
the past.
Students who chose this project for 3.44 will need to have the following
skills (keep this in mind when picking projects, because different projects
require different main skills):
These skills are also important for software engineers, for very similar
reasons. It isn’t just about programming!
Overview
Note that a plural means “at least two” in NZQA documents. Because this is
one of the two areas of computer science that the student must cover, most of
the plurals effectively become singular for this project, which is half their
overall report. This project will give them at least one algorithm/technique and
practical application, and the other project they do will give them another of
each.
Project
This project is based on the “Software Processes” project in the CS Field Guide.
Your teacher will arrange for a software engineer to visit your class or for your
class to visit a local software company. In this visit, you will interview a software
engineer to find out about the software development practices their company
uses and how this impacts their work, in particular Agile processes. This guide
assumes an agile process is used by the company interviewed, because this is
generally used in large and innovative software projects - the projects that are
the most challenging to bring to success. Before the interview, you need to
come up with a list of interview questions which will help you to get the
information that you need to complete this report.
In addition, you should also have your own experience using Agile processes
(and Waterfall processes). You might have even used a form of Agile without
realising it! Being able to discuss these experiences and how they were
impacted by the user of Waterfall and Agile will be valuable for your report.
Some ideas are: Working on a large program or website, you might have
designed, implemented, and tested small parts of the functionality at a time, as
opposed to trying to design your entire project, then implement it, and then
test it.
In other technology projects, you might have used the Waterfall Process. For
example, in hard materials technology, you would typically do analysis to get
ideas, and then come up with a bunch of designs, and then pick one to actually
build. In other digital technology projects, you might also have used the
Waterfall Process. In some cases, this would have worked well (if the project
was simple for you and you fully understood it). In other cases, you may realise
that an Agile Process would have been more effective.
Follow the page limit recommendations, as these will help you to ensure you
have enough room in your report for everything. Your goal should be to write
the best possible explanations and discussions you can within the page limits.
Try to write concisely and to only include the most interesting points (you don’t
need to include everything you know on the topic). Some points also have
paragraph recommendations, to help prevent you writing more than you
needed to.
The kinds of things you should include are: What is the key problem you are
discussing in your report? Start the first sentence of your report with “A key
problem in software engineering is…”.. Also make sure you include some of the
risk factors for projects failing (e.g. new ideas, size, characteristics of the
engineers working on it…) This will ensure the marker can easily tick off bullet
point A1. (1 paragraph is enough)
What is some motivation for being interested in the key problem? e.g, what
interesting software disasters have you read about? What are the
consequences of not paying attention to it? (1 paragraph, or possibly 2 short
paragraphs should be enough) What company did you visit (or that visited
you), and what kind of software do they make? What methodology do they use?
(just 1 short paragraph is enough)
What are your own projects that you will be discussing in your report? Describe
each project in one sentence, and state (without defining or justifying) whether
it is your example of Agile or Waterfall. If it wasn’t purely one or the other, state
what it is closest to, and you can discuss why later in your report (i.e. no details
are required at this point).
This information is mostly to satisfy the first bullet point, and to convey key
information to the marker about your personalised examples that your report
uses.
Explaining how Agile Processes and the Waterfall
Process are applied (Achieved/ Merit/ Excellence)
[A2, M1]
Additional Information: This section mostly covers M1, but will help with
A2
Even students that are only aiming for achieved should do this part. It will
help count towards A2 as well. Note that forgetting to address this bullet
point is a common reason that students who have otherwise done great
work did not get excellence. They could potentially include it in the next
section, although we recommend doing it explicitly here, unless they are a
very good writer. Students who mix this section and the next one should be
very careful that their report does satisfy all the criteria, in particular
ensuring that they have clear examples of how Agile and Waterfall are put
into practice (that is required for M1 and good students frequently overlook
it).
The point of this section is to define what the Waterfall process and Agile
processes are, and give some specific examples of how Agile methodologies
are actually implemented in practice, that is, what specific practices mean that
the overall methodology the software engineer you interviewed is using Agile?
And what practices did you use in your own project(s) that made them Agile?
You don’t need to go into too much depth here, you just need to give several
examples of practices, and make sure you have explained them, so that it is
clear to the marker what they mean. (For example, just saying “X company
does development in sprints” is not enough. You need to explain what a sprint
actually is). You should also include 1 or 2 examples from one of your own
Waterfall projects (e.g. the order in which you did the phases in it).
• Explain what the Waterfall process is, and the key steps in it.
• Explain what Agile processes are, and the key steps in it, ensuring that
your explanation is clear in why they are different to the Waterfall process.
• Give a list of examples of ways in which the Agile and Waterfall methods
are put into practice. Each example should be something you did in one of
your projects, or from what the software engineer you interviewed told
you. State what the practice is, which of your example projects it is from,
and explain what it means (especially where jargon is involved). Don’t go
into depth about the implications, you should do this in the next section.
This information for this section could safely be presented as bullet points,
as long as the bullet points are each 2 to 3 full sentences.
These points could be in the form of something like “xxxx company has
stand up meetings every morning. These go for xxxx minutes, and are so
that everybody in the team can catch up with what each other is doing,
and xxxx”. The key idea is that they are short, but long enough for the
student to briefly explain, and they are personalised based on the situation
they have heard about.
In this section, you now need to pull everything together, and discuss how your
example projects were impacted by whatever methodology was used.
Remember that your example projects are the one that the software engineer
you interviewed was working on, and a couple of the projects you have done
yourself (one Agile and one Waterfall). The way you go about this is very open
ended, although some key points you might like to consider when discussing
each project are:
• What the project involves, and its characteristics (size, novelty, number of
people involved in it, etc).
• Some examples of the Agile/ Waterfall processes in action in the projects
(this is more in-depth than the previous section, where you were just
trying to define what the Waterfall Process and Agile Processes actually
are).
• Whether or not the process being used is effective for the project.
• What the positives of the process being used for the project are.
• What the downsides of the process being used are for the project.
• What you would do differently (with regards to Waterfall and Agile) if you
were to do the project again (for the ones that are your projects).
• What would happen if the company you interviewed used Waterfall instead
of Agile? Would they still be able to develop their software effectively? Do
they consider Agile to be effective? Is there anything about Agile that still
isn’t ideal for them? (The questions you ask them when you interview
them will be essential for this.)
• How well does the Agile process address the key problem you outlined in
your introduction?
• What shortcomings does the Agile process have? What kinds of ideas are
being considered to address them?
• Are there any modifications the company you interviewed are considering
to address issues in their own Agile Process?
In order to fully cover the standard, you will also need to have done a project in
one other 3.44 topic. The other project should be in either Software
Engineering, Artificial Intelligence, Formal Languages, Network Protocols, or
Graphics and Visual Computing.
Overview
Each project needs to satisfy all bullet points in the standard, which are given
below.
Note that a plural means “at least 2” in NZQA documents. Because this is one
of the two areas of computer science that the student must cover, most of the
plurals effectively become singular for this project, which will make up half of
the overall report. This project will give at least one algorithm/technique and
practical application, and the other project will give another of each.
Note that the term heuristics means “rules of thumb”, and that this should
not be confused with the heuristics covered in Human Computer
Interaction.
In HCI, they are also rules of thumb (for making interfaces usable). The
heuristics in this context are “rules of thumb” for solving the algorithmic
problems encountered in complexity and tractability; a heuristic isn't
guaranteed to give the best possible solution, but usually will give a fairly
good result.
Project
This project is based around a fictional scenario where there is a cray fisher
who has around 18 cray pots that have been laid out in open water. Each day
the fisher uses a boat to go between the cray pots and check each one for
crayfish.
The cray fisher has started wondering what the shortest route to take to check
all the cray pots would be, and has asked you for your help. Because every few
weeks the craypots need to be moved around, the fisher would prefer a general
way of solving the problem, rather than a solution to a single layout of
craypots. Therefore, your investigations must consider more than one possible
layout of craypots, and the layouts investigated should have the cray pots
placed randomly i.e. not in lines, patterns, or geometric shapes.
In this project, you will need to make two maps, a “small” map with 7 or 8
craypots and a “large” map, with 15 to 25 craypots.
Briefly introduce the Cray Pot Problem. You should be able to explain how the
Craypot Problem is equivalent to the Travelling Salesman Problem. Briefly
describe how you determined this, and what the equivalent of a town and road
is in the Craypot Problem. Explain why computer scientists are so interested in
the Travelling Salesman Problem. Include this introduction at the start of the
complexity/ tractability section of your report. We strongly recommend the
following sentence being at the start or end of the introduction “The key
problem I am looking at is [a few words describing the problem]”, so that it is
really clear to the marker.
Showing how the brute force algorithm can be applied to the Cray Pot
Problem + Explaining why it is not helpful to the crayfisherman + Showing
how a greedy heuristic algorithm can be applied to the Cray Pot problem +
Explaining what kind of solution the greedy heuristic algorithm has found,
and why this is more helpful to the cray fisher (A2/M1/M2).
Note that the difference between achieved and merit will be in the quality
of the explanations (i.e whether or not the marker considers them to be
“describing” or “explaining”). Generating the personalised examples is
necessary for achieved, because the marker needs to be able to see that
the student has done their own work.
Generate a map with 7 or 8 craypots using the random map generation method
described above. This is your “small” map. Then make a map with somewhere
between 15 and 25 craypots. This is your “large” map. Make a copy of each of
your maps, as you will need them again. Read the “hints for success” at the
bottom of this guide before making the maps, because it has some advice on
making the maps so that they are legible and minimise the usage of precious
space in your report.
Using your intuition, find the shortest path between the cray pots in your small
map. Do the same with your large map. Don’t spend more than 5 minutes on
the large map - you don’t need to include a solution to the large map in your
report. It is extremely unlikely you’ll find the optimum for the large map
(Recognising the challenges in the problem is far more important than finding a
solution). Number the order in which you visit the cray pots on your map.
Use the field guide interactive to estimate how long it would take a computer to
solve each of your craypot problems and find an optimal solution.
Explain why using the brute force algorithm to find the optimal route is
unhelpful to the cray fisher (remember that they generally have between 20 -
25 cray pots in the water at a time). Save this explanation so that you can
include it in your report later. One paragraph is enough.
Unless your locations were laid out in a circle or oval, you probably found it
very challenging to find the shortest route. A computer could find it even
harder, as you could at least take advantage of your visual search and intuition
to make the task easier. A computer could only consider two locations at a
time, whereas you can look at more than two. But even for you, the problem
would have been challenging! Even if you measured the distance between each
location and put lines between them and drew it on the map so that you didn’t
have to judge distances between locations in your head, it’d still be very
challenging for you to figure out! It is clear the cray fisher isn’t going to want to
wait for you to calculate the optimal solution. But can you still provide a
solution that is better than visiting the cray pots in a random order?
There are several ways of approaching this. Some are better than others in
general, and some are better than others with certain layouts. One of the more
obvious approximate algorithms is to start from the boat dock in the top left
corner of your map and to go to the nearest craypot. From there, you should go
to the nearest cray pot from that cray pot, and repeatedly go to the nearest
craypot that hasn’t yet been checked. This approach is known as the "Nearest
Neighbour" algorithm, and is an example of a greedy heuristic algorithm, as it
always makes the decision that looks the best at the current time, rather than
making a not so good choice now to try and get a bigger payoff later. In the
excellence part of this guide, you will explore why it is not always optimal, even
though it might initially seem like it is.
For both your small map and large map, use this greedy algorithm to find a
solution (it shouldn’t take you too long). Number the order in which you visit
the cray pots on your map. A computer would be a lot faster than you at this,
so you should have a pretty good idea about how the two algorithms compare.
Explain which algorithm is more suitable for determining a route for the
crayfisherman, and why. What are the implications for each choice that you
considered to arrive at your conclusion? (e.g. how long would he have to wait
for it vs how much time and fuel the fisher saves going between cray pots)
Save this explanation so that you can include it in your report later. 2 to 3
paragraphs is enough.
You should now have 4 maps (small + brute force algorithm, large + brute force
algorithm, small + greedy heuristic algorithm, large + greedy heuristic
algorithm), and several explanations. You now need to put your findings into
report form. After your introduction, briefly explain how a computer would do a
brute force algorithm (hint: check the field guide; the principle of the algorithm
is simple, but it is very inefficient!). Next, include your two maps using the
brute force algorithm, and briefly explain why the large one was so challenging,
followed by your explanation on why this algorithm is no good for the cray
fisher. Then explain how the greedy heuristic algorithm you used works, and
include your two maps for it. Finally, these include your explanation where you
determined which algorithm was best for the Cray Pot problem, and the
implications of each choice. All up, you should have around 2 to 3 pages (if you
have more, and are planning on attempting excellence, you might want to
consider shortening some parts or shrinking down images a little more).
• Analyse the greedy heuristic algorithm (e.g. cases where it finds a really
bad solution, and cases where it finds the optimal solution), and explore
how effective it would be in practice. One way you could do this is to do a
search in Google maps for something like supermarkets in a city (ideally
you’ll want at least 20 to 30 to appear), and then ensuring the the roads
are visible on the map, take a screenshot of it. Trace a greedy heuristic
algorithm path onto it, and then evaluate how effective it was. You will
probably find that some parts of the path make sense, although in other
parts it is inefficient because a destination was “skipped” when others that
were somewhat near it were visited, and the shortest path heuristic pulled
the path away from that destination. What kinds of heuristics would you
use to get a better solution? (e.g. could you somehow break the TSP into a
bunch of clusters, in which you visit all in the cluster before moving onto
the next cluster?) You might want to include a second map showing your
other heuristic ideas. The maps should take up around ½ a page each.
Make sure to discuss your conclusions.
• Exploring and discussing why companies that carry out tasks such as
delivering goods (e.g. a soft drink company sending people around to
restock their many vending machines or a courier service delivering
parcels to various addresses) are willing to invest so much money in
finding better solutions to their own travelling salesman problems (and the
closely related problems that arise when additional constraints such as
speed limits and road blocks are added).
• Investigating and discussing real world examples of the Travelling
Salesman Problem e.g. a soft drink company sending people around to
restock their many vending machines or a courier service delivering
parcels to various addresses, or someone dropping off invitations to a
group of friends.
• Identifying and discussing some of the additional issues that come up in
real world examples, for example, traffic conditions, temporary roadblocks,
roads only going between some of the destinations, speed limits, police
checkpoints, congestion rules, and traffic lights. Once these are added in,
the problem is no longer strictly the Travelling Salesman Problem,
although it is still likely to be an intractable problem (although some of the
additional issues can actually make the problem easier to solve;
identifying thee and exploiting them is another big goal for solutions)
• Evaluating some of the Android and iPhone apps that claim to help the
user with TSP style problems. Do they live up to the claims they make?
• This project is one that might be difficult to fit into the page limit because
of all the diagrams. You want to be able to make the diagrams as small as
possible, while ensuring they are still legible. It would be possible to get
the cray pot maps side by side in pairs to save space. Consider making
them vertical rather than horizontal (you will need to make this decision
before starting, because the numbering must be the right way up). This
means that you could place two side by side, taking up about half the
page in total. The total space of your four maps would be 1 page. The
following tips may also help.
• Use a fine tipped marker or pen that gives a solid line to draw the
dots and lines.
• Use a ruler to draw the lines.
• If scanning them makes any lines unclear, just redraw them with
image editing software.
• Use image sharpening or increase the contrast. It is okay if the image
is black and white.
• Don’t save them as jpegs (If this seems strange to you, read about
how JPEG compression works when you have some spare time)
• Remember that the marker wants to know about the applications, and the
use of algorithms to solve them. This is what the standard asks for, and
should always be kept in mind. Real world implications are important, i.e.
in practice, the optimal solution is not essential for the result to be useful.
• For excellence, you will need to do additional background reading. Be
careful what you include in your report; it is very obvious to the marker
when a student has copied text from a source that they don’t actually
understand. In particular, be sure that you understand the jargon you use
so that you can be certain you are using it correctly. Incorrectly used
jargon sounds really bad to a reader (e.g. marker) who knows the topic!
• Note that it is essential for Achieved, Merit, and Excellence that you start
with your own examples (e.g. craypot maps) and explain those, rather
than simply paraphrasing information from the field guide and other
sources.
Staying within the page limit
The page limit for 3.44 is 10 pages. Remember that you have to do a second
project on a different topic as well, so that leaves 5 pages for Complexity and
Tractability. We recommend aiming for 4½ pages so that if some sections go
slightly over, you will still be under 10 pages overall. Also, don’t forget you will
need a bibliography at the end.
• ½ page: introducing the key problem, the Cray Pot Problem, and why the
Cray Pot Problem is equivalent to the Travelling Salesman problem
• 1 page: Craypot maps (Achieved/ Merit Excellence). Note that they are not
necessarily on the same page, although their total area should not exceed
1 page.
• 1 page: Explanations about the Craypot maps from the Achieved/Merit
part of the guide (Achieved/Merit/ Excellence)
• 2 pages: Whatever you decide to do for excellence. This might include
diagrams or discussions. Don’t let diagrams take up more than 1 page
though, because you need to have an in depth discussion as well.
In order to fully cover the standard, you will also need to have done a project in
one other 3.44 topic. The other project should be in either Complexity and
Tractability, Intelligence Systems(Artificial Intelligence), Software Engineering,
Network Protocols, or Computer Graphics and Computer Vision. Anything
outside these topics is not part of the standard.
Overview
Note that a plural means “at least 2” in NZQA documents”. Note that because
this is one of the two areas of computer science that the student must cover,
most of the plurals effectively become singular for this project, which is half
their overall report. This project will give them at least one algorithm/technique
and practical application, and the other project they do will give them another
of each.
You will also convert a regular expression to its equivalent finite state
automata, like what a computer does internally.
Important: If you are only attempting Achieved/ Merit, you should only do
email addresses (or a similar type of input). It is far better to focus on doing
one type of input well rather than covering many different ones. Other inputs
looked at should be with the intention of discussing and evaluating the
effectivness of regular expressions/ finite state automata.
[A2, M1, M2] - Bulk of report for Achieved/ Merit. In total, this will take up
around 2 - 3 pages. Students should not go beyond 3 pages for this section
if they want to attempt the excellence. A further page breakdown is
provided within the section.
• An email address contains two parts: the “local part” and the “domain
part”, separated with a @. They are in the form of “local part”@”domain
part”.
• The local part can be made up of any alphanumeric characters (i.e. all the
upper and lower cases letters of the alphabet, and the 10 digits), “+” or
“.” . There cannot be multiple “.” in a row.
• The domain part can be made up of any alphanumeric characters (i.e. all
the upper and lower cases letters of the alphabet, and the 10 digits), and
“.”.
• The local part and domain part cannot start or end with a “+”, or “.”.
Once you think you have your regular expression correct, convert it into a finite
state automata (you can either do this by hand, or by using one of the tools
linked to in the field guide). Make sure the states of your finite state automata
are numbered (or have some kind of unique ID) as this will help you with your
later explanations. You will be including the regular expression and the finite
state automata in your report. The regular expression won’t take up much
space, although you should ensure that the finite state automata does not take
up more than half a page (and less if it is still legible when shrunk down
further).
This should be around one paragraph for the regular expression and one
paragraph for the finite state automata. All up, there should be about one page
once the regular expression, finite state automata, and two paragraphs of
explanation have been put together. Explain how you designed your regular
expression (i.e. what does each part of it mean?), and then explain how your
finite state automata is equivalent to your regular expression (i.e. which parts
of the regular expression map onto what parts of the finite state automata?)
Don’t include this part in your report now, you will include some of them in the
next bit (although you must do this testing to ensure you have a correct regular
expression/ finite state automata!). Come up with some example email
addresses to trace through your regular expression or finite state automata by
hand. You should pick examples which are effective in showing that your
regular expression accepts valid inputs and rejects invalid inputs. If you find an
example that your regular expression does not handle correctly, then you
should try and fix it. Your invalid examples should include:
It is important that students come up with their own examples to test with.
This personalises the student’s work. All up, the traces and working should
take around 1 to ½ pages. There are several ways of approaching it. The
student can either refer to their regular expression or their finite state
automata in the explanation (the latter is probably easier, especially if the
states have been numbered), and they may use clearly labelled diagrams
as part of their explanation (e.g. a drawn on finite state automata). Any
diagrams must be clear and legible. Once you have finished testing and are
satisfied that the regular expression/ finite state automata is working
correctly, you should pick a couple of valid email addresses and three or
four invalid email addresses (e.g. your ones from above) and explain how
the finite state automata handles them. Explain how the finite state
automata is moved into various states by the email address input.
Remember that valid email addresses should end in a terminating state,
and invalid email addresses should either end in a non terminating state, or
be unable to transition at all at some point. Make sure that each of these
worked examples clearly states what email address was used for it (use
headings or bold the email address).
For excellence, you will investigate other types of inputs in order to identify
what kinds of problems regular expressions are good for, which they could be
used for but should not be, and which there is no regular expression for.
In most assessment guides, you are provided with a set of links and questions
for excellence. For this one, most online resources you find will be full of
strange mathematical symbols, and won’t be helpful for you writing your
report. So instead, we provide you with several interesting problems which
involve a type of input. These can be found at the end of the document in the
section “Regular Expression Problems for Excellence”. Before you start writing
the excellence part of your report, you should explore each of the problems by
trying to write a regular expression for them (some parts of them are easier
than the email address one), and thinking about why they are either efficient,
inefficient, or impossible to solve with a regular expression. Your writing for
excellence will be based on your own understanding and thoughts about the
problems. Focus on justifying and discussing your thoughts rather than
worrying about whether or not you have "correct" answers.
Make a table which summarises the problems you have looked at (there are
around 9 of them including the sub problems - a list is included below to help
you), and specifies which are possible to solve with regular expressions, which
are not, and which should be solved with regular expressions, and which should
not. For those which are possible to solve with regular expressions, and are
simple enough that they will fit onto one line, include the regular expression in
your table. For the ones that are possible but the regular expression is very
long, describe in a sentence or two what it would look like, or what kind of thing
it would have to do. The table will probably take up around ¾ a page. It
provides a summary of the investigations, and gives something to refer to in
the discussion/ evaluation.
The problems in the table may include: (Remember to refer to the section at
the end for more details!)
Your written discussion will cover the excellence criteria and should ideally be
around 1 to 2 pages long. The key will be to write concisely. You will then write
a discussion/ evaluation on your findings, which addresses the following key
points. The problems in the above table should be referred to.
On the surface, this might have seemed straightforward. But have you
managed to ensure all these invalid dates are detected?
• 30/02/2015
• 29/02/2014 (but remember that 29/02/2016 is valid!)
• 31/11/1998
• 21/13/2013
• 38/05/1992
You should even be able to write a regular expression for each of the major
providers (mastercard, visa, etc, look on wikipedia to discover how you can
identify which is which) that will tell you whether or not a given credit card
number is from that provider.
But what about the check digits that you learnt about in 2.44? The 16th digit of
the credit card number is calculated from the first 15 digits. Why can this not
be done effectively using regular expressions? (Think about what the role of
regular expressions is) What would a regular expression that only accepted
valid credit card numbers (including the check digit) look like? (Hint: it is at
least possible, in theory. But you’d find it impossible to write the entire regular
expression by hand in your life-time, as it is just too long…)
• The equation can be made up of all the 10 digits (including multiple digits
in a row, and +, -, *, and /
• The first and last characters must be digits
• There can be any number of * + - /, although there cannot be 2 of them in
a row.
Try writing a regular expression that can check these inputs. Check it with a few
example to ensure it is correct (you should not include the testing in your
report, although you should make sure the regular expression that will be going
in your report is correct.Testing is important!). This should not be too difficult if
you were able to do the email address one without difficulty.
Now, add one extra rule in: Parentheses are allowed, as long as they are
“balanced”. i.e. there must be the same number of opening ones as closing
ones, and if you scan across it from left to right, the current count of closing
parenthesis should never be higher than the current count of opening
parenthesis, and at the end, the two counts should be the same. There is no
limit to the number of parenthesis allowed.
To save you from going crazy, we’ll tell you now that it is impossible. Unlike the
other problems so far which were all solvable with regular expressions, even if
it was terribly inefficient to write one solution, this one is impossible.The
problem is trying to check that the parentheses are balanced. Try and make
sense of why it is impossible to use regular expressions to solve this, and what
happens when you attempt to do so.
To investigate it, forget about the digits and mathematical operators, and only
consider the parentheses, using the “balanced” rules above. For example, it
should accept strings like:
• ()
• (())
• (()())
• ((()())())
• (()()(()()()(((()))(()))(()()))
• ((((
• )
• (())) -- Too many closing parenthesis
• (())())( -- At one point there has been more closing than opening
• (()))(() -- At one point there has been more closing than opening
Try to make a regular expression or finite state automata that is able to check
for strings like these. Don’t forget that there is no limit to the length of these
parenthesis strings. Even if you make a regular expression/ finite state
automata able to recognise some patterns, it is impossible to get them all.
What happens when you try?
Algorithms
Overview
• EU 4.1 Algorithms are precise sequences of instructions for processes that
can be executed by a computer and are implemented using programming
languages.
• EU 4.2 Algorithms can solve many, but not all, computational problems.
• Algorithms
• Complexity and Tractability
Learning objectives
The above chapter readings include specific knowledge for EK's marked in bold.
Work to include unmarked learning objectives in the CS Field Guide is currently
in progress.
Overview
• EU 2.1 A variety of abstractions built on binary sequences can be used to
represent all digital data.
• EU 2.2 Multiple levels of abstraction are used to write programs or create
other computational artifacts.
• EU 2.3 Models and simulations use abstraction to generate new
understanding and knowledge.
• Data Representation
• Software Engineering - Layers of Abstraction
• Programming Languages
Learning objectives
The above chapter readings include specific knowledge for EK's marked in bold.
Work to include unmarked learning objectives in the CS Field Guide is currently
in progress.
Overview
• EU 1.1 Creative development can be an essential process for creating
computational artifacts.
• EU 1.2 Computing enables people to use creative development processes
to create computational artifacts for creative expression or to solve a
problem.
• EU 1.3 Computing can extend traditional forms of human expression and
experience.
Learning objectives
The above chapter readings include specific knowledge for EK's marked in bold.
Work to include unmarked learning objectives in the CS Field Guide is currently
in progress.
Overview
• EU 3.1 People use computer programs to process information to gain
insight and knowledge.
• EU 3.2 Computing facilitates exploration and the discovery of connections
in information.
• EU 3.3 There are trade-offs when representing information as digital data.
• Data Representation
• Coding
• Compression
• Encryption
• Error Control
• Human Computer Interaction
Learning objectives
The above chapter readings include specific knowledge for EK's marked in bold.
Work to include unmarked learning objectives in the CS Field Guide is currently
in progress.
Overview
• EU 7.1 Computing enhances communication, interaction, and cognition.
• EU 7.2 Computing enables innovation in nearly every field.
• EU 7.3 Computing has a global affect — both beneficial and harmful — on
people and society.
• EU 7.4 Computing innovations influence and are influenced by the
economic, social, and cultural contexts in which they are designed and
used.
• EU 7.5 An investigative process is aided by effective organization and
selection of resources. Appropriate technologies and tools facilitate the
accessing of information and enable the ability to evaluate the credibility
of sources.
Overview
• EU 5.1 Programs can be developed for creative expression, to satisfy
personal curiosity, to create new knowledge, or to solve problems (to help
people, organizations, or society).
• EU 5.2 People write programs to execute algorithms.
• EU 5.3 Programming is facilitated by appropriate abstractions.
• EU 5.4 Programs are developed, maintained, and used by people for
different purposes.
• EU 5.5 Programming uses mathematical and logical concepts.
• Programming Languages
• Software Engineering
• Human Computer Interaction
The AP CSP framework does not require a specific programming language but
require students to learn programming. Choose one or more of the options
from:
• Programming Introduction
Learning objectives
The above chapter readings include specific knowledge for EK's marked in bold.
Work to include unmarked learning objectives in the CS Field Guide is currently
in progress.
LO 5.1.1 Develop a program for creative
expression, to satisfy personal curiosity, or to
create new knowledge.
• EK 5.1.1A Programs are developed and used in a variety of ways
by a wide range of people depending on the goals of the
programmer.
• EK 5.1.1B Programs developed for creative expression, to satisfy
personal curiosity, or to create new knowledge may have visual,
audible, or tactile inputs and outputs.
• EK 5.1.1C Programs developed for creative expression, to satisfy
personal curiosity, or to create new knowledge may be developed
with different standards or methods than programs developed for
widespread distribution.
• EK 5.1.1D Additional desired outcomes may be realized
independently of the original purpose of the program.
• EK 5.1.1E A computer program or the results of running a
program may be rapidly shared with a large number of users and
can have widespread impact on individuals, organizations, and
society.
• EK 5.1.1F Advances in computing have generated and increased
creativity in other fields.
Overview
• EU 6.1 The Internet is a network of autonomous systems.
• EU 6.2 Characteristics of the Internet influence the systems built on it.
• EU 6.3 Cybersecurity is an important concern for the Internet and the
systems built on it.
Learning objectives
The above chapter readings include specific knowledge for EK's marked in bold.
Work to include unmarked learning objectives in the CS Field Guide is currently
in progress.
• EK 6.1.1H The number of devices that could use an IP address has grown
so fast that a new protocol (IPv6) has been established to handle routing
of many more devices.
• EK 6.1.1I Standards such as hypertext transfer protocol (HTTP), IP, and
simple mail transfer protocol (SMTP) are developed and overseen by the
Internet Engineering Task Force (IETF).
Algorithm
A step by step process that describes how to solve a problem and/or complete
a task, which will always give a result.
Alphabet
In formal languages, a list of characters that may occur in a language, or more
generally, a list of all possible inputs that might happen.
Binary Search
Searching a sorted list by looking at the middle item, and then searching the
appropriate half recursively (used for phone books, dictionaries and computer
algorithms).
Brooks' law
An observation that adding more people to a project that is running late may
actually slow it down more.
Chomsky Hierarchy
A hierarchy of four classifications of formal languages, ranging from simple
regular expressions to very flexible (but computationally difficult) grammars.
See also Formal languages.
Complexity
How long it takes to solve a problem. A problem has an inherent complexity
(minimum time needed to solve it); any algorithm to solve the problem will
have a higher complexity (take at least that long).
Digital signature
An encryption system that allows the receiver to verify that a document was
sent by the person who claims to have sent it.
See also Formal languages, FSA abbreviation, Formal languages, and related to
regular expressions.
Grammar
In formal languages, a set of rules for specifying a language, for example, to
specify syntax for programming languages.
Interpolation
Working out values between some given values; for example, if a sequence of 5
numbers starts with 3 and finishes with 11, we might interpolate the values 5,
7, 9 in between.
Pattern matching
In formal languages, finding text that matches a particular rule, typically using
a regular expression to give the rule.
Pixel
This term is an abbreviation of picture element, the name given to the tiny
squares that make up a grid that is used to represent images on a computer.
Quicksort
A process for achieving an outcome, normally for a general problem such as
searching, sorting, finding an optimal path through a map and so on.
Regular expression
A formula used to describe a pattern in a text that is to be matched or searched
for. These are typically used for finding elements of a program (such as variable
names) and checking input in forms (such as checking that an email address
has the right format.)
Slope
This is a way of expressing the angle or gradient of a line. The slope is simply
how far up the line goes for every unit we move to the right. For example, if we
have a line with a slope of 2, then after moving 3 units to the right, it will have
gone up 6 units. A line with a slope of 0 is horizontal. Normally the slope of a
line is represented using the symbol .
tractable
A tractable problem is one that can be solved in a reasonable amount of time;
usually the distinction between tractable and intractable is drawn at the
boundary between problems that can be solved in an amount of time that is
polynomial; those that require exponential time are regarded as intractable.
Transition
In a finite state machine, the links between the states.
Project Leads
• Tim Bell - International Field Guide, co-founder
• Peter Denning - International Field Guide, co-founder
• Jack Morgan - Project Manager
Software Development
• Jack Morgan - Technical Lead
• Jordan Griffiths
Interactive Development
• Jack Morgan
• David Thompson
• Rhem Munro
• Heidi Newton
• James Browning
• Sam Jarman
• Hayley van Waas
• Hannah Taylor
• Marcus Stenfert Kroese
• Victor Chang
Writers
• Tim Bell - Formal Languages, Compression, Human Computer Interaction,
and Coding Introduction
• Heidi Newton - NCEA Assessment Guides, Programming Languages, Data
Representation, Compression, Encryption, Error Control, Artificial
Intelligence, and Complexity and Tractability
• Caitlin Duncan - Algorithms
• Sam Jarman - Network Communication Protocols
• David Thompson - Computer Vision
• Rhem Munro - Computer Graphics
• Janina Voigt - Software Engineering
• Jon Rutherford - Software Engineering
• Joshua Scott - Computer Graphics
Editors
• Tim Bell
• Heidi Newton
• Hayley van Waas
• Jack Morgan
• Ian Witten
Advisors
• Mike Fellows - CS Unplugged
• Andrea Arpaci-Dusseau - CS Unplugged
• Paul Curzon - CS4FN
• Quintin Cutts - Computing Science Inside
• Calvin Lin - Thriving in our Digital World
• Bradley Beth - Thriving in our Digital World
• Peter Andreae - Artificial Intelligence, Complexity and Tractability, Data
Representation, Compression, Error Control, and Encryption
• Wal Irwin - Software Engineering
• Patrick Baker - Curriculum (NCEA)
• Jenny Baker - Curriculum (NCEA)
• Neil Leslie - Curriculum (NCEA)
• Hannah Taylor - Curriculum (NCEA)
• James Atlas - Curriculum (AP-CSP)
• Paul Tymann - Curriculum (AP-CSP)
• Dr Mukundan - Computer Graphics & Vision
• Richard Green - Computer Vision
• DongSeong Kim - Network Communication Protocols, Encryption
• Andreas Willig - Network Communication Protocols
• Walter Guttmann - Formal Languages
• Joshua Scott - Programming Languages
• Brad Miller (Runestone Interactive)
• David Ranum (Runestone Interactive)
• Ian Witten
• Anthony Robins
• Shadi Ibrahim
• Renate Thies
• Jan Vahrenhold
• Paul Matthews
Other
• Sumant Murugesh - Research
• Marcus Stenfert Kroese
• Ben Gibson - Interactive games and related material
• Michael Bell (Orange Studio) - Video production
• Linda Pettigrew - Formal languages material
• Sarang Love Leehan - Illustrations
Community Contributors
• ArloL (Arlo Louis O'Keeffe)
• ner0x652 (Cornel Punga)
• alanhogan (Alan Hogan)
• StevenMaude (Steven Maude)
• rdpse (Rúben Enes)
• oughter
• digitalDojoNZ
• pvskarthikeya (Karthikeya Pammi)
• Goldob (Adam Gotlib)
• Jamie Dawson
• Alasdair Mark Smith
• microlith57
• isabelle49 (Isabelle Taylor)
• k-yle (Kyle H)
• Andrew Bell
• anzeljg (Gregor Anželj)
Funding for this online textbook has been provided by Google Inc. In addition,
countless hours of volunteer time have been contributed by those listed above.
Tim Bell prepared an initial draft of this material while visiting Huazhong
University of Science and Technology, Wuhan, China, whom we thank for
providing an excellent environment for writing The project is based at the
University of Canterbury, Christchurch, New Zealand; other authors are at
Victoria University of Wellington, New Zealand, and Cambridge University, UK.
Partial funding for the US field guide project was provided by the US National
Science Foundation under Grant No. 0938809.
Interactives
We have plenty of interactives throughout the CSFG, to teach many different
computer science concepts. This page details some troubleshooting tips if you
encounter issues, and a list of all available interactives.
Troubleshooting
Most of our interactives require a modern browser, if it's been updated in the
last year you should be fine. We recommend:
• Google Chrome
• Mozilla Firefox
• Microsoft Edge
• Safari
• Opera
While most interactives work on both phones, tablets, and desktop computers,
some of our more complex interactives require a desktop computer to achieve
acceptable performance.
WebGL
The computer graphics and vision chapters use WebGL, which is a system that
can render 3D images in a web browser. It is relatively new, so older browsers
and operating systems may not have it setup correctly. The CanIUse website is
a quick way to check if WebGL will work in your browser on your operating
system. The general rule of thumb is if you are using an up-to-date version of a
browser and the drivers for your operating system are up-to-date and the
computer has a suitable GPU, then it should work.
If you are still having issues, this answer on SuperUser is quite useful. Also try
searching your browser version and operating system in Google.
Available Interactives
• Action Menu
• Available Menu Items
• Awful Calculator
• Base Calculator
• Big Number Calculator
• Bin Packing
• Binary Cards
• Caesar Cipher
• Checksum Calculator
• GTIN-13 Checksum Calculator
• Close Window
• CMY Colour Mixer
• Colour Matcher
• Compression Comparer
• Confused Buttons
• Date Picker
• Deceiver
• Delay Analyser
• Delayed Checkbox
• Frequency Analysis
• High Score Boxes
• Image Bit Comparer
• Change bit mode
• JPEG Compression
• Puzzle mode
• MIPS Assembler
• MIPS Simulator
• No Help
• Number Generator
• Packet Attack
• Packet Attack Level Creator
• Parity Trick
• Setting parity only
• Detecting errors only
• Sandbox mode
• Payment Interface
• Pixel Viewer
• Python Interpreter
• Regular Expression Filter
• Regular Expression Search
• RGB Colour Mixer
• RSA Key Generator
• RSA Encrypter (no padding)
• RSA Decrypter (no padding)
• Run Length Encoding
• SHA2
• Searching Algorithms
• Sorting Algorithm Comparison
• Sorting Algorithms
• Trainsylvania
• Trainsylvania Planner
• Unicode Binary
• Unicode Character
• Unicode Length
• Viola-Jones Face Detection
Releases
This page lists updates to the Computer Science Field Guide. All notable
changes to this project will be documented in this file.
• MAJOR version change when major text modifications are made (for
example: new chapter, changing how a curriculum guide teaches a
subject).
• MINOR version change when content or functionality is added or
updated (for example: new videos, new activities, large number of
text (typo/grammar) fixes).
• HOTFIX version change when bug hotfixes are made (for example:
fixing a typo, fixing a bug in an interactive).
• A pre-release version is denoted by appending a hyphen and the
alpha label followed by the pre-release version.
2.12.2
Release date: 5th June 2018
Changelog:
2.12.1
Release date: 7th March 2018
Changelog:
2.12.0
Release date: 13th February 2018
Changelog:
2.11.0
Release date: 18th October 2017
Downloads: Source downloads are available on GitHub
Changelog:
2.10.1
Release date: 3rd September 2017
Changelog:
2.10.0
Release date: 2nd September 2017
Notable changes:
This release adds a JPEG compression interactive, along with many bug fixes,
corrections, and.
The version numbering scheme now does not start with the v character (so
v2.9.1 is 2.9.1). This to make the numbering consistent with our other
projects (CS Unplugged and cs4teachers).
Changelog:
2.9.1
Release date: 20th February 2017
Notable changes:
This release fixes a bug in the Computer Graphics chapter where some links to
the 2D Arrow Manipulation interactives were broken due to an incorrect regex.
Changelog:
2.9.0
Release date: 27th January 2017
Notable changes:
This release adds an introductory video for the Complexity and Tractability
chapter, updated text for Graphics Transformations section of the Computer
Graphics chapter, as well as updated versions of the 2D Arrow Manipulation
and FSA interactives.
Changelog:
2.8.1
Release date: 21st October 2016
Changelog:
2.8.0
Release date: 19th October 2016
Notable changes:
This release adds an introductory video for the Human Computer Interaction
chapter, plus a draft of guides for mapping the Computer Science Field Guide to
the AP CSP curriculum.
Changelog:
2.7.1
Release date: 5th September 2016
Notable changes:
2.7.0
Release date: 23rd August 2016
Notable changes:
2.6.1
Release date: 14th July 2016
Notable changes:
2.6.0
Release date: 16th June 2016
Notable changes:
2.5.0
Release date: 13th May 2016
Notable changes:
The chapter and assessment guides have been rewritten to take account
of new feedback from the marking process and our own observations of
student work.
As part of the rewrite of the Data Representation chapter, the following
interactives were developed:
The old version of the Data Representation chapter can be found here.
2.4.1
Release date: 29th April 2016
Notable changes:
2.4
Release date: 29th April 2016
• Large number of typo, grammar, link, and math syntax fixes and also
content corrections by contributors.
• New interactive: Added GTIN-13 checksum calculator interactive for
calculating the last digit for a GTIN-13 barcode.
• Updated interactive: The regular expression search interactive has been
updated and added to the repository.
• Updated interactive: The image bit comparer interactive has been updated
and added to the repository. It also has a changing bits mode which allows
the user to modify the number of bits for storing each colour.
• Added XKCD mouseover text (similar behaviour to website).
• Added feedback modal to allow developers to directly post issues to
GitHub.
• Added encoding for HTML entities to stop certain characters not appearing
correctly in browsers.
• Added summary of output at end of generation script.
• Added message for developers to contribute in the web console.
2.3
Release date: 10th March 2016
Notable changes:
2.2
Release date: 19th February 2016
Notable changes:
• New interactive: Parity trick with separate modes for practicing setting
parity, practicing detecting parity, and the whole trick. Also has a sandbox
mode.
• Updated interactives: Two colour mixers, one for RGB and one for CMY
have been added.
• Updated interactive: A colour matcher interactive has been added for
matching a colour in both 24 bit and 8 bit.
• Updated interactive: A python interpreter interactive has been added for
the programming languages chapter.
• Website improvements: Code blocks now have syntax highlighting when a
language is specified, dropdown arrows are fixed in Mozilla Firefox
browsers, and whole page interactives now have nicer link buttons.
2.1
Release date: 12th February 2016
Notable changes:
Notable changes:
Comments: The first major step in releasing a open source version of the
Computer Science Field Guide. While some content (most notably interactives)
have yet to be added to the new system, we are releasing this update for New
Zealand teachers to use at the beginning of their academic year. For any
interactives that are missing, links are in place to the older interactives.
2.0-alpha.3
Release date: 29th January 2016
2.0-alpha.2
Release date: 25th January 2016
2.0-alpha.1
Release date: 2nd December 2015
1.?.?
Release date: 3rd February 2015
Comments: The last version of the CSFG before the open source version was
adopted.
coloured-roof-small.png
Owner: Jack Morgan
cosine-graph.png
Owner: Geek3
mips-assembler, mips-simulator
Owner: Alan Hogan
codemirror.css, codemirror.js
Used: Within regular expression interactives.
License:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
en-words.txt
Owner: atebits/Words
License:
Statement of Purpose
Certain owners wish to permanently relinquish those rights to a Work for the
purpose of contributing to a commons of creative, cultural and scientific works
("Commons") that the public can reliably and without fear of later claims of
infringement build upon, modify, incorporate in other works, reuse and
redistribute as freely as possible in any form whatsoever and for any purposes,
including without limitation commercial purposes. These owners may
contribute to the Commons to promote the ideal of a free culture and the
further production of creative, cultural and scientific works, or to gain
reputation or greater distribution for their Work in part through the use and
efforts of others.
For these and/or other purposes and motivations, and without any expectation
of additional consideration or compensation, the person associating CC0 with a
Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and
Related Rights in the Work, voluntarily elects to apply CC0 to the Work and
publicly distribute the Work under its terms, with knowledge of his or her
Copyright and Related Rights in the Work and the meaning and intended legal
effect of CC0 on those rights.
1. Copyright and Related Rights. A Work made available under CC0 may be
protected by copyright and related or neighboring rights ("Copyright and
Related Rights"). Copyright and Related Rights include, but are not limited
to, the following:
i. the right to reproduce, adapt, distribute, perform, display, communicate,
and translate a Work;
vi. database rights (such as those arising under Directive 96/9/EC of the
European Parliament and of the Council of 11 March 1996 on the legal
protection of databases, and under any national implementation thereof,
including any amended or successor version of such directive); and
2. Waiver. To the greatest extent permitted by, but not in contravention of,
applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
unconditionally waives, abandons, and surrenders all of Affirmer's
Copyright and Related Rights and associated claims and causes of action,
whether now known or unknown (including existing as well as future
claims and causes of action), in the Work (i) in all territories worldwide, (ii)
for the maximum duration provided by applicable law or treaty (including
future time extensions), (iii) in any current or future medium and for any
number of copies, and (iv) for any purpose whatsoever, including without
limitation commercial, advertising or promotional purposes (the "Waiver").
Affirmer makes the Waiver for the benefit of each member of the public at
large and to the detriment of Affirmer's heirs and successors, fully
intending that such Waiver shall not be subject to revocation, rescission,
cancellation, termination, or any other legal or equitable action to disrupt
the quiet enjoyment of the Work by the public as contemplated by
Affirmer's express Statement of Purpose.
3. Public License Fallback. Should any part of the Waiver for any reason be
judged legally invalid or ineffective under applicable law, then the Waiver
shall be preserved to the maximum extent permitted taking into account
Affirmer's express Statement of Purpose. In addition, to the extent the
Waiver is so judged Affirmer hereby grants to each affected person a
royalty-free, non transferable, non sublicensable, non exclusive,
irrevocable and unconditional license to exercise Affirmer's Copyright and
Related Rights in the Work (i) in all territories worldwide, (ii) for the
maximum duration provided by applicable law or treaty (including future
time extensions), (iii) in any current or future medium and for any number
of copies, and (iv) for any purpose whatsoever, including without limitation
commercial, advertising or promotional purposes (the "License"). The
License shall be deemed effective as of the date CC0 was applied by
Affirmer to the Work. Should any part of the License for any reason be
judged legally invalid or ineffective under applicable law, such partial
invalidity or ineffectiveness shall not invalidate the remainder of the
License, and in such case Affirmer hereby affirms that he or she will not (i)
exercise any of his or her remaining Copyright and Related Rights in the
Work or (ii) assert any associated claims and causes of action with respect
to the Work, in either case contrary to Affirmer's express Statement of
Purpose.
md5.js
Owner: Jeff Mott
License:
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list
of conditions, and the following disclaimer. Redistributions in binary form must
reproduce the above copyright notice, this list of conditions, and the following
disclaimer in the documentation or other materials provided with the
distribution. Neither the name CryptoJS nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE
COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS," AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE,
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.
skulpt.min.js
Website: https://github.com/skulpt/skulpt
License:
Copyright (c) 2009-2010 Scott Graham
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Portions of the code were written with the benefit of viewing code that's in the
official "CPython" and "pypy" distribution and/or translated from code that's in
the official "CPython" and "pypy" distribution. As such, they are:
Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006 Python Software
Foundation; All Rights Reserved"
per:
http://www.python.org/psf/license/
http://www.python.org/download/releases/2.6.2/license/
https://bitbucket.org/pypy/pypy/src/default/LICENSE
jsencrypt.js
Owner: https://github.com/travist/jsencrypt
License:
The MIT License (MIT) Copyright (c) 2013 AllPlayers.com
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
jquery.js
Owner: jQuery
materialize.scss, materialize.min.js
Owner/Creator: MaterializeCSS (Dogfalo)
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
License:
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction, and
distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by the copyright
owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all other entities
that control, are controlled by, or are under common control with that entity.
For the purposes of this definition, "control" means (i) the power, direct or
indirect, to cause the direction or management of such entity, whether by
contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity exercising permissions
granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation source, and
configuration files.
"Object" form shall mean any form resulting from mechanical transformation or
translation of a Source form, including but not limited to compiled object code,
generated documentation, and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or Object form,
made available under the License, as indicated by a copyright notice that is
included in or attached to the work (an example is provided in the Appendix
below).
"Derivative Works" shall mean any work, whether in Source or Object form, that
is based on (or derived from) the Work and for which the editorial revisions,
annotations, elaborations, or other modifications represent, as a whole, an
original work of authorship. For the purposes of this License, Derivative Works
shall not include works that remain separable from, or merely link (or bind by
name) to the interfaces of, the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including the original version
of the Work and any modifications or additions to that Work or Derivative Works
thereof, that is intentionally submitted to Licensor for inclusion in the Work by
the copyright owner or by an individual or Legal Entity authorized to submit on
behalf of the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent to the
Licensor or its representatives, including but not limited to communication on
electronic mailing lists, source code control systems, and issue tracking
systems that are managed by, or on behalf of, the Licensor for the purpose of
discussing and improving the Work, but excluding communication that is
conspicuously marked or otherwise designated in writing by the copyright
owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity on behalf
of whom a Contribution has been received by Licensor and subsequently
incorporated within the Work.
(a) You must give any other recipients of the Work or Derivative Works a copy
of this License; and
(b) You must cause any modified files to carry prominent notices stating that
You changed the files; and
(c) You must retain, in the Source form of any Derivative Works that You
distribute, all copyright, patent, trademark, and attribution notices from the
Source form of the Work, excluding those notices that do not pertain to any
part of the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its distribution, then any
Derivative Works that You distribute must include a readable copy of the
attribution notices contained within such NOTICE file, excluding those notices
that do not pertain to any part of the Derivative Works, in at least one of the
following places: within a NOTICE text file distributed as part of the Derivative
Works; within the Source form or documentation, if provided along with the
Derivative Works; or, within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents of the NOTICE
file are for informational purposes only and do not modify the License. You may
add Your own attribution notices within Derivative Works that You distribute,
alongside or as an addendum to the NOTICE text from the Work, provided that
such additional attribution notices cannot be construed as modifying the
License.
You may add Your own copyright statement to Your modifications and may
provide additional or different license terms and conditions for use,
reproduction, or distribution of Your modifications, or for any such Derivative
Works as a whole, provided Your use, reproduction, and distribution of the Work
otherwise complies with the conditions stated in this License.
2. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
3. Disclaimer of Warranty. Unless required by applicable law or agreed to in
writing, Licensor provides the Work (and each Contributor provides its
Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied, including, without
limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT,
MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are
solely responsible for determining the appropriateness of using or
redistributing the Work and assume any risks associated with Your exercise
of permissions under this License.
website.css
This file is a combination of website.scss and materialize.scss created for
output. See the details for these files for more details.
website.scss, website.js
Created by our team for the CSFG project. The Computer Science Field Guide
uses a Creative Commons Attribution-NonCommercial-ShareAlike 4.0
International (CC BY-NC-SA 4.0) license.
This means you are free to:
This is a human-readable summary of (and not a substitute for) the full license.
This deed highlights only some of the key features and terms of the actual
license. It is not a license and has no legal value. You should carefully review all
of the terms and conditions of the actual license before using the licensed
material.