Paper 1 Computer Science AS
Paper 1 Computer Science AS
Paper 1 Computer Science AS
Before we jump into the world of number systems, we'll need a point of reference; I recommend that you
copy the following table that you can refer to throughout this chapter to check your answers.
0 0000 0
1 0001 1
2 0010 2
3 0011 3
4 0100 4
5 0101 5
6 0110 6
7 0111 7
8 1000 8
9 1001 9
A 1010 10
B 1011 11
C 1100 12
D 1101 13
E 1110 14
F 1111 15
Page 1 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Denary is the number system that you have most probably grown up with. It is also another way of saying
base 10. This means that there are 10 different numbers that you can use for each digit, namely:
0,1,2,3,4,5,6,7,8,9
Notice that if we wish to say 'ten', we use two of the numbers from the above digits, 1 and 0.
Using the above table we can see that each column has a different value assigned to it. And if we know
the column values we can know the number, this will be very useful when we start looking at other base
systems. Obviously, the number above is: five-thousands, nine-hundreds, seven-tens and three-units.
Binary is a base-2 number system, this means that there are two numbers that you can write for each digit
0, 1.
With these two numbers we should be able to write (or make an approximation) of all the numbers that we
could write in denary. Because of their digital nature, a computer's electronics can easily manipulate
numbers stored in binary by treating 1 as "on" and 0 as "off."
Using the above table we can see that each column has a value assigned to it that is the power of two (the
base number!), and if we take those values and the corresponding digits we can work out the value of the
number: 1*64 + 1*32 + 1*8 + 1*2 = 106.
If you are asked to work out the value of a binary number, the best place to start is by labeling each
column with its corresponding value and adding together all the columns that hold a 1. Let's take a look at
another example:
(00011111) 2
R
128 64 32 16 8 4 2 1
0 0 0 1 1 1 1 1
Page 2 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Exercise: Binary
U
Answer :
128 64 32 16 8 4 2 1
0 0 0 0 1 1 0 0
8+4 = (12) 10
R
Answer :
128 64 32 16 8 4 2 1
0 1 0 1 1 0 0 1
64 + 16 + 8 + 1 = (89) 10 R
Answer :
128 64 32 16 8 4 2 1
0 0 0 0 0 1 1 1
4 + 2 + 1 = (7) 10
R
Answer :
128 64 32 16 8 4 2 1
0 1 0 1 0 1 0 1
64 + 16 + 4 + 1 = (85) 10 R
Answer :
It's right most digit is a one
Page 3 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Is there a short cut to working out a binary number that is made of solid ones, such as:
(01111111) 2 ?
R R
Answer :
Yes, take the first 0's column value and minus one
128 64 32 16 8 4 2 1
0 1 1 1 1 1 1 1
= 128 - 1 = 127 = 64 + 32 + 16 + 8 + 4 + 2 + 1
The language that a computer understands is very simple, so simple that it only has 2 different numbers:
1 and 0. This is called Binary. Everything you see on a computer, images, sounds, games, text, videos,
spreadsheets, websites etc. Whatever it is, it will be stored as a string of ones and zeroes.
What is a bit?
A bit is the smallest unit in digital representation of information. A bit has only two values, ON and OFF
where ON is represented by 1 and OFF by 0. In terms of electrical signals a 1 (ON) is normally a 5 volt
signal and a 0 (OFF) is a 0 volt signal.
Bit
1
What is a nibble?
A group of 4 bits are referred to as a nibble.
Nibble
1 0 0 1
What is a byte?
In the world of computers and microcontrollers, 8 bits are considered to be a standard group. It is called
a byte. Microcontrollers are normally byte oriented and data and instructions normally use bytes. A Byte
can be broken down into two nibbles.
Byte
1 0 0 1 0 1 1 1
What is a word?
Going bigger, a group of bytes is known as a word. A word normally consists of two and sometimes more
bytes belonging together.
Word
1 1 0 1 0 1 1 0 0 0 1 1 0 1 1 1
Page 4 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
You may notice from the table that one hexadecimal digit can represent exactly 4 binary bits. Hexadecimal
is useful to us as a shorthand way of writing binary, and makes it easier to work with long binary numbers.
Hexadecimal is a base-16 number system which means we will have 16 different numbers to represent
our digits. The only problem being that we run out of numbers after 9, and knowing that 10 is counted as
two digits we need to use letters instead:
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
We can do exactly the same thing as we did for denary and binary, and write out our table.
So now all we need to do is to add the columns containing values together, but remember that A = 10, B =
11, C = 12, D = 13, E = 14, F = 15.
You might be wondering why we would want to use hexadecimal when we have binary and denary, and
when computer store and calculate everything in binary. The answer is that it is entirely for human ease.
Consider the following example:
Page 5 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Error messages are written using hex to make it easier for us to remember and record them
Representation Base
EFFE11 base-16 hexadecimal
15728145 base-10 denary
111011111111111000010001 base-2 binary
All the numbers are the same and the easiest version to remember/understand for humans is the base-
16. Hexadecimal is used in computers for representing numbers for human consumption, having uses for
things such as memory addresses and error codes. NOTE: Hexadecimal is used as it is shorthand for
binary and easier for people to remember. It DOES NOT take up less space in computer memory, only on
paper or in your head! Computers still have to store everything as binary whatever it appears as on the
screen.
Exercise: Hexadecimal
U
>>A1
Answer :
16 1
A 1
16 * 10 + 1 * 1 = (161) 10
R
>>FF
Answer :
16 1
F F
16 * 15 + 1 * 15 = (255) 10 R
Page 6 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
>>0D
Answer :
16 1
0 D
16 * 0 + 1 * 13 = (13) 10 R
>> 37
Answer :
16 1
3 7
16 * 3 + 1 * 7 = (55) 10
R
Answer :
Answer :
Page 7 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
The sum that you saw previously to convert from hex to denary seemed a little cumbersome and in the
exam you wouldn't want to make any errors, we therefore have to find an easier way to make the
conversion.
Since 4 binary bits are represented by one hexadecimal digit, it is simple to convert between the two. You
can group binary bits into groups of 4, starting from the right, and adding extra 0's to the left if required,
and then convert each group to their hexadecimal equivalent. For example, the binary number
0110110011110101 can be written like this:
and then by using the table given at the beginning, you can convert each group of 4 bits into hexadecimal:
6 C F 5
So the binary number 0110110011110101 is 6CF5 in hexadecimal. We can check this by converting both
to denary. First we'll convert the binary number, since you already know how to do this:
0 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1
By multiplying the columns and then adding the results, the answer is 27893.
Notice that the column headings are all 2 raised to a power, , , , , and so
on. To convert from hexadecimal to denary, we must use column headings that are powers with the base
16, like this:
4096 256 16 1
6 C F 5
Page 8 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Totaling them all up gives us 27893, showing that 0110110011110101 is equal to 6CF5.
To convert from denary to hexadecimal, it is recommended to just convert the number to binary first, and
then use the simple method above to convert from binary to hexadecimal.
In summary, to convert from one number to another we can use the following rule:
Page 9 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
>>(12) 16
R
Answer :
1 2 (Hex)
0001 0010 (Binary)
128 64 32 16 8 4 2 1
0 0 0 1 0 0 1 0
2+16 = 18 (decimal)
>> (A5) 16 R
Answer :
A 5 (Hex)
1010 0101 (Binary)
128 64 32 16 8 4 2 1
1 0 1 0 0 1 0 1
128+32+4+1 = 165 (decimal)
>> (7F) 16 R
Answer :
7 F (Hex)
0111 1111 (Binary)
128 64 32 16 8 4 2 1
0 1 1 1 1 1 1 1
64+32+8+4+2+1 = 127 (decimal)
Page 10 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
>> (10) 16 R
Answer :
1 0 (Hex)
0001 0000 (Binary)
128 64 32 16 8 4 2 1
0 0 0 1 0 0 0 0
16(decimal)
>>(10101101) 2 R
Answer :
>>(110111) 2
R
>>(10101111) 2 R
Answer :
1010 1111 (Binary)
A F (Hex)
Page 11 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
>>(111010100001) 2 R
Answer :
>>(87) 10
R
Answer :
128 64 32 16 8 4 2 1
0 1 0 1 0 1 1 1
0101 0111= 64+16+4+2+1 = 87(decimal)
>>(12) 10
R
Answer :
128 64 32 16 8 4 2 1
0 0 0 0 1 1 0 0
00001100 = 8+4 = 12(decimal)
Page 12 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
>>(117) 10
R
Answer :
128 64 32 16 8 4 2 1
0 1 1 1 0 1 0 1
01110101=64+32+16+4+1 = 117(decimal)
Answer :
So that it makes things such as error messages and memory address easier for humans understand and
remember.
Answer :
Page 13 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Nearly all computers work purely in binary. That means that they only use ones and zeros, and there's no -
or + symbol that the computer can use. The computer must represent negative numbers in a different
way.
We can represent a negative number in binary by making the most significant bit (MSB) a sign bit, which
will tell us whether the number is positive or negative. The column headings for an 8 bit number will look
like this:
-128 64 32 16 8 4 2 1
MSB LSB
1 0 1 1 1 1 0 1
Here, the most significant bit is negative, and the other bits are positive. You start with -128, and add the
other bits as normal. The example above is -67 in denary because: (-128 + 32 + 16 + 8 + 4 + 1 = -67).
Note that you only use the most significant bit as a sign bit if the number is specified as signed. If the
number is unsigned, then the msb is positive regardless of whether it is a one or not.
Two’s Complement:
The MSB stays as a number, but is made negative. This means that the column headings are
-128 64 32 16 8 4 2 1
-117 = -128 + 11
Two’s complement seems to make everything more complicated for little reason at the moment, but later
it becomes essential for making the arithmetic easier.
Page 14 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Bytes are frequently used to hold individual characters in a text document. In the ASCII character set, each
binary value between 0 and 127 is given a specific character. Most computers extend the ASCII character
set to use the full range of 256 characters available in a byte. The upper 128 characters handle special
things like accented characters from common foreign languages.
You can see the 127 standard ASCII codes below. Computers store text documents, both on disk and in
memory, using these codes. For example, if you use Notepad in Windows OS to create a text file
containing the words, "Four score and seven years ago," Notepad would use 1 byte of memory per
character (including 1 byte for each space character between the words --
ASCII character 32). When Notepad stores the sentence in a file on disk, the file will also contain 1 byte
per character and per space.
Try this experiment: Open up a new file in Notepad and insert the sentence, "Four score and seven years
ago" in it. Save the file to disk under the name getty.txt. Then use the explorer and look at the size of the
file. You will find that the file has a size of 30 bytes on disk: 1 byte for each character. If you add another
word to the end of the sentence and re-save it, the file size will jump to the appropriate number of bytes.
Each character consumes a byte.
If you were to look at the file as a computer looks at it, you would find that each byte contains not a letter
but a number -- the number is the ASCII code corresponding to the character (see below). So on disk, the
numbers for the file look like this:
F o u r <spc> a n d <spc> s e v e n
70 111 117 114 32 97 110 100 32 115 101 118 101 110
By looking in the ASCII table, you can see a one-to-one correspondence between each character and the
ASCII code used. Note the use of 32 for a space -- 32 is the ASCII code for a space. We could expand
these decimal numbers out to binary numbers (so 32 = 00100000) if we wanted to be technically correct -
- that is how the computer really deals with things.
The first 32 values (0 through 31) are codes for things like carriage return and line feed. The space
character is the 33rd value, followed by punctuation, digits, uppercase characters and lowercase
characters. To see all 127 values try Google “ASCII codes”.
Page 15 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Unicode is a computing industry standard for the consistent encoding, representation and handling of text
expressed in most of the world's writing systems or languages. Developed in conjunction with the
Universal Character Set standard and published in book form as The Unicode Standard, the latest version
of Unicode consists of a repertoire of more than 107,000 characters covering 90 scripts for the correct
display of text containing both right-to-left scripts, such as Arabic and Hebrew, and left-to-right scripts).
The Unicode Consortium, the nonprofit organization that coordinates Unicode's development, has the
ambitious goal of eventually replacing existing character encoding schemes like ASCII with Unicode as
many of the existing schemes are limited in size and scope and are incompatible with multilingual
environments. Before 1996 UNICODE used 16 bits (2 bytes) however after 1996 size was not restricted to
16bits and enhanced further to cover every possible variation of multilingual environment.
Notes: All the characters that a system can recognise are called its character set.
ASCII uses 8 bits so there are 256 different codes that can be used and hence 256 different characters.
(This is not quite true, we will see why in chapter 1.5 with reference to parity checks.)
A problem arises when the computer retrieves a piece of data from its memory. Imagine that the data is
01000001. Is this the number 65, or is it A?
They are both stored in the same way, so how can it tell the difference?
The answer is that characters and numbers are stored in different parts of the memory, so it knows which
one it is by knowing whereabouts it was stored.
Page 16 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Some numbers are not proper numbers because they don’t behave like numbers. A barcode for chocolate
looks like a number, and a barcode for sponge cake look like a number, but if the barcodes are added
together the result is not the barcode for chocolate cake. The arithmetic does not give a sensible answer.
Values like this that look like numbers but do not behave like them are often stored in binary coded
decimal (BCD). Each digit is simply changed into a four bit binary number which are then placed after one
another in order.
8 = 1000 6 = 0110
0 = 0000 2 = 0010
Note: All the zeros are essential otherwise you can’t read it back.
Application
The BIOS in many personal computers stores the date and time in BCD because the MC6818 real-time
clock chip used in the original IBM PC AT motherboard provided the time encoded in BCD. This form is
easily converted into ASCII for display.
The Atari 8-bit family of computers used BCD to implement floating-point algorithms. The MOS
6502 processor used has a BCD mode that affects the addition and subtraction instructions.
Early models of the PlayStation 3 store the date and time in BCD. This led to a worldwide outage of the
console on 1 March 2010. The last two digits of the year stored as BCD were misinterpreted as 16 causing
an error in the unit's date, rendering most functions inoperable. This has been referred to as the Year 2010
Problem.
Page 17 of 17
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Bitmap Graphics
31TU U31T
Bitmaps images are exactly what their name says they are: a collection of bits that form an image. The
image consists of a matrix of individual dots (or pixels) that all have their own color (described using bits,
the smallest possible units of information for a computer).
Page 1 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
The smallest possible addressable area defined by a solid color, represented as binary, in an image
This example shows a Bitmap image with a portion greatly enlarged, in which the individual pixels are
rendered as little squares and can easily be seen. Try looking closely at your monitor or mobile phone
screen to see if you can spot the pixels
Resolution
Image Resolution - how many pixels does an image contains per inch/cm? The more pixels used to
31TU U31T
produce an image the more detailed that image can be i.e. the higher its resolution. For instance a 10
Megapixel digital camera makes use of over 10 million pixels per image thus offering a very high
photographic quality.
Screen Resolution - The screen resolution tells you how many pixels your screen can display horizontally
31TU U31T
and vertically. It's written in the form 1024 x 768. In this example, the screen can show 1,024 pixels
horizontally, and 768 vertically:
Page 2 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
The higher the resolution, the more pixels are available. Therefore the crisper the picture.
There are many different video display formats out there, with different widths and heights, and total
numbers of pixels
Page 3 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Using the diagram above we are going to work out how many pixels are required to display a single frame
on a VGA screen.
Height = 480
Width = 640
Area = Width x Height = Total Pixels
Area = 640 x 480 = 307200
Page 4 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Example
RGB(79,146,85)
RGB(129,111,134),
RGB(149,146,166)
Number of
colors/pxl
Example
Page 5 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
This block of bytes is at the start of the file and is used to identify the file. A typical application reads this
block first to ensure that the file is actually a BMP file and that it is not damaged.
Many file types can be identified by using what’s known as a file header. A file header is a ‘signature’
placed at the beginning of a file, so the operating system and other software know what to do with the
following contents.
Many electronic discovery applications (computer programs) will use the file header as a means to verify
file types. The common fear is if a user changes a files extension or the file wasn’t named using an
applications default naming convention, that file will lose its association with the program that created it.
For example, if you create a Microsoft Word document and name it ‘myfile.001’, instead of ‘myfile.doc’ and
then attempt to locate all Microsoft Word files at a later date, you would miss the file if you were looking
for all files ending in ‘.doc’. There are specific file extensions associated with the native application.
Page 6 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
shapes or polygon(s). Allowing for scalability. Objects and properties stored mathematically.
Drawing objects and properties - Vector graphics are made up of objects and their properties. An object is
a mathematical or geometrically defined construct such as a rectangle, line or circle.
40T 40T3 41T3
Page 7 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Image
40T
<rect
width="100"height="80"
39T 39T4 43T5 39T45 39T4 43T5
x="0"y="70"
39T 39T4 43T5 39T45 39T4 43T5
fill="green"/>
39T 39T4 43T5 41T5
40T
<line
<rect x="14"y="23"
40T 39T40 39T4 43T5 39T45 39T4 43T5
<circle cx="100"cy="100"r="5
40T 39T40 39T4 43T5 39T45 39T4 43T5 39T45 39T4 43T5
x1="5"y1="5"
39T 39T4 43T5 39T45 39T4 43T5
width="250"height="50
39T 39T4 43T5 39T45 39T4 43T5
"
g fill="red" stroke="red"/>
39T 39T4 43T5 39T 39T4 43T5 41T5
fill="green"
39T 39T4 43T5
List
40T
stroke="black" stroke
39T 39T4 43T5 39T45
width="5"/> cx="90"cy="80"
39T4 43T5 41T5 39T 39T4 43T5 39T45 39T4 43T5
-width="1"/>
39T4 43T5 41T5
r="50"
39T 39T4 43T5
fill="blue"/>
39T 39T4 43T5 41T5
<text x="180"y="60">
40T 39T40 39T4 43T5 39T45 39T4 43T5 41T5
Un texte
</text>
40T 40T1
Page 8 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
This image illustrates the difference between bitmap and vector images. The bitmap image is composed
of a fixed set of dots (pixels), while the vector image is composed of a fixed set of shapes. In the picture,
scaling the bitmap reveals the pixels and scaling the vector image preserves the shapes.
Raster graphics have inherently unique characteristics that can’t be created inside Flash. The only
warning related to using this option is to make sure you really need raster graphics. The following are
some cases that justify the use of raster graphics:
A photograph. The only time to consider using a vector alternative to a photograph is when the picture is
of a very geometric object. Otherwise, photographs should be raster graphics.
A series of still images extracted from frames of a short video.
An image with special effects that can’t be achieved by using a vector tool, such as clouds, fire, water, and
other natural effects.
If you’re unfamiliar with the difference between vector graphics and raster graphics, learning when one
choice is better than the other can take some time. The file formats .gif, .jpg, .png, .bmp, and .pct are all
raster graphics formats. However, just because a file was saved in one of these formats doesn’t mean it
was done appropriately. It’s the nature of the image in the file that matters. If all you have is a .gif, for
Page 9 of 10
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Page 10 of
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
sound particles will collide with your ear drum, vibrating it and sending a message to your brain. This is
how you hear:
When you hear different volumes and pitches of sound all that is happening is that each sound wave
varies in energy for the volume (larger energy waves, the louder the sound), or distance between sound
waves which adjusts the pitch, (smaller distances between waves leads to higher pitched sound).
Page 1 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
A computer representation of a stereo song, if you look carefully you'll see the volume of the song varying
as you go through it
This section of the book will cover how we record, store and transmit sound using computers. Sound
waves in nature are continuous; this means they have an almost infinite amount of detail that you could
0T 0T
store for even the shortest sound. This makes them very difficult to record perfectly, as computers can
only store discrete data, data that has a limited number of data points.
0T 0T 0T 0T
Page 2 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
An analogue sound wave is picked up by a microphone and sent to an Analogue to Digital (ADC) converter
in the form of analogue electrical signals. The ADC converts the electrical signals into digital values which
can be stored on a computer.
Once in a digital format you can edit sounds with programs such as audacity.
To play digital audio you convert the sound from digital values into analogue electrical signals using the
DAC, these signals are then passed to a speaker that vibrating the speaker cone, moving the air to create
sound waves and analogue noise.
Analogue to Digital Converter (ADC) - Converts analogue sound into digital signals that can be stored on a
computer
Digital to Analogue Converter (DAC) - Converts digital signals stored on a computer into analogue sound
that can be played through devices such as speakers
Page 3 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Page 4 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Hertz (Hz) - the SI unit of frequency defined as the number of cycles per second of a periodic
31TU U0T31 0T
phenomenon
To create digital music that sounds close to the real thing you need to look at the analogue sound waves
and try to represent them digitally. This requires you to try to replicate the analogue (and continuous)
waves as discrete values. The first step in doing this is deciding how often you should sample the sound
wave, if you do it too little, the sample stored on a computer will sound very distant from the one being
recorded. Sample too often and sound stored will resemble that being recorded but having to store each
of the samples means you'll get very large file sizes. To decide how often you are going to sample the
analogue signal is called the sampling rate. Take a look at the following example:
To create digital sound as close to the real thing as possible you need to take as many samples per
second as you can. When recording MP3s you'll normally use a sampling rate between 32,000, 44,100 and
0T 0T31U U0T31 0T
48,000Hz (samples per second). That means that for a sampling rate of 44,100, sound waves will have
been sampled 44,100 times per second! Recording the human voice requires a lower sampling rate,
around 8,000Hz.
Page 5 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Comparison of the same sound sample recorded at 8kHz, 22kHz and 44kHz sample rate. Note the spacing
of the data points for each sample. The higher the sample rate the more data points we'll need to store
Sampling resolution
As you saw earlier, different sounds can have different volumes. The sampling resolution allows you to set
the range of volumes storable for each sample. If you have a low sampling resolution then the range of
volumes will be very limited, if you have a high sampling resolution then the file size may become
unfeasible. The sampling resolution for a CD is 16 bits used per sample.
Page 6 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
To work out the size of a sound sample requires the following equation:
If you wanted to record a 30 second voice message on your mobile phone you would use the following:
Sound Editing
If you are interested in sound editing you can start editing your own music using a program
called Audacity. Using Audacity you can create your own sound samples with different sample rates and
0T 0T31U U31T
sample resolutions, listening to the difference between them and noting the different file sizes.
Features
This is a list of features in Audacity, the free, open source, cross-platform audio editor.
Recording
Audacity can record live audio through a microphone or mixer, or digitize recordings from cassette tapes,
0T 0T31U
records or minidiscs. With some sound cards, and on any Windows Vista, Windows 7 or Windows 8
U31T
Page 7 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Sound Quality
Supports 16-bit, 24-bit and 32-bit (floating point) samples (the latter preserves samples in excess
of full scale).
Sample rates and formats are converted using high-quality resampling and dithering.
Tracks with different sample rates or formats are converted automatically in real time.
Editing
Label tracks with selectable Sync-Lock Tracks feature for keeping tracks and labels synchronized.
31TU U0T31 0T 0T 0T31U U0T31 0T
Accessibility
Tracks and selections can be fully manipulated using the keyboard. 0T 0T31U U31T
Excellent support for JAWS, NVDA and other screen readers on Windows, and for VoiceOver on
0T 0T31U U0T31 0T 0T 0T31U U0T31 0T
Mac.
Effects
Create voice-overs for podcasts or DJ sets using Auto Duck effect. 0T 0T31U U0T31 0T
Page 8 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Plug-ins
Support for LADSPA, Nyquist, VST and Audio Unit effect plug-ins.
0T 0T31U U31T 0T 0T31U U31T 0T 0T31U U0T31 0T 0T 0T31U U0T31 0T
Effects written in the Nyquist programming language can be easily modified in a text editor - or
0T 0T31U U0T31 0T
Analysis
Page 9 of 9
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
second (frame/s) for old mechanical cameras to 120 or more frames per second for new professional
cameras. PAL (Europe, Asia, Australia, etc.) and SECAM (France, Russia, parts of Africa etc.) standards
0T 0T31U U0T31 0T 0T 0T31U U0T31 0T
specify 25 frame/s, while NTSC (USA, Canada, Japan, etc.) specifies 29.97 frame/s. Film is shot at the 0T 0T31U U0T31 0T
slower frame rate of 24 photograms/s, which complicates slightly the process of transferring a cinematic
motion picture to video. The minimum frame rate to achieve a comfortable illusion of a moving images 0T 0T31U U31T
early mechanical and CRT video displays without increasing the number of complete frames per second,
0T 0T31U U0T31 0T 0T 0T31U U0T31 0T 0T 0T31U U31T
which would have required sacrificing image detail in order to remain within the limitations of a
narrow bandwidth. The horizontal scan lines of each complete frame are treated as if numbered
0T 0T31U U31T 0T 0T31U U0T31 0T
consecutively and captured as two fields: an odd field (upper field) consisting of the odd-numbered lines 0T 0T 0T 0T 0T 0T
Analog display devices reproduce each frame in the same way, effectively doubling the frame rate as far
as perceptible overall flicker is concerned. When the image capture device acquires the fields one at a
time, rather than dividing up a complete frame after it is captured, the frame rate for motion is effectively
doubled as well, resulting in smoother, more lifelike reproduction (although with halved detail) of rapidly
moving parts of the image when viewed on an interlaced CRT display, but the display of such a signal on a
progressive scan device is problematic.
NTSC, PAL and SECAM are interlaced formats. Abbreviated video resolution specifications often include
an i to indicate interlacing. For example, PAL video format is often specified as 576i50, where 576indicates
0T 0T 0T 0T 0T 0T 0T 0T
the total number of horizontal scan lines, i indicates interlacing, and 50 indicates 50 fields (half-frames) 0T 0T 0T 0T 0T 0T 0T 0T
per second.
In progressive scan systems, each refresh period updates all of the scan lines of each frame in sequence.
0T 0T 0T 0T
When displaying a natively progressive broadcast or recorded signal, the result is optimum spatial
resolution of both the stationary and moving parts of the image. When displaying a natively interlaced
signal, however, overall spatial resolution will be degraded by simple line doubling and artifacts such as 0T 0T31U U0T31 0T
flickering or "comb" effects in moving parts of the image will be seen unless special signal processing is
applied to eliminate them. A procedure known as deinterlacing can be used to optimize the display of an 0T 0T31U U0T31 0T
interlaced video signal from an analog, DVD or satellite source on a progressive scan device such as
an LCD Television, digital video projector or plasma panel. Deinterlacing cannot, however, produce video
0T 0T31U U31T 0T 0T31U
Page 1 of 3
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Video compression
31TU U31T
Uncompressed video delivers maximum quality, but with a very high data rate. A variety of methods are
31TU U31T 31TU U31T
used to compress video streams, with the most effective ones using a Group Of Pictures (GOP) to reduce
31TU U31T 31TU U31T
spatial and temporal redundancy. Broadly speaking, spatial redundancy is reduced by registering
31TU U31T
differences between parts of a single frame; this task is known as intraframe compression and is closely 31TU U31T
related to image compression. Likewise, temporal redundancy can be reduced by registering differences
31TU U31T
between frames; this task is known as interframe compression, including motion compensation and other 31TU U31T 31TU U31T
techniques. The most common modern standards are MPEG-2, used for DVD, Blu-ray and satellite 31TU U31T 31TU U31T 31TU U31T 31TU
television, and MPEG-4, used for AVCHD, Mobile phones (3GP) and Internet.
U31T 31TU U31T 31TU U31T
Video formats
There are different layers of video transmission and storage, each with its own set of formats to choose
from.
For transmission, there is a physical connector and signal protocol. A given physical link can carry certain
"display standards" which specify a particular refresh rate, display resolution, and color space. 31TU U31T 31TU U31T
Many analog and digital recording formats are in use, and digital video clips can also be stored on 31TU U31T 31TU U31T
a computer file system as files which have their own formats. In addition to the physical format used by
31TU U31T 31TU U31T
the data storage device or transmission medium, the stream of ones and zeros that is sent must be in a
31TU U31T
Multimedia Container Format (MCF) was the first project to create an open and flexible media container
format that could encapsulate multiple video, audio and subtitle streams in one file. The project was
started in 2000 as an attempt to improve the aging AVI format. At first the project generated some
confusion about its intended goals. This was solved when the lead developer created a simple player for
the format which supported embedded subtitles, which sparked interest and the community began to
grow. Several new features were added and the specification refined.
One of the objectives of the new format was to simplify its handling by players. This was to be done by
making it feature-complete, eliminating the need for third-party extensions and actively discouraging
them. Because of the simple, fixed structure, the time required to read and parse the header information
was minimal. The small size of the header (2.5 kB), which at the same time contained all the important
data, facilitated quick scanning of collections of MCF files, even over slow network links.
The key feature of MCF was being able to store several chapters of video, menus, subtitles in several
languages and multiple audio streams (e.g. for different languages) in the same file. At the same time, the
content could be split between several files called segments; assembling the segments into a complete
movie was automatic, given the segments were all present. Segments could also be played separately,
Page 2 of 3
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
Page 3 of 3
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
We have seen the various ways that you can reduce the size of files, we have also seen that humans have
a limit to the frequencies that they can perceive, so what sampling rate would be needed to only store the
samples that humans can perceive. The full range of human hearing is between 20 Hz and 20 kHz.
As you can see we have some serious issues with the size of sound files. Take a look at the size of a 3
minute pop song recorded at a sample rate of 44kHz and a sample resolution of 16 bits.
As you are probably aware an mp3 of the same length would be roughly 3Mb, a fifth of the size. So what
gives? It is easy to see that the raw file sizes for sounds are just too big to store and transmit easily, what
is needed it a way to compress them.
31TU U0T31 0T
Lossless
Lossless compression - compression doesn't lose any accuracy and can be decompressed into an
0T 0T
WAV files don't involve any compression at all and will be the size of files that you have calculated
31TU U0T31 0T
already. There are lossless compressed file formats out there such as FLAC which compress the WAV file
0T 0T31U U0T31 0T
into data generally 50% the original size. To do this it uses run length encoding (RLE), which looks for
0T 0T31U U31T
repeated patterns in the sound file, and instead of recording each pattern separately, it stores information
on how many times the pattern occurs in a row. Let us take a hypothetical set of sample points:
0000000000000000000001234543210000000000000000000123456787656789876
As you can see the silent area takes up a large part of the file, instead of recording these individually we
can set data to state how many silent samples there are in a row, massively reducing the file size:
(21-0)123454321(17-0)123456787656789876
Lossy
FLAC files are still very large, what is needed is a format that allows you to create much smaller file sizes
that can be easily stored on your computer and portable music device, and easily transmitted across the
internet.
Lossy compression - compression loses file accuracy, generally smaller than lossless compression
0T 0T
As we have already seen, to make smaller audio files we can decrease the sampling rate and the sampling
resolution, but we have also seen the dreadful effect this can have on the final sound. There are other
Page 1 of 2
Computer Science 9608 (Notes)
Chapter: 1.1 Information representation
There are many lossy compressed audio formats out there including: MP3, AAC and OGG (which is open
0T 0T31U U31T 0T 0T31U U0T31 0T 0T 0T31U U0T31 0T
source). The compression works by reducing accuracy of certain parts of sound that are considered to be
beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual 0T 0T31U
coding. It uses psychoacoustic models to discard or reduce precision of components less audible to
U31T
human hearing, and then records the remaining information in an efficient manner. Because the accuracy
of certain frequencies are lost you can often tell the difference between the original and the lossy versions,
being able to hear the loss of high and low pitch tones.
If the first image uses 1 bit to store the color for each pixel, then the image size would be:
For the second image uses 2 bits to store the color for each pixel, then the image size would be:
Page 2 of 2
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
The history of computing started off with centralized computers (in many cases mainframes) or servers
performing all the calculations. Client computers were then attached to these centralised computers
(servers) and if you wanted to calculate something, you would have to wait for the central computer to
respond. As computing power got cheaper, client nodes became more powerful and the central computer
less important. However, with the growth of the internet, there has been a shift back to a client server
model. Powerful central computers store information such as emails, documents, music and videos or
offer services such as file hosting, printing, game hosting and internet access; client computers fetch
information and use services from these central servers. In the next few years you are likely to see more
and more software moving away from running on your desktop to running on remote servers and you
accessing it as a client, this is called software as a service.
31TU U31T
As an example of modern client server model consider a video sharing website. The website, let's call it
mutube, has a server that stores all the videos that are uploaded to the site. The website is used by
millions of clients a day and each of them connects to the server to watch videos. When a client connects
to mutube, the server and asks for a particular video, the server loads the video into RAM from a large
array of hard disks and mutube sends the video to the client. The client on receiving the video presses
play and watches the video.
Other examples of servers might be a shared printing service in a college. The print server will be hosted
on a single computer, and when anyone in the building wants to print, the request is sent to the server. In
this case the server will keep track of how much printing credit each user has and make sure that the print
queue is dealt with properly.
Page 1 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
The current client-server model is starting to change, with companies being less likely to offer services
with a centralized server. Increasingly internet firms are reaching a global clientele, it makes little sense to
have a server or servers based in one location as if your servers are in America and some of your users in
Armenia, these users will experience slow access to your services. Another problem is if a power cut
effects your server or the connection to that one server or set of servers goes down then the service you
are offering the internet will also stop.
With cloud computing the services may be distributed all over the globe, meaning that wherever you are,
31TU U31T
you'll have a server reasonably close to you offering access to the data and services you need. It also
means that if one server goes down, other servers in different locations can keep the service running.
Keeping databases synchronized across the globe, so your mail client has the same mails in Switzerland
as in Swaziland, is a complex task and firms such as amazon and rackspace offer services to help you
handle this. One downside with cloud computing is you are never quite sure where your data is, and if
you're not careful you might find data being stored in countries that have less stringent data protection
laws than your own.
Page 2 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Servers are software programs that in most cases run off normal computing hardware. Server software
includes:
Printing
File sharing
Game hosting
Websites
Other web services
Clients are software programs and processes that connect to servers, sending requests and receiving
responses. Client examples include:
Definition: The term “WWW” refers to the “World Wide Web” or simply the Web. The World Wide Web
0T 0T 0T 0T 0T 0T 0T 0T 0T 0T
consists of all the public Web sites connected to the Internet worldwide, including the client devices (such
as computers and cell phones) that access Web content. The WWW is just one of many applications of the
Internet and computer networks.
Researcher Tim Berners-Lee led the development of the original World Wide Web in the late 1980s and
early 1990s. He helped build prototypes of the above Web technologies and coined the term "WWW." Web
sites and Web browsing exploded in popularity during the mid-1990s.
Page 3 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
The Internet is named for "interconnection of computer networks". It is a massive hardware combination
of millions of personal, business, and governmental computers, all connected like roads and highways.
The Internet started in the 1960's under the original name "ARPAnet". ARPAnet was originally an
experiment in how the US military could maintain communications in case of a possible nuclear strike.
With time, ARPAnet became a civilian experiment, connecting university mainframe computers for
academic purposes. As personal computers became more main stream in the 1980's and 1990's, the
Internet grew exponentially as more users plugged their computers into the massive network. Today, the
Internet has grown into a public spider web of millions of personal, government, and commercial
computers, all connected by cables and by wireless signals.
No single person owns the Internet. No single government has authority over its operations. Some
technical rules and hardware/software standards enforce how people plug into the Internet, but for the
most part, the Internet is a free and open broadcast medium of hardware networking.
Page 4 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
When you want to send a message or retrieve information from another computer, the TCP/IP protocols
are what make the transmission possible. Your request goes out over the network, hitting domain name 0T 0T4
servers (DNS) along the way to find the target server. The DNS points the request in the right direction.
0T4 0T
Once the target server receives the request, it can send a response back to your computer. The data might
travel a completely different path to get back to you. This flexible approach to data transfer is part of what
makes the Internet such a powerful tool.
Definition: A network gateway is an internetworking system capable of joining together two networks that
0T 0T 0T 0T 0T 0T 0T 0T 0T 0T
use different base protocols. A network gateway can be implemented completely in software, completely
in hardware, or as a combination of both. Depending on the types of protocols they support, network
gateways can operate at any level of the OSI model. 0T 0T
Because a network gateway, by definition, appears at the edge of a network, related capabilities
like firewalls tend to be integrated with it. On home networks, a broadband router typically serves as the
0T 0T 0T 0T 0T 0T
network gateway although ordinary computers can also be configured to perform equivalent functions.
Definition: Routers are small physical devices that join multiple networks together. Technically, a router is
0T 0T 0T 0T
a Layer 3 gateway device, meaning that it connects two or more networks and that the router operates at
0T 0T 0T 0T
Home networks typically use a wireless or wired Internet Protocol (IP) router, IP being the most common 0T 0T 0T 0T
OSI network layer protocol. An IP router such as a DSL or cable modem broadband router joins the 0T 0T
home's local area network (LAN) to the wide-area network (WAN) of the Internet.
0T 0T 0T 0T 0T 0T
By maintaining configuration information in a piece of storage called the routing table, wired or wireless 0T 0T
routers also have the ability to filter traffic, either incoming or outgoing, based on the IP addresses of
senders and receivers. Some routers allow a network administrator to update the routing table from a Web
browser interface. Broadband routers combine the functions of a router with those of a network switch
and a firewall in a single unit.
Page 5 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
In general, all of the machines on the Internet can be categorized as two types: servers and clients. Those
machines that provide services (like Web servers or FTP servers) to other machines are servers. And the
machines that are used to connect to those services are clients. When you connect to Yahoo! at
www.yahoo.com to read a page, Yahoo! is providing a machine (probably a cluster of very large
machines), for use on the Internet, to service your request. Yahoo! is providing a server. Your machine, on
the other hand, is probably providing no services to anyone else on the Internet. Therefore, it is a user
machine, also known as a client. It is possible and common for a machine to be both a server and a client,
but for our purposes here you can think of most machines as one or the other.
A server machine may provide one or more services on the Internet. For example, a server machine might
have software running on it that allows it to act as a Web server, an e-mail server and an FTP server.
Clients that come to a server machine do so with a specific intent, so clients direct their requests to a
specific software server running on the overall server machine. For example, if you are running a Web
browser on your machine, it will most likely want to talk to the Web server on the server machine. Your
Telnet application will want to talk to the Telnet server, your e-mail application will talk to the e-mail
server, and so on...
DNS:
Page 6 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
In relation to the Internet, the PSTN actually furnishes much of the Internet's long-distance infrastructure.
Because Internet service providers ISPs pay the long-distance providers for access to their infrastructure
0T 0T31U U31T
and share the circuits among many users through packet-switching, Internet users avoid having to pay
0T 0T
Many cell phone service providers offer mobile Broadband Internet Access Services for smartphones,
basic phones, tablets, netbooks, USB modems, mobile hotspots and other wireless devices over their 3G
and 4G broadband networks. 0T
Page 7 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
One of the key elements that determine bandwidth is the physical nature of the cable being used. A signal
becomes weaker and dies down eventually the longer it travels along a cable. Therefore the length of
cable determines the bandwidth of the link. For instance the bandwidth of a broadband DSL connection to
the home is determined by the length of copper cable between the house and the nearest telephone
exchange.
COPPER WIRES
Twisted pair cabling is a type of wiring in which two conductors (wires) are twisted together for the
purposes of cancelling out electromagnetic interference from external sources or other twisted pairs. It
was invented by Alexander Graham Bell. Twisted pair is used to gain enough bandwidth higher than coax
cable.
Unshielded twisted pair or UTP cables are found in many local area networks and telephone systems. A
typical subset of these colours (white/blue, blue/white, white/orange, orange/white) shows up in most
UTP cables as shown above.
UTP cable is the most common cable used in computer networking and is often used in LAN because of
its relatively lower costs compared to optical fibre and coaxial cable. UTP is also finding increasing use in
video applications, primarily in security cameras.
Page 8 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Copper cable is adequate for network cable runs for up to 100 meters, but above that the signal becomes
too weak, therefore an alternative technology is needed.
FIBER OPTIC
Fiber optics is a technology that uses glass (or plastic) threads (fibers) to transmit data. A fiber optic
cable consists of a bundle of glass threads, each of which is capable of transmitting messages modulated
onto light waves.
Page 9 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
1. Fiber optic cables have a much greater bandwidth than metal cables. This means that they can carry more
data.
2. Fiber optic cables are less susceptible than metal cables to interference.
3. Fiber optic cables are much thinner and lighter than metal wires.
4. Data can be transmitted digitally (the natural form for computer data) rather than analogically.
The main disadvantage of fiber optics is that the cables are expensive to install. In addition, they are more
fragile than wire and are difficult to splice.
Fiber optics is a particularly popular technology for local-area networks. In addition, telephone companies
are steadily replacing traditional telephone lines with fiber optic cables. In the future, almost all
communications will employ fiber optics.
Wireless networks:
Wireless network refers to any type of computer network that is not connected by cables of any kind. It is
a method by which homes, telecommunications networks and enterprise (business) installations avoid
the costly process of introducing cables into a building, or as a connection between various equipment
locations. Wireless telecommunications networks are generally implemented and administered using a
transmission system called radio waves.
Page 10 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Radio waves are an electromagnetic radiation with differing wavelengths. These waves are similar to an
ocean wave. Radio waves are used for many processes. For example they are used to broadcast TV, in
communication between satellites and it enables computers to share information without wires.
Radio waves have a large wavelength so they experience less interference and can travel over large
43T 0T43 0T
distances. However, since they do not have a high frequency, they cannot transmit as much data.
However, they can carry more signals than wires; they are often used for linking buildings on a college
campus or corporate site and increasingly for longer distances as telephone companies update their
networks.
MICRO WAVE
Microwave radio also carries computer network signals, generally as part of long-distance telephone
systems. Low-power microwave radio is becoming common for wireless networks within a building.
0T 0T 0T 0T
Microwaves are widely used for point-to-point communications because their small wavelength allows
conveniently-sized antennas to direct them in narrow beams, which can be pointed directly at the
receiving antenna. This allows nearby microwave equipment to use the same frequencies without
interfering with each other, as lower frequency radio waves do. Another advantage is that the high
frequency of microwaves gives the microwave band a very large information-carrying capacity; the
microwave band has a bandwidth 30 times that of all the rest of the radio spectrum below it. The
attenuation of microwave is less than twisted pair or coaxial cable. A disadvantage is that microwaves are
limited to line of sight propagation; they cannot pass around hills or mountains as lower frequency radio
waves can. It is also affected by anything blocking the line of sight, such as rainfall.
Page 11 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
SATELLITES
A satellite is any object that revolves around a planet in a circular or elliptical path. The moon is Earth's
natural satellite at 240,000 miles distant. Other satellites that fulfill this definition are man-made and have
been launched into orbit to carry out specific functions. These satellites are typically between 100 and
24,000 miles away. Satellites have many purposes including data communications, scientific applications
and weather analysis. Satellite transmission requires an unobstructed line of sight. The line of site will be
between the orbiting satellite and a station on Earth. Satellite signals must travel in straight lines but do
not have the limitations of ground based wireless transmission, such as the curvature of the Earth.
Microwave signals from a satellite can be transmitted to any place on Earth which means that high quality
communications can be made available to remote areas of the world without requiring the massive
investment in ground-based equipment.
Page 12 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Millions of bits, travel over thousands of computer networks every day. The system works much like the
modern post office, which has to constantly send and receive letters from all over the world. Like those
letters, computer bits arrive in a continuous, ordered stream known as the bit stream. The bits identify
where they are coming from (often a computer) and where they are traveling to (often another computer).
All the information sent to and from a computer turns into a series of 1's and 0's that represent data.
When the computer sends a message, the bits travel in a specific order through a wire to their destination.
Typically, the bit stream starts with information about where it's going and how to process the information
once it arrives. An email, for example, contains information on the sender, the recipient, and the message
itself. When the user sends it, it's broken down into bits of data which travel over the bit stream to the
recipient's computer.
Video on demand (VOD) is a system that may allow users to select and watch/listen to video or
audio content when they choose to, rather than having to watch at a specific broadcast time (Live
streaming). Some TV VOD systems such as Netflix or Hulu allow users to watch their favorite shows
whenever they please.
Real time or Live streaming, as the name suggests, is streaming a video that is happening at that exact
moment. Examples may be a football match, a concert, or a lecture happening at your university.
Page 13 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Bitrate is a term used to describe the amount of data that is being passed within a given amount of time.
Depending on the context, common measurements of bitrate include Kbps and Mbps, respectively
meaning kilobits per second and megabits per second. No matter the units being used, a higher number is
generally good, indicating high speed or high quality.
When it comes to Internet speeds, a higher bitrate is always desirable –it simply sends you the content
that you want faster. With higher bitrates, you can do more with your Internet connection — stream high-
definition movies, play online games with minimal lag, and download large files in just a few seconds. You
can figure out what bitrates you’re getting by visiting a website such as speedtest.net
When talking about streaming audio or video, bitrates refer to the amount of data stored for each second
of media that is played. For example, a 320 kbps MP3 audio file has a higher quality than the same file at
just 128 kbps. The same applies to videos – a higher bitrate will have a higher quality when comparing the
same video with the same resolution. Bitrates should be expected to go up whenever the resolution goes
up, as more data is being processed. Therefore, high bitrates for audio and video may provide excellent
quality, but it can also place a major strain on your hardware which can result in stutters or frequent
pauses in the media being streamed if your bitrate is not high enough or if there is too much traffic on
your line.
Page 14 of 14
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Take for example, if you are ordering pizza by home delivery, then you need to tell the delivery man your
exact home address. Similarly, to print a document, you must know the IP address of your printer on your
network.
The designers of the Internet Protocol defined an IP address as a 32-bit number and this system, known
as Internet Protocol Version 4 (IPv4), is still in use today. However, due to the enormous growth of
the Internet and the depletion of vacant addresses, a new version of IP (IPv6), using 128 bits for the
address, was developed in 1995. IPv6 and its deployment has been ongoing since the mid-2000s.
IP addresses are binary numbers, but they are usually stored in text files and displayed in human-
readable notations, such as 172.16.254.1 (for IPv4), and 2001:db8:0:1234:0:567:8:1 (for IPv6).
IPv4 addresses are canonically represented in dot-decimal notation, which consists of four decimal
numbers, each ranging from 0 to 255, separated by dots, e.g., 172.16.254.1. Each part represents a group
of 8 bits (octet) of the address. In some cases of technical writing, IPv4 addresses may be presented in
various hexadecimal, octal, or binary representations.
Page 1 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
A public IP address is assigned to every computer that connects to the Internet, where each IP is
unique. In this case, there cannot exist two computers with the same public IP address all over the
Internet. This addressing scheme makes it possible for the computers to “find each other” online and
exchange information. Users have no control over the IP address (public) that is assigned to the
computer. The public IP address is assigned to the computer by the Internet Service Provider as soon as
the computer is connected to the Internet gateway.
A public IP address can be either static or dynamic. A static public IP address does not change and is
used primarily for hosting web pages or services on the Internet. On the other hand, a dynamic public IP
address is chosen from a pool of available addresses and changes each time one connects to the
Internet.
Most Internet users will only have a dynamic IP assigned to their computer which goes off when the
computer is disconnected from the Internet. Thus when it is re-connected it gets a new IP.
You can check your public IP address by visiting www.whatismyip.com
30TU U30T
An IP address is considered private if the IP number falls within one of the IP address ranges reserved for
private networks such as a Local Area Network (LAN). The Internet Assigned Numbers Authority (IANA)
has reserved the following three blocks of the IP address space for private networks (local networks):
Private IP addresses are used for numbering the computers in a private network including home, school
and business LANs in airports and hotels which makes it possible for the computers in the network to
communicate with each other.
Say for example, if a network X consists of 10 computers, each of them can be given an IP starting
from 192.168.1.1 to 192.168.1.10. Unlike the public IP, the administrator of the private network is free to
assign an IP address of his own choice (provided the IP number falls in the private IP address range as
mentioned above).
Page 2 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
If the private network is connected to the Internet (through an Internet connection via ISP), then each
computer will have a private IP as well as a public IP. Private IP is used for communication within the
network whereas the public IP is used for communication over the Internet. Most Internet users with a
DSL connection will have both a private as well as a public IP.
You can know your private IP by typing ipconfig command in the command prompt. The number that
you see against “IPV4 Address:” is your private IP which in most cases will be192.168.1.1 or 192.168.1.2.
Unlike the public IP, private IP addresses are always static in nature.
Explain how a Uniform Resource Locator (URL) is used to locate a resource on the World Wide Web
(WWW) and the role of the Domain Name Service
Every file on the Web has a URL (uniform resource locator). Whether it’s an HTML file, a photo file,
whatever, it has a URL. A file’s URL is its unique address on the Web. Just as a cell phone has a unique
telephone number, a file has a unique URL.
Page 3 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
A user clicks the “Bouncing ball” link. What does the browser do? The message is sent to a server. Which
server? The one given in the URL of course. The URL of the desired page is
http://doomdogs.com/products/ball.html. The part “doomdogs.com” is the domain name of the
server. So that’s where the message is sent: to the server at doomdogs.com. The rest of the URL
“/products/ball.html” tells the server what data to return to the browser. That part of the URL matches a
file on the server. A server might have 50,000 files that it can send to a browser. /products/ball.html is the
particular file it should send.
Let’s look at what happens when the user clicks the link.
Page 4 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Somehow, we want the Web server to use the URL the browser sent, to fetch the right file from its disk
drive. It needs to convert the URL into a file path, and then use the file path to read the file.
One of the settings in the file is DocumentRoot. The setting tells web server where on the computer’s disk
drive the files for the Web site are stored. The Web master puts all the data for the doomdogs.com Web
site in the directory “/sites/dd”. Then he map DocumentRoot folder to “/sites/dd”, so that web server
would know where to get the files.
The server takes its DocumentRoot setting “/sites/dd” and appends the URL path “/products/ball.html”,
giving “/sites/dd/products/ball.html.”
Now web server knows where to get the file the browser
asked for. We’re at step 3. Server has translated the URL to
a file path on the server computer’s disk drive.
Server reads the file, and sends its contents back to the browser. The browser renders the content for
Jake (6). Recall that rendering is the process of making a display from code in a file. And finally the
contents are displayed on the screen.
Page 5 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
The Domain Name System (DNS) is a standard technology for managing public names of Web sites and
other Internet domains. DNS technology allows you to type names into your Web browser like
“soundcloud.com” and your computer to automatically find that address on the Internet. A key element of
the DNS is a worldwide collection of DNS servers.
A DNS server is any computer registered to join the Domain Name System. A DNS server runs special-
purpose networking software, features a public IP address, and contains a database of network names
and addresses for other Internet hosts.
A DNS server is similar to looking up contacts on your phone, to call a contact, you simply look up that
person’s name, but that name is of no use to the phone itself, it has to look up the contact number and dial
that.
Simply speaking, both systems translate the website/contact name into an IP address or phone number.
Page 6 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
When a DNS server receives a request not in its database (such as a geographically distant or rarely
visited Web site), it temporarily transforms from a server to a DNS client. The server automatically passes
that request to another DNS server or up to the next higher level in the DNS hierarchy as needed.
Eventually the request arrives at a server that has the matching name and IP address in its database (all
the way to the root level if necessary), and the response flows back through the chain of DNS servers to
your computer.
Page 7 of 7
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Scripts
A script is a set of instructions. For Web pages they are instructions either to the Web browser (client-side
scripting) or to the server (server-side scripting). These are explained more below.
Scripts provide change to a Web page. Think of some Web pages you have visited. Any page which
changes each time you visit it (or during a visit) probably uses scripting.
All log on systems, some menus, almost all photograph slideshows and many other pages use scripts.
Google uses scripts to fill in your search term for you, to place advertisements, to find the thing you are
searching for and so on. Amazon uses scripting to list products and record what you have bought.
Client-side
The client is the system on which the Web browser is running. JavaScript is the main client-side scripting
language for the Web. Client-side scripts are interpreted by the browser. The process with client-side
scripting is:
So client-side scripting is used to make Web pages change after they arrive at the browser. It is useful for
making pages a bit more interesting and user-friendly. It can also provide useful gadgets such as
calculators, clocks etc. but on the whole is used for appearance and interaction.
Client-side scripts rely on the user's computer. If that computer is slow they may run slowly. They may not
run at all if the browser does not understand the scripting language. As they have to run on the user's
system the code which makes up the script is there in the HTML for the user to look at (and copy or
change).
Page 1 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
The server is where the Web page and other content lives. The server sends pages to the user/client on
request. The process is:
The use of HTML forms or clever links allow data to be sent to the server and processed. The results may
come back as a second Web page.
Server-side scripting tends to be used for allowing users to have individual accounts and providing data
from databases. It allows a level of privacy, personalization and provision of information that is very
powerful. E-commerce, MMORPGs and social networking sites all rely heavily on server-side scripting.
PHP and ASP.net are the two main technologies for server-side scripting.
The script is interpreted by the server meaning that it will always work the same way. Server-side scripts
are never seen by the user (so they can't copy your code). They run on the server and generate results
which are sent to the user. Running all these scripts puts a lot of load onto a server but none on the user's
system.
The combination
A site such as Google, Amazon, Facebook or StumbleUpon will use both types of scripting:
server-side handles logging in, personal information and preferences and provides the specific
data which the user wants (and allows new data to be stored)
client-side makes the page interactive, displaying or sorting data in different ways if the user asks
for that by clicking on elements with event triggers
Page 2 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Each statement is executed by the browser in the sequence they are written
Blocks start with a left curly bracket and ends with a right curly bracket.
JavaScript operators
U
Comparison operators (== Equal to, === Exactly equal to, != Not equal to, <,
>, <=, >=)
Page 3 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Alert Box:
U
An alert box is often used if you want to make sure information comes through to the user.
When an alert box pops up, the user will have to click “OK” to proceed.
Syntax: alert(“sometext”);
Confirm Box:
U
A confirm box is often used if you want the user to verify or accept something.
When a confirm box pops up, the user will have to click either “OK” or “cancel’ to proceed.
If the user clicks “OK”, the box returns true. If the user clicks “cancel”, the box returns false.
Syntax: confirm(“sometext”);
Prompt Box:
U
A prompt box is often used if you want the user to input a value before entering a page.
When a prompt box pops up, the user will have to click either “OK’ or “cancel” to proceed after
entering an input value.
If the user clicks “OK” the box returns the input value. If the user clicks “cancel” the box returns
null.
JavaScript functions
U
A function contains a code that will be executed by an event or by a call to the function.
Page 4 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
This gives us an overview of what event handling is , what its problems are and how to write proper
cross-browser scripts
Event modeling
U
This is example of event modeling in which we displays the date when a button is clicked:
Page 5 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Uses of PHP:
U
Performs system functions i.e. from files on a system it can create, open, read, write and close
them.
Can handle forms, i.e. gather data from files, save data to file, through email you send data, and
return data to the user.
The PHP parsing engine needs a way to differentiate PHP code from other elements in the page.
The mechanism for doing so is known as “escaping to PHP”.
Page 6 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
Before the browser sends the information, it encodes it using a scheme called URL encoding.
In this scheme, name/value pairs are joined with equal signs and different pairs are separated by
the ampersand sign (&).
Spaces are removed and replaced with the “+” character and any other non-alphanumeric
characters are replaced with a hexadecimal values.
The get method sends the encoded user information appended to the page request.
The page and the encoded information are separated by the “?” character.
“http://www.test.com/index.htm?name1=value1&name2=value2”
The get method produces a long string that appears in your server logs, in the browser’s location “:
box”.
Never use GET method if you have a password or other sensitive information to be sent to the user.
It can’t be used to send binary data, like images or word documents, to the server.
The data sent by the GET method can be accessed using the QUERY_STRING environment
variable.
The PHP provides $_GET associative array to access all the sent information using the GET
method.
Page 7 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
<?php
if( $_GET["name"] || $_GET["age"] )
{
echo "Welcome ". $_GET['name']. "<br />";
echo "You are ". $_GET['age']. " years old.";
exit();
}
?>
<html>
<body>
<form action="<?php $_PHP_SELF ?>" method="GET">
Name: <input type="text" name="name" />
Age: <input type="text" name="age" />
<input type="submit" />
</form>
</body>
</html>
Page 8 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
The POST method transfers information via HTTP headers. The information is encoded as described in
case of GET method and put into a header called “QUERY_STRING”.
The POST method does not have any restriction on data size to be sent.
The POST method can be used to send ASCII as well as binary data.
The data sent by POST method goes through HTTP header so the security depends on the HTTP
protocol. By using Secure HTTP you can make sure that your information is secure.
The PHP provides “$_POST” associative array to access all the sent information using POST
method.
<?php
if( $_POST["name"] || $_POST["age"] )
{
echo "Welcome ". $_POST['name']. "<br />";
echo "You are ". $_POST['age']. " years old.";
exit();
}
?>
<html>
<body>
<form action="<?php $_PHP_SELF ?>" method="POST">
PHP variables
All variables in PHP are denoted with a leading dollar sign “$”.
PHP does a good job of automatically converting types from one to another when necessary.
Page 9 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
NULL: null is a special type that only has one value i.e. “null”.
if…..else statement :
Use this statement if you want to execute some code if a condition is true and another code if a condition
is false.
Example:
if (condition)
else
else if statement:
If you want to execute some code if one of the several conditions aretrue, use this statement.
Example:
if (condition)
else if (condition)
Else
Page 10 of 11
Computer Science 9608 (Notes)
Chapter: 1.2 Communication and Internet
technologies
If you want to select one of many blocks of code to be executed , use the switch statement.
Example:
switch(expression)
case label 1
break;
case label 2
break;
default:
Page 11 of 11
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
A computer is an electronic machine that accepts data, stores and processes data into information. The
computer is able to work because there are instructions in its memory directing it, instructions that direct
the computer are called software or computer program. The physical parts of the computer that you can
see and touch, such as the keyboard, monitor and the mouse are called hardware. There are four major
categories of computer hardware:
Input devices:
U U
An input device is any hardware component that allows you the user to enter data or instruction into the
computer. There are many manual/automatic input devices. Most widely used input devices are:
Keyboard
Pointing devices
o Trackerball mouse
o Laser mouse
2D/3D Scanners
Page 1 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
The keyboard is one of the most popular ways of inputting information into a computer. The basic
mechanical keyboard relies on springed keys being pressed down to complete an electrical circuit. This
circuit then transmits a binary signal (commonly using ASCII) to the computer to represent the key
pressed.
Scanner
A scanner creates a digital photograph of a paper document. It scans the illuminated surface of the
document with a single row of hundreds of light sensors. Each sensor produces an analogue signal that
depends on the intensity of the light it receives. The scanner’s embedded computer repeatedly scans the
signals from the sensors as they move across the document. The embedded computer then digitizes,
processes them and sends them to the computer.
Page 2 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
A trackball is a pointing device consisting of a ball held by a socket containing sensors to detect a rotation
of the ball about two axes—like an upside-down mouse with an exposed protruding ball. The user rolls the
ball with the thumb, fingers, or the palm of the hand to move a pointer. Compared with a mouse, a
trackball has no limits on effective travel; at times, a mouse can reach an edge of its working area while
the operator still wishes to move the screen pointer farther. With a trackball, the operator just continues
rolling, whereas a mouse would have to be lifted and re-positioned. Some trackballs, such as Logitech's
optical-pickoff types, have notably low friction, as well as being dense (glass), so they can be spun to
make them coast. The trackball's buttons may be situated to that of a mouse or to a unique style that
suits the user.
A mouse allows the user to point by moving the cursor in graphical user interface on a PC’s screen. The
optical mouse actually uses a tiny camera to take 1,500 pictures every second. Able to work on almost
any surface, the mouse has a small, red light-emitting diode (LED) that bounces light off that surface onto
a complementary metal-oxide semiconductor (CMOS) sensor.
The CMOS sensor sends each image to a digital signal processor (DSP) for analysis. The DSP, operating
at 18 MIPS (million instructions per second), is able to detect patterns in the images and see how those
patterns have moved since the previous image. Based on the change in patterns over a sequence of
images, the DSP determines how far the mouse has moved and sends the corresponding coordinates to
the computer. The computer moves the cursor on the screen based on the coordinates received from the
mouse. This happens hundreds of times each second, making the cursor appear to move very smoothly.
Sensor:
A sensor measures a specific property data and sends a signal to the computer. They can produce a
stream of input data automatically without any human intervention. Usually this is an analogue signal so
it needs to be converted into digital data for the computer to process. This is done using by an Analogue-
to-Digital Converter (ADC).
Sensors are used extensively in monitoring / measuring / data logging systems, and also in computer
control systems. Following is the list of commonly used sensors:
Temperature
Magnetic Field
Gas
Pressure
Moisture
Humidity
Ph/Acidity/Alkalinity
Motion/ Infra-Red
Page 3 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
The disadvantage of using sensors is that they may need a power supply to work and may need regular
calibration to check their accuracy.
Temperature Sensor:
A temperature sensor produces a signal that depends on the temperature of its surroundings. The
computer process the digitize signal to display a measurement or to control an appliance.
Pressure sensor:
A pressure sensor produces a signal that depends on the pressure to which it is exposed. Pressure sensor
can be used in many appliances such as automatic blood pressure monitor. Pressure sensor can also
control the pressure of gases or liquids in chemical reaction vessel.
Magnetic Field:
The Magnetic Field Sensor can be used to study the field around permanent magnets, coils, and electrical
devices. This sensor uses a Hall effect transducer, and measures a vector component of the magnetic
field near the sensor tip. It has two ranges, allowing for measurement of relatively strong magnetic fields
around permanent magnets and electromagnets, as well as measurement of weak fields such as the
Earth’s magnetic field. The articulated sensor tip allows you to measure both transverse and longitudinal
magnetic fields.
Gas:
A gas sensor produces a signal depending on the concentration of a particular gas or vapor. We can use
gas sensor for an inflammable gas to monitor the atmosphere and sound an alarm if there is a leakage.
We can use gas sensor in other applications such as:
Breathalyser, which measure the concentration of alcohol vapour in a sample of breath and
estimate the concentration of alcohol in blood.
Process control in chemical industry.
Environment monitoring of air pollution.
Page 4 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
A moisture sensor produces a signal that depends on the concentration of water vapors in the
atmosphere. A moisture sensor can control an irrigation system more efficiently only allow water when
soil is dry. We can use moisture sensor in many other application including:
Controlling a heating system and air conditioning system.
Maintaining sufficient humidity in the air in a greenhouse.
Measuring humidity for meteorological record and forecasting in a weather station.
PH/acidity/alkalinity sensor:
PH Sensor measures the pH of aqueous solutions in industrial and municipal process applications. It is
designed to perform in the harshest of environments, including applications that poison conventional pH
sensors.
Acid-base titrations
Studies of household acids and bases
Monitoring pH change during chemical reactions or in an aquarium as a result of photosynthesis
Investigations of acid rain and buffering
Analysis of water quality in streams and lakes
Infrared sensor:
An infra (IR) sensor produces a signal that depends on the level of invisible IR radiation falling on it.
All objects (unless they are extremely cold) emit significant IR radiation. Security camera equips with lens
and grid of IR sensors uses this IR radiation to form a detector for a person.
Page 5 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
An output device is a piece of hardware that is used to display or output data which has been processed
or has been stored on the computer.
There are many different kind of output devices such as inkjet, laser and 3D printers; 2D and 3D cutters;
speakers and head phones; actuators; flat panel display screens including Liquid Crystal Display (LCD)
and Light-Emitting Diodes (LED); LCD projectors and Digital Light Projectors (DLP)
Printer:
Printer is an output device that prints character and graphics on paper or other materials.
Laser Printer:
Laser printer uses a laser scanning a drum to print with powdered ink, known as toner. The printer places
an even, negative, static charge on a photoconductive drum. It scans a very narrow laser beam across the
surface of the rotating drum. The laser beam causes the negative charge to leak away wherever it shines
on the drum. The drum revolves past a supply of toner which is also charged negatively. The toner is
attracted onto those regions of the drums surface where no charge remains. Toner particles are repelled
by those regions that remain charged because they were not lit by the laser’s beam. The printer rapidly
switches the beam on and off to draw the required pattern of output. A roller presses a sheet of paper
against the rotating drum and the toner particles transfer to the paper.
Another roller presses the paper against a heated ‘fuser’ roller. The heated toner melts and bonds to the
paper, producing a printed copy. If there are four drums with four different colors of toner the printer can
print in color.
Page 6 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Page 7 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Page 8 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Speakers are one of the most common output devices used with computer systems. The purpose of
speakers is to produce audio output that can be heard by the listener.
Speakers are transducers that convert electromagnetic waves into sound waves. peakers use magnets to
convert electricity into sound waves. This is a basic principle of physics.
Sound is made when an object makes the particles around it vibrate. These vibrations travel through the
air, and reach your ears. Our brain interprets this motion as sound. High frequencies of sound are made
when the wavelength of the vibrations are close together. Low frequencies occur when they are farther
apart. The amplitude of the vibrations causes the level of volume you hear.
To make these vibrations, speakers have a set of magnets. One of them is called the permanent magnet. It
doesn’t move or change polarity and is made of a magnetic metal like iron. The other magnet is an
electromagnet. It is a coil of metal wire like copper or aluminum. When an electric current is sent through
the electromagnet, it is either attracted to or repelled away from the permanent magnet. The polarity of
the coil can be reversed depending on the current. This back and forth movement causes the diaphragm
or cone to vibrate, because it is connected to the magnetic coil. This is the sound that you hear.
Page 9 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
An actuator is an output device but it does not always provide output directly to the user. It can change
some physical value in response to a signal from an automated system or control system.
Actuators naturally pair up with sensors, which can provide feedback to the control program about the
effects of its actuators.
Backing storage:
Backing storage (also called auxiliary storage) stores programs and data for future use. In order to store
data while the electricity is switched off or unavailable storage must be non-volatile. Access to backing
storage is slower than internal memory. Operating systems and program files are loaded into RAM form
backing storage when required for execution.
It is important to distinguish between a storage device and storage medium. The storage device is the
machine that stores data; the storage medium is the material on which the device stores data. There are
three different types of backing storage device:
Magnetic media stores data by assigning a magnetic charge to metal. This metal is then processed by a
0T 0T
read head, which converts the charges into ones and zeros. Historically, magnetic media has been very
popular for storing programs, data, and making backups. It looks set to continue in this role for some
time. However, solid state technology is starting to be used more and more, storing programs and data on
new devices such as mobile phones and cameras.
Hard disk
Hard disks are usually found inside computers to store programs and data. They are increasingly cheap
and more and more companies are using them to back things up. Hard disks can vary in physical size with
Page 10 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Platter - Metallic disks where One or both sides of the platter are magnetized, allowing data to be
0T 0T
stored. The platter spins thousands of times a second around the spindle. There may be several
platters, with data stored across them
Head - The head reads magnetic data from the platter. For a drive with several platters there may
0T 0T
two heads per platter allowing data to be read from top and bottom of each
Actuator Arm - used to move the read heads in and out of the disk, so that data can be read and
0T 0T
written to particular locations and you can access data in a Random fashion, you don't need to
read your way through the entire disk to fetch a particular bit of information, you can jump right
there. Seek time is very low.
Power connector - provides electricity to spin the platters, move the read head and run the
0T 0T
electronics
IDE connector - allows for data transfer from and to the platters
0T 0T
Jumper block - used to get the disk working in specific ways such as RAID
0T 0T 0T 0T31U U31T
For the exam you must be able to explain how a hard disk works:
Page 11 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Optical disks
Optical media works by creating a disc with a pitted metallic surface. There are several different types of
disk out there ranging from 650 MB to 128 GB, with the pits and lands getting closer together for higher
volume disks. The principle behind how each of them works is the same.
Page 12 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Optical media
CD-ROM
31TU U31T Read Only
650 - 900 MB
CD-R Write once then Read only
CD-RW re-Writable
CD-ROM
A CD-ROM is a metal disc embedded into a plastic protective housing. Each disc has to be 'mastered'; this
is the process of creating the CD and placing the data on it. CDs are WORM (Write Once, Read Many)
media; this refers to the fact that once they have been mastered, there is no way to change the data on
them.
Page 13 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
1. A single track runs in a spiral pattern from the center of the disc to the outside, this track is made
of pits and lands to represent the ones and zeroes of binary data
2. A high-powered laser is shone onto the CD-ROM, burning pits into the metal
3. The disc spins and the laser follows the track, putting the binary data onto the CD in a spiral track
4. The data has been written
Reading from a CD-ROM
1. A single track runs in a spiral pattern from the center of the disc to the outside, this track is made
of pits and lands to represent the ones and zeroes of binary data
2. A low-powered laser is shone on the metallic surface and the reflection is captured in a photodiode
sensor, the lands reflect differently to the pits, meaning it can tell the difference between a 1 and a
0
3. The disc spins and the laser follows the track
4. The binary data (the 1s and 0s) are put together and the CD-ROM has been read
Solid-state memory
Device Description
Up to 256 GB
USB flash drive
Up to 256 GB
Memory card
Page 14 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Solid state storage devices are electronic and made as integrated circuits or chip. The currently
0T
predominant technology is flash memory, which like ROM holds data that are non-volatile but can be
erased and rewritten in large blocks. We often refer to this as non-volatile memory.
2. USB driver loads, providing the computer with code on how to read and write from the USB.
0T
3. The USB is read, giving information on the file and folder structure (File Allocation Table) to the
0T
Computer.
4. [Reading] The user chooses to open a file, the Computer sends the address wanted to the USB
0T
port.
5. [Reading] The USB returns the data at the location requested.
0T
6. [Writing] The computer sends data to the USB port where it is place into empty space on the drive.
0T
7. [Writing] The computer then requests a new version of the file and folder structure.
0T
Page 15 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
different connectors and are generally smaller than USB Flash drives allowing for them to be used in
cameras, mobile phones and game consoles.
Page 16 of 16
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
NOR
AND OR NOT NAND XOR
A Truth Table is simply a table listing all the combinations of inputs and their respective outputs.
The NOT gate has only one input, but the rest have 2 inputs.
Page 1 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Page 2 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
If we now look at the output in two stages. First let us consider the outputs being produced at stages S
and T. To do this we need to draw a truth table. There are three inputs (A, B and C) which gives 23 (i.e. 8)
possible combinations of 1s and 0s. To work out the outputs at S and T we need to refer to the truth
tables for the NOR gate and for the AND gate. For example, when A = 1 and B = 1 then we have 1 NOR 1
which gives the value of S = 0. Continuing doing the same thing for all 8 possible inputs we get the
following interim truth table:
Page 3 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Designing logic networks to solve a specific problem and testing using truth tables
We can convert this onto logic gate terminology (ON = 1 and OFF = 0):
If (A = 1 OR B = 1) AND (C = NOT 1) then (X = 1)
(Notice: rather than write 0 we use NOT 1)
Page 4 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Page 5 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
A steel rolling mill is to be controlled by a logic network made up of AND, OR and NOT gates only. The mill
receives a stop signal (i.e. S = 1) depending on the following input bits:
Draw a logic network and truth table to show all the possible situations when the stop signal could be
received.
The first thing to do is to try and turn the question into a series of logic gates and then the problem
becomes much simplified.
The first statement can be re-written as: (L = 1 AND V = NOT 1) since Length > 100 metres corresponds to a
binary value of 1 and Velocity < 10 m/s corresponds to a binary value of 0 (i.e. NOT 1).
The second statement can be re-written as (T = NOT 1 AND V = 1) since Temperature < 1000C corresponds
to a binary value of 0 (i.e. NOT 1) and Velocity > 10 m/s corresponds to a binary value of 1
Both these statements are joined together by OR which gives us the logic statement:
if (L = 1 AND V = NOT 1) OR (T = NOT 1 AND V = 1) then S = 1
We can now draw the logic network and truth table to give the solution to the original problem (input L has
been put at the bottom of the diagram just to avoid crossing over of lines; it merely makes it look neater
and less complex and isn’t essential):
Page 6 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Page 7 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Page 8 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
Questions 7 to 10 require both the logic network to be created and also the truth table. The truth table can
be derived from the logic network, but also from the problem. This is a check that the logic network
actually represents the original problem.
(7) A computer will only operate if three switches P, S and T are correctly set. An output signal
(X = 1) will occur if P and S are both ON or if P is OFF and S and T are ON. Design a logic network and draw
the truth table for this network.
(8) A traffic signal system will only operate if it receives an output signal (D = 1). This can only occur
if:
either (a) signal A is red (i.e. A = 0)
or (b) signal A is green (i.e. A = 1) and signals B and C are both red (i.e. B and C are both 0)
Design a logic network and draw a truth table for the above system.
Page 9 of 10
Computer Science 9608 (Notes)
Chapter: 1.3 Hardware
(10) A power station has a safety system based on three inputs to a logic network. A warning signal (S
= 1) is produced when certain conditions occur based on these 3 inputs:
Draw a logic network and truth table to show all the possible situations when the warning signal could be
received.
Page 10 of 10
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
The earliest computing machines had fixed programs. For example, a desk calculator (in principle) is a
fixed program computer. It can do basic mathematics, but it cannot be used as a word processor or a
gaming console. Changing the program of a fixed-program machine requires re-wiring, re-structuring, or
re-designing the machine. The earliest computers were not so much "programmed" as they were
"designed". "Reprogramming", when it was possible at all, was a laborious process, starting with
flowcharts and paper notes, followed by detailed engineering designs, and then the often-arduous process
of physically re-wiring and re-building the machine. It could take up to 3 weeks to set up a program on
ENIAC (a computer of 1940s) and get it working.
The phrase Von Neumann architecture derives from a paper written by computer scientist John von
Neumann in1945. This describes a design architecture for an electronic digital computer with
subdivisions of a central arithmetic part, a central control part, a memory to store both data and
instructions, external storage, and input and output mechanisms. The meaning of the phrase has evolved
to mean “A stored-program computer”. A stored-program digital computer is one that keeps its
programmed instructions, as well as its data, in read-write, random-access memory (RAM). So John Von
Neumann introduced the idea of the stored program. Previously data and programs were stored in
separate memories. Von Neumann realized that data and programs are somewhat of the same type and
can, therefore, use the same memory. On a large scale, the ability to treat instructions as data is what
makes assemblers, compilers and other automated programming tools possible. One can "write programs
which write programs". This led to the introduction of compilers which accepted high level language
source code as input and produced binary code as output.
Page 1 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Register Meaning
PC Program Counter
IX Index Register
Program counter (PC): Keeps track of where to find the next instruction so that a copy of the
instruction can be placed in the current instruction register. Sometimes the program counter is
called the Sequence Control Register (SCR) as it controls the sequence in which instructions are
executed.
Current instruction register (CIR): Holds the instruction that is to be executed.
Memory address register (MAR): Used to hold the memory address that contains either the next
piece of data or an instruction that is to be used.
Memory data register (MDR): Acts like a buffer and holds anything that is copied from the memory
ready for the processor to use it.
Index register (IR): A microprocessor register used for modifying operand addresses during the run
of a program, typically for doing vector/array operations. Index registers are used for a special kind
of indirect addressing where an immediate constant (i.e. which is part of the instruction itself) is
added to the contents of the index register to form the address to the actual operand or data.
Page 2 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
The central processor contains the arithmetic-logic unit (also known as the arithmetic unit) and the
control unit. The arithmetic-logic unit (ALU) is where data is processed. This involves arithmetic and
logical operations. Arithmetic operations are those that add and subtract numbers, and so on. Logical
operations involve comparing binary patterns and making decisions.
The control unit fetches instructions from memory, decodes them and synchronizes the operations before
sending signals to other parts of the computer.
The accumulator is in the arithmetic unit, the Program counter (PC) and the Instruction registers (CIR) are
in the control unit and the memory data register (MDR) and memory address register (MAR) are in the
processor.
Page 3 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Main Memory
Control Unit
ALU
PC
Accumulator
CIR
MAR
MDR
Fig a.1
Page 4 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
A simple example of an arithmetic logic unit (2-bit ALU) that does AND, OR, XOR, and addition
Page 5 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Page 6 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Processor clock-is a timing device connected to the processor that synchronizes when the fetch, decode
execute cycle runs
Your computer might contain several clocks that regulate different things. The clock we are going to look
at here will keep the processor in line. It will send the processor a signal at regular times telling it to start
the fetch decode execute routine.
Lights flash at frequencies of 0.5 Hz, 1.0 Hz and 2.0 Hz, where Hz means flashes per second. (Hz=
Hertz)
Clock speed - The number of cycles that are performed by the CPU per second
Clock speed is measured in Hertz, which means 'per second'. You have probably heard of clock speeds
such as 1 MHz, this means 1,000,000 cycles per second and potentially a million calculations. A computer
of speed 3.4 GHz means it might be capable of processing 3,400,000,000 instructions per second!
However it isn't as simple at that, some processors can perform more than one calculation on each clock
cycle, and processors from different manufacturers and using different architecture are often difficult to
compare. Also with the increase in multi-core processors such as the PS3 (7 cores) and the Xbox 360 (3
cores) there might be times where the clock might be ticking but there is nothing for the processor to
calculate, the processor will then sit idle.
Page 7 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
A bus is a set of parallel wires connecting two or more components of the computer.
The CPU is connected to main memory by three separate buses. When the CPU wishes to access a
particular memory location, it sends this address to memory on the address bus. The data in that location
is then returned to the CPU on the data bus. Control signals are sent along the control bus.
In Figure below, you can see that data, address and control buses connect the processor, memory and I/O
controllers. These are all system buses. Each bus is a shared transmission medium, so that only one
device can transmit along a bus at any one time.
Busses: These are mediums (wires) connecting the microprocessor with the main memory and other
parts of the system. There are three types of the busses, i.e. Control bus, address bus, and data bus.
Control bus: This is used by the control unit to communicate and transfer signals to and from the internal
and external devices of the system. It is used to minimise the communication lines required for the
communication. This is a bi-directional bus that is comprised of interrupt line, read/write signals & status
line. It makes sure that data when transferred is on one channel, from one device and in one direction only
to avoid collision and loss of data.
Address Bus: This bus is used to carry the address from where data needs to be fetched or placed in main
memory. This is a one directional bus whose width (number of wires) defines the range of addresses a
microprocessor can reach to. For example an address bus of 12 bits can reach to 212=4096 (4k) addresses
P P
Data Bus: A bi-directional bus to access data to and from memory address mentioned in MAR.
Page 8 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Increasing performance
If we want to increase the performance of our computer, we can try several things
For each different method, we are going to look at these old games consoles to see how performance
increase was achieved:
NES
31TU U31T 1983 1.79 MHz 8 bit
SNES
31TU U31T 1990 3.58 MHz 16 bit
GameCube
31TU U31T 2001 486 MHz 128 bit cooling fan introduced
Page 9 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
The most obvious way to increase the speed of a computer would be to increase the speed of the
computer clock. With a faster clock speed the processor would be forced to perform more instructions per
second.
But what is to stop us increasing the clock speed as much as we want? If you study Physics you might
already know this, but the problem with increased clock speed is that an increased current will have to
flow through the circuits. The more current that flows through a circuit, the more heat will be generated.
You might notice that a laptop will get hot or even your mobile phone when you are doing a heavy task like
playing a game. The faster the clock speed, the hotter the processor runs.
To counter this issue, computer scientists have come up with more advance chip designs and introduced
heat sinks, fans, and even liquid cooling into computers. If a processor runs too hot it can burn out!
Peripherals
Input/output devices are used by the system to get information in and out, as they are not internal but are
connected to the CPU, we refer to them as peripherals (your hands are peripheral to your torso). We cover
0T 0T 0T 0T
the specific ones you need to learn in chapter 1.3.1, but for the moment you need to know the
fundamental difference:
If you look at the Von Neumann Architecture notice that it doesn't mention Keyboard or display, this was a
very well thought out action, as you don't want to force every computer to have a keyboard (think about a
games console) or a VDU (some devices such as MP3 players don't have a screen like the iPod Shuffle).
However, some computer architecture does include specific I/O controllers:
I/O controllers- an electronic circuit that connects to a system bus and an I/O device; it provides the
correct voltages and currents for the system bus and the I/O device. Examples would include:
Page 10 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
a computer. This allows I/O devices to be connected to the CPU without having to have specialized
hardware for each one. Think about the USB port on your computer, you can connect Keyboards, Mice,
Game pads, Cameras, Phones, etc. and they all connect using the same type of port!
Page 11 of 11
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Registers involved
The circuits used in the CPU during the cycle are:
Program Counter (PC) - an incrementing counter that keeps track of the memory address of which
instruction is to be executed next...
Memory Address Register (MAR) - the address in main memory that is currently being read or
written
Memory Data Register (MDR) - a two-way register that holds data fetched from memory (and data
ready for the CPU to process) or data waiting to be stored in memory
Current Instruction register (CIR) - a temporary buffer for the instruction that has just been fetched
from memory
Control Unit (CU) - decodes the program instruction in the CIR, selecting machine resources such
as a data source register and a particular arithmetic operation, and coordinates activation of those
resources
Arithmetic logic unit (ALU) - performs mathematical and logical operations
Page 1 of 4
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
To describe the cycle we can use register notation. This is a very simple way of noting all the steps
involved. In all cases where you see brackets e.g. [PC], this means that the contents of the thing inside the
brackets are loaded. In the case of the first line, the contents of the program counter are loaded into the
Memory Address Register.
1. The contents of the Program Counter, the address of the next instruction to be executed, is placed
into the Memory Address Register
Page 2 of 4
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
2. The address is sent from the MAR along the address bus to the Main Memory. The instruction at
that address is found and returned along the data bus to the Memory Buffer Register also called
Memory Data Register (MDR). At the same time the contents of the Program Counter is increased
by 1, to reference the next instruction to be executed.
.
3. The MBR (MDR) loads the Current Instruction Register with the instruction to be executed.
Page 3 of 4
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
1. Instruction Fetched
2. Current Inst. is executed
3. Interrupt presence is checked in queue
4. If there is an interrupt then check priority with current program
5. If Interrupt's priority is high then save current program's data in registers to STACK in RAM and load
new program (Interrupt handler) by writing new program's 1st instruction address in PC.
a. Serve the Interrupt
b. Load the data back from STACK in RAM
c. Resume previous program.
6. Any more instruction of the current program?
a. If yes then go to step 1.
b. else end.
What happens in the CPU when an Interrupt is generated during a fetch execute cycle:
We have said that 'An interrupt is a signal for the CPU to stop what it is doing and instead carry out the
interrupt task, once the task is complete, the CPU goes back to what it was doing'.
But what is meant by 'back to what it was doing'?
To appreciate this, you need to understand a little about what goes on inside a CPU. A CPU contains a
number of 'registers'. A register is a small section of on-chip memory having a specific purpose.
Registers range from 8 bits wide on an 8 bit CPU to 64 bits and beyond.
Registers in the CPU hold all of the data currently being handled. These include
The current instruction being executed (Instruction Register),
The location in primary memory of the next instruction (Program Counter)
A number of general purpose registers holding current data
The registers are updated by each tick of the system clock so at any instant in time, they hold specific
values. When an interrupt comes along, all the register values are copied to a special data structure or
memory area called the 'stack' which is in primary memory. And they stay in the stack whilst the CPU
starts executing the interrupt service routine (ISR). Once the routine is over, the registers are loaded
back with their original values from the stack and can continue with what they were doing before the
interrupt came along. This jumping of instructions from current CPU operations to the ISR and then
back again is called 'context switching'
Page 4 of 4
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
There are many different instructions that we can use in machine code, you have already met three (LDD,
ADD, STO), but some processors will be capable of understanding many more. The selection of
instructions that a machine can understand is called the instruction set. Below is a list of some other
instructions that might be used:
Page 1 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Depending on the word size, there will be different numbers of bits available for the opcode and for the
operand. There are two different philosophies at play, with some processors choosing to have lots of
different instructions and a smaller operand (Intel, AMD) and others choosing to have less instructions
and more space for the operand (ARM). Know it now and we will study it in detail in paper 3.
CISC - Complex Instruction Set Computer - more instructions allowing for complex tasks to be
executed, but range and precision of the operand is reduced. Some instruction may be of variable
length, for example taking extra words (or bytes) to address full memory addresses, load full data
values or just expand the available instructions.
RISC - Reduced Instruction Set Computer - less instructions allowing for larger and higher
precision operands.
Page 2 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
There are many ways to locate data and instructions in memory and these methods are called 'memory
address modes'
Memory address modes determine the method used within the program to access data either from within
the CPU or external RAM. Some memory addressing modes can control program flow.
Direct
Indirect
Immediate
Indexed
Relative
Immediate Addressing
Immediate addressing means that the data to be used is hard-coded into the instruction itself.
15T
This is the fastest method of addressing as it does not involve main memory at all.
Nothing has been fetched from memory; the instruction simply loads 2 to the accumulator
immediately.
Immediate Addressing is very useful to carry out instructions involving constants (as opposed to
variables). For example you might want to use 'PI' as a constant 3.14 within your code.
Page 3 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
This is a very simple way of addressing memory - direct addressing means the code refers
directly to a location in memory
For example
LDD 3001
In this instance the value held at the direct location 3001 in RAM is loaded to the accumulator.
The good thing about direct addressing is that it is fast (but not as fast as immediate addressing)
the bad thing about direct addressing is that the code depends on the correct data always being
present at same location.
It is generally a good idea to avoid referring to direct memory addresses in order to have 're-
39T
locatable code' i.e. code that does not depend on specific locations in memory.
39T
You could use direct addressing on computers that are only running a single program. For
example an engine management computer only ever runs the code the car engineers
programmed into it, and so direct memory addressing is excellent for fast memory access.
Indirect Addressing
Indirect addressing means that the address of the data is held in an intermediate location so that
15T
the address is first 'looked up' and then used to locate the data itself.
Many programs make use of software libraries that get loaded into memory at run time by the
loader. The loader will most likely place the library in a different memory location each time.
So how does a programmer access the subroutines within the library if he does not know the
starting address of each routine?
Page 4 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
1. A specific block of memory will be used by the loader to store the starting address of every
subroutine within the library. This block of memory is called a 'vector table'. A vector table holds
39T 39T
addresses rather than data. The application is informed by the loader of the location of the vector
table itself.
2. In order for the CPU to get to the data, the code first of all fetches the content at RAM location
5002 which is part of the vector table.
3. The data it contains is then used as the address of the data to be fetched, in this case the data
is at location 9000
This looks to location 5002 for an address. That address is then used to fetch data and load it
into the accumulator. In this instance it is 302.
Page 5 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Indexed Addressing
Indexed addressing means that the final address for the data is determined by adding an offset
15T
to a base address.
For example, it makes sense to store arrays as contiguous blocks in memory (contiguous means
being next to something without a gap). The array has a 'base address' which is the location of
39T 39T
the first element, then an 'index' is used that adds an offset to the base address in order to fetch
39T 39T
Index addressing is fast and is excellent for manipulating data structures such as arrays as all
you need to do is set up a base address then use the index in your code to access individual
elements.
Another advantage of indexed addressing is that if the array is re-located in memory at any point
then only the base address needs to be changed. The code making use of the index can remain
exactly the same.
Page 6 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Quite often a program only needs to jump a little bit in order to jump to the next instruction.
Maybe just a few memory locations away from the current instruction.
A very efficient way of doing this is to just add a small offest to the current address in the
program counter. (Remember that the program counter always points to the next instruction to
be executed). This is called 'relative addressing'
39T 39T
DEFINITION:
39T 39T
Relative addressing means that the next instruction to be carried out is an offset
number of locations away, relative to the address of the current instruction.
acc:
code executed if accumulator is = 2)
carryon:
In the code snippet above, the first line of code is checking to see if the accumulator has the
value of 2 in it. If it is has, then the next instruction is 3 lines away. This is called a conditional
39T
Another example of relative addressing can be seen in the jmp +5 instruction. This is telling the
CPU to effectively avoid the next instruction and go straight to the 'carryon' point; let’s say
present at this address +5.
Page 7 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Page 8 of 8
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
As we already know from chapter 1.1.1, computers can only understand binary, 1s and 0s. We are now
going to look at the simplest instructions that we can give a computer. This is called machine code.
Machine code allows computers to perform the most basic, but essential tasks. For this section we are
going to use the Accumulator (you met this register earlier) to store the intermediate results of all our
calculations. Amongst others, the following instructions are important for all processors:
LDD - Loads the contents of the memory address or integer into the accumulator
ADD - Adds the contents of the memory address or integer to the accumulator
STO - Stores the contents of the accumulator into the addressed location
Assembly code is easy to read interpretation of machine code, there is a one to one matching; one line of
assembly equals one line of machine code:
000000110101 = Store 53
There is no set binary bit pattern for different opcodes in an instruction set. Different processors will use
different patterns, but sometimes it might be the case that you are given certain bit patterns that
represent different opcodes. You will then be asked to write machine code instructions using them.
For every assembly language command there is an equal machine language command. i.e. the
relationship is 1:1.
Page 1 of 6
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
An assembler converts an assembly language program into machine language program. Machine
code is binary. So an Assembler converts a program that looks like the program I wrote below into
something like "10100110010101110101".
Single pass and two pass assembler:
One-pass assemblers go through the source code once and assume that all symbols will be defined
before any instruction that references them. It has to create object code in single pass and it cannot
refer any table further.
Two-pass assemblers does two passes as it creates a table with all symbols and their values in the
first pass, then use the table in a second pass to generate code and the length of each instruction on
the first pass must be determined so that the addresses of symbols can be calculated. It can be
Page 2 of 6
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
Op-Code operand
ADD A 20
Here Add is the operation and A and 20 are the operands. A programmer writes a bunch of these single
line statements that make up the assembly language program. Each line in this program is converted
into Binary format and loaded into consecutive memory locations when this program is called for
execution. These instructions in consecutive memory locations are executed one after the other
unless something called a Jump instruction is encountered. A jump instruction asks the system to go
to a particular memory location (Marked by a Label in the source code) and start executing code from
there.
The problem with this is that the current instruction under execution may ask you to jump to a
particular memory location marked by label which the assembler has not yet encountered. See in the
example below the first instruction is to jump to X, but X is not defined till line 5. To translate an
assembly statement to machine code you need the exact value of the operand in that statement. If the
operand is referring to a value in a register, you need the register number. If operand is referring to a
value in a memory location, you need the address of that memory location. If the operand is a
hardcoded value, you need that value. If you operand is referring to a particular line number in your
source code that is marked by a label to jump to, you need that line number. Without this information
you can't translate an assembly statement to machine code.
Page 3 of 6
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
To solve this problem there are two approaches, hence leading to two types of assemblers.
(1) Two Pass Assembler: In older days when memory was a limitation people used to use a two pass
assembler. A two pass assembler has a symbol table which is a table with two columns, symbol name,
and symbol value. When it see the first sentence JMP x. It doesn't know what a x is, so it creates a new
entry in the symbol table for x with value field un-initialized. When it gets to instruction 5. It sets the x
value as 5.
So after reading the whole code(After first pass). The table will look something like this.
Symbol Name | Symbol Value
X 5
Y 2
Z 8
In the next pass it just runs through each sentence and converts the program sentence by sentence
into machine code using the symbol table.
(2) Single Pass Assembler: In the Single Pass Assembler, we use a different kind of symbol table.
Unlike the Two Pass Assembler where we stored the name of a symbol and its value, in one Pass
Assembler we use a table where we have a symbol and all the locations where we have encountered
that symbol. As soon as we encounter the initialization for a symbol we use the symbol table to jump
to all the locations where the symbol was encountered, plug in the value we just read, convert those
lines into machine code, and continue back from the line where we encountered the initialization.
As you can clearly see, the Single Pass Assembler needs a lot of memory at runtime. Much more than
Two Pass. In olden days people could barely fit the assembly program in memory so people used to
use the Two Pass Assembler.
Thanks to Sastry Aditya for this passage.
U
Page 4 of 6
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
The symbols being defined by this bit of code are VarA, VarB. The size of the variables VarA and VarB
are defined, but notice that their location is not defined.
It is the job of the assembler to resolve the variables into locations in memory.
The advantages of using symbolic addressing over direct memory references are:
The program is re-locatable in memory. It does not particularly care about its absolute
location, it will still work
Using symbols makes the software much more understandable
When the code is ready to be loaded and run, a 'symbol table' is created by the assembler for the linker
and loader to use to place the software into memory.
Assembler Directives:
Assembler directives are instructions to the assembler to perform various bookkeeping tasks, storage
reservation, and other control functions. To distinguish them from other instructions, directive names
begin with a period. Three common directives: .data, .text, and .word. The first two (.data and .text) are
used to separate variable declarations and assembly language instructions. The .word directive is
used to allocate and initialize space for a variable.
Page 5 of 6
Computer Science 9608 (Notes)
Chapter: 1.4 Processor fundamentals
2: Invoking a macro by using its given <label> on a separate line followed by the list of
parameters used if any:
<label> [parameter list]
Page 6 of 6
Computer Science 9608 (Notes)
Chapter: 1.5 System software
Most operating systems consist of a large set of programs, only some of which are stored in the
processor memory all of the time. Many of the subroutines available in the O.S. are stored on the hard
drive so that they can be accessed when needed. This not only saves space in the processor memory but
also means that the O.S. can be easily replaced when needed.
When you are using an applications package you are not communicating with the computer hardware, you
are communicating with the operating system. Without an operating system, no matter how many
programs you have, the computer is useless. The operating system sits between the hardware and the
application program or user.
The operating system is likely to be stored on a backing store (Hard-Drive) rather than in the memory of
the computer (RAM) because:
Summary:
U
1. Operating system is a software program which controls the operations of the computer system.
2. It provides a user interface (Human-computer interaction (HCI)).
3. It controls how the computer responds to user’s requests
4. It controls how the hardware communicate with each other
5. It provides an environment in which application software can be executed
Page 1 of 2
Computer Science 9608 (Notes)
Chapter: 1.5 System software
Page 2 of 2
Computer Science 9608 (Notes)
Chapter: 1.5 System software
Utility software should be contrasted with application software, which allows users to do things like
creating text documents, playing games, listening to music or surfing the web. Rather than providing
these kinds of user-oriented or output-oriented functionality, utility software usually focuses on how the
computer infrastructure (including the computer hardware, operating system, application software and
data storage) operates. Due to this focus, utilities are often rather technical and targeted at people with an
advanced level of computer knowledge. There are many examples of utility software but we shall limit
ourselves to just a few:
1. The surface of a disk can store so much data that the computer cannot handle it all at once so it needs
to be split up so that data stored on it can be found again. When it is new a disk surface is blank so the
computer “draws lines” on the surface to split it into small areas. The process is called formatting and it is
carried out by a utility program called a disk formatter.
2. Some files are very large. In most files it is possible to find simple ways of reducing the size of the file
while keeping all its meaning. This can be very important when files are being sent from one computer to
another as the communication is speeded up. The programs that reduce the size of files are called file
compressors.
3. When files are being sent from one computer to another it is possible that they may contain a virus
which will infect the receiving computer. A virus checker (scanner, killer...) is a utility program which keeps
a constant check on files searching for viruses and deletes it if found.
4. A disk defragmenter software is a utility that reorganizes the files and unused space on a computer’s
hard disk so that the operating system accesses data more quickly and programs run faster. When an
operating system stores data on a disk, it places the data in the first available sector on the disk. It
attempts to place data in sectors that are contiguous (next to each other), but this is not always possible.
When the contents of a file are scattered across two or more noncontiguous sectors, the file
is fragmented.
U U
Fragmentation slows down disk access and thus the performance of the entire computer. Defragmenting
the disk, or reorganizing it so that the files are stored in contiguous sectors, solves this problem.
Operating systems usually include a disk defragmenter. Windows disk defragmenter is available in the
System Tools list.
Page 1 of 2
Computer Science 9608 (Notes)
Chapter: 1.5 System software
Main Features:
Scan and Fix multiple drives with one click.
Schedule Disk Check on next boot if a drive cannot be locked.
Boot Time Disk check can be performed on multiple drives in a click.
Use of checkdisk makes it highly safe to use.
6. A backup software utility allows users to copy, or back up, selected files or an entire hard disk to
another storage medium such as an external hard disk, optical disc, USB flash drive, or tape.
During the backup process, the utility monitors progress and alerts you if it needs additional media, such
as another disc. Many backup programs compress, or shrink the size of, files during the backup process.
By compressing the files, the backup program requires less storage space for the backup files than for the
original files.
Because they are compressed, you usually cannot use backup files in their backed up form. In the event
you need to use a backup file, a restore utility reverses the process and returns backed up files to their
original form. Backup utilities work with a restore utility.
Page 2 of 2
Computer Science 9608 (Notes)
Chapter: 1.5 System software
and subroutines.
Library programs contain code and data that provide services to other programs such as interface (look
and feel), printing, network code and even the graphic engines of computer games. If you have ever
wondered why all Microsoft Office programs have the same look and feel, that is because they are using
the same graphical user interface libraries. For computer games a developer might not have the time and
budget to write a new graphics engine so they often buy graphical libraries to speed up development, this
will allow them to quickly develop a good looking game that runs on the desired hardware.
Most programming languages have a standard set of libraries that can be used, offering code to handle
input/output, graphics and specialist math functions. You can also create your own custom libraries and
when you start to write lots of programs with similar functionality you'll find them very useful. Below is an
example of how you might import libraries into VB.
What is a DLL?
A DLL is a library that contains code and data that can be used by more than one program at the same
time. For example, in Windows operating systems, the Comdlg32 DLL performs common dialog box
related functions. Therefore, each program can use the functionality that is contained in this DLL to
implement an Open dialog box. This helps promote code reuse and efficient memory usage.
By using a DLL, a program can be modularized into separate components. For example, an accounting
program may be sold by module. Each module can be loaded into the main program at run time if that
module is installed. Because the modules are separate, the load time of the program is faster, and a
module is only loaded when that functionality is requested.
Additionally, updates are easier to apply to each module without affecting other parts of the program. For
example, you may have a payroll program, and the tax rates change each year. When these changes are
isolated to a DLL, you can apply an update without needing to build or install the whole program again.
Page 1 of 1
Computer Science 9608 (Notes)
Chapter: 1.5 System software
Assembler
An assembler translates assembly language into machine code. Assembly language consists of
0T 0T 0T 0T
mnemonics for machine opcodes so assemblers perform a 1:1 ratio translation from mnemonic to a
direct instruction. For example:
Conversely, one instruction in a high level language will translate to one or more instructions at machine
level.
Page 1 of 4
Computer Science 9608 (Notes)
Chapter: 1.5 System software
language, object/machine code. The most common reason for translating source code is to create an
executable program (converting from a high level language into machine language).
Interpreter
35T
An interpreter program executes other programs directly, running through program code and executing it
line-by-line. As it analyses every line, an interpreter is slower than running compiled code but it can take
less time to interpret program code than to compile and then run it — this is very useful when prototyping
and testing code. Interpreters are written for multiple platforms, this means code written once can be run
immediately on different systems without having to recompile for each. Examples of this include flash
based web programs that will run on your PC, MAC, games console and Mobile phone.
Page 2 of 4
Computer Science 9608 (Notes)
Chapter: 1.5 System software
Page 3 of 4
Computer Science 9608 (Notes)
Chapter: 1.5 System software
The Java Virtual Machine takes the byte code prepared by the Java compiler and executes it. The byte-code itself is
platform-independent; it is the responsibility of the Java Virtual Machine implementation to execute the program in
the byte-code form on the real computer.
So, the java code is partially compiled for optimization by the programmer. The user runs the code on a Java virtual
machine on their computers that interprets the java code for the users' computer specific architecture.
Page 4 of 4
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
Introduction
Information Systems are made of three subclasses, hardware, software and communications with the
purpose to identify and apply information security industry standards, as mechanisms of protection and
prevention, at three levels or layers: Physical, personal and organizational. Essentially, procedures or
policies are implemented to tell people (administrators, users and operators) how to use products to
ensure information security within their organizations.
IT Security
Information security means protecting information and information systems from unauthorized access,
use, disclosure, disruption, modification or destruction. The terms information security, computer security
and information assurance are frequently incorrectly used interchangeably. These fields are interrelated
often and share the common goals of protecting the confidentiality, integrity and availability of
information; however, there are some subtle differences between them.
These differences lie primarily in the approach to the subject, the methodologies used, and the areas of
concentration. Information security is concerned with the confidentiality, integrity and availability of data
regardless of the form the data may take: electronic, print, or other forms. Computer security can focus on
ensuring the availability and correct operation of a computer system without concern for the information
stored or processed by the computer
Privacy
Page 1 of 5
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
Integrity
In information security, integrity means that data cannot be modified without permission. This is not the
same thing as referential integrity in databases. Integrity is violated when an employee accidentally or
with malicious intent deletes important data files, when a computer virus infects a computer, when an
employee is able to modify his own salary in a payroll database, when an unauthorized user vandalizes a
web site, when someone is able to cast a very large number of votes in an online poll, and so on.
There are many ways in which integrity could be violated without malicious intent. In the simplest case, a
user on a system could mis-type someone's address. On a larger scale, if an automated process is not
written and tested correctly, bulk updates to a database could alter data in an incorrect way, leaving the
integrity of the data compromised.
Information security professionals are tasked with finding ways to implement controls that prevent errors
of integrity.
meet. Note that for most personal workstations, these are the only measures that apply. The
requirements are:
Page 2 of 5
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
2. Firewall
Systems must be protected by a firewall that allows only those incoming connections necessary to fulfill the
business needs of that system. Client systems which have no business need to provide network services must
deny all incoming connections. Systems that provide network services must limit access to those services to the
smallest reasonably manageable group of hosts that need to reach them.
3. Password Protection
All accounts and resources must be protected by passwords which meet the following requirements, which
must be automatically enforced by the system:
Must be at least eight characters long.
Must NOT be dictionary or common slang words in any language, or be relatively easy to
guess.
Must include at least three of the following four characteristics, in any order: upper case
letters, lower case letters, numbers, and special characters, such as “!@#$%^&*”.
Must be changed at least once per year.
Page 3 of 5
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
Digital signatures rely on certain types of encryption to ensure authentication. Encryption is the
process of taking all the data that one computer is sending to another and encoding it into a form
that only the other computer will be able to decode. Authentication is the process of verifying that
information is coming from a trusted source. These two processes work hand in hand for digital
signatures.
Data Backup: Data protection is crucial for protecting your business's continuity. If your only data backup
is on a computer and the hard disk crashes or is damaged by a power surge, your business’s data is gone.
And having paper copies of business data isn't adequate data protection; what if your business premises
burns to the ground or destroyed in a flood? Once again the data you need to carry on your business could
be irretrievably lost.
For adequate data protection, you need to establish a data backup system that follows these three steps:
The basic rule for business data protection is that if losing the data will interfere with doing business, then
U
you should back it up. You can reinstall software programs if you need to, but recovering the details of
U
transactions or business correspondence is impossible if those files are lost or damaged beyond repair.
The rest of this article outlines each of the steps listed above so you can establish a data backup system
that will effectively protect your critical business data from disaster.
Disk mirroring is a real-time strategy that writes data to two or more disks at the same time. If one disk
U U
fails, the other continues to operate and provide access for users. Server mirroring provides the same
functionality, except that an entire server is duplicated. This strategy allows users to continue accessing
data if one of the servers fails.
Page 4 of 5
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
The translation of data into a secret code. Encryption is the most effective way to achieve data security.
To read an encrypted file, you must have access to a secret key or password that enables you
to decrypt it. Unencrypted data is called plain text ; encrypted data is referred to as cipher text.
U U U U
There are two main types of encryption: asymmetric encryption (also called public-key encryption)
and symmetric encryption.
Access Control
Access Control is any mechanism by which a system grants or revokes the right to access some data, or
perform some action. Normally, a user must first Login to a system, using some Authentication system.
Next, the Access Control mechanism will controls what operations the user may or may not make by
comparing the User ID to an Access Control database.
Page 5 of 5
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
Data Integrity defines a quality of data, which guarantees the data is complete and has a whole structure.
Data integrity is most often talked about with regard to data residing in databases, and referred to as
database integrity as well. Data integrity is preserved only if and when the data is satisfying all the
business rules and other important rules. These rules might be how each piece of data is related to each
other, validity of dates, lineage, etc. According to data architecture principles, functions such as data
transformation, data storage, meta-data storage and lineage storage must guarantee the integrity of data.
That means, data integrity should be maintained during transfer, storage and retrieval.
If data integrity is preserved, the data can be considered consistent and can be given the assurance to be
certified and reconciled. In terms of data integrity in databases (database integrity), in order to guarantee
that integrity is preserved, you have to ensure that the data becomes an accurate reflection of the
universe it is modeled after. In other words, it must make sure that the data stored in the database
corresponds exactly to the real world details it is modeled after.
When data is input to a computer system it is only valuable data if it is correct. If the data is in error in any
way then no amount of care in the programming will make up for the erroneous data and the results
produced can be expected to be unreliable. There are three types of error that can occur with the data on
entry. The first is that the data, while reasonable, is wrong. If your birthday is written down on a data
U U
capture form as 18th of November 1983, it will (except in very rare cases) be wrong. It can be typed into the
P P
computer with the utmost care as “181183”, it can be checked by the computer to make sure that is a
sensible date, and will then be accepted as your date of birth despite the fact that it is wrong. There is no
reason for the computer to imagine that it may be wrong, quite simply when you filled out the original form
you made a mistake. The second type of error is when the operator typing in the data hits the wrong key
U
and types in “181193”, or the equivalent. In this case an error has been made that should be able to be
U
spotted if a suitable check is made on the input. This type of data checking is called a verification
check. The third type of error is when something is typed in which simply is not sensible. If the computer
U U
knows that there are only 12 months in a year then it will know that “181383” must be wrong because it is
not sensible to be born in the thirteenth month. Checks on the sensibility of the data are called validation
checks.
Faulty data:
U
There is very little that can be done about faulty data except to let the owner of the data check it visually
on a regular basis. The personal information kept on the school administration system about you and your
family may well be printed off at regular intervals so that your parents can check to ensure that the stored
information is still correct.
Page 1 of 4
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
Verification means tallying the input data with the original data to make sure that there have been no
transcription errors. The standard way to do this is to input the data twice to the computer system. The
computer then checks the two sets of data (which should be the same) and if there is a difference
between the two sets of data the computer knows that one of the inputs is wrong. It won’t know which
one is wrong but it can now ask the operator to check that particular input.
Validation:
U
The first thing is to clear out a common misinterpretation of validation. Specifically, the use of parity bits
to check data. This is NOT validation. Parity bits and echoing back are techniques that are used to check
U U
that data has been transmitted properly within a computer system (e.g. from the disk drive to the
processor), validation checks are used to check the input of data to the system in the first place.
Validation is a check on DATA INPUT to the system by comparing the data input with a set of rules that
the computer has been told the data must follow. If the data does not match up with the rules then there
must be an error. There are many different types of validation checks that can be used to check input in
different applications:
1. Range check. A mathematics exam is out of 100. A simple validation rule that the computer can apply
to any data that is input is that the mark must be between 0 and 100 inclusive. Consequently, a mark of
101 would be rejected by this check as being outside the acceptable range.
2. Character check. A person’s name will consist of letters of the alphabet and sometimes a hyphen or
apostrophe. This rule can be applied to input of a person’s name so that “dav2d” will immediately be
rejected as unacceptable.
3. Format check. A particular application is set up to accept a national insurance number. Each person
has a unique national insurance number, but they all have the same format of characters, 2 letters
followed by 6 digits followed by a single letter. If the computer knows this rule then it knows what the
format of a NI number is and would reject “ABC12345Z” because it is in the wrong format, it breaks the
rule.
4. Length check. A NI number has 9 characters, if more or fewer than 9 characters are keyed in then the
data cannot be accurate.
5. Existence check. A bar code is read at a supermarket check-out till. The code is sent to the main
computer which will search for that code on the stock file. As the stock file contains details of all items
held in stock, if it is not there then the item cannot exist, which it obviously does, therefore the code must
have been wrongly read.
6. Check digit. When the code is read on the item at the supermarket, it consists of numbers. One number
is special; it is called the check digit. If the other numbers have some arithmetic done to them using a
Page 2 of 4
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
When you send data across the internet or even from your USB to a computer, you are sending millions
upon millions of ones and zeros. What would happen if one of them got corrupted? Think of this situation:
You are buying a new game from an online retailer and put £40 into the payment box. You click on send
and the number 40 is sent to your bank stored in a byte: 00101000. Now imagine if the second most
significant bit got corrupted on its way to the bank, and the bank received the following: 01101000. You'd
be paying £104 for that game! Error Checking and Correction stops things like this happening. There are
many ways to detect and correct corrupted data, we are going to learn two.
a) Parity: All data is transmitted as bits (0s and 1s). The Number of 1s in a byte must always be either an
odd number or an even number. If two devices that are communicating decide that there will always be an
odd number of 1s, then if a byte is received that has an even number of 1s, an error must have occurred.
E.g. the byte 01011000 has 3 ones in it. 3 is an odd number, so it fits the rule that it must have an odd
number of ones. When it is sent there is an error in transmission so that the first bit is received as a one.
So, the byte received is 11011000. This has 4 ones in it, which is an even number, so there must be an
error. The receiving device would ask for it to be sent again.
Notes:
If two mistakes are made in the same byte they will cancel each other out and the faulty data will be
accepted. This problem can be overcome, and in the same way, a clever way of correcting error
mistakes can be implemented. This method is not part of this course.
Earlier in this course it was said that a byte was the number of bits necessary to hold a character
code. Specifically, an ASCII character uses 8 bits in a byte, giving 256 different characters. This is not
true because one of the bits has to be reserved for a parity bit, the bit that can change to make sure
that the number of ones is always odd. This means that there are 128 different characters possible.
The implication in all the above work is that odd parity is always used. Even parity can equally well be
U U
Page 3 of 4
Computer Science 9608 (Notes)
Chapter: 1.6 Security, privacy and data
integrity
It is an error-checking technique involving the comparison of a transmitted block check character with
one calculated by the receiving device. Parallel parity is based on regular parity. Parallel parity can detect
whether or not there was an error, furthermore it can detect which bit has flipped. This method is
implemented on a block of data which is made of sum words; the parity bit is then added to the columns
and rows.
Here’s an example:
1 1 1 0 1 0 1 1
1 1 0 0 0 1 1 0
0 0 1 0 0 1 1 1
0 0 0 0 1 0 1 0
b) Check Sum: Data will normally be sent from one place to another as a block of bytes rather than as
individual bytes. The computer can add numbers together without any trouble, so another checking
procedure is to add all the bytes together that are being sent in the block of data. The carry, out of the
byte, is not taken into account, so the answer is an 8 bit number, just like the bytes. This answer is
calculated before the data is sent, and then calculated again when it is received, and if there are no errors
in the transmission, the two answers will match.
Page 4 of 4
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
Ethics can be defined as "moral principles that govern a person's or a group's behaviors". Ethical
behavior is not necessarily related to the law. For example, just because something is not against the law
doesn't mean it is okay to do it.
Computer ethics are concerned with standards of conduct applying to the use of computers.
Computer ethics can be understood as the branch of applied ethics which studies and analyzes social and
ethical impact of information technology.
The Computer Ethics Institute has published their "Ten Commandments of Computer Ethics" to guide
responsible computer use. They are as follows:
Page 1 of 3
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
The Software Engineering Code of Ethics and Professional Practice, produced by the Institution of
Electrical and Electronic Engineers Computer Society (IEEE CS) and the Association for Computing
Machinery (ACM), acts as a professional standard for teaching and practicing software engineering. It
specifies ethical and professional obligations of software engineers and states the standards that
standards society at large expects them to meet and what they should expect of one another. The code
also tells the public what they should expect from software engineers. The code was produced by a
multinational task force which considered input from a variety of sources including industrial, government
and military installations and educational professions. An informative article of about the development of
the code, which includes a full copy of the code itself was published in the October 1999 issue of ACM
Computer.
Software engineers shall commit themselves to making the analysis, specification, design, development,
testing and maintenance of software a beneficial and respected profession. In accordance with their
commitment to the health, safety and welfare of the public, software engineers shall adhere to the
following Eight Principles:
Public: Software engineers shall act consistently with the public interest.
Client and Employer: Software engineers shall act in a manner that is in the best interests of their client
and employer, consistent with the public interest.
Product: Software engineers shall ensure that their products and related modifications meet the highest
professional standards possible.
Judgment: Software engineers shall maintain integrity and independence in their professional judgment.
Management: Software engineering managers and leaders shall subscribe to and promote an ethical
approach to the management of software development and maintenance.
Profession: Software engineers shall advance the integrity and reputation of the profession consistent
with the public interest.
Self: Software engineers shall participate in lifelong learning regarding the practice of their profession and
shall promote an ethical approach to the practice of the profession.
Page 2 of 3
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
A Code of Conduct is not law, but it is a set of rules that apply when you are in an organization such as
your college. Examples might include "Don't watch pornography at the office". This would be legal at
home, but if you did it at the office, you could be fired. In addition, a code of conduct may contain laws
such as "Don't install pirated software".
The British Computer Society has produced a list of standards for the training and development of
Information Technology workers.
The Public Interest - safeguarding public health; respecting rights of 3rd parties, applying knowledge of
relevant regulations.
Duty to employers and clients - carrying out work according to the requirements, and not abusing
employers' or clients' trust in any way.
Professional duty - uphold the reputation of the profession through good practice, support fellow
members in professional development.
Professional Integrity and Competence - maintain standards of professional skill and practice, accepting
responsibility for work done, avoiding conflicts of interest with clients.
(Each of these might be perfectly legal at home, but they might get you fired at work)
Page 3 of 3
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
One of the more controversial areas of computer ethics concerns the intellectual property rights
connected with software ownership. Some people, like Richard Stallman who started the Free Software
Foundation, believe that software ownership should not be allowed at all. He claims that all information
should be free, and all programs should be available for copying, studying and modifying by anyone who
wishes to do so [Stallman, 1993]. Others argue that software companies or programmers would not invest
weeks and months of work and significant funds in the development of software if they could not get the
investment back in the form of license fees or sales [Johnson, 1992]. Today's software industry is a
multibillion dollar part of the economy; and software companies claim to lose billions of dollars per year
through illegal copying (“software piracy”). Many people think that software should be ownable, but
“casual copying” of personally owned programs for one's friends should also be permitted [Nissenbaum,
1995]. The software industry claims that millions of dollars in sales are lost because of such copying.
Ownership is a complex matter, since there are several different aspects of software that can be owned
and three different types of ownership: copyrights, trade secrets, and patents.
The “source code” which is written by the programmer(s) in a high-level computer language like
Java or C++.
The “object code”, which is a machine-language translation of the source code.
The “algorithm”, which is the sequence of machine commands that the source code and object
code represent.
The “look and feel” of a program, which is the way the program appears on the screen and
interfaces with users.
Page 1 of 5
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
The way you use data and computers is subject to the law of the country you are living in. Across the
world different countries have different laws, for the exam you only need to learn about the laws that
affect the United Kingdom.
You must be familiar with the following legislation:
Copyright
Software copyright refers to the law regarding the copying of computer software. Many companies and
individuals write software and sell it for money, these products are copyrighted and you cannot copy the
code or the program without the permission of the maker. This, they believe protects the work of the
programmers, rewarding them for their efforts.
Other companies and individuals release software under Free and Open Source software (FOSS) licenses.
0T 0T 0T 0T
These licenses allow users the right to use, study, change, and improve a program's design through the
availability of its source code. Some adherents of FOSS believe it creates better software in the long term,
and others believe that no software should be copyrighted. FOSS licensed products are heavily used in
running the World Wide Web and in the creation of popular websites such as Facebook. Open Source
licenses generally mean that if you create software that makes changes to open source code, and choose
to release it, you must release your new code under the same Open Source license, this is called Copy-
Left. Some free software is in the public domain, meaning that you can use it for whatever purpose you
wish, if you make a software product involving changes to public domain sources code, you don't have to
release your code into the public domain.
Copyright in most works lasts until 70 years after the death of the creator if known, otherwise 70 years
after the work was created or published (fifty years for computer-generated works).
In summary the act specifies that users are not allowed to:
The Computer Misuse Act 1990 deals with people who crack computer programs or systems. Crimes
0T 0T 0T 0T 0T 0T 0T 0T
might include removing the Copyright protective measures from a commercial software product, breaking
into a school database to change grades, hacking into a companies' website and stealing customer credit
card details, creating viruses and Trojans, and so on. It was recognized in the late 1980s that the increase
in business and home use of computers required legislation in order to protect against their exploitation.
To this end, in 1990 the Computer Misuse Act was established.
Under the act, three new offences were created:
It prohibits:
Page 3 of 5
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
Regulation of Investigatory Powers Act 2000 (measures to restrict access to data made available through
the Internet and World Wide Web)
The Regulation of Investigatory Powers Act was passed in 2000, and introduces the power to intercept
communications with the aim of taking into account the growth of the Internet. It regulates the manner in
which certain public bodies may conduct surveillance and access a person's electronic communications.
Supporters of the act claimed this was an excuse to introduce new measures, some of these included
being able to force someone to reveal a cryptographic key for their data, with failure to do so resulting in
up to 2 years imprisonment. As we have seen in packet switching, data can be read in transit between
hosts. However, the act goes further than allowing this:
enables certain public bodies to demand that an ISP provide access to a customer's
communications in secret;
enables mass surveillance of communications in transit;
enables certain public bodies to demand ISPs fit equipment to facilitate surveillance;
enables certain public bodies to demand that someone hand over keys to protected information;
allows certain public bodies to monitor people's internet activities;
prevents the existence of interception warrants and any data collected with them from being
revealed in court.
Software License
A software license is a legally binding agreement that specifies the terms of use for an application and
defines the rights of the software producer and of the end-user.
All software must be legally licensed before it may be installed. Proof of purchase (purchase orders,
receipts, invoices or similar documentation are acceptable) must be maintained by individuals or
departments.
Software licensing can be a confusing subject. There are different types of licenses and licensing
contracts, and different vendors may use different terms to describe their licenses. Here are some key
terms to help you navigate through these murky waters.
Page 4 of 5
Computer Science 9608 (Notes)
Chapter: 1.7 Ethics and ownership
Shareware. Shareware is free to try out. You usually have to pay if you want to continue using it. Some
shareware relies on the honesty of users to pay up when they should.
Open source business software. Open source software can be freely adapted by anyone with the
knowledge and inclination to do so. The open source system has created many useful pieces of software
that are the product of loose collaboration between many people, all over the world.
Commercial/Proprietary Software
Proprietary software consists of software that is licensed by the copyright holder under very specific
conditions. In general, you can use the software, but you are not allowed to modify the software or
distribute it to others.
Many proprietary software applications are also commercial, meaning that you have to pay for a license.
However, many other proprietary software applications are free. The fact that software is free does not
mean it is not proprietary.
Page 5 of 5
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
A database is an organized collection of data for one or more purposes, usually in digital form. The data
are typically organized to model relevant aspects of reality (for example, the availability of rooms in
hotels), in a way that supports processes requiring this information (for example, finding a hotel with
vacancies). This definition is very general, and is independent of the technology used.
Originally all data were held in files. A typical file would consist of a large number of records each of
which would consist of a number of fields. Each field would have its own data type and hold a single item
of data. Typically a stock file would contain records describing stock. Each record may consist of the
following fields.
Description String
This led to very large files that were difficult to process. Suppose we want to know which items need to
be reordered. This is fairly straightforward, as we only need to sequentially search the file and, if Number
in Stock is less than the Reorder Level, make a note of the item and the supplier and output the details.
The problem is when we check the stock the next day, we will create a new order because the stock that
has been ordered has not been delivered. To overcome this we could introduce a new field called On
Order of type Boolean. This can be set to True when an order has been placed and reset to False when an
order has been delivered. Unfortunately it is not that easy.
The original software is expecting the original seven fields not eight fields. This means that the software
designed to manipulate the original file must be modified to read the new file layout.
Page 1 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Further ad hoc enquiries are virtually impossible. What happens if management ask for a list of best
selling products? The file has not been set up for this and to change it so that such a request can be
satisfied in the future involves modifying all existing software. Further, suppose we want to know which
products are supplied by Food & Drink Ltd.. In some cases the company's name has been entered as
Food & Drink Ltd., sometimes as Food and Drink Ltd. and sometimes the full stop after Ltd has been
omitted. This means that a match is very difficult because the data is inconsistent. Another problem is
that each time a new product is added to the database both the name and address of the supplier must be
entered. This leads to redundant data or data duplication.
The following example, shown in Fig..a.1, shows how data can be proliferated when each department
keeps its own files.
File containing
Programs to
Customer name and
Accounts record accounts of
address, amount owing,
Department customers
dates of orders, etc.
Fig. a.1
Page 2 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
This method of keeping data uses flat files. Flat files have the following limitations.
To try to overcome the search problems of sequential files, relational database management systems
were introduced.
A database management system (DBMS) is a software package with computer programs that control the
creation, maintenance, and the use of a database. It allows organizations to conveniently develop
databases for various applications. A database is an integrated collection of data records, files, and other
database objects. A DBMS allows different user application programs to concurrently access the same
database. DBMSs may use a variety of database models, such as the relational model, to conveniently
describe and support applications.
The relational model for database management is a database model that was first formulated and
proposed in 1969 by Edgar F. Codd. The purpose of the relational model is to provide a declarative method
for specifying data and queries: users directly state in simple sentences that what information the
database contains and what information they want from it, and let the database management system
Page 3 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
software take care of describing data structures for storing the data and retrieval procedures for
answering queries.
A DBMS can be defined as a collection of related records and a set of program that access and
manipulate these records. A DBMS enables the user to enter, store, and manage data. The main problem
with the earlier DBMS packages was that the data was stored in the flat file format. So, the information
about different objects was maintained separately in different physical files. Hence, the relations between
these objects, if any, had to be maintained in a separate physical file. Thus, a single package would
consist of too many files and vast functionalities to integrate them into a single system.
A solution to these problems came in the form of a centralized database system. In a centralized
database system, the database is stored in the central location. Everybody can have access to the data
stored in a central location from their machine. For example, a large central database system would
contain all the data pertaining to the employees. The Accounts and the HR department would access the
data required using suitable programs. These programs or the entire application would reside on
individual computer terminals.
A Database is a collection of interrelated data, and a DBMS is a set of programs used to add or modify this
data. Thus, a DBMS is a set of software programs that allow databases to be defined, constructed, and
manipulated.
A DBMS provides an environment that is both convenient and efficient to use when there is a large volume
of data and many transactions to be processed. Different categories of DBMS can be used, ranging from
small systems that run on personal computers to huge systems that run on mainframes.
Page 4 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Advantage
1B Notes
2B
individual departments.
More information
9B Data sharing by departments means that
10B
Improved security
13B The database administrator (DBA) can define data
14B
Increased productivity
21B The DBMS provides file handling processes
2B
applications to be re-written.
Improved back-up and
25B DBMSs automatically handle back-up and
26B
Page 5 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Benefits of DBMS
A DBMS is responsible for processing data and converting it into information. For this purpose, the
database has to be manipulated, which includes querying the database to retrieve specific data, updating
the database, and finally, generating reports.
These reports are the source of information, which is, processed data. A DBMS is also responsible for data
security and integrity.
Data storage
The programs required for physically storing data, handled by a DBMS, is done by creating complex data
structures, and the process is called data storage management.
Data definition
A DBMS provides functions to define the structure of the data in the application. These include defining
and modifying the record structure, the type and size of fields, and the various constraints/conditions to
be satisfied by the data in each field.
Data manipulation
Once the data structure is defined, data needs to be inserted, modified, or deleted. The functions, which
perform these operations, are also part of a DBMS. These functions can handle planned and unplanned
data manipulation needs. Planned queries are those, which form part of the application. Unplanned
queries are ad-hoc queries, which are performed on a need basis.
Data security is of utmost importance when there are multiple users accessing the database. It is required
for keeping a check over data access by users. The security rules specify, which user has access to the
database, what data elements the user has access to, and the data operations that the user can perform.
Data in the database should contain as few errors as possible. For example, the employee number for
adding a new employee should not be left blank. Telephone number should contain only numbers. Such
checks are taken care of by a DBMS.
Thus, the DBMS contains functions, which handle the security and integrity of data in the application.
These can be easily invoked by the application and hence, the application programmer need not code
these functions in the programs.
Page 6 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Recovery of data after a system failure and concurrent access of records by multiple users are also
handled by a DBMS.
Performance
Optimizing the performance of the queries is one of the important functions of a DBMS. Hence, the DBMS
has a set of programs forming the Query Optimizer, which evaluates the different implementations of a
query and chooses the best among them.
At any point of time, more than one user can access the same data. A DBMS takes care of the sharing of
data among multiple users, and maintains data integrity.
The query language of a DBMS implements data access. SQL is the most commonly used query language.
A query language is a non-procedural language, where the user needs to request what is required and
need not specify how it is to be done. Some procedural languages such as C, Visual Basic, Pascal, and
others provide data access to programmers through these APIs and other tools.
Data Modelling
This shows that a data model can be an external model (or view), a conceptual model, or a physical model.
This is not the only way to look at data models, but it is a useful way, particularly when comparing models.
Page 7 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Conceptual schema: describes the semantics of a domain (the scope of the model). For example, it may
be a model of the interest area of an organization or of an industry. This consists of entity classes,
representing kinds of things of significance in the domain, and relationships assertions about
associations between pairs of entity classes. A conceptual schema specifies the kinds of facts or
propositions that can be expressed using the model. In that sense, it defines the allowed expressions in
an artificial "language" with a scope that is limited by the scope of the model. Simply described, a
conceptual schema is the first step in organizing the data requirements.
Logical schema: describes the structure of some domain of information. This consists of descriptions of
(for example) tables, columns, object-oriented classes, and XML tags. The logical schema and conceptual
schema are sometimes implemented as one and the same.
Physical schema: describes the physical means used to store data. This is concerned with partitions,
CPUs, table spaces, and the like.
According to ANSI, this approach allows the three perspectives to be relatively independent of each other.
Storage technology can change without affecting either the logical or the conceptual schema. The
table/column structure can change without (necessarily) affecting the conceptual schema. In each case,
of course, the structures must remain consistent across all schemas of the same data model.
Database Schema:
Database schema is the skeleton of database. It is designed when database doesn't exist at all and very
hard to do any changes once the database is operational. Database schema does not contain any data or
information.
A database schema defines its entities and the relationship among them. Database schema is a
descriptive detail of the database, which can be depicted by means of schema diagrams. All these
activities are done by database designer to help programmers in order to give some ease of
understanding all aspect of database.
Physical Database Schema: This schema pertains to the actual storage of data and its form of storage
like files, indices etc. It defines the how data will be stored in secondary storage etc.
Logical Database Schema: This defines all logical constraints that need to be applied on data stored. It
defines tables, views and integrity constraints etc
Page 8 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Data Dictionary
In database management systems, a file that defines the basic organization of a database. A data
dictionary contains a list of all files in the database, the number of records in each file, and the names and
types of each field. Most database management systems keep the data dictionary hidden from users to
prevent them from accidentally destroying its contents.
Data dictionaries do not contain any actual data from the database, only book-keeping information for
managing it. Without a data dictionary, however, a database management system cannot access data
from the database.
Every database software provides the interface to design schemas and manipulate data through the query
processors.
Page 9 of 9
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
The Relational Model is an attempt to simplify database structures. It represents all data in the database
as simple row-column tables of data values. An RDBMS is a software program that helps to create,
maintain, and manipulate a relational database. A relational database is a database divided into logical
units called tables, where tables are related to one another within the database.
Tables are related in a relational database, allowing adequate data to be retrieved in a single query
(although the desired data may exist in more than one table). By having common keys, or fields, among
relational database tables, data from multiple tables can be joined to form one large result set.
Thus, a relational database is a database structured on the relational model. The basic characteristic
of a relational model is that in a relational model, data is stored in relations.
Page 1 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Both the tables have a common column, that is, the Country column. Now, if the user wants to
display the information about the currency used in Rome, first find the name of the country to
which Rome belongs. This information can be retrieved from table 1.6. Next, that country should
be looked up in table 1.7 to find out the currency.
It is possible to get this information because it is possible to establish a relation between the
two tables through a common column called Country.
There are certain terms that are mostly used in an RDBMS. These are described as follows:
For example, a company might have an Employee table with a row for each employee; typically called the
entity. What attributes might be interesting for such a table? This will depend on the application and
the type of use the data will be put to, and is determined at database design time.
An entity is anything living or nonliving with certain attributes to which some values can be assigned.
These are separate entities for which different tables are designed in a schema.
Page 2 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
The tables 1.8, 1.9, 1.10, and 1.11 are used to illustrate this scenario. These tables depict tuples and
attributes in the form of rows and columns. Various terms related to these tables are given in table
1.12.
Page 3 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
The components of an RDBMS are entities and tables, which will be explained in this section.
Entity
An entity is a person, place, thing, object, event, or even a concept, which can be distinctly identified. For
example, the entities in a university are students, faculty members, and courses.
Each entity has certain characteristics known as attributes. For example, the student entity might include
attributes such as student number, name, and grade. Each attribute should be named appropriately.
A grouping of related entities becomes an entity set. Each entity set is given a name. The name of the
entity set reflects the contents. Thus, the attributes of all the students of the university will be stored in an
entity set called Student.
Page 4 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
The access and manipulation of data is facilitated by the creation of data relationships based on a
construct known as a table. A table contains a group of related entities that is an entity set. The terms
entity set and table are often used interchangeably. A table is also called a relation. The rows are known
as tuples. The columns are known as attributes. Figure 1.6 highlights the characteristics of a table.
Each table must have a key known as primary key that uniquely identifies each row.
All values in a column must conform to the same data format. For example, if the attribute is
assigned a decimal data format, all values in the column representing that attribute must be in
decimals.
Each column has a specific range of values known as the attribute domain.
Page 5 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
The differences between a DBMS and an RDBMS are listed in table 1.13.
In an RDBMS, a relation is given more importance. Thus, the tables in an RDBMS are dependent and the
user can establish various integrity constraints on these tables so that the ultimate data used by the user
remains correct. In case of a DBMS, entities are given more importance and there is no relation
established among these entities.
Page 6 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
England
1 Table
2 Desk
3 Chair
In this example, the delivery note has more than one part on it. This is called a repeating group. In the
relational database model, each record must be of a fixed length and each field must contain only one
item of data. Also, each record must be of a fixed length so a variable number of fields is not allowed. In
this example, we cannot say 'let there be three fields for the products as some customers may order more
products than this and other fewer products. So, repeating groups are not allowed.
At this stage we should start to use the correct vocabulary for relational databases. Instead of fields we
call the columns attributes and the rows are called tuples. The files are called relations (or tables).
Page 7 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
where DELNOTE is the name of the relation (or table) and Num, CustName, City, Country, ProdID and
Description are the attributes. ProdID and Description are put inside parentheses because they form a
repeating group. In tabular form the data may be represented by Fig. 3.6 (b)2.
2 Desk
3 Chair
This again shows the repeating group. We say that this is in un-normalised form (UNF). To put it into 1st
normal form (1NF) we complete the table and identify a key that will make each tuple unique. This is
shown in Fig. Fig. 3.6 (b)3.
To make each row unique we need to choose Num together with ProdID as the key. Remember, another
delivery note may have the same products on it, so we need to use the combination of Num and ProdID to
form the key. We can write this as
To indicate the key, we simply underline the attributes that make up the key.
Because we have identified a key that uniquely identifies each tuple, we have removed the repeating
group.
Page 8 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
A relation with repeating groups removed is said to be in First Normal Form (1NF). That is, a
relation in which the intersection of each tuple and attribute (row and column) contains one
and only one value.
However, the relation DELNOTE still contains redundancy. Do we really need to record the details of the
customer for each item on the delivery note? Clearly, the answer is no. Normalisation theory recognises
this and allows relations to be converted to Third Normal Form (3NF). This form solves most problems.
(Note: Occasionally we need to use Boyce-Codd Normal Form, 4NF and 5NF. This is rare and beyond the
scope of this syllabus.)
Let us now see how to move from 1NF to 2NF and on to 3NF.
Definition of 2NF
A relation that is in 1NF and every non-primary key attribute is fully dependent on the primary
key is in Second Normal Form (2NF). That is, all the incomplete dependencies have been
removed.
In our example, using the data supplied, CustName, City and Country depend only on Num and not on
ProdID. Description only depends on ProdID, it does not depend on Num. We say that
ProdIDdetermines Description
and write
ProdID → Description
Page 9 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
Note the keys (underlined) for each relation. DEL_PROD needs a compound key because a delivery note
may contain several parts and similar parts may be on several delivery notes. We now have the relations
in 2NF.
Can you see any more data repetitions? The following table of data may help.
Page 10 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
A relation that is in 1NF and 2NF, and in which no non-primary key attribute is transitively dependent on
the primary key is in 3NF. That is, all non-key elements are fully dependent on the primary key.
City → Country
Num → CustName
CITY_COUNTRY(City, Country)
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
Page 11 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
1NF
DELNOTE
Page 12 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
DELNOTE PRODUCT
5 Cabinet
DEL_PROD
Num ProdID
005 1
005 2
005 3
008 2
008 7
014 5
002 7
002 1
002 2
Page 13 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
DELNOTE DEL_PROD
008 7
014 5
002 7
002 1
002 2
PRODUCT CITY_COUNTRY
7 Cupboard
5 Cabinet
Page 14 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
UNF
1NF
2NF
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
3NF
CITY_COUNTRY(City, Country)
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
In this Section we have seen the data presented as tables. These tables give us a view of the data. The
tables do NOT tell us how the data is stored in the computer, whether it be in memory or on backing store.
Tables are used simply because this is how users view the data. We can create new tables from the ones
that hold the data in 3NF. Remember, these tables simply define relations.
Users often require different views of data. For example, a user may wish to find out the countries to
which they have sent desks. This is a simple view consisting of one column. We can create this table by
using the following relations (tables).
Page 15 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Films are shown at many cinemas, each of which has a manager. A manager may manage more than one
cinema. The takings for each film are recorded for each cinema at which the film was shown.
Page 16 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Page 17 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Page 18 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Therefore 2NF is
FILM(FID, Title)
In Cinema, the non-key attribute MName is dependent on MID. This means that it is transitively dependent on the
primary key. So we must move this out to get the 3NF relations
FILM(FID, Title)
MANAGER(MID, MName)
Page 19 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Entity-Relationship (E-R) diagrams can be used to illustrate the relationships between entities. In the
earlier example we had the four relations
In an E-R diagram DELNOTE, CITY_COUNTRY, PRODUCT and DEL_PROD are called entities. Entities have
the same names as relations but we do not usually show the attributes in E-R diagrams.
Page 20 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by
Fig. 3.6 (c)1 is the E-R diagram showing the relationships between DELNOTE, CITY_COUNTRY, PRODUCT
and DEL_PROD.
DELNOTE
CITY_COUNTRY DEL_PROD
PRODUCT
If the relations are in 3NF, the E-R diagram will not contain any many-to-many relationships. If there are
any one-to-one relationships, one of the entities can be removed and its attributes added to the entity that
is left.
Page 21 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
FILM(FID, Title)
MANAGER(MID, MName)
takes
FILM TAKINGS
is for
Connected by FID
takes
CINEMA TAKINGS
is for
Connected by CID
manages
MANAGER CINEMA
managed by
Connected by MID
Page 22 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
CINEMA
MANAGER TAKINGS
FILM
That is
CINEMA FILM
If you now look at Fig. 3.6.c.2, you will see that the link entity is TAKINGS.
Page 23 of 23
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
DDL, which is usually part of a DBMS, is used to define and manage all attributes and properties of a
database, including row layouts, column definitions, key columns, file locations, and storage strategy. DDL
statements are used to build and modify the structure of tables and other objects such as views, triggers,
stored procedures, and so on. For each object, there are usually CREATE, ALTER, and DROP statements
0T 0T 0T 0T 0T 0T
(such as, CREATE TABLE, ALTER TABLE, and DROP TABLE). Most DDL statements take the following form:
0T 0T 0T 0T 0T 0T
In DDL statements, object_namecan be a table, view, trigger, stored procedure, and so on.
0T 0T
CREATE DATABASE
Many database servers allow for the presence of many databases. In order to create a database, a
relatively standard command ‘CREATE DATABASE’ is used.
The name can be pretty much anything; usually it shouldn’t have spaces (or those spaces) have to be
properly escaped). Some databases allow hyphens, and/or underscores in the name. The name is usually
limited in size (some databases limit the name to 8 characters, others to 32—in other words, it depends on
what database you use).
DROP DATABASE
Just like there is a ‘create database’ there is also a ‘drop database’, which simply removes the database.
Note that it doesn’t ask you for confirmation, and once you remove a database, it is gone forever.
Page 1 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
CREATE TABLE
Probably the most common DDL statement is ‘CREATE TABLE’. Intuitively enough, it is used to create
tables. The general format is something along the lines of:
The ... is where column definitions go. The general format for a column definition is the
PERSONID INT
Which defines a column name PERSONID, of type INT. Column names have to be comma separated, i.e.:
The above creates a table named person, with person id, last name, first name, and date of birth. There is
also the ‘primary key’ definition. A primary key is a column value that uniquely identifies a database
record. So for example, we can have two ‘person’ records with the same last name and first name, but
with different ids.
Besides for primary key, there are many other flags we can specify for table columns. For example, in the
above example, FNAME is marked as NOT NULL, which means it is not allowed to have NULL values.
Many databases implement various extensions to the basics, and you should read the documentation to
determine what features are present/absent, and how to use them.
Page 2 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
ALTER TABLE
There is a command to ‘alter’ tables after you create them. This is usually only useful if the table already
has data, and you don’t want to drop it and recreate it (which is generally much simpler). Also, most
databases have varying restrictions on what ‘alter table’ is allowed to do. For example, Oracle allows you
do add a column, but not remove a column.
Note that very few databases let you drop a field. The drop command is mostly present to allow for
dropping of constraints (such as indexes, etc.) on the table.
The general syntax to modify a field (change its type, etc.) is:
Note that you can only do this to a certain extent on most databases. Just as with ‘drop’, this is mostly
useful for working with table constraints (changing ‘not null’ to ‘null’, etc.)
Page 3 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
DML is used to select, insert, update, or delete data in the objects defined with DDL. All database users
can use these commands during the routine operations on a database. The different DML statements are
as follows:
SELECT
0T statement
0T
INSERT
0T statement
0T
UPDATE
0T statement
DELETE
0T statement
INSERT Statement
To get data into a database, we need to use the ‘insert’ statement. The general syntax is:
INSERT INTO <table-name> (<column1>,<column2>,<column3>,...)
VALUES (<column-value1>,<column-value2>,<column-value3>);
The column names (i.e.: column1, etc.) must correspond to column values (i.e.: column-value1,etc.).
There is a short-hand for the statement:
VALUES (<column-value1>,<column-value2>,<column-value3>);
In which the column values must correspond exactly to the order columns appear in the ‘createtable’
declaration. It must be noted, that this sort of statement should (or rather, must)be avoided! If someone
changes the table, moves columns around in the table declaration, the code using the shorthand insert
statement will fail.
A typical example, of inserting the ‘person’ record we’ve created earlier would be:
Page 4 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
SELECT Statement
Probably the most used statement in all of SQL is the SELECT statement. The select statementh as the
general format of:
SELECT <column-list>
FROM <table-list>
WHERE <search-condition>
The column-list indicates what columns you’re interested in (the ones which you want
to appear in the result), the table-list is the list of tables to be used in the query, and
search-condition specifies what criteria you’re looking for.
An example of a short-hand version to retrieve all ‘person’ records we’ve been using:
SELECT * FROM PERSON;
The WHERE clause is used in UPDATE, DELETE, and SELECT statements, and has the same
format in all these cases. It has to be evaluated to either true or false. Table 1 lists some of the common
operators.
= equals to
> greater than
< less than
>= greater than or equal to
<= less than or equal to
<> not equal to
Table 1: SQL Operators
There is also IS, which can be used to check for NULL values, for example:
column-name IS NULL
Page 5 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
UPDATE Statement
The update statement is used for changing records. The general syntax is:
UPDATE <table-name>
SET <column1> = <value1>, <column2> = <value2>, ...
WHERE <criteria>
The criteria is what selects the records for update. The ‘set’ portion indicates which columns should be
updated and to what values. An example of the use would be:
UPDATE PERSON
SET FNAME=’Clark’, LNAME=’Kent’
WHERE FNAME=’Superman’;
DELETE Statement
The ‘delete’ is used to remove elements from the database. The syntax is very similar to
update and select statements:
Basically we select which records we want to delete using the where clause. An example use would be:
Page 6 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
Page 7 of 8
Computer Science 9608 (Notes)
Chapter: 1.8 Database and data modelling
Topic: 1.8.3 Data Definition Language (DDL) and Data Manipulation Language (DML)
IN SELECT column_name(s)
FROM table_name
WHERE column_name
IN (value1,value2,..)
INSERT INTO INSERT INTO table_name
VALUES (value1, value2, value3,....)
or
INSERT INTO table_name
(column1, column2, column3,...)
VALUES (value1, value2, value3,....)
INNER JOIN SELECT column_name(s)
FROM table_name1
INNER JOIN table_name2
ON table_name1.column_name=table_name2.column_name
LIKE SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern
ORDER BY SELECT column_name(s)
FROM table_name
ORDER BY column_name [ASC|DESC]
SELECT SELECT column_name(s)
FROM table_name
SELECT * SELECT *
FROM table_name
SELECT DISTINCT SELECT DISTINCT column_name(s)
FROM table_name
UPDATE UPDATE table_name
SET column1=value, column2=value,...
WHERE some_column=some_value
WHERE SELECT column_name(s)
FROM table_name
WHERE column_name operator value
Page 8 of 8