Compa Level Notes
Compa Level Notes
ALGORITHM: Set of rules or sequence of steps specifying how to solve a problem. The
specification indicates that the steps indicated by the algorithm must terminate, but many
disagree with this. An Algorithm generally contains an INPUT step, a PROCESSING step and
an OUTPUT step.
Data Types: Different types of data are stored differently in the computer’s memory; these
are:
• Integers: Number written without a fractional component, is a member of
Program Constructs: There are three basic programming constructs: Sequence, Selection
and Iteration.
• Sequence is one statement after the other.
• Selection statements are used to find which statement should be performed
next, making use of conditional operators.
• Iteration: Two types of iteration, definite and indefinite iteration – with
definite iteration involving a clear assignment at the beginning of the loop of
how many loop cycles would occur for. Meanwhile the indefinite loop would
often make use of conditions, (such as in a while loop), where the loop would
occur while the condition is being met.
Using Variables: The process involves three steps, declaration, followed by assignment and
then use.
Identifier Names: These names are used for everything from the names of variables to the
names of classes and functions. It is helpful to important to use useful identifier names as it
makes it easier for the program to be easily readable by you or indeed by others.
Integer vs Float Division: Integer division returns the answer without the remainder,
whereas float division will return the answer to the division in a decimal form. In order to get
just the remainder, modular division can be used.
Truncation: Process by which the number of digits after the decimal point are limited.
Boolean Operations: Boolean operations are effectively passing variables through logic
gates, such as the AND, OR, and NOT gate.
Variables vs Constants: Variables are identifiers which are given to a specific memory
location where the value will change over the course of the program. Constants are instead
values which will remain constant throughout the running of the program – in a long
program, they offer peace-of-mind to a programmer, that they could not change this value.
Exception Handling: Exception handling is a mechanism by which the program will cater for
any possible issues that occur through the running of the program, such as accessing non-
existent variables, etc.
Structured Programming: When a program is short and simple, there is no need to break it
up into subroutines. However, when the program is long, it is often useful to break the
program into a number of subtasks. This offers a number of advantages.
• Easier to read for the programmer and someone else reading the program.
• Easier to edit the program – maintainability.
• Easier to write an algorithm – where the problem can be divided into a
number of subtasks.
• Reduced complexity.
Hierarchy Charts can be showed to show the structure of the program, including how it
flows, including subroutines and subroutines inside these subroutines. However, the
program does not show the detailed program structures in every module, therefore it
doesn’t have the required complexity. These can be shown in a structure chart.
Objects
Each object will have its own attributes and a state
Each object has behaviours - functions which can be performed by the object.
Classes
Class is a blueprint or template for an object - defines the attributes and methods of objects
in that class.
As a general rule, instance variables or attributes are declared private and most methods
public, so that other classes may use methods belonging to another class but may see or
change their attributes.
This is a principle of information hiding.
Constructors are items used to create objects in the class.
Getters and setters also exist.
Instantiation
Creating an instance of a class is known as instantiation - creating a reference type variable.
Using getters and setters, the items in the class can be changed.
Encapsulation
An object encapsulates both its state and its behaviours and methods.
Related to the concept of encapsulation is information hiding , where details of how a class
can be used can be ignored when utilising this class.
Inheritance
Classes can inherit data and behaviour from a parent class in the same way that children can
inherit characteristics from their parents.
A child class in OOP is a subclass and parent class is a superclass.
Can therefore create a inheritance diagram.
Inheritance should be used using the 'is a' rule…
Polymorphism
Polymorphism refers to the ability to process objects differently depending on their class -
using overriding things which exist.
PROGRAM TO AN INTERFACE
An interface is a collection of abstract methods that a group of unrelated classes may be
implemented. Although the methods are specified in the interface, they will only be
implemented by a class that implements the interface.
This is generally a good idea as it can take immediately from the design and therefore can
ensure that someone definitely implements all the required features of the class.
• LIST
o Abstract data type - effectively an array with an undefined number of terms.
o It is possible to use a static array to store the items of a list.
o Functions
• Insert
• Remove
o Ordered List
• Some languages, like Python, have a built in dynamic list structure which
uses a linked list - therefore hides all the associated function.
• As nodes are added, new memory locations can be dynamically pulled
from the heap
§ HEAP: A pool of memory locations which can be allocated or
deallocated as required.
• The pointers in different items can then be updated.
•
•
• 4.4.4 - CLASSIFICATION OF ALGORITHMS (time complexity and space complexity)
o 4.4.4.2 - O Notation
• Time Complexity: The definition of how much time an algorithm takes to
solve a problem
• Functions
§ Linear: f(x) = ax + b
§ Polynomial: ..
§ Logarithmic: …
§ Factorial: …
• Orders
§ O(1) - Constant time
§ O(n) - linear time
§ O(n^x) - polynomial time
§ O(2^n) - exponential time
§ O(log n) - Logarithmic time
§ O(n!) - exponential time
o 4.4.4.4 - Limits of Computation
• Insoluble practical problems
§ Problems can be theoretically soluble but will take millions of year
to solve
§ Therefore practically insoluble
§ Eg. Passwords
• Travelling Salesman Problem
§ For a simple method, requires a brute-force method, therefore
testing every combination of the route that would visit every
node.
§ Therefore, insoluble for large number of data.
§ Computationally Difficult - requires heuristics to be easily
doable.
• 4.4.4.5 - Tractable vs intractable
§ NP = Intractable
§ P = Tractable
• Heuristics
§ Solution which has a high probability of being correct - therefore
things like A* instead of Djikstra
§ Other thins like that are usable for Travelling Salesman Problem
§ Eg. A*, virus scanners use heuristics
• 4.4.4.6 - Computable vs Non-Computable
§ Things which can't be solved algorithmically are non-computable.
§ Turing proved some problems are simply non-computable.
• 4.4.4.7 - THE HALTING PROBLEM
§ Problem of, whether a given input, a program will finish running
or continue for ever.
§ Turing proved that no machine can solve this for all possible
inputs.
§ Shows there are some problems which cannot be solved by
computer.
• 4.4.5 - Turing Machines
o A turing machine can be views as a computer with a single fixed program with
• A finite set of states in a state transition diagram
• A finite alphabet of symbols
• An infinite tape with marked off symbols
• Sensing read-write head that can travel along the tape, one square at the
time.
o States include:
• Start state
• Halting state (any state with no outgoing transitions)
o Example is:
• Binary alphabet of 1, 0 or blank
• And a machine can increment by one
o Transition Functions
• Transition rule can be represented by a function d (delta)
• D(Current State, Input Symbol) = (Next State, Output Symbol,
Movement)
o Universal Turing Machine
• A turing machine can theoretically represent any computation
• The UTM can be used to compute any computable sequence - effectively
by compiling something which is written at the beginning of the tape.
• Anything which is computable can be computed by a turing machine.
Otherwise, it is not computable.
• Led to the idea of the stored program computer - program and data
stored in memory
Number Bases
Different number bases show how many possible numbers could be represented by a single
character.
Denary: Base 10 – makes use of ‘0’ to ‘9’
Binary: Base 2 – makes use of ‘0’ and ‘1’
Hexadecimal: Base 16 – makes use of ‘0’ to ‘9’ and ‘A’ to ‘F’
Conversion
1. Denary to Binary
a. Find the largest 2^n that would fit into the number and then subtract
this from the original number. From here, look at all possible 2^n-x
where n-x >= 0, putting a 0 if it would not fit in what is left of the
number or a 1 if it would, then subtracting this from the number.
2. Binary to Denary
a. Using the multipliers of each column, add up the values to find the
original number, for example:
0 1 1 1
x8 x4 x2 x1
0 4 2 1
Therefore
0111 = 0
+4+2+
1=7
3. Binary to Hexadecimal
a. Look at each nibble (four bits) of the number separately, converting
each into hex, (largely through converting through denary) then
combining each hex equivalent of each nibble to form one complete
hexadecimal number.
4. Hexadecimal to Binary
a. Convert each hexadecimal digit directly into a nibble, converting
through denary, then recombining the value afterwards.
5. Denary to Hexadecimal
a. Follow the same process as for denary to binary just for 16 as
opposed to 2.
6. Hexadecimal to Denary
a. Using the multipliers of each digit, add up the values to find the
original number.
Hexadecimal Advantages:
7. Used to represent binary since it can represent a byte in only two digits
rather than the eight required if binary were to be used.
8. Easy for technicians and computer user to remember hexadecimal digits.
9. Easy to convert to and from raw values on the computer, since binary can be
easily converted to hex.
Binary
Terminology
• Bit = fundamental piece of information – a single 1 or 0
• Byte = set of eight bits
• Nibble = set of four bits
• Signed Integers are when the binary can represent both positive and negative numbers,
where integers are unsigned when that can only represent nonnegative integers.
Unit Nomenclature
Name Symbol Power
Kibi Ki 210
Mebi Mi 220
Gibi Gi 230
Tebi Ti 240
Pebi Pi 250
Exbi Ei 260
Zebi Zi 270
Yobi Yi 280
Kilo K 103
Mega M 106
Giga G 109
Terra T 1012
Peta P 1015
Exa E 1018
Zetta Z 1021
Yotta Y 1024
ASCII Table
The standard manner for representing the characters on a keyboard is called ASCII
(American Standard Code for Information Interchange). ASCII originally only used 7 bits, but
there was an increase into an 8 bit version to attempt to include more characters such as
those in foreign languages. Subsequently, there was the production of Unicode (UTF-16 and
UTF-32) which have all the possible characters that could be needed. However, this ensured
that the first 128 characters were the same as those in ASCII to ensure there was complete
compatibility with this.
Important Characters:
• ‘ ’ = 32
• ‘0’ = 48
• ‘A’ = 65
• ‘a’ = 97
Normalisation: Normalisation is the process of moving the binary point of a floating point
number to provide the maximum level of precision for a given number of bits.
A positive number starts off with a 01, while negatives should start with a 10 - the mantissa
of a negative number in normalised form always lies between -1/2 and -1.
Overflow occurs when the result of a calculation is too large to be stored in the number of
bits allocated.
Error Checking
Parity Bits: This was developed especially when there was an extra bit, when 7 bit ASCII
code was used. This extra bit could help to find if there was a single change. There are two
types of parity bits, even parity and odd parity. This works by ensuring that the number of
1s in the byte are even or odd respectively by using the 8th bit as a 0 or 1 as required. This
ensures that if there is a change in the rest of the byte (or indeed in the parity bit), the
receiver on attempting to verify it, would find that there was a problem, and hence request
retransmission. However, if there are two (or even number) of problems, this would not find
that.
Majority Voting: Majority voting is a system which would require the same bit to be sent
three times. On the receiving end, these are evaluated for each bit and it chooses the
majority receipt (0 or 1) to be the correct one that had been sent, assuming that the
majority of transmissions would be correct.
Checksums: Checksums work by adding up the value of all of the bytes separately and
sending this value in addition to everything else. Therefore, it is likely (though not
guaranteed) that if there are any changes in any of the bytes it would change the value of
the checksum, hence registering a problem on the receiving end. This would then allow the
receiving computer to request retransmission of the information.
Checkdigit: A checkdigit is similar to a checksum and includes an additional digit at the end
of the string of the other numbers. An example of a use of checkdigits is in the ISBN
(International Standard Book Number) and EAN (European Article Number) which is specific
to each book. They make use of the modulo 10 system.
ISBN 9 7 8 0 9 5 6 1 4 3 0 5 1
Weight 1 3 1 3 1 3 1 3 1 3 1 3
Multiplication 9 21 8 0 9 15 6 3 4 9 0 15
Addition Numbers 99
are added
together
Remainder The answer 9
is divided by
10 with the
remainder
being found
(99 % 10)
Subtraction The answer 1
is
subtracted
from 10
Representing Graphics
Bitmapped Images
A bitmap (or raster) contains many picture elements or pixels that make up the whole
image. A pixel is the smallest identifiable area of an image. Each pixel is attributed a binary
value which represents a single colour.
The Resolution of the image can be expressed by width in pixels x height in pixels. It is
sometimes also expressed as the number of pixels per inch, PPI, and refers to the density of
the pixels. For printing, there is a similar measure of DPI (dots per inch) which refers to the
printing quality of a printer.
The Color Depth of the image refers to the number of bits assigned to each pixels taken to
describe the colour of the pixel. The more bits which are used, the greater accuracy of
colour that will occur. The current standard is 3 bytes per pixel, allowing for one byte for
red, one for green and one for blue. Sometimes an extra byte is used as an alpha channel to
control transparency.
The Metadata specifies the properties of the image, including the colour depth, width and
height of the image.
Vector Graphics
Vector graphics allows for the storage of an image in terms of geometric shapes or objects
such as lines, curve, arcs and polygons. A vector file stores the necessary details about each
shape to redraw the object when the file loads, including its position, and for example for a
circle, radius, fill colour, line colour, line style and line width. These properties are stored in
a drawing list which specifies how to redraw the image.
Vector Graphics Bitmapped Graphics
File Size For simple images, with lots Bitmapped images are
of geometric shapes, vector worse for simple
graphics are much more shapes but more
efficient, taking up much less efficient for complex
space on disk. However, for images, especially with
complex shapes with lots of images with
changing colours, and few continuous areas of
geometric shapes, which changing colours.
could be abstracted from the
image, the file size is very
large, often with a specific list
item (square) required for
every single pixel, being far
larger than the binary
equivalent.
Manipulation Vector graphics are easy to While large scale
manipulate in changing the manipulation is harder,
position and properties of the fine changes are much
objects however it is very more simple, with the
hard to change small photo being able to
specifics, since the only manipulated to the
changes that are allowed are pixel.
to the properties of the
object.
Representing Sound
Sample Resolution: The resolution of a sound is increased by a greater audio bit depth
which describes the number of bits which would be used to describe the amplitude of the
sound at a given point in time. Each point at the sample rate must be represented by a
binary value and hence the more bits used to describe this amplitude means a greater
accuracy compared to the original analog sound.
Sample Rate: The sampling rate is the rate at which one records the amplitude of the
sound. The more often the sample is taken, the smoother the playback is, with fewer large
changes in the amplitude between samples. However, the greater the sampling rate, the
more space is required for the information about the song, hence leading to the song taking
up more space on the hard disk.
Nyquist’s Theorem: Henry Nyquist in 1928 found that in order to produce an accurate
recording, the sample rate should be double the maximum frequency of the original signal,
with the theory being proved by Claude Shannon. Therefore, music is generally sampled at
44,100 Hz (the max audible frequency for humans is around 20,000 Hz).
ADC:
• In order to convert analog music into digital music, a microphone must record samples
at a regular interval (the sample rate). Each sample is then quantised, where the wave
height is measured and given an integer value, which can be represented in binary, using
a specific audio bit depth that is being used.
• To output a sound, the binary values for a sample point are translated back into
analogue signals or voltage levels and sent to an amplifier.
MIDI
A Musical Instrument Digital Interface is a list of instructions that allow a device to
synthesise a sound based on digital samples and samples of sounds created from different
sources of instruments. Therefore, the file is generally much smaller (1,000 times) than
conventional recordings). Hence, they are often used for phone ringtones.
Event Messages: Since MIDI files generate music used a timed sequence of instructions,
they can also send event messaged to other instruments to synchronise tempo or control
pitch and volume changes.
Metadata: MIDI files contain the information for a computer to recreate it accurately
including the duration of the note, the instrument, volume and timbre.
Data Compression
Data compression techniques attempt to reduce the size of files on the disk. This is highly
important previously when the cost of storage was very high, however, it is still quite
important when the size of the file is important as the file must be transmitted over the
internet. Compression can be lossy when the quality of the file after the compression is
reduced and lossless when the lossless compression retains all information to ensure the
original file can be replicated exactly.
Lossy Compression
This works by removing non-essential information. In images, this often works by reducing
the colour depth ad in reducing the pixel size. In music files, frequencies which could not
otherwise be heard are removed, while quiet sounds which are played at the same time as
loud ones are also removed, meaning the final file is around 10% of the original size. When
we talk over a telephone, the system makes use of lossy compression.
Lossless Compression
Lossless compression looks at patterns in the file to reduce the size of the data. Using the
patterns and a set of instructions on how to use them, the compressed file can easily be
returned to the original.
Run Length Encoding (RLE): Instead of recording every item in a repeating sequence, RLE
find these sequences and records how many times and where these are used, therefore
saving the space required to repeat these many times.
Dictionary-Based Compression: The system is useful when sending text files. Words are
looked up into a common dictionary, finding their place in the dictionary. Therefore, this
number can be transferred instead of the actual word. For example, ‘pelican’ which would
otherwise require 7 bytes to be sent can be sent in two bytes.
Encryption
Encryption is the transformation of data from one form to another to prevent an
unauthorised third party from being able to understand it.
Caesar Cipher
One of the earliest form of ciphers, as used famously by Julius Caesar, to ensure secrecy in
messages was a shift cipher where each letter of the message was moved a certain number
of characters up or down the alphabet, such that a shift encryption with a 3-shift of CAT is
FDX. This is in essence a very simple cipher, and relied mostly upon the reader not spending
the time to break the number of possibilities, clearly 26, the number of possible shifts that
could occur, each one producing a different result.
Over time, a more sophisticated form of breaking shift ciphers and in fact a number of other
ciphers, was developed by the Arab polymath, Al Kindi, in his paper, ‘A Manuscript on
Deciphering Cryptographic Messages’, by investigating the text of the Qu’ran.
The process relies upon the fact that in certain languages the frequency of certain letters is
characteristic. Thus, by investigating the frequency of letters in the encrypted text, one can
easily see the code, as the graph would likely be shifted to the side, such that the peaks and
troughs would be moved a certain amount. Therefore, one can easily find the likely shifts,
reducing the number of attempts required to find the exact shift used.
Vernam Cipher
The Vernam Cipher is the only cipher to still be proven as unbreakable (by Claude Shannon).
It relies on the use of the one-time pad.
Though the One-Time Pad is by far the most sophisticated type of cipher that we have
encountered so far, there are a few issues that it introduces. Firstly, the length of the code
must be at least as long if not longer than the message that the people wish to encrypt, thus
taking up a lot of length. This makes it very difficult to remember and use effectively.
Additionally, the question becomes how to ensure that both parties have the same code, to
decrypt and encrypt the message, since much of the communication is not in person and
rely upon inherently insecure systems such as the internet. The mechanisms of fixing these
issues were fundamentally fixed in RSA and the Diffie-Helman Key Exchange System.
Classification of Software
Software is broadly classified into system software and application software. Application Software
is software completing a task which would need to be done whether or not the person had a
computer. On the other hand, System Software would be unrequired if the person does not have a
computer.
System Software
System software is the software needed to run the computer’s hardware and application programs,
including the operating system, utility programs, libraries and programming language translators.
Operating System: An operating system is a set of programs that lies between application software
and the computer hardware. It serves many different functions including: resource management –
managing all the computer hardware including the CPU, memory, disk drives, IO devices and
provision of a user interface to enable users to perform all the tasks required.
Utility Programs: Utility software is a system software designed to optimise the performance of the
computer or perform tasks such as backing up files, restoring corrupted files from backup, etc.
Examples include a Disk Defragmenter is a program which will reorganise the hard disk so that files
which have been split up into blocks and stored all over the disk. The defragmenter recombines the
files into sequential blocks. A virus checker is another example of a utility program which checks
your hard drive to ensure that there aren’t any viruses.
Libraries: Library programs are ready-compiled which can be run when needed and which are
grouped together in software libraries. Most compiled languages have their own libraries of pre-
written functions which can be invoked in a defined manner from within the user’s program.
Translators: Programming Language Translators fall into three categories: compilers, interpreters
and assemblers.
Application Software
Application Software can be categorised as General Purpose, Special Purpose or Custom-Written.
General Purpose: Software such as word-processor, spreadsheet software which can be used for
multiple purposes.
Special-Purpose: Software which completes a single specific task or set of tasks to serve a niche.
Examples include payroll and accounts software. Software can be bought as ‘off-the-shelf’ or
‘bespoke’ software. While bespoke software is likely to more correctly and properly satisfy the
requirements of the user, it is likely to be far more expensive, as well as requiring the user to well
define the exact requirements of the software which can be challenging.
The above table shows the possible opcodes of lots of different commands. Machine Code is a very
low-level programming language, because the code reflects how the computer carries out the
command – it is dependent on the specific architecture of the computer.
Assembly Code
From here, there was the development of a slightly higher (second generation) level programming
language, where the opcode was replaced by a mnemonic which showed what the operator was
doing. Additionally, the binary for the operand was replaced by a denary value.
High Level Programming Language
From here, there was the development of high level programming languages, where a single line of
code no longer represented one instruction of the computer (one clock cycle). This allows the coder
to easily code algorithms, using the imperative high-level languages (FORTRAN = FORmula
TRANsmission), without needing to worry about how the machine would handle the code. They are
known as imperative languages, since every line is a single command to do something. Additionally,
the language was identical for every computer, with different architectures, with different
processors, which would require different machine and assembly codes. Though the system is often
easier for programmers to write to use, it is sometimes less efficient than using the assembly code,
programming the complete system. This means for some systems where the program needs to
execute as fast as possible, such as in embedded systems, the program is better off being written in
Assembly Code. Additionally, Assembly Code takes up much less space on a disk than higher level
programming languages, also not requiring a translator.
Imperative Programming involves writing your program as a series of instructions that can actively
modify memory, focusing on how, in the sense that you express the logic of your program based on
how the computer would execute it.
Functional programming involves writing your program in terms of functions and other
mathematical structures, focusing on what, attempting to specify the logic of your program, hence
allowing for the rules of the way in which the program runs in order to solve the problem.
Program Translators
Assembler: Converts each assembly line (source code = code made by the user) piece of code into
machine code (object code = the code which is created post assembly or compilation which is
executable and can be run directly by the computer). This is specific to the computer that is being
used, since the machine code instruction set would differ (though slightly) between different
architectures.
Compiler: Compiler converts high-level code into machine code, therefore still requiring the
compiler to be specific to the exact architecture and platform of the computer.
Interpreter: Interpreting is a different option to compilation, where the interpreter will look at each
line in term, checking for syntactical errors, before translating it into machine code and running it.
(In fact, the interpreter completes a cursory scan looking for flow issues, such as missing brackets etc
before beginning the translation of the first line). This means that if there is a problem in the code,
then the computer will run until that point before registering a problem. Therefore, it is particularly
useful for testing and finding problems in long pieces of code.
Bytecode: In fact, the majority of interpreted languages do not convert to machine code, instead
going to bytecode, which can be executed using a bytecode compiler.
Compiler Interpreter
Advantages Object code can be saved on disk Faster, as the entire code does
and run without needing to not have to be compiled before
recompile the program. running the program. This is
particularly important during
Once compiled, object code runs the program development
faster than interpreted code. For phases when there would be a
interpreted code things like loops number of minor errors, which
are very inefficient, since they are would each require lengthy
converted to machine code every recompilations.
time they are encountered, instead
of being translated once as would It is easier to partially test and
happen with compiled languages. debug programs.
Logic Gates
Logic gates allow the user to complete conditionals, changing an outputted voltage (or signal)
depending on the various inputs that are occurring. There are a number of different logic gates:
NOT gate
𝑂𝑢𝑡𝑝𝑢𝑡 = ????????
𝐼𝑛𝑝𝑢𝑡
AND gate
𝑂𝑢𝑡𝑝𝑢𝑡 = 𝐼𝑛𝑝𝑢𝑡@ ∙ 𝐼𝑛𝑝𝑢𝑡B
OR gate
𝑂𝑢𝑡𝑝𝑢𝑡 = 𝐼𝑛𝑝𝑢𝑡@ + 𝐼𝑛𝑝𝑢𝑡B
XOR gate
𝑂𝑢𝑡𝑝𝑢𝑡 = 𝐼𝑛𝑝𝑢𝑡@ D 𝐼𝑛𝑝𝑢𝑡B = (𝐼𝑛𝑝𝑢𝑡@ ∙ ????????? ?????????
𝐼𝑛𝑝𝑢𝑡B ) + (𝐼𝑛𝑝𝑢𝑡 @ ∙ 𝐼𝑛𝑝𝑢𝑡B )
NOR gate
??????????????????????
𝑂𝑢𝑡𝑝𝑢𝑡 = 𝐼𝑛𝑝𝑢𝑡 @ + 𝐼𝑛𝑝𝑢𝑡B
NAND gate
?????????????????????
𝑂𝑢𝑡𝑝𝑢𝑡 = 𝐼𝑛𝑝𝑢𝑡 @ ∙ 𝐼𝑛𝑝𝑢𝑡B
Boolean Algebra
De Morgan’s Laws
𝐴̅ ∙ 𝐵? = 𝐴
????????
+ 𝐵
𝐴̅ + 𝐵? = ???????
𝐴 ∙ 𝐵
Other Laws
X·0=0
X·1=1
X·X=X
X · X’ = 0
X+0=X
X+1=1
X+X=X
X + X’ = 1
X’’ = X
X·Y=Y·X
X+Y=Y+X
X(Y·Z) = (X·Y)Z
X + (Y + Z) = (X + Y) + Z
S = A XOR B
C=A.B
Full Adder
S = A XOR B XOR C
C = (A.B)+(C.(A XOR B))
By connecting multiple full adders together, to complete a concatenated adder capable of adding a
binary number of n bits.
D-type flip-flops
Elemental Sequential Logic Circuit that can store one bit and flip between states - has two inputs, a
control input labelled D and a clock signal.
D type flip flop is a positive edge-triggered flip-flop, so it can only change input value from 1 to 0 or
visa verca when the clock is at a a rising edge.
When the clock is not at a positive edge, the input value is held and does not change. HENCE, CAN
BE USED AS A MEMORY CELL.
Dedicated Registers
Program Counter (PC): This holds the address of the next instruction to be executed. This may
be the next instruction in a sequence, or if the instruction is a branch or jump instruction, the
next address to jump to.
Current Instruction Register (CIR): Holds the current instruction being executed.
Memory Address Register (MAR): This holds the address of the memory location from which
the data to be fetched or where data is to be written.
Memory Buffer Register (MBR): Used to temporarily store the data read from or written to
memory – also sometimes known as the memory data register.
Status Register (SR): Contains bits which are set or cleared depending on the result of the
instruction. It includes whether there was an overflow, whether the result was negative or
zero. There is also bit which is used to store whether there is a carry.
Control Unit: The control unit controls and coordinates the activities of the CPU, directing the flow
of data between the CPU and other devices. It accepts the next instruction, decodes it into several
sequential steps such as fetching addresses and data from memory.
System Clock: The system clock generates a series of signals, switching between 0 and 1 around
seven million times a second. Most fetch execute cycles take one clock cycle but a few take a
number of clock cycles.
Buses
The processor is connected to the main memory by three different bus (A bus is a set of parallel
wires connecting two or more components of a computer). The address bus is used when the
memory from a specific place on memory is required. This data is then returned making use of the
data bus, while control signals, especially from the peripherals are sent along the Control Bus. While
both the data bus and control bus are bidirectional with signals passing from the control unit to the
memory and visa-versa, it is important to note that the address bus only carries data from the
control unit to the memory.
Control Bus: Given that the data and address bus are shared by all components the system, the
control lines must be provided to ensure there is no conflict, transmitting timing and status
information between components. The various commands sent over the Control Bus includes:
• Memory Write; data bus information stored into addressed location
• Memory Read: data from addressed location into data bus
• I/O write: Data from data bus to I/O port
• I/O Read: Data from I/O port into data bus
• Bus Request: indication of a device request to use the data bus
• Bus Grant: indication that the Control Unit has granted access to the data bus
Data Bus: The data bus provides paths for moving data containing information from the memory as
well as instructions to the control unit and between other registers. The width of the data bus us a
key factor in determining overall system performance. Most computers are 32 or 64 bit, indicating
that the have a data bus of this width.
Address Bus: The address bus contains the address of the word (collection of memory) that the
control unit should access. As with the data bus, the wider the address bus, the greater amount of
memory the control unit can address. For example, a system with a 32-bit address bus can address
232 memory locations (with general word lengths is generally around 4GB). During I/O operations,
the address bus is also used to address the I/O ports. However, for the address bus, generally this is
unnecessary, and so it is more common for the address bus to make use of multiplexing, where the
first half of the address is sent in one signal and then the rest of it in another signal.
I/O Controller: The IO Controller is a device which interfaces between an input or output device and
processor. Each device has a separate controller which connects to the control bus. IO Controllers
receive input and output requests from the processor and then can send signals to the device they
control. They also manage the data flow to and from the device. The controller is made up of three
main parts:
• An interface that allows connection of the controller to the system or IO bus
• A set of data, command and status registers.
• An interface (standardised form of connection defining things such as signals, number
connecting pins / sockets and voltage levels) that allows connection of the controller to the
device.
Von Neumann Machine: The von Neumann machine is a machine where there is one main memory
store, where the instructions and the data on which the instructions were being carried out on were
both stored. Almost all computers made today are built on this principle.
Harvard Architecture: The Harvard Architecture is a computer architecture where there are
physically separate memories for instructions and data, allowing the data and instructions to be
fetched in parallel instead of competing for the same bus. Additionally, the two memories can have
different properties, allowing embedded systems where the instructions should never change to
make use of faster and more stable Read only Memory (ROM). Additionally, the system would allow
for different sized data buses (to main memory) and instruction buses to allow for times when the
instructions are complex and long, and hence should have a much larger bus.
Operation Operand
Code
Basic Addressing
Machine Mode
Operation
0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1
The number of bits allocated to the operation code (opcode) and the operand will vary according to
the architecture and word size of the particular processor type. The above example shows an
instruction held in one 16-bit word. In this particular machine, all operations are assumed to take
place in a single register called the accumulator. In more complex architectures, each instruction
may occupy up to 32 bits and allow for multiple operands – including load from memory address x
into register y.
Addressing Modes: There are a number of addressing mdoes which are shown in the two binary
digits which talk about the addressing mode. One of these modes is immediate addressing, where
the operand is the actual numerical value to be operated on. In direct addressing, the operand holds
the memory address of the value to be operated on. Other addressing modes include Register
Addressing where the register contains the value. There is also indirect addressing, where the
operand will contain the memory address of a memory address where the number to operate on is
stored.
Assembly Code: Assembly code directly makes use of this system, representing the system with:
Opcode AddressingMode Operand
Though the system makes use of mnemonics for the opcode as opposed to directly using the binary.
Additionally, the addressing modes make use of characters such as brackets, a ‘#’ and a lack of
anything to indicate the addressing mode being used, while the operand would be represented in
denary (or sometimes hex) instead of binary.
Input-Output Devices
Barcodes
There are two main types of barcodes, QR (Quick Response) codes – which can store far more
information than 1D barcodes and 1D barcodes, such as those found on products, which are made
of differing thicknesses and patterns of black and white rectangles.
Barcode Readers: There are a few types of barcode readers available, including pen-type readers,
laser scanners, CCD readers and camera based readers.
In Pen-type readers, a light source and a photo diode are placed next to each other in the tip of a
pen. To read a barcode, the tip of the pen is dragged across all the bars at an even speed. The photo
diode measures the intensity of the light returning (I guess therefore it can’t be a photodiode – since
this would simply measure if light was coming back…) and therefore, due to the different reflectivity
of white and black (black absorbs more light than white) can measure the widths of the bars and
spaces in the barcode. The simplest systems make use of a specific thickness with black meaning 1,
and white meaning 0. However, there are also more complex alphabetic system, such as the UPC
(universal product code) alphabet, which has an entirely different pattern for each number. Pen-
type scanners are the most durable type of scammers, however, they most come in contact with the
barcode, so cannot be used for many applications. Additionally, it is sometimes challenging to use
them, as one has to move at a very constant speed.
In Laser Scanners, the system is the same, however it uses a line of laser light with an array of light
sensors. This is much faster and they are reliable and economical for low-volume applications, such
as in supermarkets.
In CCD (Charge-Coupled Device) readers, there is a large array of tiny light sensors lined up in a row
at the head of the reader, relying on the fact that there is anyway light that the white parts would
reflect. Therefore, in the same way, the pattern could be found, and the barcode number be found.
In Camera Based Readers, a camera is used to take an image of the barcode. From here, there is a
use of image processing tools to locate a barcode, reorientate and deal with any issues of alignment
in all three axes, and then measure the thicknesses of black and white bars. Hence, the method
could be used when the code is damaged or poorly printed. This is generally also much cheaper than
other methods, as one could even use the hardware on a smartphone.
Digital Cameras
A digital camera makes use of a CMOS (Complementary Metal Oxide Semiconductor) sensor
comprising of millions of tiny light sensors arranged in a grid. When the shutter opens, the light
enters the camera and projects an image onto the the sensor at the back of the lens (which focuses
the light). Each sensor measures the colur and the brightness of the light which is reaching it, before
sensing the information to a microcontroller which can recombine the readings of all of these
sensors into a complete image.
Some cameras (tend to be higher end) make use of a CCD sensor, since they produce a far higher
quality image, however taking up far more power, and being more expensive, since there are fewer
nodes between the individual sensors and the analogue-digital converters.
RFID
An RFID tag is useful, as, as opposed to barcodes they can be read from over 300 metres away, as
well as being able to transfer more information, passing information between the transmitter and
the receiver wirelessly. The RFI chip consists of a small microchip transponder and an antenna. Since
so little is required, the tag can be very small, even the size of a grain of rice. There are two types of
tags, active tags and passive tags. Active tags also contain a battery to power the tag so it actively
transmits a signal for a reader to pick up. This means that they can also be picked up from further
away. Passive tags instead make use of a radio wave emitted from a reader in order to proved
sufficient electromagnetic power to the card using the antenna. Once powered, the transceiver
inside the RFID tag can send the data to the receiver. These are much smaller, and cheaper, and are
generally far more used, from everything from credit cards to oyster cards.
Laser Printer
A laser printer makes use of powered ink stored in a toner. The printer generates a bitmap image of
the printed page and using a laser unit and a mirror, draws a negative, reverse image onto a
negatively charged drum. The laser light causes the affected areas of the drum to lose their charge.
The drum then rotates past a toner hopper to attract charged toner particles onto the areas which
have not been lasered. Then the ink is bonded to the paper using heat and pressure.
Laser printers are affordable and are largely used when the volume of printing is high. They are also
generally very fast, though colour laser printers (which use cyan, magenta and yellow as well as
black) require the same process to be repeated 4 times for each colour take more time. Additionally,
the quality of a laser printer is generally lower than the quality which can be achieved using an inkjet
printer.
Storage Devices
A secondary storage device is needed because the other memory RAM, which is the memory that is
directly accessed by the processor loses its contents when the computer’s power is turned off. In
order for the computer to be able to store any data for extended periods of time therefore, there
must be another place where the data could be stored more permanently, and this is generally the
hard disk, though today, there has been a switch towards Solid State Drives (SSDs).
Hard Disk: A hard disk uses rigid rotating platters coated with magnetic materials. The iron particles
on the disk are polarised to be either north or south state – representing 0s or 1s. The disk is
separated into concentric circles, and each track is subdivided into sectors.
The disk spins very quickly at speeds up to 10,000 RPM, allowing the drive head to move across the
disk to access different sectors, using a magnet to read and write data to the disk. Finally, the hard
disk likely makes use of a number of platters, each with its own drive head in order to increase the
amount of data which could be read from a single hard disk, without vastly increasing the disks size.
As opposed to Solid State drives, which are much faster, since they use a number of flash ships, hard
drives generally have very large capacities, allowing people to store all required information on
them comfortably. Additionally, they tend to be very cheap.
Optical Disks: Optical Disks (CD ROMs, DVDs and Blu-Ray disks) make use of a high powered laser to
burn sections of its surface, making it less reflective at those points. Hence a laser at a lower power
is used to read the disk by measuring how much light is absorbed and how much reflected by that
part of the disk. Reflective and non-reflective areas are read as 1s and 0s respectively. There is
generally only one track on an optical disk, generally arranged in a spiral.
CD ROMs have a max storage space of 650MB, while a Blu Ray can store around 50GB, since it uses a
much shorter wavelength laser, meaning the thickness of the track is reduced.
Rewritable compact disks use a laser and a magnet to heat a spot on the disk and then set its state
to become a 0 or 1 before it cools again.
A DVD-RW makes use off a phase change alloy that can change between amorphous and crystalline
states by changing the power of the laser beam.
Optical storage is very cheap to produce and can be easily and safely distributed. However, it is
easily damaged by excessive sunlight or scratches.
Solid State Drives (SSDs): SSDs make use of an array of chips arranged on a board, making use of a
large number of NAND flash memory cells and a controller that manages the memory. The cell
works by delivering a current along the bit to lead to electrons going to the floating gate. Hence, this
floating gate can be measured to find if it is a 0 or 1.
Data is stored in pages (of around 4KB) grouped in blocks. Flash memory cannot be overwritten
instead requiring the old data to be removed before the new data can be written to the same
location. However, the individual pages cannot be erased, requiring instead that the entire block be
erased. Hence, the rest of the data needs to be copied to another block before erasing the block and
then moving the required data back.
Solid State Drives tend to have much lower capacities than Hard Disks as well as being much more
expensive, though costs of SSDs are beginning to reduce over time. They are much faster than HDDs
(lower latency), with the read/write speeds being almost double as fast.
Cyber-Attacks
• ESTONIA
• In 2007, Estonia suffered a series of cyber-attacks which swamped websites of
organisations including the Estonian Parliament, banks, ministries, newspapers and
broadcasters. This was one of the largest cases of state-sponsored cyber warfare
every seen, sponsored by Russia in retaliation for the relocation of the Bronze
Soldier of Tallinn, an elaborate Soviet-era grave marker, and war graves in Tallinn.
• In 2008, an ethnic Russian Estonian national was charged and convicted of the
crime.
• SONY PICTURES
• In June 2014, the North Korean government threatened action against the US
government if the movie ‘The Interview’ was released.
• In November, Sony Pictures computers were hacked by the ‘Guardians of Peace’, a
group believed by the FBI linked to the North Korean government.
Asynchronous Transmission
• Using asynchronous transmission, one character is sent at a time, with each character being
preceded by a start bit and an end bit.
• The start bit alerts the receiving device and synchronises the clock inside the receiver ready to
receive the character – ensuring the baud rate at the receiving end is the same as the sender’s
baud rate.
• A parity bit is also included to check against incorrect transmission, thus for every character, 10
bits are required – start bit, parity bit and end bit.
• The start bit can be either a 0 or a 1, with the end bit being the other one.
4.9.2 -Networking
LAN (local area network): A LAN consists of a number of computing devices on a single site
connected together by cables, connecting a number of PCs, printers and scanners and a central
server. Users can communicate with each other as well as sharing data and hardware devices such
as printers. LANs can transmit data very fast but only over a very short distance and there is a limit
to the number of computers that can be connected to a single LAN.
• In a Bus topology, all computers are connected to a single cable, with the ends of the cable
connected to a computer or a terminator.
• Data is transmitted in one direction only at any one time – where all the traffic has equal
transmission priority.
• A device wanting to communicate with another device sends a broadcast message onto the wire
that all other devices see, but only the intended recipient can accept and process the message,
with the system using the system of Carrier Sense Multiple Access / Collision Detect
(CSMA/CD).
• Inexpensive to install as it requires less cable than a star topology and does not require any
additional hardware.
• New devices can be easily added without disrupting the network.
• Well suited to small networks not requiring high speeds.
• If the main cable fails, the whole network will go down.
• Limited cable length and number of stations
• The performance of the system will degrade with heavy traffic.
• There is a relatively slow security, where all the computers on the network can see all data
transmissions.
Star Topology
• A star network has a central node which may be a switch, hub or computer which acts as a
router to transmit messages
• A hub simply transmits all the messages received from one line to all other lines, whereas a
switch or router can be more clever, saving where the computer is coming from (using the MAC
address) and only sending the messages to the specific computer that it should be sent to.
• MAC Address: Every computer device has a Network Interface Card (NIC), which would
have a unique media access control address (MAC address) which is assigned and hard-
coded into the card by the manufacturer and it uniquely identifies the device.
• If one cable fails, only one station is affected, so it is easy to isolate faults
• Consistent performance even when the network is being heavily used
• Performance is better than bus network
• No problems with collisions of data
• Messages are more secure
• Easy to add new stations without disrupting the network
• Different stations can communicate with the switch using different protocols.
• More costly, because more cables are required, as well as a central hub or server.
• If the central device fails, the entire network will go down.
Physical vs Logical Topology
The physical topology of a network is the actual design layout, which describes the wiring scheme.
The logical topology is the path in which the data travels and describes how components
communicate across the physical topology
+ -
Thin- Easy to set up, maintain Reliant on the server
Client
More secure Requires powerful and reliable
server - therefore expensive
Easier to update
Server demand and bandwidth
increased
o Client-Server Databases
• Many modern serves allow a service for a client-server operation.
• Advantages are:
1. Consistency of data as it is only stored in one place.
2. Expensive Resource can be made available to a large number of users.
3. Access rights can be managed centrally
4. Backup and recovery.
• PROBLEMS
1. People simultaneously updating things.
a. Methods of fixing this:
i. When an item is updated, the entire block is copied to local
memory area and when the record is saved, the block is
rewritten to the server.
ii. BUT, then people's changes can be lost
• Concurrency Fixes
1. RECORD LOCKS
• Technique of preventing simulataneous access to objects in a
database in order to prevent updates being lost or inconsistancies in
the data from arising.
• Record is locked whenever a user retrieves it for editing or updating.
• PROBLEMS
• Deadlock - if people need to access two records
• Therefore other things can be used as well.
2. SERIALISATION
a. Transaction cannot start until the previous one has finished.
3. TIMESTAMP ORDERING
a. When a transaction starts, it is given a timestamp.
b. If two transactions attempt to affect the same object, the one with
the later timestamp is ignored.
c. Every object has a read and write timestamp
i. When completing an update - reading it sets the timestamp to
the correct timestamp
ii. When it writes, it can check if there is another transaction which
has read the record (is the write timestamp changed)
iii. Therefore, allows problems to be avoided.
4. COMMITMENT ORDERING
a. Transactions are ordered in terms of their dependencies on one
another as well as the time when they were started.
b. Can stop deadlock by blocking certain requests until others have been
completed.
LACK OF STRUCTURE: This is the important part of the data, rather than its volume. This poses a
number of challenges, such as that it becomes harder to analyse the data while relational databases
are not appropriate because they require the data to fit into row and column format
Big Data collection and processing enables us to detect and analyse relationships within and among
individual pieces of information that previously we were unable to grasp.
GRAPH SCHEMA
Graphs can be used to represent connected data and to perform analyses on very large
datasets.
Can be used to easily graphically represent the data
Data is stored as nodes and relationships (arrows) while properties describe the stored data.
Programming Paradigms
Programming Paradigms are styles of programming - eg Procedural Programming, Object Oriented
Programming, Declarative Languages and Functional Programming
What is a function
A function is a mapping from a set of inputs (domain) to a set of possible outputs, known as the co-
domain
A function is a rule, that for each element in the domain, assigns a chosen value from the co-domain,
without necessarily using every member of the co-domain.
Function Application
Function application means a function applied to the arguments of the function application.
Higher-order functions
A function is higher-order if it takes a function as an argument or returns a function as a result or
does both.
STATELESSNESS: the value of a variable can never change - they are immutable.
NO SIDE EFFECTS: The only thing a function can do is calculate something and return a result, and it
is said to have no side effects. Function which is called twice with the same parameters will always
return the same result. (Referential Transparency)
Composition of Function
Two functions can be combined to get a single function - this is referred to as functional
composition.
Map
Map is a higher-order function which takes a list and the function to be applied to the elements of
the list as inputs and returns a list made by applying the function to each element of the old list.
Filter
Filter is higher-order function which takes a condition and a list and it returns elements in the list
which satisfy the condition.
Fold (REDUCE)
Fold function reduces a list of values to a single value by repeatedly applying a combining function to
the list values.
Foldl nmeans start from the left while Foldr means to start from the right.