0% found this document useful (0 votes)
460 views151 pages

Computer Science A2 Level 9618 Theory Notes

Helps aspiring pragrammers.

Uploaded by

stacey.bobo2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
460 views151 pages

Computer Science A2 Level 9618 Theory Notes

Helps aspiring pragrammers.

Uploaded by

stacey.bobo2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 151

Data Representation

User-Defined Data Types


 A user-defined data type is a data type designed by the programmer. When object-oriented programming is not being used, a

programmer may choose to utilize user-defined data types for a large program as their use can reduce errors in the program and

make it more understandable. It also has less restriction and allows for inevitable user definition.

 The use of built-in data types is the same for any program. However, there can't be a built-in record type because each different

problem will need an individual definition of a record. 2 types of user-defined data types:

Composite User-Defined Data Types

Composite user-defined data types have a definition referencing at least one other type.

 Record Data type: a data type that contains a fixed number of components that can be of different types. It allows the programmer

to collect values with other data types together when these form a coherent whole. It could be used to implement a data structure

where one or more defined variables are pointer variables.

 TYPE

 <myRecord> // The record consists of several fields

 DECLARE <identifier1> : <built-in data type>

 DECLARE <identifier2> : <built-in data type> // Each field has its own data type

1|Page
 ENDTYPE

 // Creating a variable using this record data-type.

 DECLARE myVariable : myRecord // The variable is declared as a record

 // Assigning a value to <identifier1> of myVariable

 myVariable.<identifier1> <-- <value> // Fields can be accessed using dot notation

 e.g.

 TYPE

 TEmployeeRecord

 DECLARE FirstName : STRING

 DECLARE LastName : STRING

 DECLARE Salary$ : REAL

 DECLARE Position : STRING

 ENDTYPE

 DECLARE Employee1 : TStudentRecord

 Employee1.FirstName ← "John"

2|Page
 Employee1.LastName ← "Doe"

 Employee1.Salary$ ← 2830.80

 Employee1.Position ← "Project Manager"

 Set Data type: allows a program to create sets and to apply the mathematical operations defined in set theory. All the elements in

the set should be unique. Operations like:

 • Union

 • Difference

 • Intersection

 • Include an element in the set

 • Exclude an element from the set

 • Check whether an element is in a set

 // In pseudocode the type definition has this structure:

 TYPE <set-identifier> = SET OF <Basetype>

3|Page

 // Variable definition

 DEFINE <identifier> (value1,value2,value3,...) : <set-identifier>

 e.g.

 TYPE Days = SET OF STRING

 DEFINE Today (Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday) : Days

 Classes: in object-oriented programming, a program defines the classes to be used. Then, for each class, the objects must be

defined. A Class includes variables and methods( functions or procedures that an object can run in that class).

Non-Composite User-Defined Data Types

Non-composite user-defined data types don’t involve a reference to another type. When a programmer uses a simple built-in type,

the only requirement is for an identifier to be named with a defined type. They must be explicitly defined before an identifier can be

created, unlike built-in data types, which include string, integer, real, etc…

 Enumerated Data type: is a non-composite user-defined data type. It is a list of possible data values. The values defined here

have an implied order of values to allow comparisons. Therefore, value2 is greater than value1(they're not string values and can't

be quoted). This allows for comparisons to be made. It is also countable, thus finite values.

 TYPE <Datatype> = (<value1>,<value2>,<value3>…)

4|Page
 DECLARE <identifier> : <datatype>

 e.g.

 TYPE Season = (Summer,Winter,Autumn,Spring) // Note: we are not writing

 Summer intead of "Summer" b/c

 it is a non-composite data type

 DECLARE ThisSeason : Season

 DECLARE NextSeason : Season

 ThisSeason <-- Autum

 NextSeason <-- ThisSeason + 1 // NextSeason is set to Spring

 Pointer Data Type: used to reframe a memory location. It may be used to construct dynamically varying data structures. The

pointer definition has to relate to the type of the variable being pointed to(it doesn’t hold a value but a reference/address to data). In

type declaration, ^ shows that the TYPE being declared is a Pointer, and is the data type found at the memory location, e.g.

STRING.

 TYPE <PointerName> = ^<Typename>

 // Declaring a pointer variable

 DECLARE <FirstPointer> : <PointerName>

5|Page

 <assignment value> ← <FirstPointer>^ // This accesses the data stored at the address

 which IntegerPointer points to. This is known

 as dereferencing.

 SecondPointer <-- @<identifier> //This stores the memory address of the <identifier> in

 SecondPointer

 SecondPointer^ <-- MyVariable //This stores the value in MyVariable to the

 memory location currently pointed by SecondPointer.

e.g

6|Page
Use the diagram to state the current values of the following expressions.

IPointer: 4402 // This is the address that IPointer is pointing to.

7|Page
IPointer^: 33 //This is the value stored at the address (4402) the IPointer is pointing to.

@MyInt1: 3427 //This is the address of MyInt1

IPointer^ = MyInt2 : TRUE //This compares the value of MyInt2 to the value stored at the address (4402)

Write pseudocode statements that will achieve the following.

1. Place the address of MyInt2 in the IPointer.

IPointer <-- @MyInt2

2. Assign the value 33 to the variable MyInt1.

MyInt1<-- 33

3. Copy the value of MyInt2 into the memory location currently pointed at by the IPointer.

IPointer^ <-- MyInt2

File Organisation and Access


Contents in any file are stored using a defined binary code that allows the file to be used as intended. But, for storing data to be

used by a computer program, there are only two defined file types: text or binary.

 A text file contains data stored according to a defined character code defined by ASCII or Unicode. A text file can be created using

a text editor.

8|Page
 A binary file is a file designed for storing data to be used by a computer program(0's and 1's). It stores data in its internal

representation(an integer value might be stored in 2 bytes in 2's complement representation to represent a negative number), and

this file is created using a specific program. Its organisation is based on records (a collection of fields containing data values). file

→ records → fields → values

Methods of File Organisation and Access

 Proper file organisation is crucial as it determines access methods, efficiency, flexibility, and storage devices.

Serial files: contains records that have no defined order. A text file may be a serial file with repeating lines defined by an end-of-

line character(s). Records are stored, one after another, in the order they were added to the file. New records are added at the end

of the file.

 There's no end of record character. A record in a serial file must have a defined format to allow data to be input and output

correctly. To access a specific record, it has to go through every record until found.

 Serial file organisation is frequently utilised for temporarily storing transaction files that will eventually be moved to more permanent

storage.

Advantages of serial file organisation

 The task at hand is straightforward.

 The cost is low.

9|Page
Disadvantages of serial file organisation

 It becomes difficult to access because you must access all proceeding records before retrieving the one being searched.

 It cannot support modern high-speed requirements for quick record access.

 File access: Records in this file type are searched using Sequential Access. Successively read record by record until the required

data is found or the whole file has been searched, and the required data is not found, thus prolonging the process. Uses:

 Batch processing

 Backing up data on magnetic tape

 Banks record transactions involving customer accounts every time there is a transaction.

Sequential Files

Sequential files are records ordered and suited for long-term data storage and thus are considered an alternative to a database. A

key field is required to order a sequential file for which the values are unique and sequential—this way, it can be easily accessed.

 A sequential database file is more efficient than a text file due to data integrity, privacy and less data redundancy. A change in one

file would update any other files affected.

 Primary keys from the DBMS(database management system) must be unique but not ordered, unlike the key field from the

sequential files, which must be ordered and unique.

10 | P a g e
 A particular record is found by sequentially reading the key field's value until the required value is found. New records must be

added to the file in the correct place.

Advantages of sequential file organisation

 The sorting makes it easy to access records but does not remove the need to access other records as the search looks for

particular records.

 The binary search technique can reduce record search time by half the time.

File Access

Records in this type of file are searched using the Sequential Access and Direct Access methods.

 Retrieving records using Sequential Access:

o Successively read the value In the key field until the required key is found or the key field of the current record being checked is

greater than the key field searched for. (This would mean the required record is not in the file).

 The rest of the file does not need to be searched as the records are sorted on ascending key field values. This method is efficient

when every record in the file needs to be processed.

Retrieving records using Direct Access:

 This method finds the required record without reading other records in the file. This allows the retrieval of records more quickly. An

index of all the key fields is kept and used to look up the address of the file location where the required record is stored.

11 | P a g e
 This method is efficient when an individual record in the file needs to be processed.

To edit/delete data:

 Create a new version of the file. Data is copied from the old file to the new file until the record is reached, which needs editing or

deleting.

 For deleting, reading and copying the old file, continue from the next record. If a record has been edited, the new version is written

to the new file and the remaining records are copied to the new file.

Random Files

 Records are stored randomly in the file but are accessed directly. The location for each record is found using a Hashing

Algorithm on the record's key field. Magnetic and optical disks use random file organisation.

Advantages of random file organisation

 Quick retrieval of records.

 The records may vary in size.

Direct Access Files

File access: Records in this file type are searched using the Direct Access method. A hashing algorithm is used on the key field to

calculate the address of the file location where a given record is stored.

12 | P a g e
Direct access/random access files: access isn't defined by a sequential reading of the file(random). It's well suited for larger files,

which take longer to access sequentially. Data in direct access files are stored in an identifiable record, which could be found by

involving initial direct access to a nearby record followed by a limited serial search.

 The choice of the position must be calculated using data in the record so the same calculation can be carried out when there's a

search for the data. One method is the hashing algorithm, which takes the key field as an input and outputs a value for the record's

position relative to the file's start. To access, the key is hashed to a specific location.

 This algorithm also considers the potential maximum length of the file, which is the number of records the file will store.

 e.g., If the key field is numeric, divide by a suitable large number and use the remainder to find a position. But we won't have

unique positions. The next position in the file is used if a hash position is calculated that duplicates one already calculated by a

different key. This is why a search will involve direct access, possibly followed by a limited serial search. That's why it's considered

partly sequential and partly serial.

File access:

 The value in the key field is submitted to the hashing algorithm, which then provides the same value for the position in the file that

was provided when the algorithm was used at the time of data input. It goes to that hashed position and through another short

linear search because of collisions in the hashed positions—fastest access.

To edit/delete data:

13 | P a g e
 Only create a new file if the current file is full. A deleted record can have a flag set so that the record is skipped over in a

subsequent reading process. This allows it to be overwritten.

Uses:

 Most suited for when a program needs a file in which individual data items might be read, updated or deleted.

Factors that Determine the File Organisation to Use

 How often do transactions occur, and how often does one need to add data?

 How often does it need to be accessed, edited, or deleted?

Floating-Point Numbers, Representation, and Manipulation


 The Real Number: A number that contains a fractional part.

 Floating-point Representation: The approximate representation of a real number using binary digits.

 Format: Number = ±Mantissa × BaseExponent

o Mantissa: The non-zero part of the number.

o Exponent: The power to which the base is raised to in order to accurately represent the number.

 Base: The number of values the number systems allows a digit to take. 2 in the case of floating-point representation.

 The floating point representation stores a value for the mantissa and the exponent.

14 | P a g e
 Some bits are used for the significant/mantissa, +-M. The remaining bits are for the exponent, E. The radix, R, is not stored in the

representation as it has an implied value of 2(representing 0 and 1s).

 If a real number was stored using 8 bits: four bits for the mantissa and four bits for the exponent, each using two complement

representations. The exponent is stored as a signed integer. The mantissa has to be stored as a fixed point real value.

 The binary point can be in the beginning after the first bit(immediately after the sign bit) or before the last bit. The former produces

smaller spacing between the values that can be represented and is more preferred. It also has a greater range than the fixed

representation.

15 | P a g e
Converting a denary value expressed as a real number into a floating point binary representation: Most fractional parts do not

convert to a precise representation as binary fractional parts represent a half, a quarter, an eighth…(even). Other than .5, there are

no different values unless the ones above can be converted accurately. So you convert by multiplying by two and recording the

whole number part.

16 | P a g e
For example, 8.63, 0.63 * 2 = 1.26; therefore, .1 -> 0.26 * 2 = 0.52 and .10 -> 0.52 * 2 = 1.04 and .101, and you keep going until

the required bits are achieved.

The method for converting a positive value is:

1. Convert the whole number part

2. Add the sign bit 0

3. Convert the fractional part. You start by combining the two parts, giving the exponent value zero. Shift the binary points by

shifting the decimal to the beginning, providing a higher exponent value. Depending on the number of bits, add extra 0's at the

mantissa's end and the exponent's beginning.

4. Adjust the position of the binary point and change the exponent accordingly to achieve a normalised form.

Therefore: 8.75 -> 1000 -> 01000 -> .11 -> 010000.11 -> 0.100011(mantissa) -> 0100011000 0100(10 for M, and 4 for E).

 For negatives, use 2's complement.

 When implementing the floating point representation, a decision must be made regarding the number of bits to use and how many

for the mantissa and exponent.

 Usually, the choice for the total number of bits will be provided as an option when the program is written. However, the floating point

processor will determine the split between the two parts.

17 | P a g e
 If there were a choice, it's convenient to note that increasing the number of bits for the mantissa would give better precision but

leave fewer bits for the exponent, thus reducing the range of possible values and vice versa. For maximum precision, it is

necessary to normalise a floating point number.

 Optimum precision will only be made once full use is made of the bits in the mantissa, therefore using the largest possible

magnitude for the value the mantissa represents.

 Also, the two most significant bits must differ—0 1 for positives and 10 for negatives.

 They both equal two, but the second one with the higher bits in the mantissa is the most precise.

 0.125 * 2^4 = 2 0 001 0100

 0.5 * 2^2 = 2 0 100 0010

-For negatives.

 0.25 * 2^4 = -4 1 110 0100

 1.0 * 2^2 = -4 1 000 0010

 When the number is represented with the highest magnitude for the mantissa, the two most significant bits are different. Thus, a

number is in a normalised representation. How a number could be normalised: for a positive number, the bits in the mantissa are

shifted left until the most significant bits are 0, followed by 1.

18 | P a g e
 For each shift left, the value of the exponent is reduced by 1. The same shifting process is used for a negative number until the

most significant bits are 1, followed by 0. In this case, no attention is paid to the fact that bits are falling off the most significant end

of the mantissa. Thus, normalisation shifts bits to the left until the two most significant bits differ.

Why are Floating Point numbers represented in normalised form?

 Saves space by storing many numbers using the smallest bytes possible.

 Normalization reduces the representation of leading zeros or ones.

 Maximizing the precision or accuracy of the number for the given number of bits.

 Allows for the precise storage of both very large and very small numbers.

 Avoids the possibility of many numbers having multiple representations.

Precision vs Range.

 Increasing the number of bits for mantissa increases the precision of the number.

 The number range can be increased by increasing the number of bits for exponent.

 Precision and range will always be a trade-off between the mantissa and exponent size.

19 | P a g e
Problems with Using Floating Point Numbers

1. The conversion of real denary values to binary mainly needs a degree of approximation followed by restricting the number of bits

used to store the mantissa. These rounding errors can become significant after multiple calculations. The only way to prevent a

severe problem is to increase the precision by using more bits for the mantissa. Programming languages, therefore, offer options to

work in double/quadruple precision.

2. The highest value represented is 112; thus, it is a limited range. This produces an overflow condition. If a result value is smaller

than one that can be stored, there would be an underflow error condition. This very small number can be turned into zero, but there

are several risks, like multiplication or division of this value.

20 | P a g e
3. There is an inability to store the number 0 using normalised floating point numbers. This is because the mantissa can either be 0.1

or 1.0.

For example, one use of floating point numbers is in extended mathematical procedures involving repeated calculations like

weather forecasting, which uses the mathematical model of the atmosphere.

21 | P a g e
Communication and Internet Technologies
Protocols
Protocols are essential for successful transmission of data over a network. Each protocol defines a set of rules that must be

agreed between sender and receiver. At the simplest level, a protocol could define that a positive voltage represents a bit with a

value of 1.

 At the other extreme, a protocol could define the format of the first 40 bytes in a packet. The complexity of networking requires

many protocols, a protocol suite is a collection of related protocols. TCP/IP is the dominant protocol suite for internet usage.

 Protocol: A set of rules governing communication between computers.

o Ensures the computers that communicate understand each other.

 MAC address: A unique number assigned to each device’s networking hardware worldwide.

 IP address: A unique number assigned to each node/networking device in a network.

 Port number: A software-generated number that specifies an application or a process communication endpoint attached to an IP

address.

 IP: Internet Protocol – The function of the network layer, and the IP, is to ensure correct routing over the internet. To do so, it takes

the packet received from the transport layer and adds a further header containing the sender and receiver's IP addresses.

22 | P a g e
o To find the IP address of the receiver, the DNS system can be used to find the address corresponding to the URL supplied in the

user data. The IP packet(datagram) is sent to the data link layer and, therefore, to a different protocol suite. The data link layer

assembles datagrams into frames. Transmission now begins.

o IP has no further duty once the IP packet has been sent to the data link layer. IP is a connectionless service, so if it receives a

packet that contains an acknowledgement of a previously sent packet, it will simply pass the packet on to TCP with no awareness

of the content.

 TCP: Transfer Control Protocol.

o If an application is running on an end system where a message is to be sent to a different end system, the application will be

controlled by an application layer protocol. The protocol will transmit the user data to the transport layer; the TCP operating in the

transport layer now has to take responsibility for ensuring the safe delivery of the message to the receiver.

o To do this, it creates sufficient packets to hold all the data. Each packet consists of a header plus the user data. TCP must ensure

safe delivery and return any response to the application protocol.

o The header has a port number that identifies the application layer protocol at the sending and receiving end system (however, the

TCP isn't concerned with the receiving end system). If the packet is one of a sequence, a sequence number is included to ensure

the eventual correct reassembly of the user data.

23 | P a g e
o The TCP is connection-oriented. Initially, just one sequence packet is sent to the network layer. Once the connection has been

established, TCP sends the other packets and receives response packets containing acknowledgements. This allows missing

packets to be identified and resent.

 TCP/IP Suite: A common protocol used to send data over a network.

o Protocols are split into separate layers, which are arranged as a stack.

o They service each other, thus maintaining the flow of the data.

 Layer: A division of the TCP/IP suite.

 Stack: A collection of elements/protocols/layers.

24 | P a g e
25 | P a g e
Layer Purpose
Application Encodes the data being sent
Network/
Adds IP addresses stating where the data is from and where it is going
Internet
Adds MAC address information to specify which hardware device the message came from and which hardware
Link
device the message is going to
Physical Enables the successful transmission of data between devices
 When a message is sent from one host to another:

o Sender side: Application Layer

 Encodes the data in an appropriate format.

o Sender side: Transport Layer

 The data to be sent is broken down into smaller chunks known as packets

o Sender side: Network Layer

 IP addresses (sender and receiver) and a checksum are added to the header

o Sender side: Link Layer

 Formats the packets into a frame. These protocols attach a third header and a footer to “frame” the packet. The frame header

includes a field that checks for errors as the frame travels over the network media.

o Sender side: Physical Layer

26 | P a g e
 Receives the frames and converts the IP addresses into the hardware addresses appropriate to the network media. The physical

network layer then sends the frame out over the network media.

o Server/ Service Provider

 Re-routes the packets according to the IP address

o Receiver side: Physical Layer

 Receives the packet in its frame form. It computes the packet's checksum and sends the frame to the data link layer.

o Receiver side: Link Layer

 Verifies that the checksum for the frame is correct and strips off the frame header and checksum. Finally, the data link protocol

sends the frame to the Internet layer.

o Receiver side: Network Layer

 Reads information in the header to identify the transmission and determine if it is a fragment. IP would reassemble the fragments

into the original

 datagram if the transmission was fragmented. It then strips off the IP header and passes it on to transport layer protocols.

o Receiver side: Transport Layer

 Reads the header to determine which application layer protocol must receive the data. Then TCP o strips off its related header and

sends the message or stream up to the receiving application.

27 | P a g e
o Receiver side: Application Layer

 Receives the message and performs the operation requested by the sender

 Bit Torrent protocol: A protocol that allows fast sharing of files via peer-to-peer networks.

o Torrent File: A file that contains details regarding the tracker

o Tracker: A server that keeps track of the peers

o Peers: A user who is at the time downloading and uploading the same file in the swarm

o Swarm: A network of peers that are sharing the torrent – simultaneously downloading and uploading the file.

o Seeding: The act of uploading a part of the file or the file itself as a whole after/while downloading

o Leeching: The act of simply downloading a part of the file or the file itself as a whole and not seeding it during or after the

download.

o Seeders: Users who are currently seeding the file.

o Leechers/Free-raiders: Peers who are currently leeching the file.

28 | P a g e
Other Protocols

29 | P a g e
Acronym Protocol Purpose
HTTP Hyper Text Transfer Protocol Handles transmission of data to and from a website
FTP File Transfer Protocol Handles transmission of files across a network
POP3 Post Office Protocol 3 Handles the receiving of emails
Simple Mail Transfer
SMTP Handles the sending of emails
Protocol
 SMTP is a push protocol. POP3 is a pull protocol; the recent alternative to POP3 is IMAP(internet message access protocol), which

offers the facilities of POP3 and more.

 The use of web-based mail has largely superseded this approach. A browser is used to access the email application, so HTTP is

now the protocol (direct and automatic email from a website). However, SMTP remains in use for transfer between mail servers.

Peer to Peer File Sharing

P2P file sharing generates a lot of network traffic in internet usage. It is an architecture that has no structure and no controlling

mechanism. Peers act as both clients and servers, and each peer is just one end system. The BitTorrent protocol is the most

used protocol because it allows fast file sharing. There are three basic problems to solve if end systems are to be confident in using

BitTorrent:

1. How does a peer find others that have the wanted content? The answer by BitTorrent here is to get every content provider to

provide a content description - torrent, which is a file that contains the name of the tracker(a server that leads peers to the content)

30 | P a g e
and a list of the chunks that make up the content. The torrent file is at least 3 orders of magnitude smaller than the content, so it

can be transferred quickly. The tracker is a server that maintains a list of all the other peers/the swarm actively downloading and

uploading the content.

2. How do peers replicate content to provide high-speed downloads for everyone? This answer involves peers simultaneously

downloading and uploading chunks, but peers have to exchange lists of chunks and aim to download rare chunks for preference.

Each time a rare chunk is downloaded, it automatically becomes less rare.

3. How do peers encourage other peers to provide content rather than just using the protocol to download for themselves? This

answer requires dealing with the free riders/leachers who only download. The solution is for a peer to initially randomly try other

peers but then continue to upload to those peers that provide regular downloads. If a peer is not downloading or downloading

slowly, it will eventually be isolated/choked.

Circuit Switching, Packet Switching and Routers


 Circuit switching: A method of data transfer in which the message is sent over a dedicated communication channel.

Eg: - Landline Phone

 Packet switching: A method of data transfer in which the intended message is broken down into parts and is sent over whichever

route is optimum to reach its destination.

o Each packet travels through several other networks – “switching” between them to reach its destination.

31 | P a g e
E.g.: - Internet

 Router: A device that connects two or more computer networks.

o Directs the incoming packets to their receiver according to the data traffic in the network.

Transport Layer Security (TLS)

 TLS protocol: TLS aims to provide secure communication over a network, maintain data integrity and add an additional layer of

security.

 TLS provides improved security over SSL(secure sockets layer). TLS is composed of two layers: record protocol and handshake

protocol. TLS protects this information by using encryption. It also allows for authentication of servers and clients. A handshake

process has to occur before any data exchange using the TLS protocol.

 The handshake process establishes details about how the exchange of data will occur. Digital certificates and keys are used. The

handshake process starts with:

1. The client sends some communication data to the server.

2. The client asks the server to identify itself.

3. The server sends its digital certificate, including the public key.

4. The client validates (the server’s) TLS Certificate.

5. The client sends its digital certificate (to the server if requested).

32 | P a g e
6. The client sends an encrypted message to the server using the server’s public key.

7. The server can use its private key to decrypt the message and get the data needed to generate the symmetric key.

8. Both server and client compute symmetric key (to be used for encrypting messages) // session key established.

9. The client sends back a digitally signed acknowledgement to start an encrypted session.

10. The server sends back a digitally signed acknowledgement to start an encrypted session.

E.g. for online banking.

Local Area Networks (LAN)


 Bus topology: A network topology in which each workstation is connected to a main cable (backbone) through which the network

is established.

o The Backbone acts as the common medium; any signals sent or received go through the backbone to reach the recipient.

33 | P a g e
 Star topology: A network topology in which each workstation is connected to a central node/connection point through which the

network is established.

34 | P a g e
o The central node (hub) re-directs and directs the packets according to the data traffic and their recipient.

 Wireless networks: A computer network that uses wireless data connections between its network components.

 Bluetooth: A type of short-range wireless communication that uses

 Wi-Fi OR IEEE 802.11x.– A type of wireless communication that allows the users to communicate within a particular area/ access

the Internet.

Component Purpose of a LAN


Switch Allows different networks to connect
Router Directs the incoming packets into
Provides a medium for the storage, sharing of usage of files and applications for its
Servers
users
Network Interface Cards (NICs) Consists of the electronic circuitry required to communicate with other networks/devices.
 Ethernet: an array of networking technologies and systems used in local area networks (LAN), where computers are connected

within a primary physical space.

 CSMA/CD:

Standard ethernet was implemented on a LAN configured as a bus or a star topology with a hub as the central device where the

transmission was broadcast in a connectionless service. Because of the broadcast transmission, there was a need for access to

the shared medium by end systems to be controlled.

35 | P a g e
o Without control, two messages sent simultaneously would collide, and each message would be corrupted. The method adopted

was CSMA/CD(carrier sense multiple access with collision detection). If a frame was being transmitted, there was a voltage level

on the ethernet cable which an end system could detect. If this was the case, the protocol defined a time the end system had to

wait before it tried again.

o However, because two end systems could have waited, then both decided to transmit at the same time, collisions could still

happen; thus, there was also a need to incorporate a means for an end system to detect a collision and to discontinue transmission

if a collision occurred. Before transmitting a device, check if the channel is busy. The device waits to see if channel-free data is sent

if it is busy. When transmission begins, the device listens for other devices also beginning transmission. If a collision occurs,

transmission is aborted/ transmitting a jam signal.

 Both devices wait a (different) random time, then try again. The modern implementation of ethernet is switched. The star

configuration has a switch as the central device, which controls transmission to specific end systems. Each end system is

connected to the switch by a full duplex link, so no collision is possible along the link, and therefore, CSMA/CD is no longer needed

as collisions are impossible.

 Ethernet is the most likely protocol to operate in the data link layer when the IP in the network layer sends a datagram to the data

link layer.

36 | P a g e
 When the data link layer uses ethernet, the protocol defines 2 sub-layers. The upper one is the logical link layer which handles flow

control, error control and part of the framing process. The lower is the media access control(MAC) sublayer, which completes the

framing process and defines the access method. The MAC layer transmits the frames containing the physical address for the

sender and receiver, which is why they are called MAC addresses.

Hardware Connection Device

 An end system on an ethernet LAN needs a network interface card(NIC). Each NIC has a unique physical address, MAC address.

The end system itself has no identification on the network. If the NIC is removed and inserted into a different end system, it takes

the address.

 The simplest device used for the center of the star topology LAN is the hub which ensures that any incoming communication is

broadcast to all connected end systems. However, a hub is not restricted to supporting an isolated network; it can have a

hierarchical configuration with one hub connected to other hubs supporting individual LANs. A hub can also have a built-in

broadband modem, which allows all of the end user systems on the LAN to have an internet connection when this modem is

connected to a telephone line.

 A switch can function as a hub, but it's more intelligent and can keep track of the addresses of connected devices; this allows a

switch to send an incoming transmission to a specific end system as a unicast. This reduces the amount of network traffic

compared to the hubs.

37 | P a g e
 A router is the most intelligent of the connecting devices. It can function as a switch and decide which device to which it will transmit

a received transmission. The main use of routers is in the backbone fabric of the internet. Nearer to the end systems, a router may

function as a gateway, as a network address translation box or be combined with a firewall.

Wireless Networks

The dominant technology no longer uses cables now; it's wireless. The following are discussed in order of increasing scale of

operation.

 Bluetooth: this has been standardized as IEEE802.15. Communication is by short-range radio transmission in a confined area. A

Bluetooth LAN is an ad hoc network thus no defined infrastructure and network connections are created spontaneously. eg

Wireless keyboard

 Wi-Fi (WLAN) is a wireless ethernet known as IEEE 802.11. This is a wireless LAN protocol which uses radio frequency

transmission. A Wi-Fi LAN is mostly centred on a wireless access point in an infrastructure network, not an ad hoc network. The

wireless access point communicates wirelessly with any end systems connected to the device. It also has a wired connection to the

internet. •WiMAX(worldwide interoperability for microwave access): an IEEE802.16 is a protocol for a MAN or WAN. It's designed

for use by PSTNs to provide broadband access to the internet without having to lay underground cables. Local subscribers connect

to the antenna of a local base station using a microwave signal.

38 | P a g e
 Cellular networks: used in mobile/cell phones. Each cell has a base station at its centre. The system works because each cell has

a defined frequency for transmission, which is different from the frequencies used in adjacent cells. The technology available in cell

phones has vastly progressed:

1. 1G was designed for voice communication using analogue technology

2. 2G went digital

3. 3G introduced multimedia and serious internet connection capability

4. 4G introduced smartphones with high bandwidth broadband connectivity.

Wireless Access Points

• Allowing devices to connect to the LAN via radio communication instead of using a cable

• Easy to move a device to a different location.

39 | P a g e
Hardware and Virtual Machines
Logic Gates & Circuit Design
Logic Gates: A component of a logical circuit that can perform a Boolean operation (logical function).

 AND Gate: 𝐴.𝐵 = 𝑋A.B = X

A B X
0 0 0
0 1 0
1 0 0
1 1 1

40 | P a g e
 OR Gate: 𝐴+𝐵=𝑋A+B=X

A B Output
0 0 0
0 1 1
1 0 1
1 1 1

41 | P a g e
 NOT Gate: 𝐴‾ = 𝑋A = X

A Output
0 1
1 0

42 | P a g e
 NAND Gate: 𝐴.𝐵‾ = 𝑋A.B = X

A B Output
0 0 1
0 1 1
1 0 1
1 1 0

43 | P a g e
 NOR Gate: 𝐴+𝐵‾=𝑋A+B=X

A B Output
0 0 1
0 1 0
1 0 0
1 1 0

44 | P a g e
 XOR Gate: 𝐴.𝐵‾+𝐴‾.𝐵=𝑋A.B+A.B=X

A B Output
0 0 0
0 1 1
1 0 1
1 1 0

45 | P a g e
 Logic circuits: A circuit that performs logical operations on symbols.

 Sequential circuit: a circuit whose output depends on the input and previous output values. E.g.: - Flip-flops (Section 3.3.4)

 Combinational circuit: a circuit whose output is dependent only on the input values

o Half-Adder: A logic circuit that adds two bits together and outputs their sum.

46 | P a g e
Input Output
A B S C
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1

Boolean Algebra
 Double Complement: 𝐴‾‾=𝐴A=A

 Identity Law

47 | P a g e
o 1.𝐴 = 𝐴1.A = A

o 0+𝐴 = 𝐴0+A = A

 Null Law

o 0.𝐴 =00.A =0

o 1+𝐴 = 11+A = 1

 Idempotent Law

o 𝐴.𝐴 = 𝐴A.A = A

o 𝐴+𝐴=𝐴A+A=A

 Inverse Law

o 𝐴.𝐴‾=0A.A=0

o 𝐴+𝐴‾ = 1A+A = 1

 Commutative Law

o 𝐴.𝐵 = 𝐵.𝐴A.B = B.A

o 𝐴+𝐵 = 𝐵+𝐴A+B = B+A


48 | P a g e
 Associative

o (𝐴.𝐵).𝐶 = 𝐴.(𝐵.𝐶)(A.B).C = A.(B.C)

o (𝐴+𝐵)+𝐶 = 𝐴+(𝐵+𝐶)(A+B)+C = A+(B+C)

 Distributive Law

o 𝐴+𝐵.𝐶 = (𝐴+𝐵).(𝐴+𝐶)A+B.C = (A+B).(A+C)

o 𝐴.(𝐵+𝐶)=𝐴.𝐵+𝐴.𝐶A.(B+C)=A.B+A.C

 Adsorption

o 𝐴.(𝐴+𝐵)=𝐴A.(A+B)=A

o 𝐴+𝐴.𝐵=𝐴A+A.B=A

 De Morgan’s Law

o (𝐴.𝐵‾) = 𝐴‾ + 𝐵‾(A.B) = A + B

o (𝐴+𝐵‾)=𝐴‾.𝐵‾(A+B)=A.B

{example}

49 | P a g e
Karnaugh Maps
 Karnaugh maps: a method of obtaining a Boolean algebra expression from a truth table involving the

 Benefits of using Karnaugh Maps:

o Minimises the number of Boolean expressions.

o Minimises the number of Logic Gates used, thus providing a more efficient circuit.

 Methodology

o Try to look for trends in the output, thus predicting the presence of a term in the final expression

o Draw out a Karnaugh Map by filling in the truth table values into the table

o Column labeling follows the Gray coding sequence

o Select groups of ‘1’ bits in even quantities (2, 4, 6, etc.); if not possible, then consider a single input as a group

o Note: Karnaugh Maps wrap around columns

o Within each group, only the values that remain constant are retained

50 | P a g e
Examples

51 | P a g e
Flip-Flops
52 | P a g e
 Flip flops can store a single bit of data as 0 or 1

 Computers use bits to store data.

 Flip-flops can be used to store bits of data.

 Memory can be created from flip-flops.

SR Flip Flops

JK Flip Flops

 JK flip flops are an improvement over SR flip flops.

 Invalid input combinations are eliminated in JK flip flops.

 All four combinations of input values (J and K) are valid in JK flip-flops.

 JK flip flops use a clock pulse for synchronization to ensure proper functioning.

53 | P a g e
 Advantages of JK flip flops include the validity of all input combinations, avoidance of unstable states, and increased stability

compared to SR flip flops.

RISC & CISC Processors


 RISC: Reduced Instruction Set Computers.

 CISC: Complex Instruction Set Computers.

RISC CISC
Fewer instructions More instructions
Simpler instructions Complicated instructions
A small number of instruction formats Many instruction formats

54 | P a g e
RISC CISC
Single-cycle instructions whenever possible Multi-cycle instructions
Fixed-length instructions Variable-length instructions
Only load and store instructions to address May types of instructions to address
memory memory
Fewer addressing modes More addressing modes
Multiple register sets Fewer registers
Hard-wired control unit Microprogrammed control unit
Pipelining easier Pipelining much difficult
 Pipelining: Instruction level parallelism

o Used extensively in RISC processor-based systems to reduce the time taken to run processes

o Multiple registers are employed

 Interrupt handling in CISC and RISC Processors:

o As soon as the interrupt is detected, the current processes are paused and moved into registers

o The ISR (Interrupt Service Routine) is loaded onto the pipeline and is executed.

 When the interrupt has been serviced, the paused processes are resumed by bringing them back from the registers to the pipeline

 RISC processors allow for providing efficient pipelining. Pipelining is instruction-level parallelism. Its underlying principle is that the

fetch-decode execute cycle can be separated into several stages.

55 | P a g e
One of the possibilities include:

1. Instruction fetch (IF)

2. Instruction decode (ID)

3. Operand fetch (OF)

4. Instruction execution (IE)

5. Result write back (WB)

Pipelining for Five-Stage Instruction Handling

 For pipelining to be implemented, the construction of the processor must have five independent units, each handling one of the five

identified stages.

56 | P a g e
 This explains the need for a RISC processor to have many register sets. Each processor unit must have access to its own set of

registers. The representations 1.1, 1.2, and so on are used to define the instruction and the stage of the instruction. Initially, only

the first stage of the first instruction has entered the pipeline. At clock cycle 6, the first instruction has left the pipeline, the last stage

of instruction 2 is being handled, and instruction 6 has just been entered. Once underway, the pipeline is handling 5 stages of 5

individual instructions. At each clock cycle, the complete processing of one instruction is finished. Without the pipelining, the

processing time would've been 5 times longer.

 One disadvantage is interrupt handling. There will be 5 instructions in the pipeline when an interrupt occurs.

o Erase the pipeline contents for the latest 4 instructions to have entered. Then, the normal interrupt handling routine can be applied

to the remaining instruction.

o Construct the individual units in the processor with individual program counter registers. This allows current data to be stored for all

of the instructions in the pipeline while the interrupt Is handled.

Parallel Processing
 SISD

o Single Instruction Single Data stream

o Found in the early computers

o Contains a single processor; thus, there is no pipelining

57 | P a g e
 SIMD

o Single Instruction Multiple Data stream.

o Found in array processors

o Contains multiple processors, which have their own memory.

 MISD

o Multiple Instruction Single Data stream

o Used to sort large quantities of data.

o Contains multiple processors which process the same data

 MIMD

o Multiple Instruction Multiple Data.

o Found in modern personal computers.

o Each processor executes a different individual instruction.

 Massively parallel computers

o Computers that contain vast amounts of processing power.

o Has a bus structure to support multiple processors and a network infrastructure to support multiple ‘Host’ computers.

o Commonly used to solve highly complex mathematical problems.

58 | P a g e
Virtual Machines
 Virtual machine:

o Process interacts with the software interface provided by the OS. This provides an exact copy of the hardware.

o OS kernel handles interaction with actual host hardware

Pros Cons
Allows more than one OS to run on a system Performance drop from native OS
The time and effort needed for implementation is
Allows multiple copies of the same OS
high
Examples and Usage:

 Used by companies wishing to use the legacy software on newer hardware and server consolidation companies

 Virtualising machines allows developers to test applications on many systems without making expensive hardware purchases.

59 | P a g e
System Software
Purposes of an Operating System (OS)
 Optimizes the use of computer resources

o Implements process scheduling to ensure efficient CPU use

o Manages main memory usage

o Optimizes I/O

 Dictates whether I/O passes through CPU or not

 Hides the complexities of the hardware

o UI allows users to interact with application programs

o Automatically provides drivers for new devices

o Provides file system

 Organizes physical storage of files on disk

o Provides a programming environment, removing the need for knowledge of processor functions

o Provides system calls/APIs

 Portability

 Multitasking:

60 | P a g e
o More than one program can be stored in memory, but only one can have CPU access at any given time

o The rest of the programs remain ready

 Process:

o A program being executed which has an associated Process Control Block (PCB) in memory

 PCB: a complex data structure containing all data relevant to the execution of a process

o Process states

 Ready: A new process arrived at the memory, and the PCB is created

 Running: Has CPU access

 Blocked: Cannot progress until some event has occurred

 Scheduling ensures that the computer system can serve all requests and obtain a certain quality of service.

 Interrupt:

o Causes OS kernel to invoke ISR

 The kernel may have to decide on a priority

 Register values stored in PCB

o Reasons

 Errors

61 | P a g e
 Waiting for I/O

 Scheduler halts process

 Low-level scheduling: Allocation of specific processor components to complete specific tasks.

 Low-level scheduling algorithms

 Preemptive: Will stop the process that would have otherwise have continued to execute normally.

 First-come-first-served

o Non-preemptive

o FIFO(First In First Out) queue

 Round-robin

o Allocates time slice to each process

o Preemptive

o Can be a FIFO queue

o Does not prioritize

 Priority-based

o Most complex

 Priorities re-evaluated on queue change

62 | P a g e
 Priority calc. Requires computation

o Criteria for priority time

 Estimated time of execution

 Estimated remaining time of execution

 Is the CPU/IO bound?

 Length of time spent in waiting queue

 Paging:

o Process split into pages, memory split into frames

o All pages loaded into memory at once

 Virtual memory:

o No need for all pages to be in memory

o CPU address space is thus larger than physical space

 Addresses resolved by the memory management unit

o Benefits

 Not all of the program has to be in memory at once

 Large programs can be run with or without large physical memory

63 | P a g e
o Process

 All pages on the disk initially

 One/more loaded into memory when process ‘ready’

 Pages replaced from disk when needed

 This can be done with a FIFO queue or usage-statistics-based algorithm

 Disk thrashing: Perpetual loading/unloading of pages due to a page from disk immediately requiring the page it replaced.

OS Structure
An OS has to be structured to provide a platform for resource management and facilities for users. The logical structure provides 2

modes of operation:

1. The user mode is the one available for the user or an application program.

2. Privileged/kernel mode has the sole access to parts of the memory and to certain system functions that the user mode can’t

access.

Translation Software
 Lexical analysis: The process of converting a sequence of characters to a sequence of tokens.

o Tokens: Strings with an assigned meaning

 Syntax analysis: The process of double-checking the code for grammar mistakes (syntax errors).

 Code generation: The process by which an intermediate code is generated after syntax analysis.

64 | P a g e
 Optimization: A process in which the code is edited to make efficiency improvements.

 For interpreters:

o Analysis and code generation run for each code line as above

o Each line is executed as soon as the intermediate code is generated

Syntax Diagrams and BNF


BNF is a formal mathematical way of defining syntax unambiguously.

It consists of:

65 | P a g e
 A set of terminal symbols

 A set of non-terminal symbols

 A set of production rules

66 | P a g e
67 | P a g e
RPN (Reverse Polish Notation)
Reverse Polish notation (RPN): A method of representing an arithmetic or logical expression without brackets or special

punctuation. RPN uses postfix notation, where an operator is placed after the variables it acts on. For example, A + B would be

written as A B +

 Compilers use RPN because any expression can be processed from left to right without backtracking.

Advantages of RPN

● RPN expressions do not need brackets, and there is no need for the precedence of operators

● RPN is simpler for a machine to evaluate

● There is no need for backtracking in evaluation as the operators appear in the order required for computation and can be

evaluated from left to right

Infix to Reverse Polish Notation

68 | P a g e
69 | P a g e
Security
Asymmetric Keys and Encryption Methods
 Plain text: data before encryption.

 Cipher text: the result of applying an encryption algorithm to data.

 Encryption: the making of cipher text from plain text.

Encryption can be used:

o When transmitting data over a network.

o It is a routine procedure when storing data within a computing system.

 Public key: A key that is shared between the user and sender for encryption of the data and verifying digital signatures.

 Private key: A key which is kept to be a secret and used to decrypt data the data encrypted by the public key.

70 | P a g e
 Symmetric key encryption: when there is just one key used to encrypt and then decrypt. The sender and the receiver of a

message share the secret key.

 Asymmetric encryption is when two different keys are used, one for encryption and another for decryption. Only one of

these is a secret.

 Sending a private message:

 Sending verified messages to the public:

71 | P a g e
Encryption and Decryption

1. The process starts with original data, plaintext, whatever form it takes.

2. This is encrypted by an encryption algorithm, which uses a key.

3. The product of the encryption is ciphertext, which is transmitted to the recipient.

4. When the transmission is received, it is decrypted using a decryption algorithm and a key to produce the original plaintext.

Security concerns relating to a transmission:

 Confidentiality: only the intended recipient should be able to decrypt the ciphertext.

 Authenticity: the receiver must be confident who sent the ciphertext.

 Integrity: the ciphertext must not be modified during transmission.

 Non-repudiation: neither the sender nor the receiver should be able to deny involvement in the transmission.

72 | P a g e
 Availability: nothing should happen to prevent the receiver from receiving the transmission.

At the sending end, the sender has a key to encrypt some plaintext, and the ciphertext produced is transmitted to the receiver.

Now, the receiver needs to get the key needed for decryption.

1. If symmetric key encryption is used, there needs to be a secure method for the sender and receiver to be provided with the secret

key.

2. Using asymmetric key encryption, the process starts with the receiver. The receiver must have two keys. One is a public key, which

is not secret. The other is a private key, which is secret and known only to the receiver. The receiver can send the public key to a

sender, who uses the public key for encryption and sends the ciphertext to the receiver. The receiver can only decrypt the message

because the private and public keys are matched. The public key can be provided to different people, allowing the receiver to

receive a private message from any of them.

Digital Signatures and Digital Certificates


 Using asymmetric encryption, the decryption works if the keys are used the other way around. An individual can encrypt a

message with a private key and send this to many recipients with the corresponding public key and can, therefore, decrypt the

message. This is not for confidential messages but can be used to verify who the sender is. Only the sender has the private key,

and the public keys only work with that specific private key. Therefore, used this way, the message has a digital signature

identifying the sender. However, the digital signature is associated with the encryption of the whole message.

73 | P a g e
 Cryptographic one-way hash function creates from the message a number uniquely defined for the particular message, a digest.

The private key is used as a signature for this digest. This speeds up the process of confirming the sender's identity.

 The message is assumed to be transmitted as plaintext, and the digital signature is assumed to be a separate file. The same public

hash key function that the sender used is used, so the same digest is produced if the message has been transmitted without

alteration. The decryption of the digital signature produces an identical digest if the message was genuinely sent by the original

owner of the public key the receiver used. This makes the receiver confident that the message is authentic and unaltered. However,

someone might forge a public key and pretend to be someone else. Therefore, there is a need for a more rigorous means of

ensuring authentication. This can be provided by a Certification Authority (CA) as part of a Public Key Infrastructure (PKI).

74 | P a g e
Suppose a would-be receiver with a public-private key pair wishes to receive secure messages from other individuals. In that case,

the public key must be made available in a way that ensures authentication. The would-be receiver would need to obtain the digital

certificate to allow safe public key delivery:

1. An individual(A) who is a would-be receiver with a public-private key pair contacts a local CA.

2. The CA confirms the identity of A.

3. A's public key is given to the CA.

4. The CA creates a public-key certificate(a digital certificate) and writes A's key into this document.

5. The CA uses encryption with the CA's private key to add a digital signature to this document.

6. The digital certificate is given to A.

7. A posts the digital certificate on a website.

 ![PROCESSES INVOLVED IN OBTAINING A DIGITAL CERTIFICATE: Individuals place the digital certificate on that person's

website, but you can post it on a website designed to keep digital certificate data. Alternatively, a digital certificate might be used

solely for authenticating emails. Once a signed digital certificate has been posted on a website, any other person wishing to use A's

public key downloads the signed digital certificate from the website and uses the CA's public key to extract A's public key from the

digital certificate. For this overall process to work, standards need to be defined.

75 | P a g e
Encryption Protocols
 SSL and TLS encryption protocols are used in client-server applications.

 SSL (Secure Socket Layer) and TLS (Transport Layer Security) are closely related Internet security protocols. TLS is a slightly

modified version of SSL. The main use of SSL is in the client-server application. The interface between an application and TCP

uses a port number. Without a security protocol, TCP services an application using the port number.

 The combination of an IP address and a port number is the socket. When the SSL protocol is implemented it functions as an

additional layer between TCP in the transport layer and the application layer. The HTTP application protocol becomes HTTPS

when the SSL protocol is in place.

 Provides:

o Encryption

o Compression of data

o Integrity checking

 Connection Process:

76 | P a g e
 Used in online shopping and banking websites.

Malware
 Virus: tries to replicate inside other executable programs.

 Worm: runs independently and propagates to other network hosts.

 Spyware: collects info & transmits to another system.

 Phishing: email from seemingly legit source requesting confidential info.

 Pharming: setting up a bogus website that appears to be legit.

77 | P a g e
Malware Vulnerabilities exploited
Virus Executable files used to run or install software.
Worm Shared networks
Spyware Background processes
Phishing Users mindset on considering emails from random addresses to be trustworthy
Users’ mindset of relying on the website’s user interface rather than the URL for its
Pharming
validity.

Malware Methods of restriction


Virus Install and use an Anti-Virus software that runs daily scans.
Worm Set up a firewall to protect yourself from external networks.
Spyware Install and use real-time Anti-Spyware protection.
Phishing Always check the sender’s email address.
Pharmin
Always double-check the website name.
g

78 | P a g e
Artificial Intelligence (AI)
Introduction
 Artificial Intelligence is the ability of computers to perform tasks that usually only a human would be able to do, such as decision-

making, speech recognition, etc.

 Machine learning (ML) is a subfield of artificial intelligence where computers learn to perform tasks without being explicitly

programmed. Machine learning computers are fed with historical training data, which produces a model from which predictions

about previously unseen data can be made.

E.g.

1. Spam filtering (the system can detect spam emails without human interaction.)

2. The search engine can refine searches based on previously conducted searches.

Types of ML include Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

 Deep Learning (DL) is a subset of ML where computers learn to solve problems using neural networks similar to how the human

brain functions.

79 | P a g e
E.g. Image Classification.

 Labelled and Unlabelled data: Labelled data is fully defined and recognisable. Unlabelled data is data which is unidentified and

unrecognisable.

Supervised, Unsupervised & Reinforcement Learning


Supervised Learning

 This is where you feed the machine learning algorithm, labelled training dataset, to improve a computer program's ability to perform

similar tasks. The model utilises regression analysis and classification analysis techniques. A set of inputs and corresponding

outputs are given for the model to be trained.

 The labels contain the expected outcome for that data. The machine uses the labels and training data to train the model.

 Once the model is trained, it is tested using labelled data and compared to expected outputs to determine if further refinement is

necessary. The model is run using unlabelled data to predict the outcome.
80 | P a g e
E.g.

81 | P a g e
Labelled data is split into training and test data. \n Training data is used to

train the model. \n The trained model is ready to test \n

82 | P a g e
The model is then tested using the test data. \n

The tested model is then ready to deploy. \n

83 | P a g e
The trained model can then be deployed. \n

Unsupervised Learning

84 | P a g e
 The machine learning algorithm is trained on unlabelled data and is left to cluster data to improve a computer program's ability to

perform similar tasks. Certain hyper-parameters may be set (such as how many clusters to form), but the process is generally

unstructured. Systems identify hidden patterns from provided data without specific guidance or labels. It groups data with similar

features, finds hidden patterns and structures, and represents it in a compressed format.

 Helpful in categorising many different objects

 Identifying hidden trends or patterns

 Anomaly detection(e.g. fraudulent transactions, spotting skin cancer, crime detection)

 data compression

85 | P a g e
Reinforcement Learning

 Reinforcement learning is a reward-based system where an agent is not given specific instructions (labelled data) but is rewarded

for how well they perform to improve the efficiency of a system in accomplishing similar tasks. Often, this learning follows a

Darwinian model where multiple agents attempt the task (each with different slightly random parameters), and those that perform

the best form the base settings for their child (slightly mutated versions).

86 | P a g e
Artificial Neural Networks (ANN)
 Artificial Neural Networks are computational models inspired by the human brain, allowing the system to exhibit human-like thinking

and improve its performance with more data. ANN is a group of interconnected input and output units where each connection has a

weight associated with its computer programs.

 ANNs are excellent at identifying patterns that are too complex or time-consuming for humans. Many recent advancements have

been made in Artificial Intelligence, including Voice Recognition, Image Recognition, and Robotics using Artificial Neural Networks.

An ANN consists of 3 or more layers :

 Input Layer: it accepts inputs in several different formats provided by the programmer.

87 | P a g e
 Hidden Layers: They are present in between the input and output layers. It performs all the calculations to find hidden features and

patterns.

 Output Layer: The output from the hidden layer is conveyed using this layer.

The purpose of the hidden layer in ANN is:

 Allows deep Learning to take place.

 A problem having a higher level of complexity requires more layers to solve.

 To allow the neural network to learn and make decisions independently.

 To improve the accuracy of the results.

How ANN enables ML:

88 | P a g e
 ANN replicate the way human brains work.

 Weightings are allocated for each connection between nodes.

 Data is inputted at the input layer and analyzed at each hidden layer to calculate outputs.

 The training process is repeated to achieve the best possible outputs (Reinforcement Learning takes place).

 Decisions can be made without being directly programmed.

 The deep learning net creates complex feature detectors.

 The output layer provides the results.

 Back Propagation of errors is used to correct any mistakes made.

Classification, Regression & Clustering


Classification

Split the data into two or more predefined groups. Example: spam email filtering, where emails are split into either spam or not.

89 | P a g e
Regression

It is a supervised machine-learning technique that predicts the value of a dependent variable based on

another explanatory variable.

Regression models can be used for:

 Weather forecasting

 Predicting health-care trends

 Financial forecasting.

 Sales prediction

90 | P a g e
Linear Regression

They are used where there is a straight-line correlation between variables.

Non-Linear Regression

Used where there is a correlation but it is not linear

91 | P a g e
Clustering

Split the data into smaller groups or clusters based on specific features. The programmer might specify a target number of groups

or let the algorithm decide.

Back Propagation
When creating neural networks, assigning random weights to each neural connection is a crucial step. However, we don’t know the

ideal weight factors for achieving optimal results. Therefore, training the neural networks during the development stage is essential.

Backpropagation in neural networks is a short form for “backward propagation of errors.” It is a standard method of training artificial

neural networks.

In this method:

 The results generated by the systems are compared to the expected outcome.

 The difference between the two results is calculated.

92 | P a g e
 Outputs travel back from the output layer to the hidden layer to adjust the initial weightings on each neuron.

 ErrorB= Actual Output – Desired Output

 If the error difference is too large, the weightings are altered.

 The process is iterative until the outputs have an acceptable error range or until the weights stop changing. The model has then

been successfully set up.

Graphs
A Graph is a non-linear data structure consisting of nodes with data and edges.

Components of a Graph:

 Nodes are the fundamental units of the graph. Every node can be labelled or unlabelled.

93 | P a g e
 Edges: Edges are lines used to connect two nodes of the graph. It can be an ordered pair of nodes in a directed graph. Every edge

can be labelled/unlabelled.

 This graph has a set of nodes V= { 1,2,3,4,5} and a set of edges E= { (1,2),(1,3),(2,3),(2,4),(2,5),(3,5),(4,50 }.

Terminologies of Graph

 Adjacency: 2 nodes are said to be adjacent if they are endpoints of the same edge.

 Path: A set of alternating nodes and edges allows you to go from one node to another. A path with unique nodes is called a simple

path.

A graph is labelled or weighted when each edge has a value or weight representing the cost of traversing that edge.

94 | P a g e
Uses of Graphs in the Real World

Social media and Google Maps utilize graphs to organize and present information. In the case of social media, each user is

represented as a node, similar to a Graph. On the other hand, in Google Maps, each location serves as a node, with the roads

linking them serving as edges. This use of graphs accurately represents the relationships and connections between different pieces

of information, initially allowing for easy comprehension and interpretation.

Uses of Graphs to help AI

 Graphs can be used to represent ANN

 The graph tells the relationships between nodes

95 | P a g e
 Find solutions to AI problems, such as finding a path in a graph.

 A range of Algorithms may examine graphs.

 Also used in ML

 An example of a method is the Back Propagation of Errors.

Dijkstra’s Algorithm
Dijkstra's algorithm is an algorithm for finding the shortest paths between two nodes in a graph, which may represent, for example,

road networks.

How to Implement Dijkstra’s Algorithm?

By default, the start node's immediate and non-immediate distance to the other nodes is “∞.”

1. We will find the shortest path from node A to the other nodes in the graph, assuming the start node is A.

96 | P a g e
2. We will look for the immediate nodes connected with node (A) which in this case are node (B) and node (D) and select that node

whose distance from A is shorter. Node A to Node B the cost = 0+3 = 3. The path becomes {A, B}

3. Now calculate the distance from B and its immediate nodes ie(node (D) and node (E)) and add the previous distance, Node B to

Node D the cost = 3+5 = 8. The path becomes {A, B, D}

4. Then from Node D to Node F the heuristic Cost= 3+5+2 = 10.

5. Now from Node F, we will choose Node C because for the path {A, B, D, F, E, C } the cost is 20 while for the path {A, B, D, F, C}

the cost is 13 so we will select this path.

Limitations

 A lack of heuristics

Dijkstra’s algorithm has no notion of the overall shortest direction to the end goal, so it will spend a lot of time searching in

completely the wrong direction if the routes in the wrong direction are shorter than the route in the correct direction. It will find the

shortest route but waste a lot of time.

97 | P a g e
This isn’t a problem in small networks, but when you have massive networks (like road networks or the internet), it will result in

massive inefficiencies.

 Negative Weighted Costs

On physical networks with physical distances, you can’t have negative weights, but on some networks where you calculate costs,

you might have negative costs for a particular leg. Dijkstra can’t handle these negative costs.

 Directed Networks

Dijkstra’s algorithm doesn’t always work best when there are directed networks (such as motorways that only run in one direction.

A* Algorithm
 One of the biggest problems with Dijkstra’s algorithm is that it can be inefficient when searching for the shortest path because it just

looks for the next shortest leg.

 A* is an informed search algorithm, or a best-first search, meaning that it is formulated in terms of weighted graphs: starting from a

specific starting node of a graph, it aims to find a path using a heuristic value, which gives priority to nodes that are supposed to be

better than others, to the given goal node having the smallest cost (least distance travelled, shortest time, etc.). It maintains a tree

of paths originating at the start node and extends those paths one edge at a time until its termination criterion is satisfied.

 At each iteration of its main loop, A* must determine which paths to extend. It does so based on the path's cost and an estimate of

the cost required to extend the path to the goal.

98 | P a g e
How to Implement A* Algorithm?

Below is an example of the implementation of the A* Algorithm:

 h is the heuristic value

 g is the movement cost

 F is the sum of gand h values.

 Start from the Home. The cost from Home to Home is 0, so g= 0. The heuristic cost of a home is 14, so h=14 and f=g+h=14.

 Now, there are three immediate nodes from home: A, B, and C. Calculate the values of g, h and f for A, B and C from home and

write them in the table.

 Select the node whose f value is the shortest (in this case, Node A).

99 | P a g e
 From A, there are two immediate nodes, B and E. Calculate the g value for each node and add the g value of A. Then, add the

corresponding h values to get f for each node.

 From E, there are two immediate nodes, School and F. Calculate the g value for each node and add the g value of E. Then, add the

corresponding h values to get f for each node.

 From F to School, add the g value (3) to the g value of F (8) and calculate f.

Final path = Home → A→E→F→School.

100 | P a g e
Further Programming
Programming Paradigms
 A programming paradigm defines the style or model followed when programming.

o Low-Level Programming

 Machine code (binary – lowest level) or Assembly language

 “Low” refers to the small/non-existent amount of abstraction between the language and machine language

 Instructions can be converted to machine code without a compiler or interpreter

 The resulting code runs directly on the specific computer processor, with a small memory footprint

 Programs written in low-level languages tend to be relatively non-portable – code written for a Windows processor might not work

on a Mac processor

 Simple language, but considered challenging to use due to numerous technical details that the programmer must remember.

o Imperative (Procedural) Programming

 It uses a sequence of statements to determine how to reach a specific goal. These statements are said to change the program's

state as each one is executed in turn.

101 | P a g e
 Each statement changes the program's state, from assigning values to each variable to the final addition of those values. Using a

sequence of five statements, the program is explicitly told how to add the numbers 5, 10, and 15 together.

o Object-Oriented Programming (OOP)

 An extension of imperative programming.

The focus is on grouping functions and data into logical classes and instances of classes called objects.

 Object-oriented programming is further explained in its section later in the notes.

o Declarative Programming

 Non-procedural and very high level (4th generation)

 Control flow is implicit, not explicit, like Imperative Programming

 The programmer states only what needs to be done and what the result should look like, not how to obtain it.

 A vital feature → backtracking – where a search goes partially back on itself if it fails to find a complete match the first time around

 Goal – a statement we are trying to prove, either true or false, effectively forms a query.

102 | P a g e
 Instantiation – giving a value to a variable in a statement

 A Declarative Language is further explained with examples in its section in the “theory” section of the notes.

File Processing and Exception Handling


File Processing

 Records are user-defined data structures

Defining a record structure for a Customer record with relevant fields (e.g., customer ID) in Python:

 Files are needed to import contents (from a file) saved in secondary memory into the program or to save the output of a program (in

a file) into secondary memory so that it is available for future use.

Pseudocode:

 Opening a file:

OPENFILE <filename> FOR READ/WRITE/APPEND

103 | P a g e
 Reading a file:

READFILE <filename>

 Writing a line of text to the file:

WRITEFILE <filename>, <string>

 Closing a file:

CLOSEFILE

 Testing for the end of the file:

EOF()

Python:

 Opening a file

variable = open(“filename”, “mode”)

Where the mode can be:

Mod
Description
e
r It opens a file for reading only. The pointer is placed at the beginning of the file.
w It opens a file for writing only. Overwrites file if file exists or creates a new file if it doesn’t
a Opens a file for appending. Pointer at the end of the file if it exists or creates a new file if not
 Reading a file:

104 | P a g e
o Read all characters

variable.read()

o Read each line and store it as a list

variable.readlines()

 Writing to a file:

o Write a fixed sequence of characters to file

variable.write(“Text”)

o Write a list of strings to file

variable.write[“line1”, “line2”, “line3”]

 Using direct access or Random File allows us to read records directly. ‘random’ is misleading since records are still systematically

read from and written to the file.

Pseudocode:

 Opening a file using the RANDOM file mode, where once the file has been opened, we can read and write as many times as we

would like in the same session:

OPENFILE <filename> FOR RANDOM

 Move a pointer to the disk address for the record before Reading/writing to a file can occur:

SEEK <filename>, <address>

105 | P a g e
Each record is given an ‘address’ at which it is to be written – the record key.

 Write a record to the file:

PUTRECORD <filename>, <identifier>

 Read a record from a file:

GETRECORD <filename>, <identifier>

 Close the file:

CLOSE <filename>

Algorithms for File Processing Operations for Serial and Sequential Files:

 Display all records:

 Search for a record:

106 | P a g e
*Special Case: If the records in a sequential file are of a fixed length, a record can be retrieved using its relative position. So, the

start position in the file could be calculated for the record with the key number 15, for example.

 Add a new record – Serial Organisation:

 Add a new record – Sequential Organisation:

*Some file processing tasks, like this one, require two files because serial/sequential files can only be opened to read from or write

to in the same session.*

107 | P a g e
 Delete a record:

108 | P a g e
 Amend an existing record:

109 | P a g e
Python example of Sequential File Handling:

110 | P a g e
111 | P a g e
Algorithms for File Processing Operations for Random Files:

 Display all records:

 Add a new record:

Python:

112 | P a g e
 Delete a record:

113 | P a g e
 Amend an existing record:

 Search for a record:

114 | P a g e
Python:

Python example of Random File Handling:

115 | P a g e
116 | P a g e
Exception Handling

 An exception is a runtime error/ fatal error/situation which causes a program to terminate/crash.

 Exception-handling – code that is called when a run-time error or “exception” occurs to prevent the program from crashing

 When an exception occurs, we say it has been “raised.” You can “handle” the exception raised by using a try block.

 A corresponding except block “catches” the exception and returns a message to the user if an exception occurs.

e.g.

Use of Development Tools and Programming Environments


 Integrated Development Environment: an application that provides several tools for software development. An IDE usually

includes: a Source code editor, debugger and automated builder

 Features in editors that benefit programming:

117 | P a g e
o Syntax Highlighting: keywords are coloured differently according to their category

o Automatic indentation: after colons, for example, to make code blocks more distinct, allowing for better code readability

o A library of preprogrammed subroutines that can be implemented into a new program to speed up the development process

Compiler Interpreter
Translates source code (e.g. Python code) into machine code, Directly executes/performs instructions written in a programming
which can be run and executed by the computer language by translating one statement at a time.
It takes significant time to analyze the source code, but the It takes less time to analyze the source code, but the execution
overall execution time is comparatively faster. time is slower.
Generates intermediate object code, which further requires No intermediate object code is generated; hence, it is memory
linking and more memory. efficient.
It generates the error message only after scanning the whole Continues translating the program until the first error is met, in
program. Hence, debugging is comparatively complex. which case it stops. Hence, debugging is easy.
Programming languages like C and C++ use compilers. Programming languages like Python and Ruby use interpreters.
 Systems that require high performance and for the long run should be written in compiled languages like C, C++,

 Systems that need to be created quickly and easily should be written in interpreted languages

Features available in debuggers:

 Stepping - traces through each line of code and steps into procedures. Allows you to view the effect of each statement on

variables

 Breakpoints - set within code; program stops temporarily to check that it is operating correctly up to that point

118 | P a g e
 Go to File/Line - Look at the current line. Use the cursor and the line above for a filename and line number. If found, open the file if

not already open, and show the line. Use this to view source lines referenced in an exception traceback and lines found by Find in

Files. Also available in the context menu of the Shell window and Output windows.

 Debugger (toggle) - When active, code entered in the Shell or run from an Editor will run under the debugger. In the Editor,

breakpoints can be set with the context menu. This feature is still incomplete and somewhat experimental.

 Stack Viewer - Show the stack traceback of the last exception in a tree widget with access to local and global variables.

 Auto-open Stack Viewer - Toggle automatically opening the stack viewer on an unhandled exception.

119 | P a g e
Sorting and Searching Algorithms
Part of the Computational Thinking and Problem-Solving Chapter

Linear Search
How does it work?

 The user is asked to enter an item they want to find in an array.

 All elements of an array are searched one by one until the item the user entered is found.

 When an item is found, the algorithm outputs an appropriate message saying that the item is found and which index/location the

item is at.

 If the particular item is not found, the algorithm outputs an appropriate message saying that the item is not found.

The linear search algorithm has a Big O notion of O(n).

Python Code

Binary Search
The necessary condition for a binary search is that the list/array being searched must be ordered/sorted.

120 | P a g e
How Does It Work?

 The middle item/index of the list is found

 Item at the middle of the list is compared to item user inputs

 If the item in the middle of the list is the same as the item that the user inputs, a message saying “item found” will be output.

 If the item is greater than what the user inputted, all the items at the index lower than the middle index are discarded.

 If the item is lower than what the user inputted, all the items at the index greater than the middle index are discarded.

 The above steps are repeated until the item searched for is found

 If one item is left in the list and it is not the item searched for, a message saying “item not found” is outputted.

The binary search algorithm has a Big O notion of O(log n). The log is of base 2.

Python Code

def binary_search(arr, target):

low, high = 0, len(arr) - 1

while low <= high: mid = (low + high) // 2 mid_element = arr[mid]

if mid_element == target:

return f"Item {target} found at index {mid}."

elif mid_element < target:

121 | P a g e
low = mid + 1

else:

high = mid - 1

return f"Item {target} not found in the array."

Bubble Sort
How Does It Work?

 Bubble sort compares adjacent elements in the list/array.

 They are swapped if the elements are in the wrong order (according to the desired sorting).

 The algorithm iterates through the array multiple times in passes.

 On each pass, the largest unsorted element "bubbles up" to its correct position at the end of the array.

 The process is repeated until the entire array is sorted.

Python Code

def bubble_sort(arr):

n = len(arr)

# Traverse through all array elements

for i in range(n):

# Last i elements are already sorted, so we don't need to check them

for j in range(0, n-i-1):

122 | P a g e
# Swap if the element found is greater than the next element

if arr[j] > arr[j+1]:

arr[j], arr[j+1] = arr[j+1], arr[j]

In the worst case, it has a time complexity of O(n^2), where n is the number of elements in the array.

Insertion Sort
How Does It Work?

 The algorithm starts with the assumption that the first element in the array is already sorted.

 It then compares the next element with the sorted portion of the array.

 If the next element is smaller, it shifts the larger elements to the right until it finds the correct position for the next element and

inserts it there.

 The sorted portion of the array grows with each iteration until the entire array is sorted.

 The process is repeated until all elements are in their correct positions.

Python Code

def insertion_sort(arr):

for i in range(1, len(arr)):

key = arr[i]

j=i-1

123 | P a g e
# Move elements of arr[0..i-1] that are greater than key to one position ahead of their current position

while j >= 0 and key < arr[j]:

arr[j + 1] = arr[j]

j -= 1

arr[j + 1] = key

In the worst case, it has a time complexity of O(n^2), where n is the number of elements in the array.

Insertion sort is efficient for small datasets or partially sorted datasets.

124 | P a g e
Abstract Data Types (ADTs)
Part of the Computational Thinking and Problem-Solving Chapter

Stacks
 Stack – an ADT where items can be popped or pushed from the top of the stack only

 LIFO – Last In First Out data structure

Popping (pseudocode)

PROCEDURE PopFromStack

IF TopOfStack = -1

THEN

OUTPUT “Stack is already empty”

ELSE

OUTPUT MyStack[ TopOfStack ] “is popped”

125 | P a g e
TopOfStack ← TopOfStack – 1

ENDIF

ENDPROCEDURE

Pushing (pseudocode)

PROCEDURE PushToStack

IF TopOfStack = MaxStackSize

THEN

OUTPUT “Stack is full”

ELSE

TopOfStack = TopOfStack + 1

126 | P a g e
MyStack[TopOfStack] = NewItem

ENDIF

ENDPROCEDURE

Use of Stacks:

 Interrupt Handling

o The contents of the register and the PC are saved and put on the stack when the interrupt is detected

o The return addresses are saved onto the stack as well

o Retrieve the return addresses and restore the register contents from the stack once the interrupt has been serviced

 Evaluating mathematical expressions held in Reverse Polish Notation

 Procedure Calling

o Every time a new call is made, the return address must be stored

o Return addresses are recalled in the order ‘the last one stored will be the first to be recalled.’

o If there are too many nested calls, then stack overflow

Queues
127 | P a g e
 Queue: an ADT where new elements are added at the end of the queue, and elements leave from the start of the queue

 FIFO: First In, First Out Data structure

o Creating a Circular Queue (pseudocode):

o PROCEDURE Initialise

o Front = 1

o Rear = 6

o NumberInQueue := 0

o END PROCEDURE

o To add an Element to the Queue (pseudocode):

o PROCEDURE EnQueue

o IF NumberInQueue == 6

o THEN Write (“Queue overflow”)

128 | P a g e
o

o ELSE

o IF Rear == 6

o THEN Rear = 1

o ELSE Rear = Rear + 1

o ENDIF

o Q[Rear] = NewItem

o NumberInQueue =NumberInQueue +1

o ENDIF

o ENDPROCEDURE

129 | P a g e
 The front of the queue is accessed through the pointer Front.

 To add an element to the queue, the pointers have to be followed until the node containing the pointer of 0 is reached → the end of

the queue, and this pointer is then changed to point to the new node.

 In some implementations, two pointers are kept: 1 to the front and 1 to the rear. This saves traversing the whole queue when a new

element is to be added.

 To Remove an Item from the Queue (pseudocode) :

 PROCEDURE DeQueue

 IF NumberInQueue == 0

 THEN Write (“Queue empty”)

 ELSE

 NewItem = Q[Front]

 NumberInQueue = NumberInQueue – 1

130 | P a g e
 IF Front ==6

 THEN Front = 1

 ELSE

 Front = Front + 1

 ENDIF

 ENDIF

 END PROCEDURE

 Items may only be removed from the front of the list and added to the end of the list

Linked Lists
 It can be represented as two 1-D arrays - a string array for data values and an integer array for pointer values

 Creating a Linked list →Setting values of pointers in the free list and empty data linked list

 FOR Index ← 1 TO 49

131 | P a g e

 NameList[Index].Pointer ← Index + 1

 ENDFOR

 NameList[50].Pointer ← 0

 HeadPointer ← 0

 FreePointer ← 1

A user-defined record type should first be created to represent a node’s data and pointer:

132 | P a g e
 Inserting into a Linked List

133 | P a g e
134 | P a g e
 Searching a Linked List

135 | P a g e
 Deleting an Item from a Linked List

 Use a Boolean value to know when an item has been found and deleted (initially false)

 Use a pointer (CurrentPointer) to go through each node’s address

 If the new item is found in the header:

1. Set head pointer to pointer of node at CurrentPointer

2. Set the pointer on node at CurrentPointer to free pointer

136 | P a g e
3. The free pointer points to CurrentPointer

4. Set the Boolean value to True

 Otherwise:

1. Search for an item when the end of the linked list is not reached and a Boolean value is false.

1. Use a Previous Pointer to keep track of the node located just before the one deleted

2. CurrentPointer points to the next node’s address

3. If data in the node at CurrentPointer matches SearchItem

 Set the pointer of the node at PreviousPointer to the pointer of the node at CurrentPointer

 Set the pointer of the node at CurrentPointer to FreePointer

 Set FreePointer to CurrentPointer

 Boolean value becomes true

 If the Boolean value is false

1. Inform the user that the item to be deleted has not been found

137 | P a g e
Binary Tree
 Dynamic Data Structure: can match the size of data requirement.

138 | P a g e
 Takes memory from the heap as required and returns memory as required, following a node deletion

 An ADT consisting of nodes arranged hierarchically, starting with a root node

 Usually implemented using three 1-D arrays

 A node can have no more than two descendants in a binary tree.

139 | P a g e
 A binary tree node is like a linked list node but with two pointers, LeftChild and RightChild.

 Binary trees can be used in many ways. One use is to hold an ordered set of data. In an ordered binary tree, all items to the root's

left will have a smaller key than those on the root's right. This applies equally to all the sub-trees.

 Tree algorithms are invariably recursive.

 To insert data into an ordered tree, the following recursive algorithm can be used:

 PROCEDURE insert(Tree, Item)

 IF Tree is empty THEN create new tree with Item as the root.

 ELSE IF Item < Root

 THEN insert(Left sub-tree of Tree, Item)

 ELSE insert(Right sub-tree of Tree, Item)

 ENDIF

 ENDIF

140 | P a g e
 ENDPROCEDURE

 Another common use of a binary tree is to hold an algebraic expression, for example X + Y * 2

It could be stored as:

Binary Tree

START at Root Node

REPEAT

IF WantedItem = ThisItem

141 | P a g e
THEN Found = TRUE

ELSE

IF WantedItem > ThisItem

THEN Follow Right Pointer

ELSE Follow Left Pointer

UNTIL Found or Null Pointer Encountered

142 | P a g e
143 | P a g e
Recursion
Part of the Computational Thinking and Problem-Solving Chapter

Essential Features of a Recursion


 Must have a base case/stopping condition

 Must have a general case which calls itself (recursively) // Defined in terms of itself

 The general case should be changing its state and move toward the base case

 Unwinding occurs once the base case is reached.

Advantages and Disadvantages of a Recursion


Advantages Disadvantages
Can produce simpler, more natural solutions to a
Less efficient in terms of computer time and storage space
problem
A lot more storage space is used to store return addresses and
states.
This could lead to infinite recursion.

How A Compiler Translates Recursive Programming Code


Before the procedure call is executed, the current state of the registers/local variables is saved onto a stack.

 When returning from a procedure call, the registers/local variables are re-instated on the stack

144 | P a g e
 When the stopping condition/base case is met, the algorithm unwinds the last set of values that are taken off the stack (in reverse

order)

Object-Oriented Programming
Part of the Further Programming Chapter

Key Terms & Definitions


 Objects: Instances of classes representing real-world entities.

 Properties/Attributes: The data items/attributes and the data types // characteristics defined in a class.

 Methods: the procedures/ functions / programmed instructions in a class that act on the properties/attributes.

 Classes: Blueprint for creating objects with shared attributes and methods.

 Inheritance: It is a mechanism for creating a new class based on an existing one, inheriting its attributes and methods. Through

inheritance, attributes contained in one class (parent class) are made available to / reused by another class (child class).

 Polymorphism: Ability to use different classes through a common interface. It allows the same method to take on different

behaviours depending on which class is instantiated. These methods can be redefined for derived classes.

 Containment (Aggregation): Combining multiple objects to create a more complex object.

 Encapsulation: Hiding the internal details of a class from the outside.

145 | P a g e
 Getters and Setters: Methods for accessing and modifying object attributes. Get methods/Getters are used to access attributes,

while set methods/setters are used to modify object attributes.

 Instances: Individual objects created from a class.

Constructor
Constructors are functions that are used for initializing the attributes/Properties of a class.

 A constructor in Object-Oriented Programming (OOP) is a special method within a class that is automatically called when an object

of that class is created.

 Its primary purpose is to initialize the object's attributes or perform setup actions.

 In Python, the constructor is typically named init and takes the self parameter, which refers to the created object.

 Inside the constructor, you initialize the object's attributes.

For example:

 When you create an instance of the class, the constructor is invoked automatically to set the object's initial state. This ensures that

objects are created in a valid and consistent state.

146 | P a g e
Defining Classes and Methods
• To define a class in Python, use the class keyword followed by the class name.

• Inside the class, you can define attributes (variables) and methods (functions) that belong to the class.

Here's a simple example of defining a class called Person with attributes and methods as shown below:

Get and Set Methods


These methods are used to access/change attributes set to be private in a class. These methods are decelerated inside the class.

147 | P a g e
Inheritance
• Inheritance is a fundamental concept in OOP that allows you to create a new class (a derived or child class) based on an existing

class (a base or parent class).

• The child class inherits the attributes and methods of the parent class and can also add new attributes and methods or override

the ones inherited.

In the context of the Person class, Inheritance would involve creating a child class, such as a Student or Employee, that inherits

attributes and methods from the Person class. For example, a Student class could inherit the name and age attributes and the

get_name method from the Person class.

148 | P a g e
Polymorphism

• Polymorphism is the ability of different classes to be treated as instances of a common base class. It allows objects of different

classes to be used interchangeably if they implement the same methods or interface.

• Polymorphism promotes flexibility and extensibility in your code.

In the context of the Person class, Polymorphism could be applied when you have different types of persons, such as students,

employees, and teachers, each having a get_name method. You can call get_name on instances of these different classes without

knowing their specific type, as long as they all have a get_name method.

Encapsulation

• Encapsulation is the practice of hiding the internal details of a class and providing a controlled interface to access and modify the

class's attributes.

• This helps maintain the integrity and consistency of the object's state by controlling how data is accessed and modified.

In the context of the Person class, Encapsulation is applied by making the name attribute private (by convention, using a leading

underscore). This indicates that __name it should not be accessed directly from outside the class. Instead, you provide a controlled

interface through the get_name and set_name methods, ensuring that name changes are validated and controlled.

Below is a brief code example illustrating how inheritance, polymorphism, and encapsulation could be applied to the Person class:

149 | P a g e
• In the example above, the Student class inherits from the Person class, demonstrating the concept of inheritance.

• Both classes can be treated interchangeably when calling the get_name method, showing polymorphism.

• Encapsulation is maintained by controlling access to the name attribute through getter and setter methods.

150 | P a g e

You might also like